Validate Your Markdown Files
In Markdown, we can write any document with valid syntax. For example, Markdown supports to directly write HTML tag, we can write HTML tag <h1>title</h1> instead of Markdown syntax #title.
But for some purpose, some behaviors are unwanted, for example, you may not want to allow <script> tag in Markdown that can insert any javascript.
In this document, you'll learn how to define markdown validation rules, which will help you to validate markdown documents in an efficient way.
Markdown validation is part of DFM, if you switch Markdown engine to other engine, validation might not work.
There're three kinds of validation rules provided by DocFX:
- HTML tag rule, which is used to validate HTML tags in Markdown. There is a common need to restrict usage of HTML tags in Markdown to only allow "safe" HTML tags, so we created this built-in rule for you.
- Markdown token rule. This can be used to validate different kinds of Markdown syntax elements, like headings, links, images, etc.
- Metadata rule. This can be used to validate metadata of documents. Metadata can be defined in YAML header,
docfx.json, or a single JSON file. Metadata rule gives you a central place to validate metadata against certain principle.
HTML tag validation rules
For most cases, you may want to prohibit using certain html tags in markdown, so we built a built-in html tag rule for you.
To define a HTML tag rule, simply create a md.style with following content:
{
"tagRules": [
{
"tagNames": [ "H1", "H2" ],
"relation": "In",
"behavior": "Warning",
"messageFormatter": "Please do not use <H1> and <H2>, use '#' and '##' instead.",
"customValidatorContractName": null,
"openingTagOnly": false
}
]
}
Then when anyone write <H1> or <H2> in Markdown file, it will give a warning.
You can use the following proprties to configure the HTML tag rule:
tagNamesis the list of HTML tag names to validate, required, case-insensitive.relationis optional fortagNames:Inmeans when html tag is intagNames, this is default value.NotInmeans when html tag is not intagNames.
behaviordefines the behavior when the HTML tag is met, required. Its value can be following:- None: Do nothing.
- Warning: Log a warning.
- Error: Log an error, it will break current build.
messageFormatteris the log message when the HTML tag is hit, required. It can contain following variables:{0}the name of tag.{1}the whole tag.
For example, the
messageFormatteris{0} is the tag name of {1}., and the tag is<H1 class="heading">match the rule, then it will output following message:H1 is the tag name of <H1 class="heading">.customValidatorContractNameis an extension tag rule contract name for complex validation rule, optional.openingTagOnlyis a boolean, option, default isfalseif
true, it will only apply to opening tag, e.g.<H1>, otherwise, it will also apply to closing tag, e.g.</H1>.
Test your rule
To enable your rule, put md.style in the same folder of docfx.json, then run docfx, warning will be shown if it encounters <H1> or <H2> during build.
Create a custom HTML tag rule
By default HTML tag rule only validates whether an HTML tag exists in Markdown. Sometimes you may want to have additional validation against the content of the tag.
For example, you may not want a tag to contain onclick attribute as it can inject javascript to the page.
You can create a custom HTML tag rule to achieve this.
- Create a project in your code editor (e.g. Visual Studio).
- Add nuget package
Microsoft.DocAsCode.PluginsandMicrosoft.Composition. - Create a class and implement ICustomMarkdownTagValidator.
- Add ExportAttribute with contract name.
For example, we require HTML link (<a>) should not contain onclick attribute:
[Export("should_not_contain_onclick", typeof(ICustomMarkdownTagValidator))]
public class MyMarkdownTagValidator : ICustomMarkdownTagValidator
{
public bool Validate(string tag)
{
// use Contains for demo purpose, a complete implementation should parse the HTML tag.
return tag.Contains("onclick");
}
}
And update your md.style with following content:
{
"tagRules": [
{
"tagNames": [ "a" ],
"behavior": "Warning",
"messageFormatter": "Please do not use 'onclick' in HTML link.",
"customValidatorContractName": "should_not_contain_onclick",
"openingTagOnly": true
}
]
}
How to enable custom HTML tag rules
- Same as default HTML tag rule, config the rule in
md.style. Create a folder (
rulesfor example) in your DocFX project folder, put all your custom rule assemblies to apluginsfolder underrulesfolder. Now your DocFX project should look like this:/ |- docfx.json |- md.style \- rules \- plugins \- <your_rule>.dllUpdate your
docfx.jsonwith following content:{ ... "dest": "_site", "template": [ "default", "rules" ] }- Run
docfxyou'll see your rule being executed.
The folder
rulesis actually a template folder. In DocFX, template is a place for you to customize build, render, validation behavior. For more information about template, please refer to our template and plugin documentation.
Markdown token validation rules
Besides HTML tags, you may also want to validate Markdown syntax like heading or links. For example, in Markdown, you may want to limit code snippet to only support a set of languages.
To create such rule, follow the following steps:
- Create a project in your code editor (e.g. Visual Studio).
- Add nuget package
Microsoft.DocAsCode.MarkdownLiteandMicrosoft.Composition. - Create a class and implements IMarkdownTokenValidatorProvider
MarkdownTokenValidatorFactory contains some helper methods to create a validator.
- Add ExportAttribute with rule name.
For example, the following rule require all code block to be csharp:
[Export("code_snippet_should_be_csharp", typeof(IMarkdownTokenValidatorProvider))]
public class MyMarkdownTokenValidatorProvider : IMarkdownTokenValidatorProvider
{
public ImmutableArray<IMarkdownTokenValidator> GetValidators()
{
return ImmutableArray.Create(
MarkdownTokenValidatorFactory.FromLambda<MarkdownCodeBlockToken>(t =>
{
if (t.Lang != "csharp")
{
throw new DocumentException($"Code lang {t.Lang} is not valid, in file: {t.SourceInfo.File}, at line: {t.SourceInfo.LineNumber}");
}
}));
}
}
To enable this rule, update your md.style to the following:
{
"rules": [ "code_snippet_should_be_csharp" ]
}
Then follow the same steps in How to enable custom HTML tag rules, run docfx you'll see your rule executed.
Logging in your rules
As you can see in the above example, you can throw DocumentException to raise an error, this will stop the build immediately.
You can also use LogWarning(String, String, String, String) and LogError(String, String, String, String) to report a warning and an error respectively.
To use these methods, you need to install nuget package
Microsoft.DocAsCode.Commonfirst.
The different between ReportError and throw DocumentException is throwing exception will stop the build immediately but ReportError won't stop build but will eventually fail the build after rules are run.
Advanced usage of md.style
Default rules
If a rule has the contract name of default, it will be enabled by default. You don't need to enable it in md.style.
Enable/disable rules in md.style
You can add use disable to specify whether disable a rule:
{
"rules": [ { "contractName": "<contract_name>", "disable": true } ]
}
This gives you an opportunity to disable the rules enabled by default.
Validate metadata in markdown files
In markdown file, we can write some metadata in conceptual or overwrite document. And we allow to add some plug-ins to validate metadata written in markdown files.
Scope of metadata validation
Metadata is coming from multiple sources, the following metadata will be validated during build:
- YAML header in markdown.
- Global metadata and file metaata in
docfx.json. - Global metadata and file metadata defined in separate
.jsonfiles.
For more information about global metadata and global metadata, see docfx.json format.
Create validation plug-ins
- Create a project in your code editor (e.g. Visual Studio).
- Add nuget package
Microsoft.DocAsCode.PluginsandMicrosoft.Composition. - Create a class and implement IInputMetadataValidator
For example, the following validator prohibits any metadata with name hello:
[Export(typeof(IInputMetadataValidator))]
public class MyInputMetadataValidator : IInputMetadataValidator
{
public void Validate(string sourceFile, ImmutableDictionary<string, object> metadata)
{
if (metadata.ContainsKey("hello"))
{
throw new DocumentException($"Metadata 'hello' is not allowed, file: {sourceFile}");
}
}
}
Enable metadata rule is same as other rules, just copy the assemblies to the plugins of your template folder and run docfx.
Create configurable metadata validation plug-ins
There are two steps to create a metadata validator:
We need to modify export attribute for metadata validator plug-in:
[Export("hello_is_not_valid", typeof(IInputMetadataValidator))]Note
If the rule doesn't have a contract name, it will be always enabled, i.e., there is no way to disable it unless delete the assembly file.
Modify
md.stylewith following content:{ "metadataRules": [ { "contractName": "hello_is_not_valid", "disable": false } ] }
Advanced: Share your rules
Some users have a lot of document projects, and want to share validations for all of them, and don't want to write md.style file repeatedly.
Create template
For this propose, we can create a template with following structure:
/ (root folder for plug-in)
\- md.styles
|- <category-1>.md.style
\- <category-2>.md.style
\- plugins
\- <your_rule>.dll
In md.styles folder, there is a set of definition files, with file extension .md.style, each file is a category.
In one category, there is a set of rule definition.
For example, create a file with name test.md.style, then write following content:
{
"tagRules": {
"heading": {
"tagNames": [ "H1", "H2" ],
"behavior": "Warning",
"messageFormatter": "Please do not use <H1> and <H2>, use '#' and '##' instead.",
"openingTagOnly": true
}
},
"rules": {
"code": "code_snippet_should_be_csharp"
},
"metadataRules": {
"hello": { "contractName": "hello_is_not_valid", "disable": true }
}
}
Then test is the category name (from file name) for three rules, and apply different id for each rule, they are heading, code and hello.
When you build document with this template, all rules will be active when disable property is false.
Config rules
Some rules need to be enabled/disabled in some special document project.
For example, hello rule is not required for most project, but for a special project, we want to enable it.
We need to modify md.style file in this document project with following content:
{
"settings": [
{ "category": "test", "id": "hello", "disable": false }
]
}
And for some project we need to disable all rules in test category:
{
"settings": [
{ "category": "test", "disable": true }
]
}
Note
disable property is applied by following order:
tagRules,rulesandmetadataRulesinmd.style.- auto enabled
ruleswith contract namedefault. settingswithcategoryandidinmd.style.settingswithcategoryinmd.style.disableproperty in definition file.