This parser assumes that the input document is an HTML document. It automatically converts HTML tags that have a native element representation to their native representation – all other tags as well as processing instructions and HTML comments are parsed as is.
Conversion of HTML tags to native kramdown elements
Here is a description of the HTML tags that have native representations and how they are converted:
em strong blockquote hr br a img p thead tbody tfoot tr td th ul ol dl li dl dt dd
- These HTML tags are just transformed into their native representation and don’t need any further processing.
- The HTML
<b>tag is converted to the
strongelement and the
<i>tag to the
h1 h2 h3 h4 h5 h6
- These six header tags are all mapped to the single
headerelement with different header levels.
codetag is converted to a
codespanelement and the
pretag to a
codeblockelement. All child tags are removed and only the contained text is used.
tabletag is converted to its native element counterpart if its cells contain only span level elements - otherwise it is not converted.
The following general transformations are also applied:
Direct text children of the following HTML tags are removed:
html head hgroup ol ul dl table colgroup tbody thead tfoot tr select optgroup
Text elements containing only whitespace and appearing either directly after the start tag, directly before the end tag or between block-level child elements are removed from the following HTML tags:
body section nav article aside header footer address div li dd blockquote figure figcaption td th fieldset form
Leading and trailing whitespace in the content of the following HTML tags is removed:
address article aside blockquote body caption dd div dl dt fieldset figcaption form footer header h1 h2 h3 h4 h5 h6 legend li nav p section td th}
The HTML parser doesn’t have any options.