HTML Converter

Introduction

This converter converts a kramdown element tree into an HTML fragment and supports all available element types. Below is a list of additional features of the HTML converter as well as some additional information.

Automatic Generation of Header IDs

kramdown supports the automatic generation of header IDs if the option auto_ids is set to true (which is the default). This is done by converting the untransformed, i.e. plain, header text (or, if the option auto_id_stripping is set, only the text from real text elements) via the following steps:

All characters except letters, numbers, spaces and dashes are removed.
All characters from the start of the line until the first letter are removed.
Everything except letters and numbers is converted to dashes.
Everything is lowercased.
If nothing is left, the identifier section is used.
If a such created identifier already exists, a dash and a sequential number is added (first -1, then -2 and so on).

Note that the option auto_id_stripping will be removed in version 2.0 because this will be the default behaviour!

Following are some examples of header texts and their respective generated IDs:

Sample header	generated ID	generated ID with `auto_id_stripping`
`# This is a header`	this-is-a-header	this-is-a-header
`## 12. Another one 1 here`	another-one-1-here	another-one-1-here
`### Do ^& it now`	do–it-now	do–it-now
`# Hallo`	hallo	hallo
`# Hallo`	hallo-1	hallo-1
`# 123456789`	section	section
`# <i>test</i>`	itesti	test

The automatic creation of header IDs is not part of standard Markdown. The rules on how header text is converted to an identifier are based on the rules specified by Pandoc.

Such generated IDs are only used if no ID has been set manually before.

Standalone Images

By assigning the reference name “standalone” to an image that is the sole content of a paragraph, the image will be rendered inside a <figure> element instead of an image inside a paragraph.

Code Blocks

A code block is wrapped in both <pre> and <code> tags.

Showing Whitespace in a Code Block

It is sometimes useful to visualize whitespace within a code block. This can be achieved by adding the class show-whitespaces to a code block (using a block IAL).

Here is an example where the whitespaces are shown:

	⋅⋅leading⋅tab	and⋅space
trailing⋅⋅⋅tab⋅and⋅space⋅⋅⋅⋅

When showing whitespace in a code block, all spaces are replaced with the entity ⋅ and additionally spaces and tabs in the code block are marked up using HTML span tags and the following CSS classes:

ws-space-l, ws-tab-l: leading spaces/tabs
ws-space-r, ws-tab-r: trailing spaces/tabs
ws-space, ws-tab: spaces/tabs in between

Automatic Syntax Highlighting

kramdown supports setting a code language for code blocks and code spans, see Language of Code Blocks. This language will be used for highlighting code blocks and spans.

The actual syntax highlighting is configurable via the option ‘syntax_highlighter’. The default value is ‘coderay’ which implies that the Coderay syntax highlighter is used.

Another syntax highlighter is Rouge which is Pygments compatible, i.e. it supports all of Pygments CSS themes.

Math Support

kramdown supports the use of various math engines. The default math engine is MathJax (which can also be used with KaTeX).

For proper functionality, the HTML template must be configured to link to the engine’s Javascript and CSS. Note that the CSS includes references to webfonts.

Also available are precompiling versions to eliminate the need for client-side Javascript. Those are Mathjax-Node, KaTeX, and SsKaTeX. Each one requires a Javascript engine installed where kramdown runs, in order to perform the precompilation. The resulting pages still require CSS and fonts, but no Javascript anymore.

Alternative math engines are Ritex and itex2MML both of which output MathML.

Emphasis

kramdown uses the HTML element em to style light and the element strong to style strong emphasized text parts.

Definition Lists

kramdown allows the automatic generation of element IDs for terms of a definition list. The algorithm is the same as with headers (see above except that the last two points about adding “section” and unique IDs is not respected. Also, the algorithm works only on the real text as if auto_id_stripping was activated.

The automatic generation of IDs is activated by assigning the reference name “auto_ids” to a definition list. This will generate plain IDs without a prefix. By using a reference name of the format “auto_ids-PREFIX”, the prefix is used.

Such generated IDs are only used if no ID has been set manually.

Here are examples:

{:auto_ids}
term
: definition

<dl>
  <dt id="term">term</dt>
  <dd>definition</dd>
</dl>

term: definition

{:auto_ids-prefix}
term
: definition

<dl>
  <dt id="prefixterm">term</dt>
  <dd>definition</dd>
</dl>

term: definition

Footnotes

If a document contains footnotes, they are automatically placed at the end of the document.

By assigning the reference name “footnotes” to an ordered or unordered list, the list will be replaced with the footnotes, instead of placing the footnotes at the end of the document.

Automatic “Table of Contents” Generation

kramdown supports the automatic generation of the table of contents of all headers that have an ID set. Just assign the reference name “toc” to an ordered or unordered list by using an IAL and the list will be replaced with the actual table of contents, rendered as nested unordered lists if “toc” was applied to an unordered list or else as nested ordered lists. All attributes applied to the original list will also be applied to the generated TOC list and it will get an ID of markdown-toc if no ID was set.

When the auto_ids option is set, all headers will appear in the table of contents as they all will have an ID. Assign the class name “no_toc” to a header to exclude it from the table of contents.

Here is an example that generates a “Table of Contents” as an unordered list:

# Contents header
{:.no_toc}

* A markdown unordered list which will be replaced with the ToC, excluding the "Contents header" from above
{:toc}

# H1 header

## H2 header

For a “Table of Contents” as an ordered list:

1. The generated Toc will be an ordered list
{:toc}

# H1 header

## H2 header

Options

The HTML converter supports the following options:

auto_ids

Use automatic header ID generation

If this option is true, ID values for all headers are automatically generated if no ID is explicitly specified.

Default: true
Used by: HTML/Latex converter

auto_id_prefix

Prefix used for automatically generated header IDs

This option can be used to set a prefix for the automatically generated header IDs so that there is no conflict when rendering multiple kramdown documents into one output file separately. The prefix should only contain characters that are valid in an ID!

Default: ‘’
Used by: HTML/Latex converter

auto_id_stripping

Strip all formatting from header text for automatic ID generation

If this option is true, only the text elements of a header are used for generating the ID later (in contrast to just using the raw header text line).

This option will be removed in version 2.0 because this will be the default then.

Default: false
Used by: kramdown parser

transliterated_header_ids

Transliterate the header text before generating the ID

Only ASCII characters are used in headers IDs. This is not good for languages with many non-ASCII characters. By enabling this option the header text is transliterated to ASCII as good as possible so that the resulting header ID is more useful.

The stringex library needs to be installed for this feature to work!

Default: false
Used by: HTML/Latex converter

template

The name of an ERB template file that should be used to wrap the output or the ERB template itself.

This is used to wrap the output in an environment so that the output can be used as a stand-alone document. For example, an HTML template would provide the needed header and body tags so that the whole output is a valid HTML file. If no template is specified, the output will be just the converted text.

When resolving the template file, the given template name is used first. If such a file is not found, the converter extension (the same as the converter name) is appended. If the file still cannot be found, the templates name is interpreted as a template name that is provided by kramdown (without the converter extension). If the file is still not found, the template name is checked if it starts with ‘string://’ and if it does, this prefix is removed and the rest is used as template content.

kramdown provides a default template named ‘document’ for each converter.

Default: ‘’
Used by: all converters

footnote_nr

The number of the first footnote

This option can be used to specify the number that is used for the first footnote.

Default: 1
Used by: HTML converter

entity_output

Defines how entities are output

The possible values are :as_input (entities are output in the same form as found in the input), :numeric (entities are output in numeric form), :symbolic (entities are output in symbolic form if possible) or :as_char (entities are output as characters if possible, only available on Ruby 1.9).

Default: :as_char
Used by: HTML converter, kramdown converter

smart_quotes

Defines the HTML entity names or code points for smart quote output

The entities identified by entity name or code point that should be used for, in order, a left single quote, a right single quote, a left double and a right double quote are specified by separating them with commas.

Default: lsquo,rsquo,ldquo,rdquo
Used by: HTML/Latex converter

toc_levels

Defines the levels that are used for the table of contents

The individual levels can be specified by separating them with commas (e.g. 1,2,3) or by using the range syntax (e.g. 1..3). Only the specified levels are used for the table of contents.

Default: 1..6
Used by: HTML/Latex converter

syntax_highlighter

Set the syntax highlighter

Specifies the syntax highlighter that should be used for highlighting code blocks and spans. If this option is set to +nil+, no syntax highlighting is done.

Options for the syntax highlighter can be set with the syntax_highlighter_opts configuration option.

Default: rouge
Used by: HTML/Latex converter

syntax_highlighter_opts

Set the syntax highlighter options

Specifies options for the syntax highlighter set via the syntax_highlighter configuration option.

The value needs to be a hash with key-value pairs that are understood by the used syntax highlighter.

Default: {}
Used by: HTML/Latex converter

math_engine

Set the math engine

Specifies the math engine that should be used for converting math blocks/spans. If this option is set to +nil+, no math engine is used and the math blocks/spans are output as is.

Options for the selected math engine can be set with the math_engine_opts configuration option.

Default: mathjax
Used by: HTML converter

math_engine_opts

Set the math engine options

Specifies options for the math engine set via the math_engine configuration option.

The value needs to be a hash with key-value pairs that are understood by the used math engine.

Default: {}
Used by: HTML converter

footnote_backlink

Defines the text that should be used for the footnote backlinks

The footnote backlink is just text, so any special HTML characters will be escaped.

If the footnote backlint text is an empty string, no footnote backlinks will be generated.

Default: ‘&8617;’
Used by: HTML converter

footnote_backlink_inline

Specifies whether the footnote backlink should always be inline

With the default of false the footnote backlink is placed at the end of the last paragraph if there is one, or an extra paragraph with only the footnote backlink is created.

Setting this option to true tries to place the footnote backlink in the last, possibly nested paragraph or header. If this fails (e.g. in the case of a table), an extra paragraph with only the footnote backlink is created.

Default: false
Used by: HTML converter

typographic_symbols

Defines a mapping from typographical symbol to output characters

Typographical symbols are normally output using their equivalent Unicode codepoint. However, sometimes one wants to change the output, mostly to fallback to a sequence of ASCII characters.

This option allows this by specifying a mapping from typographical symbol to its output string. For example, the mapping {hellip: …} would output the standard ASCII representation of an ellipsis.

The available typographical symbol names are:

hellip: ellipsis
mdash: em-dash
ndash: en-dash
laquo: left guillemet
raquo: right guillemet
laquo_space: left guillemet followed by a space
raquo_space: right guillemet preceeded by a space

Default: {}
Used by: HTML/Latex converter

remove_line_breaks_for_cjk

Specifies whether line breaks should be removed between CJK characters

Default: false
Used by: HTML converter