The Markdown Wiki syntax was designed as a simplified syntax for producing HTML text based on widely used conventions for writing e-mails, forum posts, and other plain text pieces. Markdown has since become ubiquitous on the web, being used by services such as GitHub and Stack Exchange, and having close to a hundred independent implementations as of 2016.
Markdown defines a Wiki syntax for the most common structural, typographical, and hyperlinking features of HTML (as explained in the remainder of this text). Where markdown doesn't provide a syntax for a particular HTML construct, it's possible to just use HTML directly in markdown text, as either inline markup or markup block.
sgmljs.net Markdown is presented as an application of the SGML
SHORTREF feature, which is an SGML mechanism to describe
custom Wiki or other domain-specific syntax. Whereas in regular markdown
it's possible to use HTML markup, in sgmljs.net Markdown it's
also possible to use SGML markup and other constructs, bringing SGML's
vast facilities for text organization and processing to markdown
in a natural way.
sgmljs.net Markdown implements the original markdown syntax with
the "fenced code blocks" and "tables" feature of GitHub-flavored
markdown, and with select pandoc markdown
extension features. In the remainder of this text,
is used as a reference to
John Gruber's original Markdown when discussing differences between
sgmljs.net Markdown and other markdown formatters.
Markdown performs the following conversions on special characters in body text, headers, or nearly almost everywhere else as follows:
*Emphasized text* or
is converted to
**Strongly emphasized text** or
__Strongly emphasized text__
is converted to
<strong>Strongly emphasized text</strong>
`Text put in *backtick* characters` is produced
to the output verbatim, within
<code>Text put in *backtick* characters</code>
(with any markup and markdown content escaped)
Balanced pairs of above character tokens get replaced by the respective start- and end-tags according to these rules:
Markdown produces a hard line break (a HTML
<br> tag) on a line
ending in two or more space characters.
See also Typography Examples.
In markdown, a list is created by starting a line with a
* This line starts a list item consisting of a single line of text * This list item is formatted similar to the one before, even though it is continued on the next line in markdown text. * This item has two paragraphs. As can be seen, a list item is continued even if its text lines are separated by a blank line. A continuation line only needs to be indented to the same level. * As this list item shows, only the first line following a blank line needs indentation; subsequent lines in markup text may start at the line beginning, and will still be treated as part of the list item
The following example shows how to produce nested lists:
- Top-level list item 1 - Top-level list item 2 - Sublist item 21 - Sublist item 22 - Top-level list item 3 - Sublist item 31 - Top-level list item 4 - Sublist item 41 Continuation line for sublist item 4 - Sub-sublist item 411 Continuation line for sub-sublist item 411 - - Sub-subsublist item 4111 - Sub-subsublist item 4112 - Sublist item 42
Lines starting with either of the characters
followed by one, two, or three space characters or a tab character,
followed by a non-space character, create list items, if they are
placed at the same indentation level as previous content (or without
indentation at the begin of a file).
The first such line following some content at equal or lower indentation
level creates a new list; subsequent list items add additional list items
if no content at lower indentation level is placed in between.
After a blank line, however, a list item is continued by using one level of indentation more than the list item start line to which it belongs. If list item text is continued after a blank line, it will be put into a new paragraph.
If a list item contains paragraphs, sublists or other rich content
apart from a single paragraph of text, any text content of that list item
is put into paragraphs (HTML
para elements). If a list item only contains
a single paragraph of text, the text is put directly as the sole content
of the HTML
<li> element, without wrapping the item's content into
sgmljs.net doesn't prune
<p> elements on list item content
which has span-level typography markup (see listitems-in-paras2 example)
Note that a list marker without list item text (an "empty list item") will not start a list
Note also that a blank line is needed before a list start after paragraphs, but not after headers
See also List Examples.
Lists can also be started using numbers. Numbered lists which will be rendered as HTML ordered list:
1. First item 2. Second item 3. Third item
To produce HTML definition lists, the following syntax is used:
definition term : definition : further definition : ...
See also Definition List Examples.
In markdown, section headers are commonly created in the atx-header style as follows:
# Header # Body text
Alternatively, and less commonly, headers can also be created in the setext-header style:
Header ------ Body text
sgmljs.net, like pandoc, also generates fragment (link) IDs for subsequent (or prior) reference in reference links based on the header text as follows:
See also Header Examples, also including details of automatic link id creation.
Markdown doesn't provide syntax for every HTML markup construct.
Instead, when needed, HTML can be used directly in a markdown file.
For example, to create subscripted text using inline HTML,
H<sub>2</sub>O, CO<sub>2</sub> can be used.
Likewise, to place an anchor around a portion of text, inline HTML
<span> (or other) elements can be used like this:
this is <span id="mylink">some text that will scroll in view</a>
Creating explicit link targets (fragment identifiers) like above isn't often used in markdown documents because link targets are most often placed on headers and similar structural HTML elements; such links are created automatically by sgmljs.net.
See also Inline HTML Examples.
A more common use of link targets on elements other than headers is
with text citations. Often the author will want to make cited text
stand out from surrounding text, so that text stands out as citation.
The following example uses an HTML block containing the
HTML markup element within markdown text:
markdown text ... <div> <a id="imitation-cite"> <p>Imitation is the sincerest form of flattery.</p> <i align="right">-- attributed to Charles Caleb Colton</i></a> </div> ... markdown text
Since a link target element with an id is placed around the citation block, it may be linked to from other places in the document, or from outside of the document.
This example demonstrates HTML blocks. HTML blocks, as opposed to
inline HTML, consist of HTML on one or more lines of it's own, rather
than within a running line of markdown text. HTML block level elements
<p>, cannot be used within inline HTML in markdown.
Other possible uses include HTML image maps or HTML tables for tabular data requiring more sophisticated formatting than is possible using markdown alone.
Text lines starting with markup element,
DOCTYPE, processing instructions,
or marked section tags are considered the start of a markup code block,
if preceded by a blank line, or if at the begin of a file.
To include a blank line (newline characters) in a markup block, use HTML
character entities, such as
. Note that this will be ignored by
HTML renderers, unless it appears within a
See also HTML Block Examples.
Preformatted text with spacing and line breaks that should be preserved in the output, such as software source code or verbatim HTML/XML text, can be put in Code Blocks to prevent markdown from formatting it.
A Code Block is created by indenting it one level more than the previous or surrounding paragraph or list item to which it belongs.
For example, the following is a code block: Newlines are preserved, and markdown syntax in code blocks is *NOT* formatted /* so that the syntax of a programming language being * displayed is rendered as-is */
A code block doesn't have to be terminated by a blank line; instead, a code block ends if indentation returns to a previous level, so, unlike with lists, omitting space characters for indentation on the second and subsequent line(s) of a code block won't continue a code block, with the following exception.
text from code blocks is put into
<pre><code> HTML elements
In addition to standard markdown indented code blocks, sgmljs.net also supports GitHub-flavored markdown-style fenced code blocks:
Fenced code block example: console.log('Code block goes here`)
Fenced code blocks are started and ended using a line beginning with three tilde (
backtick (```) characters
An optional string following the three tilde or backtick characters in a fenced code block
start line is put (after sanitation) into the
class attribute of the produced
code element in
See also Code Block Examples.
Inline links are span-level elements beginning with a left square bracket,
followed by link text, followed by a right square bracket, followed
by a locator in regular parentheses. The locator consist of an URL,
optionally followed by a space character and a double-quoted link title.
where the link title is what gets into the
title attribute of the produced
a anchor element.
For example, the inline link
[a link](#) gets formatted into
<a href="#">a link</a>.
See also Inline Link Examples.
An inline link having just an URL, without link text or link title
can be written in a shortcut auto-link form -- an URL in angle brackets --
<https://daringfireball.net/projects/markdown/> (which will
create a link to John Gruber's original markdown page, using the URL
<code> tags as link text).
So that an auto-link at the begin of a line isn't recognized as a
markup block or as inline markup, the initial
of the auto-link must be one of
Note that, unlike some other markdown implementations, sgmljs.net
does not mangle
for email address harvesting protection. Such functionality must be
implemented as a post-processing step using templating.
mailto: auto-link may be written by just putting an email
address into angle brackets -- the
@ character is enough to make
sgmljs.net recognize this syntax as auto-link.
In addition, sgmljs.net also supports auto-links with just a
: (colon) as
scheme: part. Such auto-links are generated
<a> having the scheme part omitted, such that browsers
take the scheme part from the document it is placed in. This cannot
be expressed by merely leaving out the scheme part in standard markdown
because an auto-link is syntactically required to have a scheme part to
be recognized as such.
See also Auto-Link Examples.
Reference links are comprised of two parts: a link similar to an inline link (but without locator), and a link definition elsewhere in the document.
A reference link, like an inline link, begins with a left square bracket followed by link text followed by a right square bracket. After that, reference links take another form than inline links, and are completed by an optional space character, followed by a left square bracket, followed by the link id, followed by a right square bracket.
A reference link definition is a line of itself beginning with a pair of square brackets containing the link id that is being defined, followed by a colon, followed by an URI, followed by a double-quoted link title. The URI may optionally be surrounded by '<' and '>' characters. Arbitrary whitespace may be placed between the colon and the URI, and the URI and the title. The title may also be written in the next line.
See also Reference Link Examples.
A short link (introduced by pandoc) has the same syntax as an external reference link, but with an empty, or omitted, second pair of square brackets. For a short link, the link id will act as the link text.
See also Short Link Examples.
An HTML image link can be inserted using the same syntax as an inline link
or as as a reference link, with a
! character placed before the construct:
For example, this is an inline image "link":
An ![Image link](http://some.url/image.png)
And this is a reference-style image "link":
An ![image link][imagelink] [imagelink]: <http://some.url>
See also Image Examples.
Text on lines beginning with the
> character, and any continuation
lines following it until a blank line, will be produced
into HTML block quotes. For example, in the following text
two levels of block quoting are used:
Block-quoted text following: > > Nested Block-quoted text > > Block-quoted text
block quoting nesting levels are generally determined by the number
> characters at the begin of a line
with respect to spaces and tab characters allowed to follow
up to a subsequent
> character in nested block quoted lines, so that these
are recognized as block quote nesting rather than e.g. nested code block, the
following steps are applied:
a single space (not tab) following
> isn't significant, and gets
discarded before the next step (but is significant when preceding
> on the line)
in further processing of
> characters, quadspace rules analogous to
those for sublist nesting are applied; however, two spaces rather than
four is taken as tab-equivalent (tabs themselves get recognized in the same
way as in normal quadspace indenting)
if there are tabs/quadspaces in place of a block quote character, markdown.pl and pandoc, like sgmljs.net, will treat this as a block quoted code block; however, markdown.pl and pandoc will change the first tab or quad-/doublespace after block quote characters (and only that) into two characters, whereas sgmljs.net leaves it as-is
The reasoning behind markdown.pl's behavior seems to be an implementation detail: within a block quote, doublespaces rather than quadspaces are significant, and at the point in time markdown.pl recognizes the codeblock, it already has pruned away a doublespace in expectation of another block quote char; markdown.pl then compensates by prepending a doublespace before the code block, irrespective of what actual indentation marker was used
Note that whereas markdown.pl and pandoc change tabs into quadspaces in plain codeblocks, sgmljs.net also leaves those as-is; the reason is that some programming languages (e.g. Python) treat whitespace as significant tokens, and the assumption that a tab translates into four spaces (or two spaces in the case of block quoted code block), while it might work for Python, isn't appropriate for a markdown processor to make; for example, classic Makefile syntax wouldn't roundtrip through markdown.pl cleanly.
Note that a single space following a block quote character always gets discarded in the first step if present, even when no further block quote characters are following, so that subsequent processing of the block quoted content never gets to see the space (this is what markdown.pl, pandoc, and sgmljs.net does); markdown.pl and pandoc seem to remove even more than a single space (detailed rules aren't documented, though); sgmljs.net doesn't do this, for already stated reasons.
See Block Quote Examples for details.
A line consisting entirely of asterisk, hyphen, or underscore characters and having one of the following forms
* * * - - - _ _ _ *** --- ___
is formatted as HTML
<hr> (horizontal ruler) element.
See Ruler Examples for details.
As an extension to standard markdown syntax, sgmljs.net has limited support for producing tables in GitHub-flavored markdown style from markdown text such as the following:
Header 1 | Header 2 ------------ | -------- Cell 11 | Cell 12 Cell 21 | Cell 22
This will produce the following HTML output:
<table><tr><th>Header 1</th><th>Header 2</th></tr> <tr><td>Cell 11</th><th>Cell 12</th></tr> <tr><td>Cell 21</td><td>Cell 22</td></tr></table>
sgmljs.net recognizes table cells when separated by the vertical bar
|; vertical bar characters aren't treated as cell separators
in code blocks, markup blocks, backticked code spans, or when escaped by
An optional table row with all cells containing just hyphen (
(ignoring leading and trailing space characters in cells), separates
preceding header rows from subsequent body rows; further separator lines
aren't recognized and produced as-is to the result table
If all cells of the first and the last table column are empty (such that
all table lines begin with and end in
|), then the
first and last column is ignored for output to HTML
Note also that double-backticked code spans (such as
considered in escaping vertical bar characters
See also Table Examples.
As an extension to base markdown, sgmljs.net supports full SGML in markup blocks and inline markup. Among many other features (described in detail in SGML Sytax Reference), this brings support for SGML named entities as a straightforward variable expansion and text reuse technique to markdown.
Named entities are symbolic names which can be assigned a piece of markdown or markup text; once declared this way, a variable may be referenced, ie. placed in running markdown text. sgmljs.net will replace a reference to a variable by the text assigned to it when producing output.
In SGML, entities are declared in a special piece of SGML at the begin of the SGML file called a document type definition (DTD). We aren't going to discuss in detail what a DTD is here, but will just explain enough of it to understand entity declarations in a DTD and their use in markdown, so that this section is a self-contained description of SGML entities as used from markdown.
To declare an entity, a DTD must be placed as a markup block at the begin of a file. For example, the following markdown text
<!doctype html [ <!entity my_variable_name "my replacement text"> ]>
declares the type name of the document to be
html; the type name
must match the document element of the generated markup file; since
markdown is an abbreviated syntax for HTML, this will always be
for the use cases explained in this chapter
declares an entity named
my_variable_name having the replacement
my replacement text; the replacement text contains everything
in double quotes
With the above DTD at the begin of the document, we can
We reference a variable in subsequent markdown
&my_variable_name, and sgmljs.net will replace
&my_variable_name; by "my replacement text".
& (ampersand) character starts an entity reference;
; (semicolon) character is only necessary to delimit
the entity reference name from subsequent text, if the subquent text
starts with characters that could be part of an entity reference name
(such as letters, digits or the dot, hyphen, or underscore character).
There's no need to know what a DTD or document type is, if all that's desired
is basic assignment and replacement of entities; it's sufficient to assume it
has to be there and declares a
html doctype as in the example above.
Entity references on a line of its own, when preceded by a blank line, will be processed using markdown rules. In other words, if a line would be parsed as a paragraph (or list continuation paragraph after a blank line), and contains a single entity reference, markdown syntax present in the replacement text for the entity reference will be expanded into HTML according to the rules in this markdown reference.
On the other hand, inline entity reference (entity reference for which the above condition isn't met) are expanded into markdown-produced HTML as-is, and no processing of markdown syntax is performed on it.
Replacement text for entity can contain arbitrary markup such as elements and attributes.
Entities are declared in a line starting with
<!entity ... like shown
above; the declaration can also be supplied in uppercase (
A DTD can contain any number of entity declarations.
For example, the following DTD declares two entities:
<!DOCTYPE html [ <!ENTITY my_variable_name "my replacement text"> <!ENTITY my_other_variable_name "another replacement text"> ]>
Entity declarations usable from markdown (those without using further SGML features) take either of the following forms in the DTD:
<!ENTITY varname1 "replacement text">
varname1 as an "internal entity" and assigns the string
"replacement text" to it
<!ENTITY varname2 "replacement text referencing &varname1;">
varname2 as an "internal entity" and assigns the specified
replacement text to it, where
&varname1; is replaced by it's
respective replacement text in turn.
Entity replacement text may reference other variables arbitrarily in this way; however, variable references used in entity replacement text must not reference themselves in a circular fashion (neither directly nor indirectly).
<!ENTITY varname3 SYSTEM "filename.txt">
varname3 as an external entity;
filename.txt may be any
local file path; file names are resolved relative to the
location of the document declaring the entity.
This may be useful for e.g. creating separate markdown text files for individual chapters of a larger text, and then include all the chapters into a master document; it may also be used to overcome markdown limitations with respect to nesting and indentation.
See also Markdown Entity Examples.