The markdown package is built on top of commonmark. It renders Markdown to output formats supported by commonmark, and the primary output formats are HTML and LaTeX.
Historically, it uses a C library named sundown, which has been deleted and replaced by commonmark since v1.3 (2022-10-30). The main advantage of the latter is that it follows a clear and widely used spec, i.e., GFM (GitHub Flavored Markdown), which can be seen as a subset of Pandoc’s Markdown. Therefore the markdown package can be viewed as a small subset of the rmarkdown package (the latter is based on Pandoc but the former doesn’t depend on external tools). It aims at simplicity, lightweight, and speed, at the cost of giving up a lot of features. This package is intended for minimalists. Most users may want to use tools based on Pandoc instead, such as rmarkdown or Quarto.
For the full list of supported document elements, please read the GFM spec. Below is a quick summary:
Headings start with a number of #
’s, e.g., ## level-two heading
.
Inline elements: **strong**
, _emphasis_
, ~~strikethrough~~
,
[text](link)
, and 
.1
Inline code is written in a pair of backticks, e.g., `code`
. Code
blocks can be indented, or fenced by ```
.
List items start with -
, +
, or *
, e.g., - item
. A task list item is
a regular list item with [ ]
or [x]
in the beginning, e.g.,
- [ ] item
.
Block quotes start with >
.
Tables are created with |
as the column separator (i.e., Pandoc’s pipe
table, which can be generated by knitr::kable(x, "pipe")
).
In addition to GFM features, the markdown package also supports the following features.
Raw LaTeX and HTML blocks can be written as fenced code blocks with language
names =latex
(or =tex
) and =html
, e.g.,
```{=tex}
This only appears in \LaTeX{} output.
```
Raw LaTeX blocks will only appear in LaTeX output, and will be ignored in other output formats. Similarly, raw HTML blocks will only appear in HTML output. One exception is raw LaTeX blocks that are LaTeX math environments, which also work for HTML output (see the next section).
You can write both $inline$
and $$display$$
LaTeX math, e.g.,
\(\sin^{2}(\theta)+\cos^{2}(\theta) = 1\)
$$\bar{X} = \frac{1}{n} \sum_{i=1}^n X_i$$
$$|x| = \begin{cases} x &\text{if } x \geq 0 \\ -x &\text{if } x < 0 \end{cases}$$
LaTeX math environments are also supported, e.g., below are an align
environment and an equation
environment:
\begin{align} a^{2}+b^{2} & = c^{2}\\ \sin^{2}(\theta)+\cos^{2}(\theta) & = 1 \end{align} \begin{equation} \begin{split} (a+b)^2 &=(a+b)(a+b)\\ &=a^2+2ab+b^2 \end{split} \end{equation}
These math environments can be written in raw LaTeX blocks, and they work for both LaTeX and HTML output, e.g.,
```{=latex}
\begin{align}
a^{2}+b^{2} & = c^{2}\\
\sin^{2}(\theta)+\cos^{2}(\theta) & = 1
\end{align}
```
For HTML output, it is up to the JavaScript library (MathJax or KaTeX) whether a math environment can be rendered.
Write superscripts in ^text^
and subscripts in ~text~
(same syntax as
Pandoc’s Markdown), e.g., 210 and H2O. Currently only alphanumeric
characters, *
, (
, and )
are allowed in the scripts. For example, a^b c^
will not be recognized as a superscript (because the space is not allowed). Note
that GFM supports striking out text via ~text~
, but this feature has been
disabled and replaced by the feature of subscripts in markdown. To strike
out text, you must use a pair of double tildes.
Insert footnotes via [^n]
, where n
is a footnote number (a unique
identifier). The footnote content should be defined in a separate block starting
with [^n]:
. For example:
Insert a footnote here.[^1]
[^1]: This is the footnote.
The support is limited for LaTeX output at the moment,2 and there are two caveats if the document is intended to be converted to LaTeX:
The footnote content must be a single paragraph.
Only numbers3 are supported as identifiers, and other types of identifiers are not recognized.
The two limitations do not apply to HTML output, e.g., you can write arbitrary elements in footnotes and not necessarily one paragraph.
Attributes on images, fenced code blocks, and section headings can be written in
{}
. For example, {.foo #bar width="50%"}
will generate an
<img>
tag with attributes in HTML output:
<img src="path" alt="text" id="bar" class="foo" width="50%" />
and ## Heading {#baz}
will generate:
<h2 id="baz">Heading</h2>
For fenced code blocks, a special rule is that the first class name will be
treated as the language name for a block, and the class
attribute of the
result <code>
tag will have a language-
prefix. For example, the following
code block
```{.foo .bar #my-code style="color: red;"}
```
will generate the HTML output below:
<pre>
<code class="language-foo bar" id="my-code" style="color: red;">
</code>
</pre>
Most attributes in {}
are ignored for LaTeX output except for:
The width
attribute for images, e.g., {width="50%"}
will be
converted to \includegraphics[width=.5\linewidth]{path}
.
The .unnumbered
attribute, which will make a heading unnumbered, e.g.,
# Hello {.unnumbered}
will be converted to \section*{Hello}
.
The .appendix
attribute on a heading, which will start the appendix, e.g.,
# Appendix {.appendix}
will be converted to \appendix
.
When a top-level heading has the attribute .appendix
, the rest of the document
will be treated as the appendix. If section numbering is
enabled, the appendix section headings will be numbered differently.
Div
sA fenced Div
can be written in :::
fences. Note that the opening fence must
have at least one attribute, such as the class name. For example:
::: foo
This is a fenced Div.
:::
::: {.foo}
The syntax `::: foo` is equivalent to `::: {.foo}`.
:::
::: {.foo #bar style="color: red;"}
This div has more attributes.
It will be red in HTML output.
:::
A fenced Div
will be converted to <div>
with attributes in HTML output,
e.g.,
<div class="foo" id="bar" style="color: red;">
</div>
For LaTeX output, it can be converted to a LaTeX environment if both the class
name and an attribute data-latex
are present. For example,
::: {.tiny data-latex=""}
This is _tiny_ text.
:::
will be converted to:
\begin{tiny}
This is \emph{tiny} text.
\end{tiny}
The data-latex
attribute can be used to specify arguments to the environment
(which can be an empty string if the environment doesn’t need an argument). For
example,
::: {.minipage data-latex="{.5\linewidth}"}
will be converted to:
\begin{minipage}{.5\linewidth}
If a fenced Div
doesn’t have the data-latex
attribute, the fence will be
ignored, and its content will be written out normally without a surrounding
environment. If a fenced Div
has multiple class names (e.g., {.a .b .c}
),
only the first class name will be used as the LaTeX environment name. However,
all class names will be used if the output format is HTML (e.g.,
<div class="a b c">
).
“Smart” HTML entities can be represented by ASCII characters, e.g., you can
write fractions in the form n/m
. Below are some example entities:
1/2 |
1/3 |
2/3 |
7/8 |
1/7 |
1/9 |
1/10 |
(c) |
(r) |
(tm) |
---|---|---|---|---|---|---|---|---|---|
½ | ⅓ | ⅔ | ⅞ | ⅐ | ⅑ | ⅒ | © | ® | ™ |
As mentioned earlier, a lot of features in Pandoc’s Markdown are not supported in the markdown package. Any feature that you find missing in previous sections is likely to be unavailable, such as citations and figure/table captions. In addition, a lot of R Markdown and Quarto (both are based on Pandoc) features are not supported, either. Some HTML features may be implemented via JavaScript, but currently it is not straightforward and may be improved in future.
Pandoc can convert Markdown to many output formats, such as Word, PowerPoint, LaTeX beamer, and EPUB. The markdown package is unlikely to support output formats beyond HTML and LaTeX.
The main function to convert Markdown to other formats is markdown::mark()
;
mark_html()
and mark_latex()
are simple wrapper functions for HTML and LaTeX
output, respectively. The function mark()
generates a document fragment by
default, and the wrapper functions mark_*()
generates full documents.
You can either call markdown::mark()
to render a Markdown document
programmatically, or click the Knit
button in RStudio to render a (Markdown or
R Markdown) document interactively. The latter requires you to specify the
output format in the output
field in YAML metadata (see the section “YAML
metadata”), e.g.,
---
output:
markdown::html_format:
options:
js_math:
package: "katex"
version: "0.16.4"
number_sections: true
embed_resources: ["local", "https"]
meta:
css: "custom.css"
---
The options
argument of mark()
can be used to enable/disable/set options to
control Markdown rendering. This argument can take either a list, e.g.,
list(toc = TRUE, smart = FALSE)
, or a character vector, e.g.,
c("+toc", "-smart")
, or equivalently, +toc-smart
, where +
means to enable
an option, and -
means to disable an option. The options can also be set in
YAML metadata (recommended). Available options are listed
below.
auto_identifiers
Add automatic IDs to headings, e.g.,
# Hello world!
will be converted to
<h1 id="hello-world">Hello world!</h1>
You can override the automatic ID by providing an ID manually via the ID attribute, e.g.,
# Hello world! {#hello}
An automatic ID is generated by substituting non-alphanumeric characters in the
heading text with hyphens. If the result is empty, the ID will be section
. If
any ID is duplicated, a numeric suffix will be added to the ID, e.g.,
example_1
and example_2
.
embed_resources
Embed resources (images, CSS, and JS) in the HTML output using their base64-encoded data (images) or raw content (CSS/JS). Possible values are:
null
or false
: Do not embed any resources.
"local"
or true
: Embed local image/CSS/JS files.
"https"
: Embed web resources (links that start with https://
).
"all"
: An alias to the union of "local"
and "https"
.
The default is "local"
, i.e., local resources are embedded, whereas https
resources are not. This means the output document may not work offline. If you
have to view the output offline, you need to use the option value "https"
(or
"all"
) and render the document at least once before you go offline.
js_highlight
Specify the JavaScript library to syntax highlight code blocks. Possible values
are highlight
(highlight.js) and prism
(Prism.js). The default is prism
. This option can also
take a list of the form list(package, version, style, languages)
, which
specifies the package name (highlight
or prism
), version, CSS style/theme
name, and names of languages to be highlighted.
You can find information about Prism.js from its CDN at
https://cdn.jsdelivr.net/npm/prismjs/. Available styles are under the
themes/
directory (e.g., prism-dark
), and languages are under the
components/
directory (e.g., prism-c
). You can omit the prefix prism-
,
e.g.,
js_highlight:
package: prism
style: dark
languages: [r, latex, yaml]
The CDN of highlight.js is at
https://cdn.jsdelivr.net/gh/highlightjs/cdn-release/build/. Themes are
under the styles/
directory (e.g., github
), and you can find demos of
themes at https://highlightjs.org/static/demo/. Supported language are
under the languages/
directory (e.g., latex
).
By default, languages are automatically detected and the required JS files are
automatically loaded. Normally you need to specify the languages
array only if
the automatic detection fails.
Technically this option is a shorthand for setting the metadata
variables css
and js
. If you want full control, you may
disable this option (set it to false
or null
) and use metadata variables
directly, which requires more familiarity with the JS libraries and the jsdelivr
CDN.
js_math
Specify the JavaScript library for rendering math expressions in HTML output.
Possible values are "mathjax"
and "katex"
(the default). Like the
js_highlight
option, this option is also essentially a shorthand for setting
the metadata variables css
and js
.
For MathJax, the js
variable is
set to tex-mml-chtml.js
.
For KaTeX, the js
variable is set
to katex.min.js
and the css
variable is set to katex.min.css
. KaTeX’s
auto-render
extension (auto-render.min.js
) is also enabled by default,
so math expressions can be immediately rendered when the page is loaded.
If you want finer control, you can provide a list of the form
list(package, version, css, js)
. This will allow you to specify the package
name, version, and css/js files. For example, if you want to use MathJax’s
tex-chtml.js
instead, you may set:
js_math:
package: mathjax
version: 3
js: es5/tex-chtml.js
By default, MathJax version 3 is used. If you want to use the older v2, you may set:
js_math:
package: mathjax
version: 2
js: MathJax.js?config=TeX-AMS-MML_CHTML
Please visit the MathJax CDN to know which versions and JS files are available.
For KaTeX, the version is not specified by default, which means the latest
version from the CDN. Below is an example
of specifying the version 0.16.4 and using the mhchem
extension:
js_math:
package: katex
version: 0.16.4
js: [dist/katex.min.js, dist/contrib/mhchem.min.js]
Note that if you want the HTML output to be self-contained via the
embed_resources
option, KaTeX can be embedded and used offline, but MathJax
cannot be fully embedded due to its complexity. MathJax v3 can be partially
embedded and used offline, but currently only its fonts can be embedded, and
extensions cannot. If you must view HTML output offline, we recommend using
KaTeX, but please also note that KaTeX and MathJax do not fully cover each
other’s features.
latex_math
Whether to identify LaTeX math expressions in pairs of single ($ $
) or double
dollar signs ($$ $$
), and transform them so that they could be correctly
rendered by MathJax (HTML output) or LaTeX.
number_sections
Whether to number section headings. To skip numbering a specific heading, add an
attribute {.unnumbered}
to it.
smartypants
Whether to translate certain ASCII strings into smart typographic characters
(see ?markdown::smartypants
).
superscript
Whether to translate strings between two carets into superscripts, e.g.,
text^foo^
to text<sup>foo</sup>
.
subscript
Whether to translate strings between two tildes into subscripts, e.g.,
text~foo~
to text<sub>foo</sub>
.
toc
Whether to generate a table of contents (TOC) from section headings. If a
heading has an id
attribute, the corresponding TOC item will be a link to this
heading. You can also set a sub-option:
depth
: The number of section levels to include in the TOC (3
by
default). Setting toc
to true
is equivalent to:
toc:
depth: 3
top_level
The desired type of the top-level headings in LaTeX output. Possible values are
'chapter'
and 'part'
. For example, if top_level = 'chapter'
, # heading
will be rendered to \chapter{heading}
instead of the default
\section{heading}
.
Options not described above can be found on the help pages of commonmark,
e.g., the hardbreaks
option is for the hardbreaks
argument of
commonmark::markdown_*()
functions, and the table
option is for the table
extension in commonmark’s extensions.
markdown::markdown_options()
#> [1] "+auto_identifiers" "+autolink" "+embed_resources"
#> [4] "+js_highlight" "+js_math" "+latex_math"
#> [7] "+smart" "+smartypants" "+strikethrough"
#> [10] "+subscript" "+superscript" "+table"
#> [13] "+tasklist" "-hardbreaks" "-number_sections"
#> [16] "-tagfilter" "-toc"
# commonmark's arguments
opts = formals(commonmark::markdown_html)
opts = opts[setdiff(names(opts), c('text', 'extensions'))]
unlist(opts)
#> hardbreaks smart normalize sourcepos footnotes
#> FALSE FALSE FALSE FALSE FALSE
# commonmark's extensions
commonmark::list_extensions()
#> [1] "table" "strikethrough" "autolink" "tagfilter"
#> [5] "tasklist"
By default, mark()
generates a document fragment (i.e., the body). To generate
a full document, you need a template. Below is a simple HTML template example:
<html>
<head>
<title>$title$</title>
</head>
<body>
$body$
</body>
</html>
It contains two variables, $title$
and $body$
. All variables will be
substituted by metadata values, except for $body$
, which takes the value from
mark()
.
The markdown has provided default templates for
HTML
and
LaTeX
output. To use them, call mark(..., template = TRUE)
, or the wrapper functions
mark_html()
/ mark_latex()
. To pass metadata to templates, use the meta
argument, e.g.,
markdown::mark(..., meta = list(title = "My Title"), template = TRUE)
You can provide your own template file to the template
argument, too.
Alternatively, the meta
argument can read YAML metadata in the Markdown
document. The following variables can be set in the top-level fields in YAML:
author
: The document author(s).
date
: The date.
title
: The document title.
For example:
---
title: "My Title"
author: "[Frida Gomam](https://example.com)"
date: "2023-01-09"
---
Note that you can use Markdown syntax in them.
Other variables need to be specified under
output -> markdown::*_format -> meta
, where *
can be html
or latex
,
e.g.,
---
title: "My Title"
output:
markdown::html_format:
meta:
css: "style.css"
js: "script.js"
markdown::latex_format:
meta:
documentclass: "book"
header_includes: "\\usepackage{microtype}"
---
The following metadata variables are supported for both HTML and LaTeX templates:
header-includes
, include-before
, include-after
: Either a vector of
(HTML/LaTeX) code or a code file to be included in the header, before the
body, or after the body of the output.Variables specific to the HTML template:
css
: A vector of CSS files to be included in the output. The default value
is markdown:::pkg_file('resources', 'default.css')
.
If you want to use built-in CSS files in this package, you can only specify
the base name, e.g., default
means default.css
in this package.
You can also use web resources, e.g., https://example.org/style.css
. One
special case is jsdelivr resources: if a css
value starts with @
, it will be recognized as a jsdelivr.com resource. if
you are not familiar with jsdelivr, you may read its documentation to
understand the following example URLs. The shorthand syntax is as follows
(...
stands for https://cdn.jsdelivr.net
):
@foo
will be converted to
.../gh/rstudio/markdown/inst/resources/foo
, e.g., @default
means
.../gh/rstudio/markdown/inst/resources/default.css
.
@path/to/file
(i.e., a value that contains slashes) will be converted
to .../path/to/file
, e.g., @npm/@xiee/utils/js/center-img.js
will be
converted to .../npm/@xiee/utils/js/center-img.min.js
.
@path/to/file-1,file-2
(comma-separated values and later values do not
contain slashes) will be converted to
.../combine/path/to/file-1,path/to/file-2
(this can be useful to
combine
multiple resources and load all at once).
@path-1/to/file-1,path-2/to/file-2
(comma-separated values and later
values contain slashes) will be converted to
.../combine/path-1/to/file-1,path-2/to/file-2
.
This provides a way to reduce the output HTML file size by loading CSS from
the web instead of embedding inside HTML, at the cost of requiring Internet
connection when viewing the HTML file. If you need the external web
resources to work after you go offline, you can enable "https"
in the
Markdown option embed_resources
in advance to embed the resources.
js
: A vector of JavaScript files to be included in the output. The syntax
is the same as the css
variable, e.g., snap
means snap.js
in this
package, and @snap
means a “jsdelivr” resource.
Variables specific to the LaTeX template:
classoption
: A string containing options for the document class.
documentclass
: The document class (by default, article
).
Note that you can use either underscores or hyphens in the variable names.
Underscores will be normalized to hyphens internally, e.g., header_includes
will be converted to header-includes
. This means if you use a custom template,
you must use hyphens instead of underscores as separators in variable names in
the template.
The above are variables supported in the default templates. If you use a custom
template, you can use arbitrary variable names consisting of alphanumeric
characters and hyphens, except for $body$
(which is a reserved name), and your
metadata values will be passed to these variables in your template.
Besides metadata variables, the aforementioned Markdown options can also be set
in YAML under output -> markdown::*_format -> options
, e.g.,
output:
markdown::html_format:
options:
toc: true
js_highlight:
package: highlight
theme: github
languages: [diff, latex]
See the help page ?markdown::html_format
for possible fields in addiction to
meta
and options
that can be specified under the format name, e.g.,
output:
markdown::latex_format:
latex_engine: xelatex
keep_md: true
template: custom-template.tex
The markdown package aims at lightweight with a minimal number of features. You can build lightweight applications on top of it. In this section, we introduce some example applications.
With an extra CSS file and a JS file, you can create lightweight HTML slides:
---
output:
markdown::html_format:
meta:
css: [default, slides]
js: [slides]
---
You can learn more in vignette('slides', package = 'markdown')
.
Similarly, you can write an HTML article with extra CSS and JS. Learn more about
it in vignette('article', package = 'markdown')
.
You can load arbitrary external JS and CSS files via the js
and css
variables. There are numerous JS libraries and CSS frameworks on the web. Here
we will only use the JS/CSS from the repo https://github.com/yihui/misc.js to
show a few brief examples.
You can load the script tabsets.js
and CSS tabsets.css
to create tabsets
from sections (see documentation
here).
css: ["@npm/@xiee/utils/js/tabsets.min.css"]
js: ["@npm/@xiee/utils/js/tabsets.min.js"]
Code folding can be supported by fold-details.js
(see documentation
here).
js: ["@npm/@xiee/utils/js/fold-details.min.js"]
You can use the script right-quote.js
to right-align a blockquote footer if it
starts with an em-dash (---
).
js: ["@npm/@xiee/utils/js/right-quote.min.js"]
The CSS is necessary only if you want to hide the anchors by default and reveal them on hover.
css: ["default", "@npm/@xiee/utils/css/heading-anchor.min.css"]
js: ["@npm/@xiee/utils/js/heading-anchor.min.js"]
The script key-button.js
identifies keys and the CSS styles them, which can be
useful for showing keyboard
shortcuts.
css: ["default", "@npm/@xiee/utils/css/key-buttons.min.css"]
js: ["@npm/@xiee/utils/js/key-buttons.min.js"]
Of course, you can combine any number of JS scripts and CSS files if you want multiple features.
If you use the RStudio IDE, the Knit
button can render your Markdown or R
Markdown document to the output
format specified in YAML (e.g.,
markdown::html_format
or markdown::latex_format
). This requires the
*r*markdown package (>= v2.18), although the markdown package itself
doesn’t really require rmarkdown.
If you only need to render a document to the HTML format, you can bypass RStudio’s requirement for rmarkdown:
Insert an HTML comment <!-- rmarkdown v1 -->
to your document (anywhere in
the body after YAML).
Set an option in your .Rprofile
:
file.edit('~/.Rprofile')
options(rstudio.markdownToHTML = function(...) {
markdown::mark_html(...)
})
Then restart R, and you will be able to use the Knit
button to render your (R)
Markdown document with the markdown package alone.
Since the Markdown syntax of markdown can be viewed as a small and strict subset of Pandoc’s Markdown, you can use RStudio’s visual Markdown editor to author documents. Please bear in mind that most common, but not all, Markdown features are supported.
When https
resources needs to be embedded (via the embed_resources
option),
only these elements are considered:
<img src="..." />
<link rel="stylesheet" href="...">
<script src="..."></script>
Background images set in the attribute style="background-image: url(...)"
are
also considered. If an external CSS file contains url()
resources, these
resources will also be downloaded and embedded.
Please note that for links and images, their URLs should not contain
spaces. If they do,
the URLs must be enclosed in <>
, e.g.,
.
↩
If you know C, I’ll truly appreciate it if you could help with the LaTeX implementation in GFM: https://github.com/github/cmark-gfm/issues/314 ↩
The specific number doesn’t matter, as long as it’s a unique footnote
number in the document. For example, the first footnote can be [^100]
and
the second can be [^64]
. Eventually they will appear as [1]
and [2]
.
If you use the RStudio visual editor to edit Markdown documents, the
footnote numbers will be automatically generated and updated when new
footnotes are inserted before existing footnotes. ↩