cmark — LaTeX Renderer

Overview

The LaTeX renderer (latex.c) converts a cmark_node AST into LaTeX source, suitable for compilation with pdflatex, xelatex, or lualatex. It uses the generic render framework from render.c, operating through a per-character output callback (outc) and a per-node render callback (S_render_node).

Entry Point

char *cmark_render_latex(cmark_node *root, int options, int width);
  • root — AST root node
  • options — Option flags (CMARK_OPT_SOURCEPOS, CMARK_OPT_HARDBREAKS, CMARK_OPT_NOBREAKS, CMARK_OPT_UNSAFE)
  • width — Target line width for hard-wrapping; 0 disables wrapping

Character Escaping (outc)

The outc function handles per-character output decisions. It is the most complex part of the LaTeX renderer, with different behavior for three escaping modes:

static void outc(cmark_renderer *renderer, cmark_escaping escape,
                 int32_t c, unsigned char nextc);

LITERAL Mode

Pass-through: all characters are output unchanged.

NORMAL Mode

Extensive special-character handling:

Character LaTeX Output Purpose
$ \$ Math mode delimiter
% \% Comment character
& \& Table column separator
_ \_ Subscript operator
# \# Parameter reference
^ \^{} Superscript operator
{ \{ Group open
} \} Group close
~ \textasciitilde{} Non-breaking space
[ {[} Optional argument bracket
] {]} Optional argument bracket
\ \textbackslash{} Escape character
| \textbar{} Pipe
' \textquotesingle{} Straight single quote
" \textquotedbl{} Straight double quote
` | \textasciigrave{} Backtick
\xA0 (NBSP) ~ LaTeX non-breaking space
\x2014 (—) --- Em dash
\x2013 (–) -- En dash
\x2018 (') ` Left single quote
\x2019 (') ' Right single quote
\x201C (") `` Left double quote
\x201D (") '' Right double quote

URL Mode

Only these characters are escaped:

  • $\$
  • %\%
  • &\&
  • _\_
  • #\#
  • {\{
  • }\}

All other characters pass through unchanged.

Link Type Classification

The renderer classifies links into five categories:

typedef enum {
  NO_LINK,
  URL_AUTOLINK,
  EMAIL_AUTOLINK,
  NORMAL_LINK,
  INTERNAL_LINK,
} link_type;

get_link_type()

static link_type get_link_type(cmark_node *node) {
  // 1. "mailto:" links where text matches url
  // 2. "http[s]:" links where text matches url (with or without protocol)
  // 3. Links starting with '#' → INTERNAL_LINK
  // 4. Everything else → NORMAL_LINK
}

Detection logic:

  1. URL_AUTOLINK: The url starts with http:// or https://, the link has exactly one text child, and that child's content matches the URL (or matches the URL minus the protocol prefix).
  2. EMAIL_AUTOLINK: The url starts with mailto:, the link has exactly one text child, and that child's content matches the URL after mailto:.
  3. INTERNAL_LINK: The url starts with #.
  4. NORMAL_LINK: Everything else.

Enumeration Level

For nested ordered lists, the renderer selects the appropriate LaTeX counter style:

static int S_get_enumlevel(cmark_node *node) {
  int enumlevel = 0;
  cmark_node *tmp = node;
  while (tmp) {
    if (tmp->type == CMARK_NODE_LIST &&
        cmark_node_get_list_type(tmp) == CMARK_ORDERED_LIST) {
      enumlevel++;
    }
    tmp = tmp->parent;
  }
  return enumlevel;
}

This walks up the tree, counting ordered list ancestors. LaTeX ordered lists cycle through: enumi (arabic), enumii (alpha), enumiii (roman), enumiv (Alpha).

Node Rendering (S_render_node)

Block Nodes

Document

No output.

Block Quote

ENTER: \begin{quote}\n
EXIT:  \end{quote}\n

List

ENTER (bullet):  \begin{itemize}\n
ENTER (ordered): \begin{enumerate}\n
                 \def\labelenumI{COUNTER}\n  (if start != 1)
                 \setcounter{enumI}{START-1}\n
EXIT:            \end{itemize}\n or \end{enumerate}\n

The counter is formatted based on enumeration level:

  • Level 1: \arabic{enumi}.
  • Level 2: \alph{enumii}. (surrounded by ()
  • Level 3: \roman{enumiii}.
  • Level 4: \Alph{enumiv}.

Period delimiters use ., parenthesis delimiters use ).

Item

ENTER: \item{}   (empty braces prevent ligatures with following content)
EXIT:  \n

Heading

ENTER: \section{  or  \subsection{  or  \subsubsection{  or  \paragraph{  or  \subparagraph{
EXIT:  }\n

Mapping: level 1 → \section, level 2 → \subsection, level 3 → \subsubsection, level 4 → \paragraph, level 5 → \subparagraph.

Code Block

\begin{verbatim}
LITERAL CONTENT
\end{verbatim}

The content is output in LITERAL escape mode (no character escaping). Info strings are ignored.

HTML Block

ENTER: % raw HTML omitted\n  (as a LaTeX comment)

Raw HTML is always omitted in LaTeX output, regardless of CMARK_OPT_UNSAFE.

Thematic Break

\begin{center}\rule{0.5\linewidth}{\linethickness}\end{center}\n

Paragraph

Same tight-list check as the HTML renderer:

parent = cmark_node_parent(node);
grandparent = cmark_node_parent(parent);
tight = (grandparent && grandparent->type == CMARK_NODE_LIST) ?
        grandparent->as.list.tight : false;
  • Normal: newline before and after
  • Tight: no leading/trailing blank lines

Inline Nodes

Text

Output with NORMAL escaping.

Soft Break

Depends on options:

  • CMARK_OPT_HARDBREAKS: \\\\\n
  • CMARK_OPT_NOBREAKS: space
  • Default: newline

Line Break

\\\\\n

Code (inline)

\texttt{ESCAPED CONTENT}

Special handling: Code content is output character-by-character with inline-code escaping. Special characters (\, {, }, $, %, &, _, #, ^, ~) are escaped.

Emphasis

ENTER: \emph{
EXIT:  }

Strong

ENTER: \textbf{
EXIT:  }

Link

Rendering depends on link type:

NORMAL_LINK:

ENTER: \href{URL}{
EXIT:  }

URL_AUTOLINK:

ENTER: \url{URL}
(children are skipped — no EXIT rendering needed)

EMAIL_AUTOLINK:

ENTER: \href{URL}{\nolinkurl{
EXIT:  }}

INTERNAL_LINK:

ENTER: (nothing — rendered as plain text)
EXIT:  (~\ref{LABEL})

Where LABEL is the URL with the leading # stripped.

NO_LINK: No output.

Image

ENTER: \protect\includegraphics{URL}

Image children (alt text) are skipped. If CMARK_OPT_UNSAFE is not set and the URL matches _scan_dangerous_url(), the URL is omitted.

HTML Inline

% raw HTML omitted

Always omitted, regardless of CMARK_OPT_UNSAFE.

Source Position Comments

When CMARK_OPT_SOURCEPOS is set, the renderer adds LaTeX comments before block elements:

snprintf(buffer, BUFFER_SIZE, "%% %d:%d-%d:%d\n",
         cmark_node_get_start_line(node), cmark_node_get_start_column(node),
         cmark_node_get_end_line(node), cmark_node_get_end_column(node));

Example Output

Markdown input:

# Hello World

A paragraph with *emphasis* and **bold**.

- Item 1
- Item 2

LaTeX output:

\section{Hello World}

A paragraph with \emph{emphasis} and \textbf{bold}.

\begin{itemize}
\item{}Item 1

\item{}Item 2

\end{itemize}

Cross-References

Was this handbook page helpful?

This page is part of the Project Tick Handbook, which is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. View full license details.
Last updated: April 18, 2026