cmark — Render Framework
Overview
The render framework (render.c, render.h) provides a generic rendering infrastructure used by three of the five renderers: LaTeX, man, and CommonMark. It handles line wrapping, prefix management, and character-level output dispatch. The HTML and XML renderers bypass this framework and write directly to buffers.
The cmark_renderer Structure
struct cmark_renderer {
cmark_mem *mem;
cmark_strbuf *buffer; // Output buffer
cmark_strbuf *prefix; // Current line prefix (e.g., "> " for blockquotes)
int column; // Current column position (for wrapping)
int width; // Target width (0 = no wrapping)
int need_cr; // Pending newlines count
bufsize_t last_breakable; // Position of last breakable point in buffer
bool begin_line; // True if at the start of a line
bool begin_content; // True if no content has been output on current line (after prefix)
bool no_linebreaks; // Suppress newlines (for rendering within attributes)
bool in_tight_list_item; // Currently inside a tight list item
void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char);
// Per-character output callback
int32_t (*render_node)(cmark_renderer *, cmark_node *, cmark_event_type, int);
// Per-node render callback
};
Key Fields
column— Tracks horizontal position for word-wrap decisions.width— If > 0, enables automatic line wrapping at word boundaries.prefix— Accumulated prefix string. For nested block quotes and list items, prefixes stack (e.g.,"> - "for a list item inside a block quote).last_breakable— Buffer position of the last whitespace where a line break could be inserted. Used for retroactive line wrapping.begin_line— True immediately after a newline. Used by renderers to decide whether to escape line-start characters.begin_content— True until the first non-prefix content on a line. Distinguished frombegin_linebecause the prefix itself isn't "content".no_linebreaks— When true, newlines are converted to spaces. Used when rendering content inside constructs that can't contain literal newlines.
Entry Point
char *cmark_render(cmark_mem *mem, cmark_node *root, int options, int width,
void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char),
int32_t (*render_node)(cmark_renderer *, cmark_node *,
cmark_event_type, int)) {
cmark_renderer renderer = {
mem,
&buf, // buffer
&pref, // prefix
0, // column
width, // width
0, // need_cr
0, // last_breakable
true, // begin_line
true, // begin_content
false, // no_linebreaks
false, // in_tight_list_item
outc, // outc
render_node // render_node
};
// ... iterate AST, call render_node for each event
return (char *)cmark_strbuf_detach(&buf);
}
The framework creates a cmark_renderer, iterates over the AST using cmark_iter, and calls the provided render_node function for each event. The outc callback handles per-character output with escaping decisions.
Escaping Modes
typedef enum {
LITERAL, // No escaping — output characters as-is
NORMAL, // Full escaping for prose text
TITLE, // Escaping for link titles
URL, // Escaping for URLs
} cmark_escaping;
Each renderer's outc function switches on this enum to determine how to handle special characters.
Output Functions
cmark_render_code_point()
void cmark_render_code_point(cmark_renderer *renderer, int32_t c) {
cmark_utf8proc_encode_char(c, renderer->buffer);
renderer->column += 1;
}
Low-level: encodes a single Unicode codepoint as UTF-8 into the buffer and advances the column counter.
cmark_render_ascii()
void cmark_render_ascii(cmark_renderer *renderer, const char *s) {
int len = (int)strlen(s);
cmark_strbuf_puts(renderer->buffer, s);
renderer->column += len;
}
Outputs an ASCII string and advances the column counter. Used for fixed escape sequences like \&, \textbf{, etc.
S_out() — Main Output Dispatcher
static CMARK_INLINE void S_out(cmark_renderer *renderer, const char *source,
bool wrap, cmark_escaping escape) {
int length = (int)strlen(source);
unsigned char nextc;
int32_t c;
int i = 0;
int len;
cmark_chunk remainder = cmark_chunk_literal("");
int k = renderer->buffer->size - 1;
wrap = wrap && !renderer->no_linebreaks;
if (renderer->need_cr) {
// Output pending newlines
while (renderer->need_cr > 0) {
S_cr(renderer);
renderer->need_cr--;
}
}
while (i < length) {
if (renderer->begin_line) {
// Output prefix at start of each line
cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr);
renderer->column = renderer->prefix->size;
renderer->begin_line = false;
renderer->begin_content = true;
}
len = cmark_utf8proc_charlen((uint8_t *)source + i, length - i);
if (len == -1) { // Invalid UTF-8
// ... handle error
}
cmark_utf8proc_iterate((uint8_t *)source + i, len, &c);
if (c == 10) {
// Newline
cmark_strbuf_putc(renderer->buffer, '\n');
renderer->column = 0;
renderer->begin_line = true;
renderer->begin_content = true;
renderer->last_breakable = 0;
} else if (wrap) {
if (c == 32 && renderer->column > renderer->width / 2) {
// Space past half-width — mark as potential break point
renderer->last_breakable = renderer->buffer->size;
cmark_render_code_point(renderer, c);
} else if (renderer->column > renderer->width &&
renderer->last_breakable > 0) {
// Past target width with a break point — retroactively break
// Replace the space at last_breakable with newline + prefix
// ...
} else {
renderer->outc(renderer, escape, c, nextc);
}
} else {
renderer->outc(renderer, escape, c, nextc);
}
if (c != 10) {
renderer->begin_content = false;
}
i += len;
}
}
This is the core output function. It:
- Handles deferred newlines (
need_cr) - Outputs line prefixes at the start of each line
- Tracks column position
- Implements word wrapping via retroactive line breaks
- Delegates character-level escaping to
renderer->outc()
Line Wrapping Algorithm
The wrapping algorithm uses a retroactive break strategy:
- As text flows through
S_out(), spaces past the half-width mark are recorded as potential break points (last_breakable). - When the column exceeds
width, the buffer is split atlast_breakable:- Everything after the break point is saved in
remainder - A newline and the current prefix are inserted at the break point
- The remainder is reappended
- Everything after the break point is saved in
This avoids forward-looking: the renderer doesn't need to know the length of upcoming content to decide where to break.
// Retroactive line break:
remainder = cmark_chunk_dup(&renderer->buffer->..., last_breakable, ...);
cmark_strbuf_truncate(renderer->buffer, last_breakable);
cmark_strbuf_putc(renderer->buffer, '\n');
cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr);
cmark_strbuf_put(renderer->buffer, remainder.data, remainder.len);
renderer->column = renderer->prefix->size + cmark_chunk_len(&remainder);
renderer->last_breakable = 0;
renderer->begin_line = false;
renderer->begin_content = false;
Convenience Functions
CR()
#define CR() renderer->need_cr = 1
Requests a newline before the next content output. Multiple CR() calls don't stack — only one newline is inserted.
BLANKLINE()
#define BLANKLINE() renderer->need_cr = 2
Requests a blank line (two newlines) before the next content output.
OUT()
#define OUT(s, wrap, escaping) (S_out(renderer, s, wrap, escaping))
LIT()
#define LIT(s) (S_out(renderer, s, false, LITERAL))
Output literal text (no escaping, no wrapping).
NOBREAKS()
#define NOBREAKS(s) \
do { renderer->no_linebreaks = true; OUT(s, false, NORMAL); renderer->no_linebreaks = false; } while(0)
Output text with normal escaping but with newlines suppressed (converted to spaces).
Prefix Management
Prefixes are used for block-level indentation. The renderer maintains a cmark_strbuf prefix that is output at the start of each line.
Usage Pattern
// In commonmark.c, entering a block quote:
cmark_strbuf_puts(renderer->prefix, "> ");
// ... render children ...
// On exit:
cmark_strbuf_truncate(renderer->prefix, original_prefix_len);
Renderers save the prefix length before modifying it and restore it on exit. This creates a stack-like behavior for nested containers.
Framework vs Direct Rendering
| Feature | Framework (render.c) | Direct (html.c, xml.c) |
|---|---|---|
| Line wrapping | Yes (width parameter) |
No |
| Prefix management | Yes (automatic) | No (uses HTML tags) |
| Per-char escaping | Via outc callback |
Via escape_html() helper |
| Column tracking | Yes | No |
| Break points | Retroactive insertion | N/A |
cmark_escaping enum |
Yes | No |
Which Renderers Use the Framework
| Renderer | Uses Framework | Why/Why Not |
|---|---|---|
LaTeX (latex.c) |
Yes | Needs wrapping for structured text |
man (man.c) |
Yes | Needs wrapping for terminal display |
CommonMark (commonmark.c) |
Yes | Needs wrapping and prefix management |
HTML (html.c) |
No | HTML handles layout via browser |
XML (xml.c) |
No | XML output is structural, not visual |
Cross-References
- render.c — Framework implementation
- render.h —
cmark_rendererstruct andcmark_escapingenum - latex-renderer.md — LaTeX
outcandS_render_node - man-renderer.md — Man
S_outcandS_render_node - commonmark-renderer.md — CommonMark
outcandS_render_node - html-renderer.md — Direct renderer (no framework)