|
|
| |
LOWDOWN(3) |
FreeBSD Library Functions Manual |
LOWDOWN(3) |
lowdown —
simple markdown translator library
This library parses
lowdown(5)
into various output formats.
The library consists first of a high-level interface consisting of
lowdown_buf(3),
lowdown_buf_diff(3),
lowdown_file(3),
and
lowdown_file_diff(3).
The high-level functions interface with low-level functions that
perform parsing and formatting. These consist of
lowdown_doc_new(3),
lowdown_doc_parse(3),
and
lowdown_doc_free(3)
for parsing
lowdown(5)
documents into an abstract syntax tree.
The front-end functions for freeing, allocation, and rendering are
as follows.
- HTML5:
- gemini:
- LaTeX:
- OpenDocument:
- roff:
- UTF-8 ANSI terminal:
- debugging:
To compile and link, use
pkg-config(1):
% cc `pkg-config --cflags lowdown` -c -o sample.o sample.c
% cc -o sample sample.o `pkg-config --libs lowdown`
The lowdown library is built to operate in
security-sensitive environments, such as those using
pledge(2)
on OpenBSD. The only promise required is
stdio for
lowdown_file_diff(3)
and
lowdown_file(3):
both require access to the stream for reading input.
All lowdown functions use one or more of the following
structures.
The struct lowdown_opts structure manage
features. It has the following fields:
- unsigned int feat
- Features used during the parse. This bit-field may have the following bits
OR'd:
LOWDOWN_ATTRS
- Parse PHP extra link, header, and image attributes.
LOWDOWN_AUTOLINK
- Parse
http , https ,
ftp , mailto , and
relative links or link fragments.
LOWDOWN_COMMONMARK
- Tighten input parsing to the CommonMark specification. This also uses
the first ordered list value instead of starting all lists at one.
This feature is experimental and
incomplete.
LOWDOWN_DEFLIST
- Parse PHP extra definition lists. This is currently constrained to
single-key lists.
LOWDOWN_FENCED
- Parse GFM fenced (language-specific) code blocks.
- Parse MMD style footnotes. This only supports the referenced footnote
style, not the “inline” style.
LOWDOWN_HILITE
- Parse highlit sequences. This are disabled by default because it may
be erroneously interpreted as section headers.
LOWDOWN_IMG_EXT
- Deprecated. Use
LOWDOWN_ATTRS instead.
LOWDOWN_MATH
- Parse mathematics equations.
LOWDOWN_METADATA
- Parse in-document MMD metadata. For the first paragraph to count as
meta-data, the first line must have a colon in it.
LOWDOWN_NOCODEIND
- Do not parse indented content as code blocks.
LOWDOWN_NOINTEM
- Do not parse emphasis within words.
LOWDOWN_STRIKE
- Parse strikethrough sequences.
LOWDOWN_SUPER
- Parse super-scripts. This accepts foo^bar, which puts the parts
following the caret until whitespace in superscripts; or foo^(bar),
which puts only the parts in parenthesis.
LOWDOWN_TABLES
- Parse GFM tables.
LOWDOWN_TASKLIST
- Parse GFM task list items.
The default value is zero (none).
- unsigned int oflags
- Features used by the output generators. This bit-field may have the
following enabled. Note that bits are by definition specific to an output
type.
For LOWDOWN_HTML :
LOWDOWN_HTML_ESCAPE
- If
LOWDOWN_HTML_SKIP_HTML has not been set,
escapes in-document HTML so that it is rendered as opaque text.
LOWDOWN_HTML_HARD_WRAP
- Retain line-breaks within paragraphs.
LOWDOWN_HTML_HEAD_IDS
- Have an identifier written with each header element consisting of an
HTML-escaped version of the header contents.
LOWDOWN_HTML_OWASP
- When escaping text, be extra paranoid in following the OWASP
suggestions for which characters to escape.
LOWDOWN_HTML_NUM_ENT
- Convert, when possible, HTML entities to their numeric form. If not
set, the entities are used as given in the input.
LOWDOWN_HTML_SKIP_HTML
- Do not render in-document HTML at all.
For LOWDOWN_GEMINI , there are several
flags for controlling link placement. By default, links (images,
autolinks, and links) are queued when specified in-line then emitted in
a block sequence after the nearest block element.
LOWDOWN_GEMINI_LINK_END
- Emit the queue of links at the end of the document instead of after
the nearest block element.
LOWDOWN_GEMINI_LINK_IN
- Render all links within the flow of text. This will cause breakage
when nested links, such as images within links, links in blockquotes,
etc. It should not be used unless in carefully crafted documents.
LOWDOWN_GEMINI_LINK_NOREF
- Do not format link labels. Takes precedence over
LOWDOWN_GEMINI_LINK_ROMAN .
LOWDOWN_GEMINI_LINK_ROMAN
- When formatting link labels, use lower-case Roman numerals instead of
the default lowercase hexavigesimal (i.e., “a”,
“b”, ..., “aa”, “ab”,
...).
LOWDOWN_GEMINI_METADATA
- Print metadata as the canonicalised key followed by a colon then the
value, each on one line (newlines replaced by spaces). The metadata
block is terminated by a double newline. If there is no metadata, this
does nothing.
There may only be one of
LOWDOWN_GEMINI_LINK_END or
LOWDOWN_GEMINI_LINK_IN . If both are specified,
the latter is unset.
For LOWDOWN_FODT :
LOWDOWN_ODT_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
For LOWDOWN_LATEX :
LOWDOWN_LATEX_NUMBERED
- Use the default numbering scheme for sections, subsections, etc. If
not specified, these are inhibited.
LOWDOWN_LATEX_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
And for LOWDOWN_MAN and
LOWDOWN_NROFF :
LOWDOWN_NROFF_GROFF
- Use GNU extensions (i.e., for
groff(1))
when rendering output. The groff arguments must include
-m pdfmark for formatting
links with LOWDOWN_MAN or
-m spdf instead of
-m s for
LOWDOWN_NROFF . Applies to the
LOWDOWN_MAN and
LOWDOWN_NROFF output types.
LOWDOWN_NROFF_NUMBERED
- Use numbered sections if
LOWDOWON_NROFF_GROFF
is not specified. Only applies to the
LOWDOWN_NROFF output type.
LOWDOWN_NROFF_SKIP_HTML
- Do not render in-document HTML at all. Text within HTML elements
remains.
LOWDOWN_NROFF_SHORTLINK
- Render link URLs in short form. Applies to images, autolinks, and
regular links. Only in
LOWDOWN_MAN or when
LOWDOWN_NROFF_GROFF is not specified.
LOWDOWN_NROFF_NOLINK
- Don't show links at all if they have embedded text. Applies to images
and regular links. Only in
LOWDOWN_MAN or when
LOWDOWN_NROFF_GROFF is not specified.
For LOWDOWN_TERM :
LOWDOWN_TERM_NOANSI
- Don't apply ANSI style codes at all. This implies
LOWDOWN_TERM_NOCOLOUR .
LOWDOWN_TERM_NOCOLOUR
- Don't apply ANSI colour codes. This will still show underline, bold,
etc. This should not be used in difference mode, as the output will
make no sense.
LOWDOWN_TERM_NOLINK
- Don't show links at all. Applies to images and regular links:
autolinks are still shown. This may be combined with
LOWDOWN_TERM_SHORTLINK to also shorten
autolinks.
LOWDOWN_TERM_SHORTLINK
- Render link URLs in short form. Applies to images, autolinks, and
regular links. This may be combined with
LOWDOWN_TERM_NOLINK to only show shortened
autolinks.
For any mode, you may specify:
LOWDOWN_SMARTY
- Don't use smart typography formatting.
LOWDOWN_STANDALONE
- Emit a full document instead of a document fragment. This envelope is
largely populated from metadata if
LOWDOWN_METADATA was provided as an option or
as given in meta or
metaovr.
- size_t maxdepth
- The maximum parse depth before the parser exits. Most documents will have
a parse depth in the single digits.
- size_t cols
- For
LOWDOWN_TERM , the “soft limit”
for width of terminal output not including margins. If zero, 80 shall be
used.
- size_t hmargin
- For
LOWDOWN_TERM , the left margin (space
characters).
- size_t vmargin
- For
LOWDOWN_TERM , the top/bottom margin
(newlines).
- enum lowdown_type type
- May be set to
LOWDOWN_HTML for HTML5 output,
LOWDOWN_LATEX for LaTeX,
LOWDOWN_MAN for
-m an macros,
LOWDOWN_FODT for “flat”
OpenDocument, LOWDOWN_TERM for ANSI-compatible
UTF-8 terminal output, LOWDOWN_GEMINI for the
Gemini format, or LOWDOWN_NROFF for
-m s macros. The
LOWDOWN_TREE type causes a debug tree to be
written.
- struct lowdown_opts_odt odt
- If type is
LOWDOWN_FODT ,
this contains const char *sty, which is either
NULL or the OpenDocument styles used when creating
standalone documents. If NULL , the default styles
are used.
- char **meta
- An array of metadata key-value pairs or
NULL . Each
pair must appear as if provided on one line (or multiple lines) of the
input, including the terminating newline character. If not consisting of a
valid pair (e.g., no newline, no colon), then it is ignored. When
processed, these values are overridden by those in the document (if
LOWDOWN_METADATA is specified) or by those in
metaovr.
- size_t metasz
- Number of pairs in metaovr.
- char **metaovr
- See meta. The difference is that
metaovr is applied after meta
and in-document metadata, so it overrides prior values.
- size_t metaovrsz
- Number of pairs in metaovr.
Another common structure is struct
lowdown_metadata, which is used to hold parsed (and output-formatted)
metadata keys and values if LOWDOWN_METADATA was
provided as an input bit. This structure consists of the following
fields:
- char *key
- The metadata key in its lowercase, canonical form.
- char *value
- The metadata value as rendered in the current output format. This may be
an empty string.
The abstract syntax tree is encoded in struct
lowdown_node, which consists of the following.
- enum lowdown_rndrt type
- The node type. (Described below.)
- size_t id
- An identifier unique within the document. This can be used as a table
index since the number is assigned from a monotonically increasing point
during the parse.
- struct lowdown_node *parent
- The parent of the node, or
NULL at the root.
- enum lowdown_chng chng
- Change tracking: whether this node was inserted
(
LOWDOWN_CHNG_INSERT ), deleted
(LOWDOWN_CHNG_DELETE ), or neither
(LOWDOWN_CHNG_NONE ).
- struct lowdown_nodeq children
- A possibly-empty list of child nodes.
- <anon union>
- An anonymous union of type-specific structures. See below for a
description of each one.
The nodes may be one of the following types, with default
rendering in HTML5 to illustrate functionality.
LOWDOWN_BLOCKCODE
- A block-level (and possibly language-specific) snippet of code. Described
by the
<pre><code> elements.
LOWDOWN_BLOCKHTML
- A block-level snippet of HTML. This is simply opaque HTML content. (Only
if configured during parse.)
LOWDOWN_BLOCKQUOTE
- A block-level quotation. Described by the
<blockquote> element.
LOWDOWN_CODESPAN
- A snippet of code. Described by the
<code>
element.
- A header with data gathered from document metadata (if configured).
Described by the
<head> element. (Only if
configured during parse.)
LOWDOWN_DOUBLE_EMPHASIS
- Bold (or otherwise notable) content. Described by the
<strong> element.
LOWDOWN_EMPHASIS
- Italic (or otherwise notable) content. Described by the
<em> element.
LOWDOWN_ENTITY
- An HTML entity, which may either be named or numeric.
- A footnote. (Only if configured during parse.)
- A block-level header. Described (in the HTML case) by one of
<h1> through
<h6> .
LOWDOWN_HIGHLIGHT
- Marked test. Described by the
<mark>
element. (Only if configured during parse.)
LOWDOWN_HRULE
- A horizontal line. Described by
<hr> .
LOWDOWN_IMAGE
- An image. Described by the
<img>
element.
LOWDOWN_LINEBREAK
- A hard line-break within a block context. Described by the
<br> element.
LOWDOWN_LINK
- A link to external media. Described by the
<a> element.
LOWDOWN_LINK_AUTO
- Like
LOWDOWN_LINK , except inferred from text
content. Described by the <a> element. (Only
if configured during parse.)
LOWDOWN_LIST
- A block-level list enclosure. Described by
<ul> or
<ol> .
LOWDOWN_LISTITEM
- A block-level list item, always appearing within a
LOWDOWN_LIST . Described by
<li> .
LOWDOWN_MATH_BLOCK
- A block (or inline) of mathematical text in LaTeX format. Described within
\[xx\] or \(xx\) . This is
usually (in HTML) externally handled by a JavaScript renderer. (Only if
configured during parse.)
LOWDOWN_META
- Meta-data keys and values. (Only if configured during parse.) These are
described by elements in the
<head>
element.
LOWDOWN_NORMAL_TEXT
- Normal text content.
LOWDOWN_PARAGRAPH
- A block-level paragraph. Described by the
<p> element.
LOWDOWN_RAW_HTML
- An inline of raw HTML. (Only if configured during parse.)
LOWDOWN_ROOT
- The root of the document. This is always the topmost node, and the only
node where the parent field is
NULL .
LOWDOWN_STRIKETHROUGH
- Content struck through. Described by the
<del> element. (Only if configured during
parse.)
LOWDOWN_SUPERSCRIPT
- A superscript. Described by the
<sup>
element. (Only if configured during parse.)
LOWDOWN_TABLE_BLOCK
- A table block. Described by
<table> . (Only
if configured during parse.)
LOWDOWN_TABLE_BODY
- A table body section. Described by
<tbody> .
Parent is always LOWDOWN_TABLE_BLOCK . (Only if
configured during parse.)
LOWDOWN_TABLE_CELL
- A table cell. Described by
<td> or
<th> if in the header. Parent is always
LOWDOWN_TABLE_ROW . (Only if configured during
parse.)
- A table header section. Described by
<thead> . Parent is always
LOWDOWN_TABLE_BLOCK . (Only if configured during
parse.)
LOWDOWN_TABLE_ROW
- A table row. Described by
<tr> . Parent is
always LOWDOWN_TABLE_HEADER or
LOWDOWN_TABLE_BODY . (Only if configured during
parse.)
LOWDOWN_TRIPLE_EMPHASIS
- Combination of
LOWDOWN_EMPHASIS and
LOWDOWN_DOUBLE_EMPHASIS .
The following anonymous union structures correspond to certain
nodes. Note that all buffers may be zero-length.
- rndr_autolink
- For
LOWDOWN_LINK_AUTO , the link address as
link and the link type type,
which may be one of HALINK_EMAIL for e-mail links
and HALINK_NORMAL otherwise. Any buffer may be
empty-sized.
- rndr_blockcode
- For
LOWDOWN_BLOCKCODE , the opaque
text of the block and the optional
lang of the code language.
- rndr_blockhtml
- For
LOWDOWN_BLOCKHTML , the opaque HTML
text.
- rndr_codespan
- The opaque text of the contents.
- rndr_definition
- For
LOWDOWN_DEFINITION , containing
flags that may be
HLIST_FL_BLOCK if the definition list should be
interpreted as containing block elements.
- rndr_entity
- For
LOWDOWN_ENTITY , the entity
text.
- rndr_header
- For
LOWDOWN_HEADER , the
level of the header starting at zero (this value is
relative to the metadata base header level, defaulting to one), optional
space-separated class list attr_cls, and optional
single identifier attr_id.
- rndr_image
- For
LOWDOWN_IMAGE , the image address
link, the image title title,
dimensions NxN (width by height) in dims, and
alternate text alt. CSS in-line style for width and
height may be given in attr_width and/or
attr_height, and a space-separated list of classes
may be in attr_cls and a single identifier may be in
attr_id.
- rndr_link
- Like rndr_autolink, but without a type and further
defining an optional link title title, optional
space-separated class list attr_cls, and optional
single identifier attr_id.
- rndr_list
- For
LOWDOWN_LIST , consists of a bitfield
flags that may be set to
HLIST_FL_ORDERED for an ordered list and
HLIST_FL_UNORDERED for an unordered one. If
HLIST_FL_BLOCK is set, the list should be output
as if items were separate blocks. The start value
for HLIST_FL_ORDERED is the starting list item
position, which is one by default and never zero.
- rndr_listitem
- For
LOWDOWN_LISTITEM , consists of a bitfield
flags that may be set to
HLIST_FL_ORDERED for an ordered list,
HLIST_FL_UNORDERED for an unordered list,
HLIST_FL_DEF for definition list data,
HLIST_FL_CHECKED or
HLIST_FL_UNCHECKED for an unordered
“task” list element, and/or
HLIST_FL_BLOCK for list item output as if
containing block elements. The HLIST_FL_BLOCK
should not be used: use the parent list (or definition list) flags for
this. The num is the index in a
HLIST_FL_ORDERED list. It is monotonically
increasing with each item in the list, starting at the
start variable given in struct
rndr_list.
- rndr_math
- For
LOWDOWN_MATH , the mode of display in
blockmode: if 1, in-line math; if 2, multi-line. The
opaque equation, which is assumed to be in LaTeX format, is in the opaque
text.
- rndr_meta
- Each
LOWDOWN_META key-value pair is represented.
The keys are lower-case without spaces or non-ASCII characters. If
provided, enclosed nodes may consist only of
LOWDOWN_NORMAL_TEXT and
LOWDOWN_ENTITY .
- rndr_normal_text
- The basic text content for
LOWDOWN_NORMAL_TEXT .
- rndr_paragraph
- For
LOWDOWN_PARAGRAPH , species how many
lines the paragraph has in the input file and
beoln, set to non-zero if the paragraph ends with an
empty line instead of a breaking block element.
- rndr_raw_html
- For
LOWDOWN_RAW_HTML , the opaque HTML
text.
- rndr_table
- For
LOWDOWN_TABLE_BLOCK , the number of
columns in each row or header row. The number of
columns in rndr_table,
rndr_table_header, and
rndr_table_cell are the same.
- rndr_table_cell
- For
LOWDOWN_TABLE_CELL , the current
col column number out of
columns. See rndr_table_header
for a description of the bits in flags. The number
of columns in rndr_table,
rndr_table_header, and
rndr_table_cell are the same.
- rndr_table_header
- For
LOWDOWN_TABLE_HEADER , the number of
columns in each row and the per-column
flags, which may tested for equality against
HTBL_FL_ALIGN_LEFT ,
HTBL_FL_ALIGN_RIGHT , or
HTBL_FL_ALIGN_CENTER after being masked with
HTBL_FL_ALIGNMASK ; or
HTBL_FL_HEADER . If no alignment is specified after
the mask, the default should be left-aligned. The number of columns in
rndr_table, rndr_table_header,
and rndr_table_cell are the same.
lowdown(1),
lowdown_buf(3),
lowdown_buf_diff(3),
lowdown_diff(3),
lowdown_doc_free(3),
lowdown_doc_new(3),
lowdown_doc_parse(3),
lowdown_file(3),
lowdown_file_diff(3),
lowdown_gemini_free(3),
lowdown_gemini_new(3),
lowdown_gemini_rndr(3),
lowdown_html_free(3),
lowdown_html_new(3),
lowdown_html_rndr(3),
lowdown_latex_free(3),
lowdown_latex_new(3),
lowdown_latex_rndr(3),
lowdown_metaq_free(3),
lowdown_nroff_free(3),
lowdown_nroff_new(3),
lowdown_nroff_rndr(3),
lowdown_odt_free(3),
lowdown_odt_new(3),
lowdown_odt_rndr(3),
lowdown_term_free(3),
lowdown_term_new(3),
lowdown_term_rndr(3),
lowdown_tree_rndr(3),
lowdown(5)
lowdown was forked from
hoedown by
Kristaps Dzonsons,
kristaps@bsd.lv. It has been
considerably modified since.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |