summaryrefslogtreecommitdiffstats
path: root/docs/using.md
blob: 83872037cb7f518367490b4fd95ae1a465491783 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
---
jupytext:
  formats: ipynb,md:myst
  text_representation:
    extension: .md
    format_name: myst
    format_version: '0.8'
    jupytext_version: 1.4.2
kernelspec:
  display_name: Python 3
  language: python
  name: python3
---

# Using `markdown_it`

> This document can be opened to execute with [Jupytext](https://jupytext.readthedocs.io)!

markdown-it-py may be used as an API *via* the [`markdown-it-py`](https://pypi.org/project/markdown-it-py/) package.

The raw text is first parsed to syntax 'tokens',
then these are converted to other formats using 'renderers'.

+++

## Quick-Start

The simplest way to understand how text will be parsed is using:

```{code-cell} python
from pprint import pprint
from markdown_it import MarkdownIt
```

```{code-cell} python
md = MarkdownIt()
md.render("some *text*")
```

```{code-cell} python
for token in md.parse("some *text*"):
    print(token)
    print()
```

## The Parser

+++

The `MarkdownIt` class is instantiated with parsing configuration options,
dictating the syntax rules and additional options for the parser and renderer.
You can define this configuration *via* directly supplying a dictionary or a preset name:

- `zero`: This configures the minimum components to parse text (i.e. just paragraphs and text)
- `commonmark` (default): This configures the parser to strictly comply with the [CommonMark specification](http://spec.commonmark.org/).
- `js-default`: This is the default in the JavaScript version.
  Compared to `commonmark`, it disables HTML parsing and enables the table and strikethrough components.
- `gfm-like`: This configures the parser to approximately comply with the [GitHub Flavored Markdown specification](https://github.github.com/gfm/).
  Compared to `commonmark`, it enables the table, strikethrough and linkify components.
  **Important**, to use this configuration you must have `linkify-it-py` installed.

```{code-cell} python
from markdown_it.presets import zero
zero.make()
```

```{code-cell} python
md = MarkdownIt("zero")
md.options
```

You can also override specific options:

```{code-cell} python
md = MarkdownIt("zero", {"maxNesting": 99})
md.options
```

```{code-cell} python
pprint(md.get_active_rules())
```

You can find all the parsing rules in the source code:
`parser_core.py`, `parser_block.py`,
`parser_inline.py`.

```{code-cell} python
pprint(md.get_all_rules())
```

Any of the parsing rules can be enabled/disabled, and these methods are "chainable":

```{code-cell} python
md.render("- __*emphasise this*__")
```

```{code-cell} python
md.enable(["list", "emphasis"]).render("- __*emphasise this*__")
```

You can temporarily modify rules with the `reset_rules` context manager.

```{code-cell} python
with md.reset_rules():
    md.disable("emphasis")
    print(md.render("__*emphasise this*__"))
md.render("__*emphasise this*__")
```

Additionally `renderInline` runs the parser with all block syntax rules disabled.

```{code-cell} python
md.renderInline("__*emphasise this*__")
```

### Typographic components

The `smartquotes` and `replacements` components are intended to improve typography:

`smartquotes` will convert basic quote marks to their opening and closing variants:

- 'single quotes' -> ‘single quotes’
- "double quotes" -> “double quotes”

`replacements` will replace particular text constructs:

- ``(c)``, ``(C)`` → ©
- ``(tm)``, ``(TM)`` → ™
- ``(r)``, ``(R)`` → ®
- ``(p)``, ``(P)`` → §
- ``+-`` → ±
- ``...`` → …
- ``?....`` → ?..
- ``!....`` → !..
- ``????????`` → ???
- ``!!!!!`` → !!!
- ``,,,`` → ,
- ``--`` → &ndash
- ``---`` → &mdash

Both of these components require typography to be turned on, as well as the components enabled:

```{code-cell} python
md = MarkdownIt("commonmark", {"typographer": True})
md.enable(["replacements", "smartquotes"])
md.render("'single quotes' (c)")
```

### Linkify

The `linkify` component requires that [linkify-it-py](https://github.com/tsutsu3/linkify-it-py) be installed (e.g. *via* `pip install markdown-it-py[linkify]`).
This allows URI autolinks to be identified, without the need for enclosing in `<>` brackets:

```{code-cell} python
md = MarkdownIt("commonmark", {"linkify": True})
md.enable(["linkify"])
md.render("github.com")
```

### Plugins load

Plugins load collections of additional syntax rules and render methods into the parser.
A number of useful plugins are available in [`mdit_py_plugins`](https://github.com/executablebooks/mdit-py-plugins) (see [the plugin list](./plugins.md)),
or you can create your own (following the [markdown-it design principles](./architecture.md)).

```{code-cell} python
from markdown_it import MarkdownIt
import mdit_py_plugins
from mdit_py_plugins.front_matter import front_matter_plugin
from mdit_py_plugins.footnote import footnote_plugin

md = (
    MarkdownIt()
    .use(front_matter_plugin)
    .use(footnote_plugin)
    .enable('table')
)
text = ("""
---
a: 1
---

a | b
- | -
1 | 2

A footnote [^1]

[^1]: some details
""")
md.render(text)
```

## The Token Stream

+++

Before rendering, the text is parsed to a flat token stream of block level syntax elements, with nesting defined by opening (1) and closing (-1) attributes:

```{code-cell} python
md = MarkdownIt("commonmark")
tokens = md.parse("""
Here's some *text*

1. a list

> a *quote*""")
[(t.type, t.nesting) for t in tokens]
```

Naturally all openings should eventually be closed,
such that:

```{code-cell} python
sum([t.nesting for t in tokens]) == 0
```

All tokens are the same class, which can also be created outside the parser:

```{code-cell} python
tokens[0]
```

```{code-cell} python
from markdown_it.token import Token
token = Token("paragraph_open", "p", 1, block=True, map=[1, 2])
token == tokens[0]
```

The `'inline'` type token contain the inline tokens as children:

```{code-cell} python
tokens[1]
```

You can serialize a token (and its children) to a JSONable dictionary using:

```{code-cell} python
print(tokens[1].as_dict())
```

This dictionary can also be deserialized:

```{code-cell} python
Token.from_dict(tokens[1].as_dict())
```

### Creating a syntax tree

```{versionchanged} 0.7.0
`nest_tokens` and `NestedTokens` are deprecated and replaced by `SyntaxTreeNode`.
```

In some use cases it may be useful to convert the token stream into a syntax tree,
with opening/closing tokens collapsed into a single token that contains children.

```{code-cell} python
from markdown_it.tree import SyntaxTreeNode

md = MarkdownIt("commonmark")
tokens = md.parse("""
# Header

Here's some text and an image ![title](image.png)

1. a **list**

> a *quote*
""")

node = SyntaxTreeNode(tokens)
print(node.pretty(indent=2, show_text=True))
```

You can then use methods to traverse the tree

```{code-cell} python
node.children
```

```{code-cell} python
print(node[0])
node[0].next_sibling
```

## Renderers

+++

After the token stream is generated, it's passed to a [renderer](https://github.com/executablebooks/markdown-it-py/tree/master/markdown_it/renderer.py).
It then plays all the tokens, passing each to a rule with the same name as token type.

Renderer rules are located in `md.renderer.rules` and are simple functions
with the same signature:

```python
def function(renderer, tokens, idx, options, env):
  return htmlResult
```

+++

You can inject render methods into the instantiated render class.

```{code-cell} python
md = MarkdownIt("commonmark")

def render_em_open(self, tokens, idx, options, env):
    return '<em class="myclass">'

md.add_render_rule("em_open", render_em_open)
md.render("*a*")
```

This is a slight change to the JS version, where the renderer argument is at the end.
Also `add_render_rule` method is specific to Python, rather than adding directly to the `md.renderer.rules`, this ensures the method is bound to the renderer.

+++

You can also subclass a render and add the method there:

```{code-cell} python
from markdown_it.renderer import RendererHTML

class MyRenderer(RendererHTML):
    def em_open(self, tokens, idx, options, env):
        return '<em class="myclass">'

md = MarkdownIt("commonmark", renderer_cls=MyRenderer)
md.render("*a*")
```

Plugins can support multiple render types, using the `__ouput__` attribute (this is currently a Python only feature).

```{code-cell} python
from markdown_it.renderer import RendererHTML

class MyRenderer1(RendererHTML):
    __output__ = "html1"

class MyRenderer2(RendererHTML):
    __output__ = "html2"

def plugin(md):
    def render_em_open1(self, tokens, idx, options, env):
        return '<em class="myclass1">'
    def render_em_open2(self, tokens, idx, options, env):
        return '<em class="myclass2">'
    md.add_render_rule("em_open", render_em_open1, fmt="html1")
    md.add_render_rule("em_open", render_em_open2, fmt="html2")

md = MarkdownIt("commonmark", renderer_cls=MyRenderer1).use(plugin)
print(md.render("*a*"))

md = MarkdownIt("commonmark", renderer_cls=MyRenderer2).use(plugin)
print(md.render("*a*"))
```

Here's a more concrete example; let's replace images with vimeo links to player's iframe:

```{code-cell} python
import re
from markdown_it import MarkdownIt

vimeoRE = re.compile(r'^https?:\/\/(www\.)?vimeo.com\/(\d+)($|\/)')

def render_vimeo(self, tokens, idx, options, env):
    token = tokens[idx]

    if vimeoRE.match(token.attrs["src"]):

        ident = vimeoRE.match(token.attrs["src"])[2]

        return ('<div class="embed-responsive embed-responsive-16by9">\n' +
               '  <iframe class="embed-responsive-item" src="//player.vimeo.com/video/' +
                ident + '"></iframe>\n' +
               '</div>\n')
    return self.image(tokens, idx, options, env)

md = MarkdownIt("commonmark")
md.add_render_rule("image", render_vimeo)
print(md.render("![](https://www.vimeo.com/123)"))
```

Here is another example, how to add `target="_blank"` to all links:

```{code-cell} python
from markdown_it import MarkdownIt

def render_blank_link(self, tokens, idx, options, env):
    tokens[idx].attrSet("target", "_blank")

    # pass token to default renderer.
    return self.renderToken(tokens, idx, options, env)

md = MarkdownIt("commonmark")
md.add_render_rule("link_open", render_blank_link)
print(md.render("[a]\n\n[a]: b"))
```