summaryrefslogtreecommitdiffstats
path: root/doc/docs/api.rst
blob: 4d330bf8de491d8f41735cef4cb7417bc546e337 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
.. -*- mode: rst -*-

=====================
The full Pygments API
=====================

This page describes the Pygments API.

High-level API
==============

.. module:: pygments

Functions from the :mod:`pygments` module:

.. function:: lex(code, lexer)

    Lex `code` with the `lexer` (must be a `Lexer` instance)
    and return an iterable of tokens. Currently, this only calls
    `lexer.get_tokens()`.

.. function:: format(tokens, formatter, outfile=None)

    Format a token stream (iterable of tokens) `tokens` with the
    `formatter` (must be a `Formatter` instance). The result is
    written to `outfile`, or if that is ``None``, returned as a
    string.

.. function:: highlight(code, lexer, formatter, outfile=None)

    This is the most high-level highlighting function.
    It combines `lex` and `format` in one function.


.. module:: pygments.lexers

Functions from :mod:`pygments.lexers`:

.. function:: get_lexer_by_name(alias, **options)

    Return an instance of a `Lexer` subclass that has `alias` in its
    aliases list. The lexer is given the `options` at its
    instantiation.

    Will raise :exc:`pygments.util.ClassNotFound` if no lexer with that alias is
    found.

.. function:: get_lexer_for_filename(fn, **options)

    Return a `Lexer` subclass instance that has a filename pattern
    matching `fn`. The lexer is given the `options` at its
    instantiation.

    Will raise :exc:`pygments.util.ClassNotFound` if no lexer for that filename
    is found.

.. function:: get_lexer_for_mimetype(mime, **options)

    Return a `Lexer` subclass instance that has `mime` in its mimetype
    list. The lexer is given the `options` at its instantiation.

    Will raise :exc:`pygments.util.ClassNotFound` if not lexer for that mimetype
    is found.

.. function:: load_lexer_from_file(filename, lexername="CustomLexer", **options)

    Return a `Lexer` subclass instance loaded from the provided file, relative
    to the current directory. The file is expected to contain a Lexer class
    named `lexername` (by default, CustomLexer). Users should be very careful with
    the input, because this method is equivalent to running eval on the input file.
    The lexer is given the `options` at its instantiation.

    :exc:`ClassNotFound` is raised if there are any errors loading the Lexer

    .. versionadded:: 2.2

.. function:: guess_lexer(text, **options)

    Return a `Lexer` subclass instance that's guessed from the text in
    `text`. For that, the :meth:`.analyse_text()` method of every known lexer
    class is called with the text as argument, and the lexer which returned the
    highest value will be instantiated and returned.

    :exc:`pygments.util.ClassNotFound` is raised if no lexer thinks it can
    handle the content.

.. function:: guess_lexer_for_filename(filename, text, **options)

    As :func:`guess_lexer()`, but only lexers which have a pattern in `filenames`
    or `alias_filenames` that matches `filename` are taken into consideration.

    :exc:`pygments.util.ClassNotFound` is raised if no lexer thinks it can
    handle the content.

.. function:: get_all_lexers()

    Return an iterable over all registered lexers, yielding tuples in the
    format::

    	(longname, tuple of aliases, tuple of filename patterns, tuple of mimetypes)

    .. versionadded:: 0.6

.. function:: find_lexer_class_by_name(alias)

    Return the `Lexer` subclass that has `alias` in its aliases list, without
    instantiating it.

    Will raise :exc:`pygments.util.ClassNotFound` if no lexer with that alias is
    found.

    .. versionadded:: 2.2

.. function:: find_lexer_class(name)

    Return the `Lexer` subclass that with the *name* attribute as given by
    the *name* argument.


.. module:: pygments.formatters

Functions from :mod:`pygments.formatters`:

.. function:: get_formatter_by_name(alias, **options)

    Return an instance of a :class:`.Formatter` subclass that has `alias` in its
    aliases list. The formatter is given the `options` at its instantiation.

    Will raise :exc:`pygments.util.ClassNotFound` if no formatter with that
    alias is found.

.. function:: get_formatter_for_filename(fn, **options)

    Return a :class:`.Formatter` subclass instance that has a filename pattern
    matching `fn`. The formatter is given the `options` at its instantiation.

    Will raise :exc:`pygments.util.ClassNotFound` if no formatter for that filename
    is found.

.. function:: load_formatter_from_file(filename, formattername="CustomFormatter", **options)

    Return a `Formatter` subclass instance loaded from the provided file, relative
    to the current directory. The file is expected to contain a Formatter class
    named ``formattername`` (by default, CustomFormatter). Users should be very
    careful with the input, because this method is equivalent to running eval
    on the input file. The formatter is given the `options` at its instantiation.

    :exc:`ClassNotFound` is raised if there are any errors loading the Formatter

    .. versionadded:: 2.2

.. module:: pygments.styles

Functions from :mod:`pygments.styles`:

.. function:: get_style_by_name(name)

    Return a style class by its short name. The names of the builtin styles
    are listed in :data:`pygments.styles.STYLE_MAP`.

    Will raise :exc:`pygments.util.ClassNotFound` if no style of that name is
    found.

.. function:: get_all_styles()

    Return an iterable over all registered styles, yielding their names.

    .. versionadded:: 0.6


.. module:: pygments.lexer

Lexers
======

The base lexer class from which all lexers are derived is:

.. class:: Lexer(**options)

    The constructor takes a \*\*keywords dictionary of options.
    Every subclass must first process its own options and then call
    the `Lexer` constructor, since it processes the `stripnl`,
    `stripall` and `tabsize` options.

    An example looks like this:

    .. sourcecode:: python

        def __init__(self, **options):
            self.compress = options.get('compress', '')
            Lexer.__init__(self, **options)

    As these options must all be specifiable as strings (due to the
    command line usage), there are various utility functions
    available to help with that, see `Option processing`_.

    .. method:: get_tokens(text)

        This method is the basic interface of a lexer. It is called by
        the `highlight()` function. It must process the text and return an
        iterable of ``(tokentype, value)`` pairs from `text`.

        Normally, you don't need to override this method. The default
        implementation processes the `stripnl`, `stripall` and `tabsize`
        options and then yields all tokens from `get_tokens_unprocessed()`,
        with the ``index`` dropped.

    .. method:: get_tokens_unprocessed(text)

        This method should process the text and return an iterable of
        ``(index, tokentype, value)`` tuples where ``index`` is the starting
        position of the token within the input text.

        This method must be overridden by subclasses.

    .. staticmethod:: analyse_text(text)

        A static method which is called for lexer guessing. It should analyse
        the text and return a float in the range from ``0.0`` to ``1.0``.
        If it returns ``0.0``, the lexer will not be selected as the most
        probable one, if it returns ``1.0``, it will be selected immediately.

        .. note:: You don't have to add ``@staticmethod`` to the definition of
                  this method, this will be taken care of by the Lexer's metaclass.

    For a list of known tokens have a look at the :doc:`tokens` page.

    A lexer also can have the following attributes (in fact, they are mandatory
    except `alias_filenames`) that are used by the builtin lookup mechanism.

    .. attribute:: name

        Full name for the lexer, in human-readable form.

    .. attribute:: aliases

        A list of short, unique identifiers that can be used to lookup
        the lexer from a list, e.g. using `get_lexer_by_name()`.

    .. attribute:: filenames

        A list of `fnmatch` patterns that match filenames which contain
        content for this lexer. The patterns in this list should be unique among
        all lexers.

    .. attribute:: alias_filenames

        A list of `fnmatch` patterns that match filenames which may or may not
        contain content for this lexer. This list is used by the
        :func:`.guess_lexer_for_filename()` function, to determine which lexers
        are then included in guessing the correct one. That means that
        e.g. every lexer for HTML and a template language should include
        ``\*.html`` in this list.

    .. attribute:: mimetypes

        A list of MIME types for content that can be lexed with this
        lexer.


.. module:: pygments.formatter

Formatters
==========

A formatter is derived from this class:


.. class:: Formatter(**options)

    As with lexers, this constructor processes options and then must call the
    base class :meth:`__init__`.

    The :class:`Formatter` class recognizes the options `style`, `full` and
    `title`.  It is up to the formatter class whether it uses them.

    .. method:: get_style_defs(arg='')

        This method must return statements or declarations suitable to define
        the current style for subsequent highlighted text (e.g. CSS classes
        in the `HTMLFormatter`).

        The optional argument `arg` can be used to modify the generation and
        is formatter dependent (it is standardized because it can be given on
        the command line).

        This method is called by the ``-S`` :doc:`command-line option <cmdline>`,
        the `arg` is then given by the ``-a`` option.

    .. method:: format(tokensource, outfile)

        This method must format the tokens from the `tokensource` iterable and
        write the formatted version to the file object `outfile`.

        Formatter options can control how exactly the tokens are converted.

    .. versionadded:: 0.7
       A formatter must have the following attributes that are used by the
       builtin lookup mechanism.

    .. attribute:: name

        Full name for the formatter, in human-readable form.

    .. attribute:: aliases

        A list of short, unique identifiers that can be used to lookup
        the formatter from a list, e.g. using :func:`.get_formatter_by_name()`.

    .. attribute:: filenames

        A list of :mod:`fnmatch` patterns that match filenames for which this
        formatter can produce output. The patterns in this list should be unique
        among all formatters.


.. module:: pygments.util

Option processing
=================

The :mod:`pygments.util` module has some utility functions usable for processing
command line options. All of the following functions get values from a
dictionary of options. If the value is already in the type expected by the
option, it is returned as-is. Otherwise, if the value is a string, it is first
converted to the expected type if possible.

.. exception:: OptionError

    This exception will be raised by all option processing functions if
    the type or value of the argument is not correct.

.. function:: get_bool_opt(options, optname, default=None)

    Intuitively, this is `options.get(optname, default)`, but restricted to
    Boolean value. The Booleans can be represented as string, in order to accept
    Boolean value from the command line arguments. If the key `optname` is
    present in the dictionary `options` and is not associated with a Boolean,
    raise an `OptionError`. If it is absent, `default` is returned instead.

    The valid string values for ``True`` are ``1``, ``yes``, ``true`` and
    ``on``, the ones for ``False`` are ``0``, ``no``, ``false`` and ``off``
    (matched case-insensitively).

.. function:: get_int_opt(options, optname, default=None)

    As :func:`get_bool_opt`, but interpret the value as an integer.

.. function:: get_list_opt(options, optname, default=None)

    If the key `optname` from the dictionary `options` is a string,
    split it at whitespace and return it. If it is already a list
    or a tuple, it is returned as a list.

.. function:: get_choice_opt(options, optname, allowed, default=None)

    If the key `optname` from the dictionary is not in the sequence
    `allowed`, raise an error, otherwise return it.

    .. versionadded:: 0.8