diff options
Diffstat (limited to '')
-rw-r--r-- | _doc/detail.rst | 289 |
1 files changed, 289 insertions, 0 deletions
diff --git a/_doc/detail.rst b/_doc/detail.rst new file mode 100644 index 0000000..2f7d682 --- /dev/null +++ b/_doc/detail.rst @@ -0,0 +1,289 @@ +******* +Details +******* + + + +- support for simple lists as mapping keys by transforming these to tuples +- ``!!omap`` generates ordereddict (C) on Python 2, collections.OrderedDict + on Python 3, and ``!!omap`` is generated for these types. +- Tests whether the C yaml library is installed as well as the header + files. That library doesn't generate CommentTokens, so it cannot be used to + do round trip editing on comments. It can be used to speed up normal + processing (so you don't need to install ``ruyaml`` and ``PyYaml``). + See the section *Optional requirements*. +- Basic support for multiline strings with preserved newlines and + chomping ( '``|``', '``|+``', '``|-``' ). As this subclasses the string type + the information is lost on reassignment. (This might be changed + in the future so that the preservation/folding/chomping is part of the + parent container, like comments). +- anchors names that are hand-crafted (not of the form``idNNN``) are preserved +- `merges <http://yaml.org/type/merge.html>`_ in dictionaries are preserved +- adding/replacing comments on block-style sequences and mappings + with smart column positioning +- collection objects (when read in via RoundTripParser) have an ``lc`` + property that contains line and column info ``lc.line`` and ``lc.col``. + Individual positions for mappings and sequences can also be retrieved + (``lc.key('a')``, ``lc.value('a')`` resp. ``lc.item(3)``) +- preservation of whitelines after block scalars. Contributed by Sam Thursfield. + +*In the following examples it is assumed you have done something like:*:: + + from ruyaml import YAML + yaml = YAML() + +*if not explicitly specified.* + +Indentation of block sequences +============================== + +Although ruyaml doesn't preserve individual indentations of block sequence +items, it does properly dump:: + + x: + - b: 1 + - 2 + +back to:: + + x: + - b: 1 + - 2 + +if you specify ``yaml.indent(sequence=4)`` (indentation is counted to the +beginning of the sequence element). + +PyYAML (and older versions of ruyaml) gives you non-indented +scalars (when specifying default_flow_style=False):: + + x: + - b: 1 + - 2 + +You can use ``mapping=4`` to also have the mappings values indented. +The dump also observes an additional ``offset=2`` setting that +can be used to push the dash inwards, *within the space defined by* ``sequence``. + +The above example with the often seen ``yaml.indent(mapping=2, sequence=4, offset=2)`` +indentation:: + + x: + y: + - b: 1 + - 2 + +The defaults are as if you specified ``yaml.indent(mapping=2, sequence=2, offset=0)``. + +If the ``offset`` equals ``sequence``, there is not enough +room for the dash and the space that has to follow it. In that case the +element itself would normally be pushed to the next line (and older versions +of ruyaml did so). But this is +prevented from happening. However the ``indent`` level is what is used +for calculating the cumulative indent for deeper levels and specifying +``sequence=3`` resp. ``offset=2``, might give correct, but counter +intuitive results. + +**It is best to always have** ``sequence >= offset + 2`` +**but this is not enforced**. Depending on your structure, not following +this advice **might lead to invalid output**. + +Inconsistently indented YAML +++++++++++++++++++++++++++++ + +If your input is inconsistently indented, such indentation cannot be preserved. +The first round-trip will make it consistent/normalize it. Here are some +inconsistently indented YAML examples. + +``b`` indented 3, ``c`` indented 4 positions:: + + a: + b: + c: 1 + +Top level sequence is indented 2 without offset, the other sequence 4 (with offset 2):: + + - key: + - foo + - bar + + +Positioning ':' in top level mappings, prefixing ':' +==================================================== + +If you want your toplevel mappings to look like:: + + library version: 1 + comment : | + this is just a first try + +then set ``yaml.top_level_colon_align = True`` +(and ``yaml.indent = 4``). ``True`` causes calculation based on the longest key, +but you can also explicitly set a number. + +If you want an extra space between a mapping key and the colon specify +``yaml.prefix_colon = ' '``:: + + - https://myurl/abc.tar.xz : 23445 + # ^ extra space here + - https://myurl/def.tar.xz : 944 + +If you combine ``prefix_colon`` with ``top_level_colon_align``, the +top level mapping doesn't get the extra prefix. If you want that +anyway, specify ``yaml.top_level_colon_align = 12`` where ``12`` has to be an +integer that is one more than length of the widest key. + + +Document version support +++++++++++++++++++++++++ + +In YAML a document version can be explicitly set by using:: + + %YAML 1.x + +before the document start (at the top or before a +``---``). For ``ruyaml`` x has to be 1 or 2. If no explicit +version is set `version 1.2 <http://www.yaml.org/spec/1.2/spec.html>`_ +is assumed (which has been released in 2009). + +The 1.2 version does **not** support: + +- sexagesimals like ``12:34:56`` +- octals that start with 0 only: like ``012`` for number 10 (``0o12`` **is** + supported by YAML 1.2) +- Unquoted Yes and On as alternatives for True and No and Off for False. + +If you cannot change your YAML files and you need them to load as 1.1 +you can load with ``yaml.version = (1, 1)``, +or the equivalent (version can be a tuple, list or string) ``yaml.version = "1.1"`` + +*If you cannot change your code, stick with ruyaml==0.10.23 and let +me know if it would help to be able to set an environment variable.* + +This does not affect dump as ruyaml never emitted sexagesimals, nor +octal numbers, and emitted booleans always as true resp. false + +Round trip including comments ++++++++++++++++++++++++++++++ + +The major motivation for this fork is the round-trip capability for +comments. The integration of the sources was just an initial step to +make this easier. + +adding/replacing comments +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Starting with version 0.8, you can add/replace comments on block style +collections (mappings/sequences resuting in Python dict/list). The basic +for for this is:: + + from __future__ import print_function + + import sys + import ruyaml + + yaml = ruyaml.YAML() # defaults to round-trip + + inp = """\ + abc: + - a # comment 1 + xyz: + a: 1 # comment 2 + b: 2 + c: 3 + d: 4 + e: 5 + f: 6 # comment 3 + """ + + data = yaml.load(inp) + data['abc'].append('b') + data['abc'].yaml_add_eol_comment('comment 4', 1) # takes column of comment 1 + data['xyz'].yaml_add_eol_comment('comment 5', 'c') # takes column of comment 2 + data['xyz'].yaml_add_eol_comment('comment 6', 'e') # takes column of comment 3 + data['xyz'].yaml_add_eol_comment('comment 7', 'd', column=20) + + yaml.dump(data, sys.stdout) + +Resulting in:: + + abc: + - a # comment 1 + - b # comment 4 + xyz: + a: 1 # comment 2 + b: 2 + c: 3 # comment 5 + d: 4 # comment 7 + e: 5 # comment 6 + f: 6 # comment 3 + +If the comment doesn't start with '#', this will be added. The key is +the element index for list, the actual key for dictionaries. As can be seen +from the example, the column to choose for a comment is derived +from the previous, next or preceding comment column (picking the first one +found). + +Config file formats ++++++++++++++++++++ + +There are only a few configuration file formats that are easily +readable and editable: JSON, INI/ConfigParser, YAML (XML is to cluttered +to be called easily readable). + +Unfortunately `JSON <http://www.json.org/>`_ doesn't support comments, +and although there are some solutions with pre-processed filtering of +comments, there are no libraries that support round trip updating of +such commented files. + +INI files support comments, and the excellent `ConfigObj +<http://www.voidspace.org.uk/python/configobj.html>`_ library by Foord +and Larosa even supports round trip editing with comment preservation, +nesting of sections and limited lists (within a value). Retrieval of +particular value format is explicit (and extensible). + +YAML has basic mapping and sequence structures as well as support for +ordered mappings and sets. It supports scalars various types +including dates and datetimes (missing in JSON). +YAML has comments, but these are normally thrown away. + +Block structured YAML is a clean and very human readable +format. By extending the Python YAML parser to support round trip +preservation of comments, it makes YAML a very good choice for +configuration files that are human readable and editable while at +the same time interpretable and modifiable by a program. + +Extending ++++++++++ + +There are normally six files involved when extending the roundtrip +capabilities: the reader, parser, composer and constructor to go from YAML to +Python and the resolver, representer, serializer and emitter to go the other +way. + +Extending involves keeping extra data around for the next process step, +eventuallly resulting in a different Python object (subclass or alternative), +that should behave like the original, but on the way from Python to YAML +generates the original (or at least something much closer). + +Smartening +++++++++++ + +When you use round-tripping, then the complex data you get are +already subclasses of the built-in types. So you can patch +in extra methods or override existing ones. Some methods are already +included and you can do:: + + yaml_str = """\ + a: + - b: + c: 42 + - d: + f: 196 + e: + g: 3.14 + """ + + + data = yaml.load(yaml_str) + + assert data.mlget(['a', 1, 'd', 'f'], list_ok=True) == 196 |