summaryrefslogtreecommitdiffstats
path: root/_doc/detail.rst
diff options
context:
space:
mode:
Diffstat (limited to '_doc/detail.rst')
-rw-r--r--_doc/detail.rst289
1 files changed, 289 insertions, 0 deletions
diff --git a/_doc/detail.rst b/_doc/detail.rst
new file mode 100644
index 0000000..2f7d682
--- /dev/null
+++ b/_doc/detail.rst
@@ -0,0 +1,289 @@
+*******
+Details
+*******
+
+
+
+- support for simple lists as mapping keys by transforming these to tuples
+- ``!!omap`` generates ordereddict (C) on Python 2, collections.OrderedDict
+ on Python 3, and ``!!omap`` is generated for these types.
+- Tests whether the C yaml library is installed as well as the header
+ files. That library doesn't generate CommentTokens, so it cannot be used to
+ do round trip editing on comments. It can be used to speed up normal
+ processing (so you don't need to install ``ruyaml`` and ``PyYaml``).
+ See the section *Optional requirements*.
+- Basic support for multiline strings with preserved newlines and
+ chomping ( '``|``', '``|+``', '``|-``' ). As this subclasses the string type
+ the information is lost on reassignment. (This might be changed
+ in the future so that the preservation/folding/chomping is part of the
+ parent container, like comments).
+- anchors names that are hand-crafted (not of the form``idNNN``) are preserved
+- `merges <http://yaml.org/type/merge.html>`_ in dictionaries are preserved
+- adding/replacing comments on block-style sequences and mappings
+ with smart column positioning
+- collection objects (when read in via RoundTripParser) have an ``lc``
+ property that contains line and column info ``lc.line`` and ``lc.col``.
+ Individual positions for mappings and sequences can also be retrieved
+ (``lc.key('a')``, ``lc.value('a')`` resp. ``lc.item(3)``)
+- preservation of whitelines after block scalars. Contributed by Sam Thursfield.
+
+*In the following examples it is assumed you have done something like:*::
+
+ from ruyaml import YAML
+ yaml = YAML()
+
+*if not explicitly specified.*
+
+Indentation of block sequences
+==============================
+
+Although ruyaml doesn't preserve individual indentations of block sequence
+items, it does properly dump::
+
+ x:
+ - b: 1
+ - 2
+
+back to::
+
+ x:
+ - b: 1
+ - 2
+
+if you specify ``yaml.indent(sequence=4)`` (indentation is counted to the
+beginning of the sequence element).
+
+PyYAML (and older versions of ruyaml) gives you non-indented
+scalars (when specifying default_flow_style=False)::
+
+ x:
+ - b: 1
+ - 2
+
+You can use ``mapping=4`` to also have the mappings values indented.
+The dump also observes an additional ``offset=2`` setting that
+can be used to push the dash inwards, *within the space defined by* ``sequence``.
+
+The above example with the often seen ``yaml.indent(mapping=2, sequence=4, offset=2)``
+indentation::
+
+ x:
+ y:
+ - b: 1
+ - 2
+
+The defaults are as if you specified ``yaml.indent(mapping=2, sequence=2, offset=0)``.
+
+If the ``offset`` equals ``sequence``, there is not enough
+room for the dash and the space that has to follow it. In that case the
+element itself would normally be pushed to the next line (and older versions
+of ruyaml did so). But this is
+prevented from happening. However the ``indent`` level is what is used
+for calculating the cumulative indent for deeper levels and specifying
+``sequence=3`` resp. ``offset=2``, might give correct, but counter
+intuitive results.
+
+**It is best to always have** ``sequence >= offset + 2``
+**but this is not enforced**. Depending on your structure, not following
+this advice **might lead to invalid output**.
+
+Inconsistently indented YAML
+++++++++++++++++++++++++++++
+
+If your input is inconsistently indented, such indentation cannot be preserved.
+The first round-trip will make it consistent/normalize it. Here are some
+inconsistently indented YAML examples.
+
+``b`` indented 3, ``c`` indented 4 positions::
+
+ a:
+ b:
+ c: 1
+
+Top level sequence is indented 2 without offset, the other sequence 4 (with offset 2)::
+
+ - key:
+ - foo
+ - bar
+
+
+Positioning ':' in top level mappings, prefixing ':'
+====================================================
+
+If you want your toplevel mappings to look like::
+
+ library version: 1
+ comment : |
+ this is just a first try
+
+then set ``yaml.top_level_colon_align = True``
+(and ``yaml.indent = 4``). ``True`` causes calculation based on the longest key,
+but you can also explicitly set a number.
+
+If you want an extra space between a mapping key and the colon specify
+``yaml.prefix_colon = ' '``::
+
+ - https://myurl/abc.tar.xz : 23445
+ # ^ extra space here
+ - https://myurl/def.tar.xz : 944
+
+If you combine ``prefix_colon`` with ``top_level_colon_align``, the
+top level mapping doesn't get the extra prefix. If you want that
+anyway, specify ``yaml.top_level_colon_align = 12`` where ``12`` has to be an
+integer that is one more than length of the widest key.
+
+
+Document version support
+++++++++++++++++++++++++
+
+In YAML a document version can be explicitly set by using::
+
+ %YAML 1.x
+
+before the document start (at the top or before a
+``---``). For ``ruyaml`` x has to be 1 or 2. If no explicit
+version is set `version 1.2 <http://www.yaml.org/spec/1.2/spec.html>`_
+is assumed (which has been released in 2009).
+
+The 1.2 version does **not** support:
+
+- sexagesimals like ``12:34:56``
+- octals that start with 0 only: like ``012`` for number 10 (``0o12`` **is**
+ supported by YAML 1.2)
+- Unquoted Yes and On as alternatives for True and No and Off for False.
+
+If you cannot change your YAML files and you need them to load as 1.1
+you can load with ``yaml.version = (1, 1)``,
+or the equivalent (version can be a tuple, list or string) ``yaml.version = "1.1"``
+
+*If you cannot change your code, stick with ruyaml==0.10.23 and let
+me know if it would help to be able to set an environment variable.*
+
+This does not affect dump as ruyaml never emitted sexagesimals, nor
+octal numbers, and emitted booleans always as true resp. false
+
+Round trip including comments
++++++++++++++++++++++++++++++
+
+The major motivation for this fork is the round-trip capability for
+comments. The integration of the sources was just an initial step to
+make this easier.
+
+adding/replacing comments
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Starting with version 0.8, you can add/replace comments on block style
+collections (mappings/sequences resuting in Python dict/list). The basic
+for for this is::
+
+ from __future__ import print_function
+
+ import sys
+ import ruyaml
+
+ yaml = ruyaml.YAML() # defaults to round-trip
+
+ inp = """\
+ abc:
+ - a # comment 1
+ xyz:
+ a: 1 # comment 2
+ b: 2
+ c: 3
+ d: 4
+ e: 5
+ f: 6 # comment 3
+ """
+
+ data = yaml.load(inp)
+ data['abc'].append('b')
+ data['abc'].yaml_add_eol_comment('comment 4', 1) # takes column of comment 1
+ data['xyz'].yaml_add_eol_comment('comment 5', 'c') # takes column of comment 2
+ data['xyz'].yaml_add_eol_comment('comment 6', 'e') # takes column of comment 3
+ data['xyz'].yaml_add_eol_comment('comment 7', 'd', column=20)
+
+ yaml.dump(data, sys.stdout)
+
+Resulting in::
+
+ abc:
+ - a # comment 1
+ - b # comment 4
+ xyz:
+ a: 1 # comment 2
+ b: 2
+ c: 3 # comment 5
+ d: 4 # comment 7
+ e: 5 # comment 6
+ f: 6 # comment 3
+
+If the comment doesn't start with '#', this will be added. The key is
+the element index for list, the actual key for dictionaries. As can be seen
+from the example, the column to choose for a comment is derived
+from the previous, next or preceding comment column (picking the first one
+found).
+
+Config file formats
++++++++++++++++++++
+
+There are only a few configuration file formats that are easily
+readable and editable: JSON, INI/ConfigParser, YAML (XML is to cluttered
+to be called easily readable).
+
+Unfortunately `JSON <http://www.json.org/>`_ doesn't support comments,
+and although there are some solutions with pre-processed filtering of
+comments, there are no libraries that support round trip updating of
+such commented files.
+
+INI files support comments, and the excellent `ConfigObj
+<http://www.voidspace.org.uk/python/configobj.html>`_ library by Foord
+and Larosa even supports round trip editing with comment preservation,
+nesting of sections and limited lists (within a value). Retrieval of
+particular value format is explicit (and extensible).
+
+YAML has basic mapping and sequence structures as well as support for
+ordered mappings and sets. It supports scalars various types
+including dates and datetimes (missing in JSON).
+YAML has comments, but these are normally thrown away.
+
+Block structured YAML is a clean and very human readable
+format. By extending the Python YAML parser to support round trip
+preservation of comments, it makes YAML a very good choice for
+configuration files that are human readable and editable while at
+the same time interpretable and modifiable by a program.
+
+Extending
++++++++++
+
+There are normally six files involved when extending the roundtrip
+capabilities: the reader, parser, composer and constructor to go from YAML to
+Python and the resolver, representer, serializer and emitter to go the other
+way.
+
+Extending involves keeping extra data around for the next process step,
+eventuallly resulting in a different Python object (subclass or alternative),
+that should behave like the original, but on the way from Python to YAML
+generates the original (or at least something much closer).
+
+Smartening
+++++++++++
+
+When you use round-tripping, then the complex data you get are
+already subclasses of the built-in types. So you can patch
+in extra methods or override existing ones. Some methods are already
+included and you can do::
+
+ yaml_str = """\
+ a:
+ - b:
+ c: 42
+ - d:
+ f: 196
+ e:
+ g: 3.14
+ """
+
+
+ data = yaml.load(yaml_str)
+
+ assert data.mlget(['a', 1, 'd', 'f'], list_ok=True) == 196