diff options
Diffstat (limited to '')
-rw-r--r-- | _doc/detail.ryd | 298 |
1 files changed, 298 insertions, 0 deletions
diff --git a/_doc/detail.ryd b/_doc/detail.ryd new file mode 100644 index 0000000..26ce7f0 --- /dev/null +++ b/_doc/detail.ryd @@ -0,0 +1,298 @@ +version: 0.2 +text: md +pdf: false +--- | +# Details + +- support for simple lists as mapping keys by transforming these to + tuples +- `!!omap` generates ordereddict (C) on Python 2, + collections.OrderedDict on Python 3, and `!!omap` is generated for + these types. +- Tests whether the C yaml library is installed as well as the header + files. That library doesn\'t generate CommentTokens, so it cannot be + used to do round trip editing on comments. It can be used to speed + up normal processing (so you don\'t need to install `ruamel.yaml` + and `PyYaml`). See the section *Optional requirements*. +- Basic support for multiline strings with preserved newlines and + chomping ( \'`|`\', \'`|+`\', \'`|-`\' ). As this subclasses the + string type the information is lost on reassignment. (This might be + changed in the future so that the preservation/folding/chomping is + part of the parent container, like comments). +- anchors names that are hand-crafted (not of the form`idNNN`) are + preserved +- [merges](http://yaml.org/type/merge.html) in dictionaries are + preserved +- adding/replacing comments on block-style sequences and mappings with + smart column positioning +- collection objects (when read in via RoundTripParser) have an `lc` + property that contains line and column info `lc.line` and `lc.col`. + Individual positions for mappings and sequences can also be + retrieved (`lc.key('a')`, `lc.value('a')` resp. `lc.item(3)`) +- preservation of whitelines after block scalars. Contributed by Sam + Thursfield. + +*In the following examples it is assumed you have done something like:*: + + from ruamel.yaml import YAML + yaml = YAML() + +*if not explicitly specified.* + +## Indentation of block sequences + +Although ruamel.yaml doesn\'t preserve individual indentations of block +sequence items, it does properly dump: + + x: + - b: 1 + - 2 + +back to: + + x: + - b: 1 + - 2 + +if you specify `yaml.indent(sequence=4)` (indentation is counted to the +beginning of the sequence element). + +PyYAML (and older versions of ruamel.yaml) gives you non-indented +scalars (when specifying default_flow_style=False): + + x: + - b: 1 + - 2 + +You can use `mapping=4` to also have the mappings values indented. The +dump also observes an additional `offset=2` setting that can be used to +push the dash inwards, *within the space defined by* `sequence`. + +The above example with the often seen +`yaml.indent(mapping=2, sequence=4, offset=2)` indentation: + + x: + y: + - b: 1 + - 2 + +The defaults are as if you specified +`yaml.indent(mapping=2, sequence=2, offset=0)`. + +If the `offset` equals `sequence`, there is not enough room for the dash +and the space that has to follow it. In that case the element itself +would normally be pushed to the next line (and older versions of +`ruamel.yaml` did so). But this is prevented from happening. However the +`indent` level is what is used for calculating the cumulative indent for +deeper levels and specifying `sequence=3` resp. `offset=2`, might give +correct, but counter-intuitive results. + +**It is best to always have** `sequence >= offset + 2` **but this is not +enforced**. Depending on your structure, not following this advice +**might lead to invalid output**. + +### Inconsistently indented YAML + +If your input is inconsistently indented, such indentation cannot be +preserved. The first round-trip will make it consistent/normalize it. +Here are some inconsistently indented YAML examples. + +`b` indented 3, `c` indented 4 positions: + + a: + b: + c: 1 + +Top level sequence is indented 2 without offset, the other sequence 4 +(with offset 2): + + - key: + - foo + - bar + +### Indenting using `typ="safe"` + +The C based emitter doesn't have the fine control, distinguishing between +block mappings and sequences. Do only use the `pure` Python versions +of the dumper if you want to have that sort of control. + + +## Positioning ':' in top level mappings, prefixing ':' + +If you want your toplevel mappings to look like: + + library version: 1 + comment : | + this is just a first try + +then set `yaml.top_level_colon_align = True` (and `yaml.indent = 4`). +`True` causes calculation based on the longest key, but you can also +explicitly set a number. + +If you want an extra space between a mapping key and the colon specify +`yaml.prefix_colon = ' '`: + + - https://myurl/abc.tar.xz : 23445 + # ^ extra space here + - https://myurl/def.tar.xz : 944 + +If you combine `prefix_colon` with `top_level_colon_align`, the top +level mapping doesn\'t get the extra prefix. If you want that anyway, +specify `yaml.top_level_colon_align = 12` where `12` has to be an +integer that is one more than length of the widest key. + +### Document version support + +In YAML a document version can be explicitly set by using: + + %YAML 1.x + +before the document start (at the top or before a `---`). For +`ruamel.yaml` x has to be 1 or 2. If no explicit version is set [version +1.2](http://www.yaml.org/spec/1.2/spec.html) is assumed (which has been +released in 2009). + +The 1.2 version does **not** support: + +- sexagesimals like `12:34:56` +- octals that start with 0 only: like `012` for number 10 (`0o12` + **is** supported by YAML 1.2) +- Unquoted Yes and On as alternatives for True and No and Off for + False. + +If you cannot change your YAML files and you need them to load as 1.1 +you can load with `yaml.version = (1, 1)`, or the equivalent (version +can be a tuple, list or string) `yaml.version = "1.1"` + +*If you cannot change your code, stick with ruamel.yaml==0.10.23 and let +me know if it would help to be able to set an environment variable.* + +This does not affect dump as ruamel.yaml never emitted sexagesimals, nor +octal numbers, and emitted booleans always as true resp. false + +### Round trip including comments + +The major motivation for this fork is the round-trip capability for +comments. The integration of the sources was just an initial step to +make this easier. + +#### adding/replacing comments + +Starting with version 0.8, you can add/replace comments on block style +collections (mappings/sequences resuting in Python dict/list). The basic +for for this is: +--- !python | + from __future__ import print_function + + import sys + import ruamel.yaml + + yaml = ruamel.yaml.YAML() # defaults to round-trip + + inp = """\ + abc: + - a # comment 1 + xyz: + a: 1 # comment 2 + b: 2 + c: 3 + d: 4 + e: 5 + f: 6 # comment 3 + """ + + data = yaml.load(inp) + data['abc'].append('b') + data['abc'].yaml_add_eol_comment('comment 4', 1) # takes column of comment 1 + data['xyz'].yaml_add_eol_comment('comment 5', 'c') # takes column of comment 2 + data['xyz'].yaml_add_eol_comment('comment 6', 'e') # takes column of comment 3 + data['xyz'].yaml_add_eol_comment('comment 7\n\n# that\'s all folks', 'd', column=20) + + yaml.dump(data, sys.stdout) +--- !stdout | +Resulting in:: +--- !comment | + abc: + - a # comment 1 + - b # comment 4 + xyz: + a: 1 # comment 2 + b: 2 + c: 3 # comment 5 + d: 4 # comment 7 + e: 5 # comment 6 + f: 6 # comment 3 + +--- | +If the comment doesn\'t start with \'#\', this will be added. The key is +the element index for list, the actual key for dictionaries. As can be +seen from the example, the column to choose for a comment is derived +from the previous, next or preceding comment column (picking the first +one found). + +Make sure that the added comment is correct, in the sense that when it +contains newlines, the following is either an empty line or a line with +only spaces, or the first non-space is a `#`. + +# Config file formats + +There are only a few configuration file formats that are easily readable +and editable: JSON, INI/ConfigParser, YAML (XML is to cluttered to be +called easily readable). + +Unfortunately [JSON](http://www.json.org/) doesn\'t support comments, +and although there are some solutions with pre-processed filtering of +comments, there are no libraries that support round trip updating of +such commented files. + +INI files support comments, and the excellent +[ConfigObj](http://www.voidspace.org.uk/python/configobj.html) library +by Foord and Larosa even supports round trip editing with comment +preservation, nesting of sections and limited lists (within a value). +Retrieval of particular value format is explicit (and extensible). + +YAML has basic mapping and sequence structures as well as support for +ordered mappings and sets. It supports scalars various types including +dates and datetimes (missing in JSON). YAML has comments, but these are +normally thrown away. + +Block structured YAML is a clean and very human readable format. By +extending the Python YAML parser to support round trip preservation of +comments, it makes YAML a very good choice for configuration files that +are human readable and editable while at the same time interpretable and +modifiable by a program. + +# Extending + +There are normally six files involved when extending the roundtrip +capabilities: the reader, parser, composer and constructor to go from +YAML to Python and the resolver, representer, serializer and emitter to +go the other way. + +Extending involves keeping extra data around for the next process step, +eventuallly resulting in a different Python object (subclass or +alternative), that should behave like the original, but on the way from +Python to YAML generates the original (or at least something much +closer). + +# Smartening + +When you use round-tripping, then the complex data you get are already +subclasses of the built-in types. So you can patch in extra methods or +override existing ones. Some methods are already included and you can +do: + + yaml_str = """\ + a: + - b: + c: 42 + - d: + f: 196 + e: + g: 3.14 + """ + + + data = yaml.load(yaml_str) + + assert data.mlget(['a', 1, 'd', 'f'], list_ok=True) == 196 |