Coverage for src/debputy/lsp/vendoring/_deb822_repro/__init__.py: 100%
3 statements
« prev ^ index » next coverage.py v7.2.7, created at 2024-04-07 12:14 +0200
« prev ^ index » next coverage.py v7.2.7, created at 2024-04-07 12:14 +0200
1# The "from X import Y as Y" looks weird, but we are stuck in a fight
2# between mypy and pylint in the CI.
3#
4# mypy --strict insists on either of following for re-exporting
5# 1) Do a "from debian._deb822_repro.X import *"
6# 2) Do a "from .X import Y"
7# 3) Do a "from debian._deb822_repro.X import Y as Z"
8#
9# pylint on the CI fails on relative imports (it assumes "lib" is a
10# part of the python package name in relative imports). This rules
11# out 2) from the mypy list. The use of 1) would cause overlapping
12# imports (and also it felt prudent to import only what was exported).
13#
14# This left 3) as the only option for now, which pylint then complains
15# about (not unreasonably in general). Unfortunately, we can disable
16# that warning in this work around. But once 2) becomes an option
17# without pylint tripping over itself on the CI, then it considerably
18# better than this approach.
19#
21""" Round-trip safe dictionary-like interfaces to RFC822-like files
23This module is a round-trip safe API for working with RFC822-like Debian data
24formats. It is primarily aimed files managed by humans, like debian/control.
25While it is be able to process any Deb822 file, you might find the debian.deb822
26module better suited for larger files such as the `Packages` and `Sources`
27from the Debian archive due to reasons explained below.
29Being round-trip safe means that this module will faithfully preserve the original
30formatting including whitespace and comments from the input where not modified.
31A concrete example::
33 >>> from debian._deb822_repro import parse_deb822_file
34 >>> example_deb822_paragraph = '''
35 ... Package: foo
36 ... # Field comment (because it becomes just before a field)
37 ... Section: main/devel
38 ... Depends: libfoo,
39 ... # Inline comment (associated with the next line)
40 ... libbar,
41 ... '''
42 >>> deb822_file = parse_deb822_file(example_deb822_paragraph.splitlines())
43 >>> paragraph = next(iter(deb822_file))
44 >>> paragraph['Section'] = 'devel'
45 >>> output = deb822_file.dump()
46 >>> output == example_deb822_paragraph.replace('Section: main/devel', 'Section: devel')
47 True
49This makes it particularly good for automated changes/corrections to files (partly)
50maintained by humans.
52Compared to debian.deb822
53-------------------------
55The round-trip safe API is primarily useful when your program is editing files
56and the file in question is (likely) to be hand-edited or formatted directly by
57human maintainers. This includes files like debian/control and the
58debian/copyright using the "DEP-5" format.
60The round-trip safe API also supports parsing and working with invalid files.
61This enables programs to work on the file in cases where the file was a left
62with an error in an attempt to correct it (or ignore it).
64On the flip side, the debian.deb822 module generally uses less memory than the
65round trip safe API. In some cases, it will also have faster data structures
66because its internal data structures are simpler. Accordingly, when you are doing
67read-only work or/and working with large files a la the Packages or Sources
68files from the Debian archive, then the round-trip safe API either provides no
69advantages or its trade-offs might show up in performance statistics.
71The memory and runtime performance difference should generally be constant for
72valid files but not necessarily a small one. For invalid files, some operations
73can degrade in runtime performance in particular cases (memory performance for
74invalid files are comparable to that of valid files).
76Converting from debian.deb822
77=============================
79The following is a short example for how to migrate from debian.deb822 to
80the round-trip safe API. Given the following source text::
82 >>> dctrl_input = b'''
83 ... Source: foo
84 ... Build-Depends: debhelper-compat (= 13)
85 ...
86 ... Package: bar
87 ... Architecture: any
88 ... Depends: ${misc:Depends},
89 ... ${shlibs:Depends},
90 ... Description: provides some exciting feature
91 ... yada yada yada
92 ... .
93 ... more deskription with a misspelling
94 ... '''.lstrip() # To remove the leading newline
95 >>> # A few definitions to emulate file I/O (would be different in the program)
96 >>> import contextlib, os
97 >>> @contextlib.contextmanager
98 ... def open_input():
99 ... # Works with and without keepends=True.
100 ... # Keep the ends here to truly emulate an open file.
101 ... yield dctrl_input.splitlines(keepends=True)
102 >>> def open_output():
103 ... return open(os.devnull, 'wb')
105With debian.deb822, your code might look like this::
107 >>> from debian.deb822 import Deb822
108 >>> with open_input() as in_fd, open_output() as out_fd:
109 ... for paragraph in Deb822.iter_paragraphs(in_fd):
110 ... if 'Description' not in paragraph:
111 ... continue
112 ... description = paragraph['Description']
113 ... # Fix typo
114 ... paragraph['Description'] = description.replace('deskription', 'description')
115 ... paragraph.dump(out_fd)
117With the round-trip safe API, the rewrite would look like this::
119 >>> from debian._deb822_repro import parse_deb822_file
120 >>> with open_input() as in_fd, open_output() as out_fd:
121 ... parsed_file = parse_deb822_file(in_fd)
122 ... for paragraph in parsed_file:
123 ... if 'Description' not in paragraph:
124 ... continue
125 ... description = paragraph['Description']
126 ... # Fix typo
127 ... paragraph['Description'] = description.replace('deskription', 'description')
128 ... parsed_file.dump(out_fd)
130Key changes are:
132 1. Imports are different.
133 2. Deb822.iter_paragraphs is replaced by parse_deb822_file and a reference to
134 its return value is kept for later.
135 3. Instead of dumping paragraphs one by one, the return value from
136 parse_deb822_file is dumped at the end.
138 - The round-trip safe api does support "per-paragraph" but formatting
139 and comments between paragraphs would be lost in the output. This may
140 be an acceptable tradeoff or desired for some cases.
142Note that the round trip safe API does not accept all the same parameters as the
143debian.deb822 module does. Often this is because the feature is not relevant for
144the round-trip safe API (e.g., python-apt cannot be used as it discard comments)
145or is obsolete in the debian.deb822 module and therefore omitted.
147For list based fields, you may want to have a look at the
148Deb822ParagraphElement.as_interpreted_dict_view method.
150Stability of this API
151---------------------
153The API is subject to change based on feedback from early adopters and beta
154testers. That said, the code for valid files is unlikely to change in
155a backwards incompatible way.
157Things that might change in an incompatible way include:
158 * Whether invalid files are accepted (parsed without errors) by default.
159 (currently they are)
160 * How invalid files are parsed. As an example, currently a syntax error acts
161 as a paragraph separator. Whether it should is open to debate.
163"""
165# pylint: disable=useless-import-alias
166from .parsing import (
167 parse_deb822_file as parse_deb822_file,
168 LIST_SPACE_SEPARATED_INTERPRETATION as LIST_SPACE_SEPARATED_INTERPRETATION,
169 LIST_COMMA_SEPARATED_INTERPRETATION as LIST_COMMA_SEPARATED_INTERPRETATION,
170 Interpretation as Interpretation,
171 # Primarily for documentation purposes / help()
172 Deb822FileElement as Deb822FileElement,
173 Deb822NoDuplicateFieldsParagraphElement,
174 Deb822ParagraphElement as Deb822ParagraphElement,
175)
176from .types import (
177 AmbiguousDeb822FieldKeyError as AmbiguousDeb822FieldKeyError,
178 SyntaxOrParseError,
179)
181__all__ = [
182 "parse_deb822_file",
183 "AmbiguousDeb822FieldKeyError",
184 "LIST_SPACE_SEPARATED_INTERPRETATION",
185 "LIST_COMMA_SEPARATED_INTERPRETATION",
186 "Interpretation",
187 "Deb822FileElement",
188 "Deb822NoDuplicateFieldsParagraphElement",
189 "Deb822ParagraphElement",
190 "SyntaxOrParseError",
191]