Coverage for src/debputy/lsp/vendoring/_deb822_repro/__init__.py: 100%

3 statements  

« prev     ^ index     » next       coverage.py v7.2.7, created at 2024-04-07 12:14 +0200

1# The "from X import Y as Y" looks weird, but we are stuck in a fight 

2# between mypy and pylint in the CI. 

3# 

4# mypy --strict insists on either of following for re-exporting 

5# 1) Do a "from debian._deb822_repro.X import *" 

6# 2) Do a "from .X import Y" 

7# 3) Do a "from debian._deb822_repro.X import Y as Z" 

8# 

9# pylint on the CI fails on relative imports (it assumes "lib" is a 

10# part of the python package name in relative imports). This rules 

11# out 2) from the mypy list. The use of 1) would cause overlapping 

12# imports (and also it felt prudent to import only what was exported). 

13# 

14# This left 3) as the only option for now, which pylint then complains 

15# about (not unreasonably in general). Unfortunately, we can disable 

16# that warning in this work around. But once 2) becomes an option 

17# without pylint tripping over itself on the CI, then it considerably 

18# better than this approach. 

19# 

20 

21""" Round-trip safe dictionary-like interfaces to RFC822-like files 

22 

23This module is a round-trip safe API for working with RFC822-like Debian data 

24formats. It is primarily aimed files managed by humans, like debian/control. 

25While it is be able to process any Deb822 file, you might find the debian.deb822 

26module better suited for larger files such as the `Packages` and `Sources` 

27from the Debian archive due to reasons explained below. 

28 

29Being round-trip safe means that this module will faithfully preserve the original 

30formatting including whitespace and comments from the input where not modified. 

31A concrete example:: 

32 

33 >>> from debian._deb822_repro import parse_deb822_file 

34 >>> example_deb822_paragraph = ''' 

35 ... Package: foo 

36 ... # Field comment (because it becomes just before a field) 

37 ... Section: main/devel 

38 ... Depends: libfoo, 

39 ... # Inline comment (associated with the next line) 

40 ... libbar, 

41 ... ''' 

42 >>> deb822_file = parse_deb822_file(example_deb822_paragraph.splitlines()) 

43 >>> paragraph = next(iter(deb822_file)) 

44 >>> paragraph['Section'] = 'devel' 

45 >>> output = deb822_file.dump() 

46 >>> output == example_deb822_paragraph.replace('Section: main/devel', 'Section: devel') 

47 True 

48 

49This makes it particularly good for automated changes/corrections to files (partly) 

50maintained by humans. 

51 

52Compared to debian.deb822 

53------------------------- 

54 

55The round-trip safe API is primarily useful when your program is editing files 

56and the file in question is (likely) to be hand-edited or formatted directly by 

57human maintainers. This includes files like debian/control and the 

58debian/copyright using the "DEP-5" format. 

59 

60The round-trip safe API also supports parsing and working with invalid files. 

61This enables programs to work on the file in cases where the file was a left 

62with an error in an attempt to correct it (or ignore it). 

63 

64On the flip side, the debian.deb822 module generally uses less memory than the 

65round trip safe API. In some cases, it will also have faster data structures 

66because its internal data structures are simpler. Accordingly, when you are doing 

67read-only work or/and working with large files a la the Packages or Sources 

68files from the Debian archive, then the round-trip safe API either provides no 

69advantages or its trade-offs might show up in performance statistics. 

70 

71The memory and runtime performance difference should generally be constant for 

72valid files but not necessarily a small one. For invalid files, some operations 

73can degrade in runtime performance in particular cases (memory performance for 

74invalid files are comparable to that of valid files). 

75 

76Converting from debian.deb822 

77============================= 

78 

79The following is a short example for how to migrate from debian.deb822 to 

80the round-trip safe API. Given the following source text:: 

81 

82 >>> dctrl_input = b''' 

83 ... Source: foo 

84 ... Build-Depends: debhelper-compat (= 13) 

85 ... 

86 ... Package: bar 

87 ... Architecture: any 

88 ... Depends: ${misc:Depends}, 

89 ... ${shlibs:Depends}, 

90 ... Description: provides some exciting feature 

91 ... yada yada yada 

92 ... . 

93 ... more deskription with a misspelling 

94 ... '''.lstrip() # To remove the leading newline 

95 >>> # A few definitions to emulate file I/O (would be different in the program) 

96 >>> import contextlib, os 

97 >>> @contextlib.contextmanager 

98 ... def open_input(): 

99 ... # Works with and without keepends=True. 

100 ... # Keep the ends here to truly emulate an open file. 

101 ... yield dctrl_input.splitlines(keepends=True) 

102 >>> def open_output(): 

103 ... return open(os.devnull, 'wb') 

104 

105With debian.deb822, your code might look like this:: 

106 

107 >>> from debian.deb822 import Deb822 

108 >>> with open_input() as in_fd, open_output() as out_fd: 

109 ... for paragraph in Deb822.iter_paragraphs(in_fd): 

110 ... if 'Description' not in paragraph: 

111 ... continue 

112 ... description = paragraph['Description'] 

113 ... # Fix typo 

114 ... paragraph['Description'] = description.replace('deskription', 'description') 

115 ... paragraph.dump(out_fd) 

116 

117With the round-trip safe API, the rewrite would look like this:: 

118 

119 >>> from debian._deb822_repro import parse_deb822_file 

120 >>> with open_input() as in_fd, open_output() as out_fd: 

121 ... parsed_file = parse_deb822_file(in_fd) 

122 ... for paragraph in parsed_file: 

123 ... if 'Description' not in paragraph: 

124 ... continue 

125 ... description = paragraph['Description'] 

126 ... # Fix typo 

127 ... paragraph['Description'] = description.replace('deskription', 'description') 

128 ... parsed_file.dump(out_fd) 

129 

130Key changes are: 

131 

132 1. Imports are different. 

133 2. Deb822.iter_paragraphs is replaced by parse_deb822_file and a reference to 

134 its return value is kept for later. 

135 3. Instead of dumping paragraphs one by one, the return value from 

136 parse_deb822_file is dumped at the end. 

137 

138 - The round-trip safe api does support "per-paragraph" but formatting 

139 and comments between paragraphs would be lost in the output. This may 

140 be an acceptable tradeoff or desired for some cases. 

141 

142Note that the round trip safe API does not accept all the same parameters as the 

143debian.deb822 module does. Often this is because the feature is not relevant for 

144the round-trip safe API (e.g., python-apt cannot be used as it discard comments) 

145or is obsolete in the debian.deb822 module and therefore omitted. 

146 

147For list based fields, you may want to have a look at the 

148Deb822ParagraphElement.as_interpreted_dict_view method. 

149 

150Stability of this API 

151--------------------- 

152 

153The API is subject to change based on feedback from early adopters and beta 

154testers. That said, the code for valid files is unlikely to change in 

155a backwards incompatible way. 

156 

157Things that might change in an incompatible way include: 

158 * Whether invalid files are accepted (parsed without errors) by default. 

159 (currently they are) 

160 * How invalid files are parsed. As an example, currently a syntax error acts 

161 as a paragraph separator. Whether it should is open to debate. 

162 

163""" 

164 

165# pylint: disable=useless-import-alias 

166from .parsing import ( 

167 parse_deb822_file as parse_deb822_file, 

168 LIST_SPACE_SEPARATED_INTERPRETATION as LIST_SPACE_SEPARATED_INTERPRETATION, 

169 LIST_COMMA_SEPARATED_INTERPRETATION as LIST_COMMA_SEPARATED_INTERPRETATION, 

170 Interpretation as Interpretation, 

171 # Primarily for documentation purposes / help() 

172 Deb822FileElement as Deb822FileElement, 

173 Deb822NoDuplicateFieldsParagraphElement, 

174 Deb822ParagraphElement as Deb822ParagraphElement, 

175) 

176from .types import ( 

177 AmbiguousDeb822FieldKeyError as AmbiguousDeb822FieldKeyError, 

178 SyntaxOrParseError, 

179) 

180 

181__all__ = [ 

182 "parse_deb822_file", 

183 "AmbiguousDeb822FieldKeyError", 

184 "LIST_SPACE_SEPARATED_INTERPRETATION", 

185 "LIST_COMMA_SEPARATED_INTERPRETATION", 

186 "Interpretation", 

187 "Deb822FileElement", 

188 "Deb822NoDuplicateFieldsParagraphElement", 

189 "Deb822ParagraphElement", 

190 "SyntaxOrParseError", 

191]