1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
|
=encoding utf-8
=head1 NAME
Lintian::Tutorial::WritingChecks -- Writing checks for Lintian
=head1 SYNOPSIS
Warning: This tutorial may be outdated.
This guide will quickly guide you through the basics of writing a
Lintian check. Most of the work is in writing the two files:
checks/<my-check>.pm
checks/<my-check>.desc
And then either adding a Lintian profile or extending an existing
one.
=head1 DESCRIPTION
The basics of writing a check are outlined in the Lintian User Manual
(§3.3). This tutorial will focus on the act of writing the actual
check. In this tutorial, we will assume the name of the check to be
written is "deb/pkg-check".
The tutorial will work with a "binary" and "udeb" check. Checking
source packages works in a similar fashion.
=head2 Create a check I<.desc> file
As mentioned, this tutorial will focus on the writing of a check.
Please see the Lintian User Manual (§3.3) for how to do this part.
=head2 Create the Perl check module
Start with the template:
# deb/pkg-check is loaded as Lintian::deb::pkg_check
# - See Lintian User Manual §3.3 for more info
package Lintian::deb::pkg_check;
use strict;
use warnings;
sub run {
my ($pkg, $type, $info, $proc, $group) = @_;
return;
}
The snippet above is a simple valid check that does "nothing at all".
We will extend it in just a moment, but first let us have a look at
the arguments at the setup.
The I<run> sub is the entry point of our "deb/pkg-check" check; it
will be invoked once per package it should process. In our case, that
will be once per "binary" (.deb) and once per udeb package processed.
It is given 5 arguments (in the future, possibly more), which are:
=over 4
=item $pkg - The name of the package being processed.
(Same as $proc->pkg_name)
=item $type - The type of the package being processed.
At the moment, $type is one of "binary" (.deb), "udeb", "source"
(.dsc) or "changes". This argument is mostly useful if certain checks
do not apply equally to all package types being processed.
Generally it is advisable to check only binaries ("binary" and
"udeb"), sources or changes in a given check. But in rare cases, it
makes sense to lump multiple types together in the same check and this
argument helps you do that.
(Current it is always identical to $proc->pkg_type)
=item $info - Accessor to the data Lintian has extracted
Basically all information you want about a given package comes from
the $info object. Sometimes referred to as either the "info object" or
(an instance of) L<Lintian::Collect>.
This object (together with a properly set Needs-Info in the I<.desc>
file) will grant you access to all of the data Lintian has extracted
about this package.
Based on the value of the $type argument, it will be one of
L<Lintian::Collect::Binary>, L<Lintian::Collect::Changes> or
L<Lintian::Collect::Source>.
(Currently it is the same as $proc->info)
=item $proc - Basic metadata about the package
This is an instance of L<Lintian::Processable> and is useful for
trivially obtaining very basic package metadata. Particularly, the
name of source package and version of source package are readily
available through this object.
=item $group - Group of processables from the same source
If you want to do a cross-check between different packages built from
the same source, $group helps you access those other packages
(if they are available).
This is an instance of L<Lintian::ProcessableGroup>.
=back
Now back to the coding.
=head2 Accessing fields
Let's do a slightly harder example. Assume we wanted to emit a tag for
all packages without a (valid) Multi-Arch field. This requires us to
A) identify if the package has a Multi-Arch field and B) identify if
the content of the field was valid.
Starting from the top. All $info objects have a method called field,
which gives you access to a (raw) field from the control file of the
package. It returns C<undef> if said field is not present or the
content of said field otherwise. Note that field names must be given
in all lowercase letters (i.e. use "multi-arch", not "Multi-Arch").
This was the first half. Let's look at checking the value. Multi-arch
fields can (currently) be one of "no", "same", "foreign" or "allowed".
One way of checking this would be using the regex:
Notice that Lintian automatically strips leading and trailing spaces
on the I<first> line in a field. It also strips trailing spaces from
all other lines, but leading spaces and the " ."-continuation markers
are kept as is.
=head2 Checking dependencies
Lintian can do some checking of dependencies. For most cases it works
similar to a normal dependency check, but keep in mind that Lintian
uses I<pure> logic to determine if dependencies are satisfied (i.e. it
will not look up relations like Provides for you).
Suppose you wanted all packages with a multi-arch "same" field to
pre-depend on the package "multiarch-support". Well, we could use the
L<< $info->relation|Lintian::Collect::Binary/relation (FIELD) >> method for
this.
$info->relation returns an instance of L<Lintian::Relation>. This
object has an "implies" method that can be used to check if a package
has an explicit dependency. Note that "implies" actually checks if
one relation "implies" another (i.e. if you satisfied relationA then
you definitely also satisfied relationB).
As with the "field"-method, field names have to be given in all
lowercase. However "relation" will never return C<undef> (not even if the
field is missing).
=head2 Using static data files
Currently our check mixes data and code. Namely all the valid values
for the Multi-Arch field are currently hard-coded in our check. We can
move those out of the check by using a data file.
Lintian natively supports data files that are either "sets" or
"tables" via L<Lintian::Data> (i.e. "unordered" collections). As an
added bonus, L<Lintian::Data> transparently supports vendor specific
data files for us.
First we need to make a data file containing the values. Which could be:
# A table of all the valid values for the multi-arch field.
no
same
foreign
allowed
This can then be stored in the data directory as
I<data/deb/pkg-check/multiarch-values>.
Now we can load it by using:
use Lintian::Data;
my $VALID_MULTI_ARCH_VALUES =
Lintian::Data->new('deb/pkg-check/multiarch-values');
Actually, this is not quite true. L<Lintian::Data> is lazy, so it
will not load anything before we force it to do so. Most of the time
this is just an added bonus. However, if you ever have to force it to
load something immediately, you can do so by invoking its "known"
method (with an arbitrary defined string and ignore the result).
Data files work with 3 access methods, "all", "known" and "value".
=over 4
=item all
"all" (i.e. $data->all) returns a list of all the entries in the data
file (for key/value tables, all returns the keys). The list is not
sorted in any order (not even input order).
=item known
"known" (i.e. $data->known('item')) returns a truth value if a given
item or key is known (present) in the data set or table. For key/pair
tables, the value associated with the key can be retrieved with
"value" (see below).
=item value
"value" (i.e. $data->value('key')) returns a value associated with a
key for key/value tables. For unknown keys, it returns C<undef>. If
the data file is not a key/value table but just a set, value returns
a truth value for known keys.
=back
While we could use both "value" and "known", we will use the latter
for readability (and to remind ourselves that this is a data set and
not a data table).
Basically we will be replacing:
unless exists $VALID_MULTI_ARCH_VALUES{$multiarch};
with
unless $VALID_MULTI_ARCH_VALUES->known($multiarch);
=head2 Accessing contents of the package
Another heavily used mechanism is to check for the presence (or absence)
of a given file. Generally this is what the
L<< $info->index|Lintian::Collect::Package/index (FILE) >> and
L<< $info->sorted_index|Lintian::Collect::Package/sorted_index >> methods
are for. The "index" method returns instances of L<Lintian::Path>,
which has a number of utility methods.
If you want to loop over all files in a package, the sorted_index will
do this for you. If you are looking for a specific file (or directory), a
call to "index" will be much faster. For the contents of a specific directory,
you can use something like:
if (my $dir = $info->index('path/to/dir/')) {
foreach my $elem ($dir->children) {
print $elem->name . " is a file" if $elem->is_file;
# ...
}
}
Keep in mind that using the "index" or "sorted_index" method will
require that you put "unpacked" in Needs-Info. See L</Keeping Needs-Info
up to date>.
There are also a pair of methods for accessing the control files of a
binary package. These are
L<< $info->control_index|Lintian::Collect::Package/control_index (FILE) >> and
L<< $info->sorted_control_index|Lintian::Collect::Package/sorted_control_index >>.
=head3 Accessing contents of a file in a package
When you actually want to see the contents of a file, you can use
L<open|Lintian::Path/open> (or L<open_gz|Lintian::Path/open_gz>) on
an object returned by e.g.
L<< $info->index|Lintian::Collect::Package/index (FILE) >>. These
methods will open the underlying file for reading (the latter
applying a gzip decompression).
However, please do assert that the file is safe to read by calling
L<is_open_ok|Lintian::Path/is_open_ok> first. Generally, it will
only be true for files or safely resolvable symlinks pointing to
files. Should you attempt to open a path that does not satisfy
those criteria, L<Lintian::Path> will raise a trappable error at
runtime.
Alternatively, if you access the underlying file object, you can
use the L<fs_path|Lintian::Path/fs_path> method. Usually, you will
want to test either L<is_open_ok|Lintian::Path/is_open_ok> or
L<is_valid_path|Lintian::Path/is_valid_path> first to ensure you do
not follow unsafe symlinks. The "is_open_ok" check will also assert
that it is not (e.g.) a named pipe or such.
Should you call L<fs_path|Lintian::Path/fs_path> on a symlink that
escapes the package root, the method will throw a trappable error at
runtime. Once the path is returned, there are no more built-in
fail-safes. When you use the returned path, keep things like
"../../../../../etc/passwd"-symlink and "fifo" pipes in mind.
In some cases, you may even need to access the file system objects
I<without> using L<Lintian::Path>. This is, of course, discouraged
and suffers from the same issues above (all checking must be done
manually by you). Here you have to use the "unpacked", "debfiles" or
"control" methods from L<Lintian::Collect> or its subclasses.
The following snippet may be useful for testing that a given path does
not escape the root.
use Lintian::Util qw(is_ancestor_of);
my $path = ...;
# The snippet applies equally well to $info->debfiles and
# $info->control (just remember to subst all occurrences of
# $info->unpacked).
my $unpacked_file = $info->unpacked($path);
if ( -f $unpacked_file && is_ancestor_of($info->unpacked, $unpacked_file)) {
# a file and contained within the package root.
} else {
# not a file or an unsafe path
}
=head2 Keeping Needs-Info up to date
Keeping the "Needs-Info" field of your I<.desc> file is a bit of
manual work. In the API description for the method there will
generally be a line looking something like:
Needs-Info requirements for using methodx: Y
Which means that the methodx requires Y to work. Here Y is a comma
separated list and each element of Y basically falls into 3 cases.
=over 4
=item * The element is the word I<none>
In this case, the method has no "external" requirements and can be
used without any changes to your Needs-Info. The "field" method
is an example of this.
This only makes sense if it is the only element in the list.
=item * The element is a link to a method
In this case, the method uses another method to do its job. An example
is the
L<sorted_control_index|Lintian::Collect::Binary/sorted_control_index>
method, which uses the
L<control_index|Lintian::Collect::Binary/control_index (FILE)>
method. So using I<sorted_control_index> has the same requirements as
using I<control_index>.
=item * The element is the name of a collection (e.g. "control_index").
In this case, the method needs the given collection to be run. So to
use (e.g.) L<control_index|Lintian::Collect::Binary/control_index (FILE)>,
you have to put "bin-pkg-control" in your Needs-Info.
=back
CAVEAT: Methods can have different requirements based on the type of
package! An example of this "changelog", which requires "changelog-file"
in binary packages and "Same as debfiles" in source packages.
=head2 Avoiding security issues
Over the years a couple of security issues have been discovered in
Lintian. The problem is that people can in theory create some really nasty
packages. Please keep the following in mind when writing a check:
=over 4
=item * Avoid 2-arg open, system/exec($shellcmd), `$shellcmd` like the
plague.
When you get any one of those wrong you introduce "arbitrary code
execution" vulnerabilities (we learned this the hard way via
CVE-2009-4014).
Usually 3-arg open and the non-shell variant of system/exec are
enough. When you actually need a shell pipeline, consider using
L<Lintian::Command>. It also provides a I<safe_qx> command to assist
with capturing stdout as an alternative to `$cmd` (or qx/$cmd/).
=item * Do not trust field values.
This is especially true if you intend to use the value as part of a
file name. Verify that the field contains what you expect before you use
it.
=item * Use L<Lintian::Path> (or, failing that, is_ancestor_of)
You might be tempted to think that the following code is safe:
use autodie;
my $filename = 'some/file';
my $ufile = $info->unpacked($filename);
if ( ! -l $ufile) {
# Looks safe, but isn't in general
open(my $fd, '<', $ufile);
...;
}
This is definitely unsafe if "$filename" contains at least one
directory segment. So, if in doubt, use
L<is_ancestor_of|Lintian::Util/is_ancestor_of(PARENTDIR, PATH)> to
verify that the requested file is indeed the file you think it is. A
better version of the above would be:
use autodie,
use Lintian::Util qw(is_ancestor_of);
[...]
my $filename = 'some/file';
my $ufile = $info->unpacked($filename);
if ( ! -l $ufile && -f $ufile && is_ancestor_of($info->unpacked, $ufile)) {
# $ufile is a file and it is contained within the package root.
open(m $fd, '<', $ufile);
...;
}
In some cases you can even drop the "! -l $ufile" part.
Of course, it is much easier to use the L<Lintian::Path> object
(whenever possible).
my $filename = 'some/file';
my $ufile = $info->index($filename);
if ( $ufile && $ufile->is_file && $ufile->is_open_ok) {
my $fd = $ufile->open;
...;
}
Here you can drop the " && $ufile->is_file" if you want to permit
safe symlinks.
For more information on the is_ancestor_of check, see
L<is_ancestor_of|Lintian::Util/is_ancestor_of(PARENTDIR, PATH)>
=back
=head1 SEE ALSO
L<Lintian::Tutorial::WritingTests>, L<Lintian::Tutorial::TestSuite>
=cut
|