summaryrefslogtreecommitdiffstats
path: root/man/deb-src-symbols.pod
blob: f44d21f34f50beb640d9f7be239907c89cdcfb93 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
# dpkg manual page - deb-src-symbols(5)
#
# Copyright © 2007-2011 Raphaël Hertzog <hertzog@debian.org>
# Copyright © 2009-2010 Modestas Vainius <modestas@vainius.eu>
# Copyright © 2012-2015 Guillem Jover <guillem@debian.org>
#
# This is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <https://www.gnu.org/licenses/>.

=encoding utf8

=head1 NAME

deb-src-symbols - Debian's extended shared library template file

=head1 SYNOPSIS

B<debian/>I<package>B<.symbols.>I<arch>,
B<debian/symbols.>I<arch>,
B<debian/>I<package>B<.symbols>,
B<debian/symbols>

=head1 DESCRIPTION

The symbol file templates are shipped in Debian source packages, and its
format is a superset of the symbols files shipped in binary packages,
see L<deb-symbols(5)>.

=head2 Comments

Comments are supported in template symbol files. Any line with ‘#’ as
the first character is a comment except if it starts with ‘#include’
(see section B<Using includes>).
Lines starting with ‘#MISSING:’ are special comments documenting
symbols that have disappeared.

=head2 Using #PACKAGE# substitution

In some rare cases, the name of the library varies between architectures.
To avoid hardcoding the name of the package in the symbols file, you can
use the marker I<#PACKAGE#>. It will be replaced by the real package
name during installation of the symbols files. Contrary to the
I<#MINVER#> marker, I<#PACKAGE#> will never appear in a symbols file
inside a binary package.

=head2 Using symbol tags

Symbol tagging is useful for marking symbols that are special in some way.  Any
symbol can have an arbitrary number of tags associated with it. While all tags are
parsed and stored, only some of them are understood by
B<dpkg-gensymbols> and trigger special handling of the symbols. See
subsection B<Standard symbol tags> for reference of these tags.

Tag specification comes right before the symbol name (no whitespace is allowed
in between). It always starts with an opening bracket B<(>, ends with a
closing bracket B<)> and must contain at least one tag. Multiple tags are
separated by the B<|> character. Each tag can optionally have a value which
is separated form the tag name by the B<=> character. Tag names and values
can be arbitrary strings except they cannot contain any of the special B<)>
B<|> B<=> characters. Symbol names following a tag specification can
optionally be quoted with either B<'> or B<"> characters to allow
whitespaces in them. However, if there are no tags specified for the symbol,
quotes are treated as part of the symbol name which continues up until the
first space.

  (tag1=i am marked|tag name with space)"tagged quoted symbol"@Base 1.0
  (optional)tagged_unquoted_symbol@Base 1.0 1
  untagged_symbol@Base 1.0

The first symbol in the example is named I<tagged quoted symbol> and has two
tags: I<tag1> with value I<i am marked> and I<tag name with space>
that has no value. The second symbol named I<tagged_unquoted_symbol> is
only tagged with the tag named I<optional>. The last symbol is an
example of the normal untagged symbol.

Since symbol tags are an extension of the B<deb-symbols>(5) format, they
can only be part of the symbols files used in source packages (those files
should then be seen as templates used to build the symbols files that are
embedded in binary packages). When
B<dpkg-gensymbols> is called without the B<-t> option, it will
output symbols files compatible to the B<deb-symbols>(5) format:
it fully processes symbols according to the requirements of their standard tags
and strips all tags from the output. On the contrary, in template mode
(B<-t>) all symbols and their tags (both standard and unknown ones)
are kept in the output and are written in their original form as they were
loaded.

=head2 Standard symbol tags

=over

=item B<optional>

A symbol marked as optional can disappear from the library at any time and that
will never cause B<dpkg-gensymbols> to fail. However, disappeared optional
symbols will continuously appear as MISSING in the diff in each new package
revision.  This behaviour serves as a reminder for the maintainer that such a
symbol needs to be removed from the symbol file or readded to the library. When
the optional symbol, which was previously declared as MISSING, suddenly
reappears in the next revision, it will be upgraded back to the “existing”
status with its minimum version unchanged.

This tag is useful for symbols which are private where their disappearance do
not cause ABI breakage. For example, most of C++ template instantiations fall
into this category. Like any other tag, this one may also have an arbitrary
value: it could be used to indicate why the symbol is considered optional.

=item B<arch=>I<architecture-list>

=item B<arch-bits=>I<architecture-bits>

=item B<arch-endian=>I<architecture-endianness>

These tags allow one to restrict the set of architectures where the symbol
is supposed to exist. The B<arch-bits> and B<arch-endian> tags
are supported since dpkg 1.18.0. When the symbols list is updated with
the symbols
discovered in the library, all arch-specific symbols which do not concern
the current host architecture are treated as if they did not exist. If an
arch-specific symbol matching the current host architecture does not exist
in the library, normal procedures for missing symbols apply and it may
cause B<dpkg-gensymbols> to fail. On the other hand, if the
arch-specific symbol is found when it was not supposed to exist (because
the current host architecture is not listed in the tag or does not match
the endianness and bits), it is made arch neutral (i.e. the arch, arch-bits
and arch-endian tags are dropped and the symbol will appear in the diff due
to this change), but it is not considered as new.

When operating in the default non-template mode, among arch-specific symbols
only those that match the current host architecture are written to the
symbols file. On the contrary, all arch-specific symbols (including those
from foreign arches) are always written to the symbol file when operating
in template mode.

The format of I<architecture-list> is the same as the one used in the
B<Build-Depends> field of I<debian/control> (except the enclosing
square brackets []). For example, the first symbol from the list below
will be considered only on alpha, any-amd64 and ia64 architectures,
the second only on linux architectures, while the third one anywhere
except on armel.

  (arch=alpha any-amd64 ia64)64bit_specific_symbol@Base 1.0
  (arch=linux-any)linux_specific_symbol@Base 1.0
  (arch=!armel)symbol_armel_does_not_have@Base 1.0

The I<architecture-bits> is either B<32> or B<64>.

  (arch-bits=32)32bit_specific_symbol@Base 1.0
  (arch-bits=64)64bit_specific_symbol@Base 1.0

The I<architecture-endianness> is either B<little> or B<big>.

  (arch-endian=little)little_endian_specific_symbol@Base 1.0
  (arch-endian=big)big_endian_specific_symbol@Base 1.0

Multiple restrictions can be chained.

  (arch-bits=32|arch-endian=little)32bit_le_symbol@Base 1.0

=item B<allow-internal>

dpkg-gensymbols has a list of internal symbols that should not
appear in symbols files as they are usually only side-effects of
implementation details of the toolchain (since dpkg 1.20.1).
If for some reason, you really want one of those symbols to be included in
the symbols file, you should tag the symbol with B<allow-internal>.
It can be necessary for some low level toolchain libraries like “libgcc”.

=item B<ignore-blacklist>

A deprecated alias for B<allow-internal> (since dpkg 1.20.1,
supported since dpkg 1.15.3).

=item B<c++>

Denotes I<c++> symbol pattern. See B<Using symbol patterns> subsection
below.

=item B<symver>

Denotes I<symver> (symbol version) symbol pattern. See B<Using symbol
patterns> subsection below.

=item B<regex>

Denotes I<regex> symbol pattern. See B<Using symbol patterns> subsection
below.

=back

=head2 Using symbol patterns

Unlike a standard symbol specification, a pattern may cover multiple real
symbols from the library. B<dpkg-gensymbols> will attempt to match each
pattern against each real symbol that does I<not> have a specific symbol
counterpart defined in the symbol file. Whenever the first matching pattern is
found, all its tags and properties will be used as a basis specification of the
symbol. If none of the patterns matches, the symbol will be considered as new.

A pattern is considered lost if it does not match any symbol in the library. By
default this will trigger a B<dpkg-gensymbols> failure under B<-c1> or
higher level. However, if the failure is undesired, the pattern may be marked
with the I<optional> tag. Then if the pattern does not match anything, it
will only appear in the diff as MISSING. Moreover, like any symbol, the pattern
may be limited to the specific architectures with the I<arch> tag. Please
refer to B<Standard symbol tags> subsection above for more information.

Patterns are an extension of the B<deb-symbols>(5) format hence they are
only valid in symbol file templates. Pattern specification syntax is not any
different from the one of a specific symbol. However, symbol name part of the
specification serves as an expression to be matched against I<name@version>
of the real symbol. In order to distinguish among different pattern types, a
pattern will typically be tagged with a special tag.

At the moment, B<dpkg-gensymbols> supports three basic pattern types:

=over

=item B<c++>

This pattern is denoted by the I<c++> tag. It matches only C++ symbols by
their demangled symbol name (as emitted by B<c++filt>(1) utility). This
pattern is very handy for matching symbols which mangled names might vary
across different architectures while their demangled names remain the same. One
group of such symbols is I<non-virtual thunks> which have architecture
specific offsets embedded in their mangled names. A common instance of this
case is a virtual destructor which under diamond inheritance needs a
non-virtual thunk symbol. For example, even if _ZThn8_N3NSB6ClassDD1Ev@Base on
32bit architectures will probably be _ZThn16_N3NSB6ClassDD1Ev@Base on 64bit
ones, it can be matched with a single I<c++> pattern:

 libdummy.so.1 libdummy1 #MINVER#
  [...]
  (c++)"non-virtual thunk to NSB::ClassD::~ClassD()@Base" 1.0
  [...]

The demangled name above can be obtained by executing the following command:

  $ echo '_ZThn8_N3NSB6ClassDD1Ev@Base' | c++filt

Please note that while mangled name is unique in the library by definition,
this is not necessarily true for demangled names. A couple of distinct real
symbols may have the same demangled name. For example, that's the case with
non-virtual thunk symbols in complex inheritance configurations or with most
constructors and destructors (since g++ typically generates two real symbols
for them). However, as these collisions happen on the ABI level, they should
not degrade quality of the symbol file.

=item B<symver>

This pattern is denoted by the I<symver> tag. Well maintained libraries have
versioned symbols where each version corresponds to the upstream version where
the symbol got added. If that's the case, you can use a I<symver> pattern to
match any symbol associated to the specific version. For example:

 libc.so.6 libc6 #MINVER#
  (symver)GLIBC_2.0 2.0
  [...]
  (symver)GLIBC_2.7 2.7
  access@GLIBC_2.0 2.2

All symbols associated with versions GLIBC_2.0 and GLIBC_2.7 will lead to
minimal version of 2.0 and 2.7 respectively with the exception of the symbol
access@GLIBC_2.0. The latter will lead to a minimal dependency on libc6 version
2.2 despite being in the scope of the "(symver)GLIBC_2.0" pattern because
specific symbols take precedence over patterns.

Please note that while old style wildcard patterns (denoted by "*@version" in
the symbol name field) are still supported, they have been deprecated by new
style syntax "(symver|optional)version". For example, "*@GLIBC_2.0 2.0" should
be written as "(symver|optional)GLIBC_2.0 2.0" if the same behaviour is needed.

=item B<regex>

Regular expression patterns are denoted by the I<regex> tag. They match by
the perl regular expression specified in the symbol name field. A regular
expression is matched as it is, therefore do not forget to start it with the
I<^> character or it may match any part of the real symbol
I<name@version> string. For example:

 libdummy.so.1 libdummy1 #MINVER#
  (regex)"^mystack_.*@Base$" 1.0
  (regex|optional)"private" 1.0

Symbols like "mystack_new@Base", "mystack_push@Base", "mystack_pop@Base" etc.
will be matched by the first pattern while e.g. "ng_mystack_new@Base" won't.
The second pattern will match all symbols having the string "private" in their
names and matches will inherit I<optional> tag from the pattern.

=back

Basic patterns listed above can be combined where it makes sense. In that case,
they are processed in the order in which the tags are specified. For example,
both:

  (c++|regex)"^NSA::ClassA::Private::privmethod\d\(int\)@Base" 1.0
  (regex|c++)N3NSA6ClassA7Private11privmethod\dEi@Base 1.0

will match symbols "_ZN3NSA6ClassA7Private11privmethod1Ei@Base" and
"_ZN3NSA6ClassA7Private11privmethod2Ei@Base". When matching the first pattern,
the raw symbol is first demangled as C++ symbol, then the demangled name is
matched against the regular expression. On the other hand, when matching the
second pattern, regular expression is matched against the raw symbol name, then
the symbol is tested if it is C++ one by attempting to demangle it. A failure
of any basic pattern will result in the failure of the whole pattern.
Therefore, for example, "__N3NSA6ClassA7Private11privmethod\dEi@Base" will not
match either of the patterns because it is not a valid C++ symbol.

In general, all patterns are divided into two groups: aliases (basic I<c++>
and I<symver>) and generic patterns (I<regex>, all combinations of
multiple basic patterns). Matching of basic alias-based patterns is fast (O(1))
while generic patterns are O(N) (N - generic pattern count) for each symbol.
Therefore, it is recommended not to overuse generic patterns.

When multiple patterns match the same real symbol, aliases (first I<c++>,
then I<symver>) are preferred over generic patterns. Generic patterns are
matched in the order they are found in the symbol file template until the first
success.  Please note, however, that manual reordering of template file entries
is not recommended because B<dpkg-gensymbols> generates diffs based on the
alphanumerical order of their names.

=head2 Using includes

When the set of exported symbols differ between architectures, it may become
inefficient to use a single symbol file. In those cases, an include directive
may prove to be useful in a couple of ways:

=over

=item *

You can factorize the common part in some external file
and include that file in your I<package>.symbols.I<arch> file by
using an include directive like this:

 #include "I<packages>.symbols.common"

=item *

The include directive may also be tagged like any symbol:

 (tag|...|tagN)#include "file-to-include"

As a result, all symbols included from I<file-to-include> will be considered
to be tagged with I<tag> ... I<tagN> by default. You can use this feature
to create a common I<package>.symbols file which includes architecture
specific symbol files:

  common_symbol1@Base 1.0
 (arch=amd64 ia64 alpha)#include "package.symbols.64bit"
 (arch=!amd64 !ia64 !alpha)#include "package.symbols.32bit"
  common_symbol2@Base 1.0

=back

The symbols files are read line by line, and include directives are processed
as soon as they are encountered. This means that the content of the included
file can override any content that appeared before the include directive and
that any content after the directive can override anything contained in the
included file. Any symbol (or even another #include directive) in the included
file can specify additional tags or override values of the inherited tags in
its tag specification. However, there is no way for the symbol to remove
any of the inherited tags.

An included file can repeat the header line containing the SONAME of the
library. In that case, it overrides any header line previously read.
However, in general it's best to avoid duplicating header lines. One way
to do it is the following:

 #include "libsomething1.symbols.common"
  arch_specific_symbol@Base 1.0

=head1 SEE ALSO

B<deb-symbols>(5),
B<dpkg-shlibdeps>(1),
B<dpkg-gensymbols>(1).