summaryrefslogtreecommitdiffstats
path: root/doc/developer/tracing.rst
blob: 76f6004034e70a53b0567304e8eeac438c3bbac0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
.. _tracing:

Tracing
=======

FRR has a small but growing number of static tracepoints available for use with
various tracing systems. These tracepoints can assist with debugging,
performance analysis and to help understand program flow. They can also be used
for monitoring.

Developers are encouraged to write new static tracepoints where sensible. They
are not compiled in by default, and even when they are, they have no overhead
unless enabled by a tracer, so it is okay to be liberal with them.


Supported tracers
-----------------

Presently two types of tracepoints are supported:

- `LTTng tracepoints <https://lttng.org/>`_
- `USDT probes <http://dtrace.org/guide/chp-usdt.html>`_

LTTng is a tracing framework for Linux only. It offers extremely low overhead
and very rich tracing capabilities. FRR supports LTTng-UST, which is the
userspace implementation. LTTng tracepoints are very rich in detail. No kernel
modules are needed. Besides only being available for Linux, the primary
downside of LTTng is the need to link to ``lttng-ust``.

USDT probes originate from Solaris, where they were invented for use with
dtrace. They are a kernel feature. At least Linux and FreeBSD support them. No
library is needed; support is compiled in via a system header
(``<sys/sdt.h>``). USDT probes are much slower than LTTng tracepoints and offer
less flexibility in what information can be gleaned from them.

LTTng is capable of tracing USDT probes but has limited support for them.
SystemTap and dtrace both work only with USDT probes.


Usage
-----

To compile with tracepoints, use one of the following configure flags:

.. program:: configure.ac

.. option:: --enable-lttng=yes

   Generate LTTng tracepoints

.. option:: --enable-usdt=yes

   Generate USDT probes

To trace with LTTng, compile with either one (prefer :option:`--enable-lttng`
run the target in non-forking mode (no ``-d``) and use LTTng as usual (refer to
LTTng user manual). When using USDT probes with LTTng, follow the example in
`this article
<https://lttng.org/blog/2019/10/15/new-dynamic-user-space-tracing-in-lttng/>`_.
To trace with dtrace or SystemTap, compile with `--enable-usdt=yes` and
use your tracer as usual.

To see available USDT probes::

   readelf -n /usr/lib/frr/bgpd

Example::

   root@host ~> readelf -n /usr/lib/frr/bgpd

   Displaying notes found in: .note.ABI-tag
     Owner                 Data size	Description
     GNU                  0x00000010	NT_GNU_ABI_TAG (ABI version tag)
       OS: Linux, ABI: 3.2.0

   Displaying notes found in: .note.gnu.build-id
     Owner                 Data size	Description
     GNU                  0x00000014	NT_GNU_BUILD_ID (unique build ID bitstring)
       Build ID: 4f42933a69dcb42a519bc459b2105177c8adf55d

   Displaying notes found in: .note.stapsdt
     Owner                 Data size	Description
     stapsdt              0x00000045	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: packet_read
       Location: 0x000000000045ee48, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-96(%rbp) 8@-104(%rbp)
     stapsdt              0x00000047	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: open_process
       Location: 0x000000000047c43b, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-224(%rbp) 2@-226(%rbp)
     stapsdt              0x00000049	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: update_process
       Location: 0x000000000047c4bf, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-208(%rbp) 2@-210(%rbp)
     stapsdt              0x0000004f	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: notification_process
       Location: 0x000000000047c557, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-192(%rbp) 2@-194(%rbp)
     stapsdt              0x0000004c	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: keepalive_process
       Location: 0x000000000047c5db, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-176(%rbp) 2@-178(%rbp)
     stapsdt              0x0000004a	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: refresh_process
       Location: 0x000000000047c673, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-160(%rbp) 2@-162(%rbp)
     stapsdt              0x0000004d	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: capability_process
       Location: 0x000000000047c6f7, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-144(%rbp) 2@-146(%rbp)
     stapsdt              0x0000006f	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: output_filter
       Location: 0x000000000048e33a, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp)
     stapsdt              0x0000007d	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: process_update
       Location: 0x0000000000491f10, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-800(%rbp) 8@-808(%rbp) 4@-812(%rbp) 4@-816(%rbp) 4@-820(%rbp) 8@-832(%rbp)
     stapsdt              0x0000006e	NT_STAPSDT (SystemTap probe descriptors)
       Provider: frr_bgp
       Name: input_filter
       Location: 0x00000000004940ed, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000
       Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp)


To see available LTTng probes, run the target, create a session and then::

   lttng list --userspace | grep frr

Example::

   root@host ~> lttng list --userspace | grep frr
   PID: 11157 - Name: /usr/lib/frr/bgpd
         frr_libfrr:route_node_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:list_sort (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:list_delete_node (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:list_remove (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:list_add (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:memfree (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:memalloc (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:frr_pthread_stop (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:frr_pthread_run (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
         frr_libfrr:thread_call (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:event_cancel_async (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:event_cancel (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:schedule_write (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:schedule_read (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:schedule_event (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:schedule_timer (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:hash_release (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:hash_insert (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_libfrr:hash_get (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:output_filter (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:input_filter (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:process_update (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:packet_read (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:refresh_process (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:capability_process (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:notification_process (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:update_process (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:keepalive_process (loglevel: TRACE_INFO (6)) (type: tracepoint)
         frr_bgp:open_process (loglevel: TRACE_INFO (6)) (type: tracepoint)

When using LTTng, you can also get zlogs as trace events by enabling
the ``lttng_ust_tracelog:*`` event class.

To see available SystemTap USDT probes, run::

   stap -L 'process("/usr/lib/frr/bgpd").mark("*")'

Example::

   root@host ~> stap -L 'process("/usr/lib/frr/bgpd").mark("*")'
   process("/usr/lib/frr/bgpd").mark("capability_process") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("input_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long
   process("/usr/lib/frr/bgpd").mark("keepalive_process") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("notification_process") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("open_process") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("output_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long
   process("/usr/lib/frr/bgpd").mark("packet_read") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("process_update") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long $arg6:long
   process("/usr/lib/frr/bgpd").mark("refresh_process") $arg1:long $arg2:long
   process("/usr/lib/frr/bgpd").mark("update_process") $arg1:long $arg2:long

When using SystemTap, you can also easily attach to an existing function::

   stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")'

Example::

   root@host ~> stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")'
   process("/usr/lib/frr/bgpd").function("bgp_update_receive@bgpd/bgp_packet.c:1531") $peer:struct peer* $size:bgp_size_t $attr:struct attr $restart:_Bool $nlris:struct bgp_nlri[] $__func__:char const[] const

Complete ``bgp.stp`` example using SystemTap to show BGP peer, prefix and aspath
using ``process_update`` USDT::

   global pkt_size;
   probe begin
   {
     ansi_clear_screen();
     println("Starting...");
   }
   probe process("/usr/lib/frr/bgpd").function("bgp_update_receive")
   {
     pkt_size <<< $size;
   }
   probe process("/usr/lib/frr/bgpd").mark("process_update")
   {
     aspath = @cast($arg6, "attr")->aspath;
     printf("> %s via %s (%s)\n",
       user_string($arg2),
       user_string(@cast($arg1, "peer")->host),
       user_string(@cast(aspath, "aspath")->str));
   }
   probe end
   {
     if (@count(pkt_size))
       print(@hist_linear(pkt_size, 0, 20, 2));
   }

Output::

   Starting...
   > 192.168.0.0/24 via 192.168.0.1 (65534)
   > 192.168.100.1/32 via 192.168.0.1 (65534)
   > 172.16.16.1/32 via 192.168.0.1 (65534 65030)
   ^Cvalue |-------------------------------------------------- count
       0 |                                                   0
       2 |                                                   0
       4 |@                                                  1
       6 |                                                   0
       8 |                                                   0
         ~
     18 |                                                   0
     20 |                                                   0
     >20 |@@@@@                                              5


Concepts
--------

Tracepoints are statically defined points in code where a developer has
determined that outside observers might gain something from knowing what is
going on at that point. It's like logging but with the ability to dump large
amounts of internal data with much higher performance. LTTng has a good summary
`here <https://lttng.org/docs/#doc-what-is-tracing>`_.

Each tracepoint has a "provider" and name. The provider is basically a
namespace; for example, ``bgpd`` uses the provider name ``frr_bgp``. The name
is arbitrary, but because providers share a global namespace on the user's
system, all providers from FRR should be prefixed by ``frr_``. The tracepoint
name is just the name of the event. Events are globally named by their provider
and name. For example, the event when BGP reads a packet from a peer is
``frr_bgp:packet_read``.

To do tracing, the tracing tool of choice is told which events to listen to.
For example, to listen to all events from FRR's BGP implementation, you would
enable the events ``frr_bgp:*``. In the same tracing session you could also
choose to record all memory allocations by enabling the ``malloc`` tracepoints
in ``libc`` as well as all kernel skb operations using the various in-kernel
tracepoints. This allows you to build as complete a view as desired of what the
system is doing during the tracing window (subject to what tracepoints are
available).

Of particular use are the tracepoints for FRR's internal event scheduler;
tracing these allows you to see all events executed by all event loops for the
target(s) in question. Here's a couple events selected from a trace of BGP
during startup::

   ...

   [18:41:35.750131763] (+0.000048901) host frr_libfrr:thread_call: { cpu_id =
   1 }, { threadmaster_name = "default", function_name = "zclient_connect",
   scheduled_from = "lib/zclient.c", scheduled_on_line = 3877, thread_addr =
   0x0, file_descriptor = 0, event_value = 0, argument_ptr = 0xA37F70, timer =
   0 }

   [18:41:35.750175124] (+0.000020001) host frr_libfrr:thread_call: { cpu_id =
   1 }, { threadmaster_name = "default", function_name = "frr_config_read_in",
   scheduled_from = "lib/libfrr.c", scheduled_on_line = 934, thread_addr = 0x0,
   file_descriptor = 0, event_value = 0, argument_ptr = 0x0, timer = 0 }

   [18:41:35.753341264] (+0.000010532) host frr_libfrr:thread_call: { cpu_id =
   1 }, { threadmaster_name = "default", function_name = "bgp_event",
   scheduled_from = "bgpd/bgpd.c", scheduled_on_line = 142, thread_addr = 0x0,
   file_descriptor = 2, event_value = 2, argument_ptr = 0xE4D780, timer = 2 }

   [18:41:35.753404186] (+0.000004910) host frr_libfrr:thread_call: { cpu_id =
   1 }, { threadmaster_name = "default", function_name = "zclient_read",
   scheduled_from = "lib/zclient.c", scheduled_on_line = 3891, thread_addr =
   0x0, file_descriptor = 40, event_value = 40, argument_ptr = 0xA37F70, timer
   = 40 }

   ...


Very useful for getting a time-ordered look into what the process is doing.


Adding Tracepoints
------------------

Adding new tracepoints is a two step process:

1. Define the tracepoint
2. Use the tracepoint

Tracepoint definitions state the "provider" and name of the tracepoint, along
with any values it will produce, and how to format them. This is done with
macros provided by LTTng. USDT probes do not use definitions and are inserted
at the trace site with a single macro. However, to maintain support for both
platforms, you must define an LTTng tracepoint when adding a new one.
``frrtrace()`` will expand to the appropriate ``DTRACE_PROBEn`` macro when USDT
is in use.

If you are adding new tracepoints to a daemon that has no tracepoints, that
daemon's ``subdir.am`` must be updated to conditionally link ``lttng-ust``.
Look at ``bgpd/subdir.am`` for an example of how to do this; grep for
``UST_LIBS``. Create new files named ``<daemon>_trace.[ch]``. Use
``bgpd/bgp_trace.[h]`` as boilerplate. If you are adding tracepoints to a
daemon that already has them, look for the ``<daemon>_trace.h`` file;
tracepoints are written here.

Refer to the `LTTng developer docs
<https://lttng.org/docs/#doc-c-application>`_ for details on how to define
tracepoints.

To use them, simply add a call to ``frrtrace()`` at the point you'd like the
event to be emitted, like so:

.. code-block:: c

   ...

   switch (type) {
   case BGP_MSG_OPEN:
           frrtrace(2, frr_bgp, open_process, peer, size); /* tracepoint */
           atomic_fetch_add_explicit(&peer->open_in, 1,
                                     memory_order_relaxed);
           mprc = bgp_open_receive(peer, size);

   ...

After recompiling this tracepoint will now be available, either as a USDT probe
or LTTng tracepoint, depending on your compilation choice.


trace.h
^^^^^^^

Because FRR supports multiple types of tracepoints, the code for creating them
abstracts away the underlying system being used. This abstraction code is in
``lib/trace.h``. There are 2 function-like macros that are used for working
with tracepoints.

- ``frrtrace()`` defines tracepoints
- ``frrtrace_enabled()`` checks whether a tracepoint is enabled

There is also ``frrtracelog()``, which is used in zlog core code to make zlog
messages available as trace events to LTTng. This should not be used elsewhere.

There is additional documentation in the header. The key thing to note is that
you should never include ``trace.h`` in source where you plan to put
tracepoints; include the tracepoint definition header instead (e.g.
:file:`bgp_trace.h`).


Limitations
-----------

Tracers do not like ``fork()`` or ``dlopen()``. LTTng has some workarounds for
this involving interceptor libraries using ``LD_PRELOAD``.

If you're running FRR in a typical daemonizing way (``-d`` to the daemons)
you'll need to run the daemons like so:

.. code-block:: shell

   LD_PRELOAD=liblttng-ust-fork.so <daemon>


If you're using systemd this you can accomplish this for all daemons by
modifying ``frr.service`` like so:

.. code-block:: diff

   --- a/frr.service
   +++ b/frr.service
   @@ -7,6 +7,7 @@ Before=network.target
    OnFailure=heartbeat-failed@%n

    [Service]
   +Environment="LD_PRELOAD=liblttng-ust-fork.so"
    Nice=-5
    Type=forking
    NotifyAccess=all


USDT tracepoints are relatively high overhead and probably shouldn't be used
for "flight recorder" functionality, i.e. enabling and passively recording all
events for monitoring purposes. It's generally okay to use LTTng like this,
though.