diff options
Diffstat (limited to 'doc/developer/tracing.rst')
-rw-r--r-- | doc/developer/tracing.rst | 411 |
1 files changed, 411 insertions, 0 deletions
diff --git a/doc/developer/tracing.rst b/doc/developer/tracing.rst new file mode 100644 index 0000000..63b0458 --- /dev/null +++ b/doc/developer/tracing.rst @@ -0,0 +1,411 @@ +.. _tracing: + +Tracing +======= + +FRR has a small but growing number of static tracepoints available for use with +various tracing systems. These tracepoints can assist with debugging, +performance analysis and to help understand program flow. They can also be used +for monitoring. + +Developers are encouraged to write new static tracepoints where sensible. They +are not compiled in by default, and even when they are, they have no overhead +unless enabled by a tracer, so it is okay to be liberal with them. + + +Supported tracers +----------------- + +Presently two types of tracepoints are supported: + +- `LTTng tracepoints <https://lttng.org/>`_ +- `USDT probes <http://dtrace.org/guide/chp-usdt.html>`_ + +LTTng is a tracing framework for Linux only. It offers extremely low overhead +and very rich tracing capabilities. FRR supports LTTng-UST, which is the +userspace implementation. LTTng tracepoints are very rich in detail. No kernel +modules are needed. Besides only being available for Linux, the primary +downside of LTTng is the need to link to ``lttng-ust``. + +USDT probes originate from Solaris, where they were invented for use with +dtrace. They are a kernel feature. At least Linux and FreeBSD support them. No +library is needed; support is compiled in via a system header +(``<sys/sdt.h>``). USDT probes are much slower than LTTng tracepoints and offer +less flexibility in what information can be gleaned from them. + +LTTng is capable of tracing USDT probes but has limited support for them. +SystemTap and dtrace both work only with USDT probes. + + +Usage +----- + +To compile with tracepoints, use one of the following configure flags: + +.. program:: configure.ac + +.. option:: --enable-lttng=yes + + Generate LTTng tracepoints + +.. option:: --enable-usdt=yes + + Generate USDT probes + +To trace with LTTng, compile with either one (prefer :option:`--enable-lttng` +run the target in non-forking mode (no ``-d``) and use LTTng as usual (refer to +LTTng user manual). When using USDT probes with LTTng, follow the example in +`this article +<https://lttng.org/blog/2019/10/15/new-dynamic-user-space-tracing-in-lttng/>`_. +To trace with dtrace or SystemTap, compile with `--enable-usdt=yes` and +use your tracer as usual. + +To see available USDT probes:: + + readelf -n /usr/lib/frr/bgpd + +Example:: + + root@host ~> readelf -n /usr/lib/frr/bgpd + + Displaying notes found in: .note.ABI-tag + Owner Data size Description + GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag) + OS: Linux, ABI: 3.2.0 + + Displaying notes found in: .note.gnu.build-id + Owner Data size Description + GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) + Build ID: 4f42933a69dcb42a519bc459b2105177c8adf55d + + Displaying notes found in: .note.stapsdt + Owner Data size Description + stapsdt 0x00000045 NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: packet_read + Location: 0x000000000045ee48, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-96(%rbp) 8@-104(%rbp) + stapsdt 0x00000047 NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: open_process + Location: 0x000000000047c43b, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-224(%rbp) 2@-226(%rbp) + stapsdt 0x00000049 NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: update_process + Location: 0x000000000047c4bf, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-208(%rbp) 2@-210(%rbp) + stapsdt 0x0000004f NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: notification_process + Location: 0x000000000047c557, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-192(%rbp) 2@-194(%rbp) + stapsdt 0x0000004c NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: keepalive_process + Location: 0x000000000047c5db, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-176(%rbp) 2@-178(%rbp) + stapsdt 0x0000004a NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: refresh_process + Location: 0x000000000047c673, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-160(%rbp) 2@-162(%rbp) + stapsdt 0x0000004d NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: capability_process + Location: 0x000000000047c6f7, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-144(%rbp) 2@-146(%rbp) + stapsdt 0x0000006f NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: output_filter + Location: 0x000000000048e33a, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) + stapsdt 0x0000007d NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: process_update + Location: 0x0000000000491f10, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-800(%rbp) 8@-808(%rbp) 4@-812(%rbp) 4@-816(%rbp) 4@-820(%rbp) 8@-832(%rbp) + stapsdt 0x0000006e NT_STAPSDT (SystemTap probe descriptors) + Provider: frr_bgp + Name: input_filter + Location: 0x00000000004940ed, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 + Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) + + +To see available LTTng probes, run the target, create a session and then:: + + lttng list --userspace | grep frr + +Example:: + + root@host ~> lttng list --userspace | grep frr + PID: 11157 - Name: /usr/lib/frr/bgpd + frr_libfrr:route_node_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:list_sort (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:list_delete_node (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:list_remove (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:list_add (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:memfree (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:memalloc (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:frr_pthread_stop (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:frr_pthread_run (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) + frr_libfrr:thread_call (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:thread_cancel_async (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:thread_cancel (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:schedule_write (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:schedule_read (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:schedule_event (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:schedule_timer (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:hash_release (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:hash_insert (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_libfrr:hash_get (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:output_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:input_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:process_update (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:packet_read (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:refresh_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:capability_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:notification_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:update_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:keepalive_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + frr_bgp:open_process (loglevel: TRACE_INFO (6)) (type: tracepoint) + +When using LTTng, you can also get zlogs as trace events by enabling +the ``lttng_ust_tracelog:*`` event class. + +To see available SystemTap USDT probes, run:: + + stap -L 'process("/usr/lib/frr/bgpd").mark("*")' + +Example:: + + root@host ~> stap -L 'process("/usr/lib/frr/bgpd").mark("*")' + process("/usr/lib/frr/bgpd").mark("capability_process") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("input_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long + process("/usr/lib/frr/bgpd").mark("keepalive_process") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("notification_process") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("open_process") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("output_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long + process("/usr/lib/frr/bgpd").mark("packet_read") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("process_update") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long $arg6:long + process("/usr/lib/frr/bgpd").mark("refresh_process") $arg1:long $arg2:long + process("/usr/lib/frr/bgpd").mark("update_process") $arg1:long $arg2:long + +When using SystemTap, you can also easily attach to an existing function:: + + stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' + +Example:: + + root@host ~> stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' + process("/usr/lib/frr/bgpd").function("bgp_update_receive@bgpd/bgp_packet.c:1531") $peer:struct peer* $size:bgp_size_t $attr:struct attr $restart:_Bool $nlris:struct bgp_nlri[] $__func__:char const[] const + +Complete ``bgp.stp`` example using SystemTap to show BGP peer, prefix and aspath +using ``process_update`` USDT:: + + global pkt_size; + probe begin + { + ansi_clear_screen(); + println("Starting..."); + } + probe process("/usr/lib/frr/bgpd").function("bgp_update_receive") + { + pkt_size <<< $size; + } + probe process("/usr/lib/frr/bgpd").mark("process_update") + { + aspath = @cast($arg6, "attr")->aspath; + printf("> %s via %s (%s)\n", + user_string($arg2), + user_string(@cast($arg1, "peer")->host), + user_string(@cast(aspath, "aspath")->str)); + } + probe end + { + if (@count(pkt_size)) + print(@hist_linear(pkt_size, 0, 20, 2)); + } + +Output:: + + Starting... + > 192.168.0.0/24 via 192.168.0.1 (65534) + > 192.168.100.1/32 via 192.168.0.1 (65534) + > 172.16.16.1/32 via 192.168.0.1 (65534 65030) + ^Cvalue |-------------------------------------------------- count + 0 | 0 + 2 | 0 + 4 |@ 1 + 6 | 0 + 8 | 0 + ~ + 18 | 0 + 20 | 0 + >20 |@@@@@ 5 + + +Concepts +-------- + +Tracepoints are statically defined points in code where a developer has +determined that outside observers might gain something from knowing what is +going on at that point. It's like logging but with the ability to dump large +amounts of internal data with much higher performance. LTTng has a good summary +`here <https://lttng.org/docs/#doc-what-is-tracing>`_. + +Each tracepoint has a "provider" and name. The provider is basically a +namespace; for example, ``bgpd`` uses the provider name ``frr_bgp``. The name +is arbitrary, but because providers share a global namespace on the user's +system, all providers from FRR should be prefixed by ``frr_``. The tracepoint +name is just the name of the event. Events are globally named by their provider +and name. For example, the event when BGP reads a packet from a peer is +``frr_bgp:packet_read``. + +To do tracing, the tracing tool of choice is told which events to listen to. +For example, to listen to all events from FRR's BGP implementation, you would +enable the events ``frr_bgp:*``. In the same tracing session you could also +choose to record all memory allocations by enabling the ``malloc`` tracepoints +in ``libc`` as well as all kernel skb operations using the various in-kernel +tracepoints. This allows you to build as complete a view as desired of what the +system is doing during the tracing window (subject to what tracepoints are +available). + +Of particular use are the tracepoints for FRR's internal event scheduler; +tracing these allows you to see all events executed by all event loops for the +target(s) in question. Here's a couple events selected from a trace of BGP +during startup:: + + ... + + [18:41:35.750131763] (+0.000048901) host frr_libfrr:thread_call: { cpu_id = + 1 }, { threadmaster_name = "default", function_name = "zclient_connect", + scheduled_from = "lib/zclient.c", scheduled_on_line = 3877, thread_addr = + 0x0, file_descriptor = 0, event_value = 0, argument_ptr = 0xA37F70, timer = + 0 } + + [18:41:35.750175124] (+0.000020001) host frr_libfrr:thread_call: { cpu_id = + 1 }, { threadmaster_name = "default", function_name = "frr_config_read_in", + scheduled_from = "lib/libfrr.c", scheduled_on_line = 934, thread_addr = 0x0, + file_descriptor = 0, event_value = 0, argument_ptr = 0x0, timer = 0 } + + [18:41:35.753341264] (+0.000010532) host frr_libfrr:thread_call: { cpu_id = + 1 }, { threadmaster_name = "default", function_name = "bgp_event", + scheduled_from = "bgpd/bgpd.c", scheduled_on_line = 142, thread_addr = 0x0, + file_descriptor = 2, event_value = 2, argument_ptr = 0xE4D780, timer = 2 } + + [18:41:35.753404186] (+0.000004910) host frr_libfrr:thread_call: { cpu_id = + 1 }, { threadmaster_name = "default", function_name = "zclient_read", + scheduled_from = "lib/zclient.c", scheduled_on_line = 3891, thread_addr = + 0x0, file_descriptor = 40, event_value = 40, argument_ptr = 0xA37F70, timer + = 40 } + + ... + + +Very useful for getting a time-ordered look into what the process is doing. + + +Adding Tracepoints +------------------ + +Adding new tracepoints is a two step process: + +1. Define the tracepoint +2. Use the tracepoint + +Tracepoint definitions state the "provider" and name of the tracepoint, along +with any values it will produce, and how to format them. This is done with +macros provided by LTTng. USDT probes do not use definitions and are inserted +at the trace site with a single macro. However, to maintain support for both +platforms, you must define an LTTng tracepoint when adding a new one. +``frrtrace()`` will expand to the appropriate ``DTRACE_PROBEn`` macro when USDT +is in use. + +If you are adding new tracepoints to a daemon that has no tracepoints, that +daemon's ``subdir.am`` must be updated to conditionally link ``lttng-ust``. +Look at ``bgpd/subdir.am`` for an example of how to do this; grep for +``UST_LIBS``. Create new files named ``<daemon>_trace.[ch]``. Use +``bgpd/bgp_trace.[h]`` as boilerplate. If you are adding tracepoints to a +daemon that already has them, look for the ``<daemon>_trace.h`` file; +tracepoints are written here. + +Refer to the `LTTng developer docs +<https://lttng.org/docs/#doc-c-application>`_ for details on how to define +tracepoints. + +To use them, simply add a call to ``frrtrace()`` at the point you'd like the +event to be emitted, like so: + +.. code-block:: c + + ... + + switch (type) { + case BGP_MSG_OPEN: + frrtrace(2, frr_bgp, open_process, peer, size); /* tracepoint */ + atomic_fetch_add_explicit(&peer->open_in, 1, + memory_order_relaxed); + mprc = bgp_open_receive(peer, size); + + ... + +After recompiling this tracepoint will now be available, either as a USDT probe +or LTTng tracepoint, depending on your compilation choice. + + +trace.h +^^^^^^^ + +Because FRR supports multiple types of tracepoints, the code for creating them +abstracts away the underlying system being used. This abstraction code is in +``lib/trace.h``. There are 2 function-like macros that are used for working +with tracepoints. + +- ``frrtrace()`` defines tracepoints +- ``frrtrace_enabled()`` checks whether a tracepoint is enabled + +There is also ``frrtracelog()``, which is used in zlog core code to make zlog +messages available as trace events to LTTng. This should not be used elsewhere. + +There is additional documentation in the header. The key thing to note is that +you should never include ``trace.h`` in source where you plan to put +tracepoints; include the tracepoint definition header instead (e.g. +:file:`bgp_trace.h`). + + +Limitations +----------- + +Tracers do not like ``fork()`` or ``dlopen()``. LTTng has some workarounds for +this involving interceptor libraries using ``LD_PRELOAD``. + +If you're running FRR in a typical daemonizing way (``-d`` to the daemons) +you'll need to run the daemons like so: + +.. code-block:: shell + + LD_PRELOAD=liblttng-ust-fork.so <daemon> + + +If you're using systemd this you can accomplish this for all daemons by +modifying ``frr.service`` like so: + +.. code-block:: diff + + --- a/frr.service + +++ b/frr.service + @@ -7,6 +7,7 @@ Before=network.target + OnFailure=heartbeat-failed@%n + + [Service] + +Environment="LD_PRELOAD=liblttng-ust-fork.so" + Nice=-5 + Type=forking + NotifyAccess=all + + +USDT tracepoints are relatively high overhead and probably shouldn't be used +for "flight recorder" functionality, i.e. enabling and passively recording all +events for monitoring purposes. It's generally okay to use LTTng like this, +though. |