summaryrefslogtreecommitdiffstats
path: root/daemon/bindings/net_xdpsrv.rst
blob: e3014feca44125b1c62ace36523ac37777794353 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
.. SPDX-License-Identifier: GPL-3.0-or-later

.. _dns-over-xdp:

XDP for higher UDP performance
------------------------------

.. warning::
   As of version 5.2.0, XDP support in Knot Resolver is considered
   experimental. The impact on overall throughput and performance may not
   always be beneficial.

Using XDP allows significant speedup of UDP packet processing in recent Linux kernels,
especially with some network drivers that implement good support.
The basic idea is that for selected packets the Linux networking stack is bypassed,
and some drivers can even directly use the user-space buffers for reading and writing.

.. TODO perhaps some hint/link about how significant speedup one might get? (link to some talk video?)

Prerequisites
^^^^^^^^^^^^^
.. this is mostly copied from knot-dns doc/operations.rst

.. warning::
   Bypassing the network stack has significant implications, such as bypassing the firewall
   and monitoring solutions.
   Make sure you're familiar with the trade-offs before using this feature.
   Read more in :ref:`dns-over-xdp_limitations`.

* Linux kernel 4.18+ (5.x+ is recommended for optimal performance) compiled with
  the `CONFIG_XDP_SOCKETS=y` option. XDP isn't supported in other operating systems.
* libknot compiled with XDP support
* **A multiqueue network card with native XDP support is highly recommended**,
  otherwise the performance gain will be much lower and you may encounter
  issues due to XDP emulation.
  Successfully tested cards:

  * Intel series 700 (driver `i40e`), maximum number of queues per interface is 64.
  * Intel series 500 (driver `ixgbe`), maximum number of queues per interface is 64.
    The number of CPUs available has to be at most 64!


Set up
^^^^^^
.. first parts are mostly copied from knot-dns doc/operations.rst

The server instances need additional Linux **capabilities** during startup.
(Or you could start them as `root`.)
Execute command

.. code-block:: bash

	systemctl edit kresd@.service

And insert these lines:

.. code-block:: ini

	[Service]
        CapabilityBoundingSet=CAP_NET_RAW CAP_NET_ADMIN CAP_SYS_ADMIN CAP_IPC_LOCK CAP_SYS_RESOURCE
        AmbientCapabilities=CAP_NET_RAW CAP_NET_ADMIN CAP_SYS_ADMIN CAP_IPC_LOCK CAP_SYS_RESOURCE

The ``CAP_SYS_RESOURCE`` is only needed on Linux < 5.11.

.. TODO suggest some way for ethtool -L?  Perhaps via systemd units?

You want the same number of kresd instances and network **queues** on your card;
you can use ``ethtool -L`` before the services start.
With XDP this is more important than with vanilla UDP, as we only support one instance
per queue and unclaimed queues will fall back to vanilla UDP.
Ideally you can set these numbers as high as the number of CPUs that you want kresd to use.

Modification of ``/etc/knot-resolver/kresd.conf`` may often be quite simple, for example:

.. code-block:: lua

	net.listen('eth2', 53, { kind = 'xdp' })
	net.listen('203.0.113.53', 53, { kind = 'dns' })

Note that you want to also keep the vanilla DNS line to service TCP
and possibly any fallback UDP (e.g. from unclaimed queues).
XDP listening is in principle done on queues of whole network interfaces
and the target addresses of incoming packets aren't checked in any way,
but you are still allowed to specify interface by an address
(if it's unambiguous at that moment):

.. code-block:: lua

	net.listen('203.0.113.53', 53, { kind = 'xdp' })
	net.listen('203.0.113.53', 53, { kind = 'dns' })

The default selection of queues is tailored for the usual naming convention:
``kresd@1.service``, ``kresd@2.service``, ...
but you can still specify them explicitly, e.g. the default is effectively the same as:

.. code-block:: lua

	net.listen('eth2', 53, { kind = 'xdp', nic_queue = env.SYSTEMD_INSTANCE - 1 })


Optimizations
^^^^^^^^^^^^^
.. this is basically copied from knot-dns doc/operations.rst

Some helpful commands:

.. code-block:: text

	ethtool -N <interface> rx-flow-hash udp4 sdfn
	ethtool -N <interface> rx-flow-hash udp6 sdfn
	ethtool -L <interface> combined <queue-number>
	ethtool -G <interface> rx <ring-size> tx <ring-size>
	renice -n 19 -p $(pgrep '^ksoftirqd/[0-9]*$')

.. TODO CPU affinities?  `CPUAffinity=%i` in systemd unit sounds good.


.. _dns-over-xdp_limitations:

Limitations
^^^^^^^^^^^
.. this is basically copied from knot-dns doc/operations.rst

* VLAN segmentation is not supported.
* MTU higher than 1792 bytes is not supported.
* Multiple BPF filters per one network device are not supported.
* Symmetrical routing is required (query source MAC/IP addresses and
  reply destination MAC/IP addresses are the same).
* Systems with big-endian byte ordering require special recompilation of libknot.
* IPv4 header and UDP checksums are not verified on received DNS messages.
* DNS over XDP traffic is not visible to common system tools (e.g. firewall, tcpdump etc.).
* BPF filter is not automatically unloaded from the network device. Manual filter unload::

	ip link set dev <interface> xdp off

* Knot Resolver only supports using XDP towards clients currently (not towards upstreams).
* When starting up an XDP socket you may get a harmless warning::

	libbpf: Kernel error message: XDP program already attached