diff options
Diffstat (limited to 'doc/userguide/capture-hardware/af-xdp.rst')
-rw-r--r-- | doc/userguide/capture-hardware/af-xdp.rst | 287 |
1 files changed, 287 insertions, 0 deletions
diff --git a/doc/userguide/capture-hardware/af-xdp.rst b/doc/userguide/capture-hardware/af-xdp.rst new file mode 100644 index 0000000..ebe8585 --- /dev/null +++ b/doc/userguide/capture-hardware/af-xdp.rst @@ -0,0 +1,287 @@ +AF_XDP +====== + +AF_XDP (eXpress Data Path) is a high speed capture framework for Linux that was +introduced in Linux v4.18. AF_XDP aims at improving capture performance by +redirecting ingress frames to user-space memory rings, thus bypassing the network +stack. + +Note that during ``af_xdp`` operation the selected interface cannot be used for +regular network usage. + +Further reading: + + - https://www.kernel.org/doc/html/latest/networking/af_xdp.html + +Compiling Suricata +------------------ + +Linux +~~~~~ + +libxdp and libpbf are required for this feature. When building from source the +development files will also be required. + +Example:: + + dnf -y install libxdp-devel libbpf-devel + +This feature is enabled provided the libraries above are installed, the user +does not need to add any additional command line options. + +The command line option ``--disable-af-xdp`` can be used to disable this +feature. + +Example:: + + ./configure --disable-af-xdp + +Starting Suricata +----------------- + +IDS +~~~ + +Suricata can be started as follows to use af-xdp: + +:: + + af-xdp: + suricata --af-xdp=<interface> + suricata --af-xdp=igb0 + +In the above example Suricata will start reading from the `igb0` network interface. + +AF_XDP Configuration +-------------------- + +Each of these settings can be configured under ``af-xdp`` within the "Configure +common capture settings" section of suricata.yaml configuration file. + +The number of threads created can be configured in the suricata.yaml configuration +file. It is recommended to use threads equal to NIC queues/CPU cores. + +Another option is to select ``auto`` which will allow Suricata to configure the +number of threads based on the number of RSS queues available on the NIC. + +With ``auto`` selected, Suricata spawns receive threads equal to the number of +configured RSS queues on the interface. + +:: + + af-xdp: + threads: <number> + threads: auto + threads: 8 + +Advanced setup +--------------- + +af-xdp capture source will operate using the default configuration settings. +However, these settings are available in the suricata.yaml configuration file. + +Available configuration options are: + +force-xdp-mode +~~~~~~~~~~~~~~ + +There are two operating modes employed when loading the XDP program, these are: + +- XDP_DRV: Mode chosen when the driver supports AF_XDP +- XDP_SKB: Mode chosen when no AF_XDP support is unavailable + +XDP_DRV mode is the preferred mode, used to ensure best performance. + +:: + + af-xdp: + force-xdp-mode: <value> where: value = <skb|drv|none> + force-xdp-mode: drv + +force-bind-mode +~~~~~~~~~~~~~~~ + +During binding the kernel will first attempt to use zero-copy (preferred). If +zero-copy support is unavailable it will fallback to copy mode, copying all +packets out to user space. + +:: + + af-xdp: + force-bind-mode: <value> where: value = <copy|zero|none> + force-bind-mode: zero + +For both options, the kernel will attempt the 'preferred' option first and +fallback upon failure. Therefore the default (none) means the kernel has +control of which option to apply. By configuring these options the user +is forcing said option. Note that if enabled, the bind will only attempt +this option, upon failure the bind will fail i.e. no fallback. + +mem-unaligned +~~~~~~~~~~~~~~~~ + +AF_XDP can operate in two memory alignment modes, these are: + +- Aligned chunk mode +- Unaligned chunk mode + +Aligned chunk mode is the default option which ensures alignment of the +data within the UMEM. + +Unaligned chunk mode uses hugepages for the UMEM. +Hugepages start at the size of 2MB but they can be as large as 1GB. +Lower count of pages (memory chunks) allows faster lookup of page entries. +The hugepages need to be allocated on the NUMA node where the NIC and CPU resides. +Otherwise, if the hugepages are allocated only on NUMA node 0 and the NIC is +connected to NUMA node 1, then the application will fail to start. +Therefore, it is recommended to first find out to which NUMA node the NIC is +connected to and only then allocate hugepages and set CPU cores affinity +to the given NUMA node. + +Memory assigned per socket/thread is 16MB, so each worker thread requires at least +16MB of free space. As stated above hugepages can be of various sizes, consult the +OS to confirm with ``cat /proc/meminfo``. + +Example :: + + 8 worker threads * 16Mb = 128Mb + hugepages = 2048 kB + so: pages required = 62.5 (63) pages + +See https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt for detailed +description. + +To enable unaligned chunk mode: + +:: + + af-xdp: + mem-unaligned: <yes/no> + mem-unaligned: yes + +Introduced from Linux v5.11 a ``SO_PREFER_BUSY_POLL`` option has been added to +AF_XDP that allows a true polling of the socket queues. This feature has +been introduced to reduce context switching and improve CPU reaction time +during traffic reception. + +Enabled by default, this feature will apply the following options, unless +disabled (see below). The following options are used to configure this feature. + +enable-busy-poll +~~~~~~~~~~~~~~~~ + +Enables or disables busy polling. + +:: + + af-xdp: + enable-busy-poll: <yes/no> + enable-busy-poll: yes + +busy-poll-time +~~~~~~~~~~~~~~ + +Sets the approximate time in microseconds to busy poll on a ``blocking receive`` +when there is no data. + +:: + + af-xdp: + busy-poll-time: <time> + busy-poll-time: 20 + +busy-poll-budget +~~~~~~~~~~~~~~~~ + +Budget allowed for batching of ingress frames. Larger values means more +frames can be stored/read. It is recommended to test this for performance. + +:: + + af-xdp: + busy-poll-budget: <budget> + busy-poll-budget: 64 + +Linux tunables +~~~~~~~~~~~~~~~ + +The ``SO_PREFER_BUSY_POLL`` option works in concert with the following two Linux +knobs to ensure best capture performance. These are not socket options: + +- gro-flush-timeout +- napi-defer-hard-irq + +The purpose of these two knobs is to defer interrupts and to allow the +NAPI context to be scheduled from a watchdog timer instead. + +The ``gro-flush-timeout`` indicates the timeout period for the watchdog +timer. When no traffic is received for ``gro-flush-timeout`` the timer will +exit and softirq handling will resume. + +The ``napi-defer-hard-irq`` indicates the number of queue scan attempts +before exiting to interrupt context. When enabled, the softirq NAPI context will +exit early, allowing busy polling. + +:: + + af-xdp: + gro-flush-timeout: 2000000 + napi-defer-hard-irq: 2 + + +Hardware setup +--------------- + +Intel NIC setup +~~~~~~~~~~~~~~~ + +Intel network cards don't support symmetric hashing but it is possible to emulate +it by using a specific hashing function. + +Follow these instructions closely for desired result:: + + ifconfig eth3 down + +Enable symmetric hashing :: + + ifconfig eth3 down + ethtool -L eth3 combined 16 # if you have at least 16 cores + ethtool -K eth3 rxhash on + ethtool -K eth3 ntuple on + ifconfig eth3 up + ./set_irq_affinity 0-15 eth3 + ethtool -X eth3 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 16 + ethtool -x eth3 + ethtool -n eth3 + +In the above setup you are free to use any recent ``set_irq_affinity`` script. It is available in any Intel x520/710 NIC sources driver download. + +**NOTE:** +We use a special low entropy key for the symmetric hashing. `More info about the research for symmetric hashing set up <http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf>`_ + +Disable any NIC offloading +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Suricata shall disable NIC offloading based on configuration parameter ``disable-offloading``, which is enabled by default. +See ``capture`` section of yaml file. + +:: + + capture: + # disable NIC offloading. It's restored when Suricata exits. + # Enabled by default. + #disable-offloading: false + +Balance as much as you can +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Try to use the network card's flow balancing as much as possible :: + + for proto in tcp4 udp4 ah4 esp4 sctp4 tcp6 udp6 ah6 esp6 sctp6; do + /sbin/ethtool -N eth3 rx-flow-hash $proto sd + done + +This command triggers load balancing using only source and destination IPs. This may be not optimal +in terms of load balancing fairness but this ensures all packets of a flow will reach the same thread +even in the case of IP fragmentation (where source and destination port will not be available for +some fragmented packets). |