summaryrefslogtreecommitdiffstats
path: root/doc/userguide/capture-hardware/af-xdp.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/userguide/capture-hardware/af-xdp.rst')
-rw-r--r--doc/userguide/capture-hardware/af-xdp.rst287
1 files changed, 287 insertions, 0 deletions
diff --git a/doc/userguide/capture-hardware/af-xdp.rst b/doc/userguide/capture-hardware/af-xdp.rst
new file mode 100644
index 0000000..ebe8585
--- /dev/null
+++ b/doc/userguide/capture-hardware/af-xdp.rst
@@ -0,0 +1,287 @@
+AF_XDP
+======
+
+AF_XDP (eXpress Data Path) is a high speed capture framework for Linux that was
+introduced in Linux v4.18. AF_XDP aims at improving capture performance by
+redirecting ingress frames to user-space memory rings, thus bypassing the network
+stack.
+
+Note that during ``af_xdp`` operation the selected interface cannot be used for
+regular network usage.
+
+Further reading:
+
+ - https://www.kernel.org/doc/html/latest/networking/af_xdp.html
+
+Compiling Suricata
+------------------
+
+Linux
+~~~~~
+
+libxdp and libpbf are required for this feature. When building from source the
+development files will also be required.
+
+Example::
+
+ dnf -y install libxdp-devel libbpf-devel
+
+This feature is enabled provided the libraries above are installed, the user
+does not need to add any additional command line options.
+
+The command line option ``--disable-af-xdp`` can be used to disable this
+feature.
+
+Example::
+
+ ./configure --disable-af-xdp
+
+Starting Suricata
+-----------------
+
+IDS
+~~~
+
+Suricata can be started as follows to use af-xdp:
+
+::
+
+ af-xdp:
+ suricata --af-xdp=<interface>
+ suricata --af-xdp=igb0
+
+In the above example Suricata will start reading from the `igb0` network interface.
+
+AF_XDP Configuration
+--------------------
+
+Each of these settings can be configured under ``af-xdp`` within the "Configure
+common capture settings" section of suricata.yaml configuration file.
+
+The number of threads created can be configured in the suricata.yaml configuration
+file. It is recommended to use threads equal to NIC queues/CPU cores.
+
+Another option is to select ``auto`` which will allow Suricata to configure the
+number of threads based on the number of RSS queues available on the NIC.
+
+With ``auto`` selected, Suricata spawns receive threads equal to the number of
+configured RSS queues on the interface.
+
+::
+
+ af-xdp:
+ threads: <number>
+ threads: auto
+ threads: 8
+
+Advanced setup
+---------------
+
+af-xdp capture source will operate using the default configuration settings.
+However, these settings are available in the suricata.yaml configuration file.
+
+Available configuration options are:
+
+force-xdp-mode
+~~~~~~~~~~~~~~
+
+There are two operating modes employed when loading the XDP program, these are:
+
+- XDP_DRV: Mode chosen when the driver supports AF_XDP
+- XDP_SKB: Mode chosen when no AF_XDP support is unavailable
+
+XDP_DRV mode is the preferred mode, used to ensure best performance.
+
+::
+
+ af-xdp:
+ force-xdp-mode: <value> where: value = <skb|drv|none>
+ force-xdp-mode: drv
+
+force-bind-mode
+~~~~~~~~~~~~~~~
+
+During binding the kernel will first attempt to use zero-copy (preferred). If
+zero-copy support is unavailable it will fallback to copy mode, copying all
+packets out to user space.
+
+::
+
+ af-xdp:
+ force-bind-mode: <value> where: value = <copy|zero|none>
+ force-bind-mode: zero
+
+For both options, the kernel will attempt the 'preferred' option first and
+fallback upon failure. Therefore the default (none) means the kernel has
+control of which option to apply. By configuring these options the user
+is forcing said option. Note that if enabled, the bind will only attempt
+this option, upon failure the bind will fail i.e. no fallback.
+
+mem-unaligned
+~~~~~~~~~~~~~~~~
+
+AF_XDP can operate in two memory alignment modes, these are:
+
+- Aligned chunk mode
+- Unaligned chunk mode
+
+Aligned chunk mode is the default option which ensures alignment of the
+data within the UMEM.
+
+Unaligned chunk mode uses hugepages for the UMEM.
+Hugepages start at the size of 2MB but they can be as large as 1GB.
+Lower count of pages (memory chunks) allows faster lookup of page entries.
+The hugepages need to be allocated on the NUMA node where the NIC and CPU resides.
+Otherwise, if the hugepages are allocated only on NUMA node 0 and the NIC is
+connected to NUMA node 1, then the application will fail to start.
+Therefore, it is recommended to first find out to which NUMA node the NIC is
+connected to and only then allocate hugepages and set CPU cores affinity
+to the given NUMA node.
+
+Memory assigned per socket/thread is 16MB, so each worker thread requires at least
+16MB of free space. As stated above hugepages can be of various sizes, consult the
+OS to confirm with ``cat /proc/meminfo``.
+
+Example ::
+
+ 8 worker threads * 16Mb = 128Mb
+ hugepages = 2048 kB
+ so: pages required = 62.5 (63) pages
+
+See https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt for detailed
+description.
+
+To enable unaligned chunk mode:
+
+::
+
+ af-xdp:
+ mem-unaligned: <yes/no>
+ mem-unaligned: yes
+
+Introduced from Linux v5.11 a ``SO_PREFER_BUSY_POLL`` option has been added to
+AF_XDP that allows a true polling of the socket queues. This feature has
+been introduced to reduce context switching and improve CPU reaction time
+during traffic reception.
+
+Enabled by default, this feature will apply the following options, unless
+disabled (see below). The following options are used to configure this feature.
+
+enable-busy-poll
+~~~~~~~~~~~~~~~~
+
+Enables or disables busy polling.
+
+::
+
+ af-xdp:
+ enable-busy-poll: <yes/no>
+ enable-busy-poll: yes
+
+busy-poll-time
+~~~~~~~~~~~~~~
+
+Sets the approximate time in microseconds to busy poll on a ``blocking receive``
+when there is no data.
+
+::
+
+ af-xdp:
+ busy-poll-time: <time>
+ busy-poll-time: 20
+
+busy-poll-budget
+~~~~~~~~~~~~~~~~
+
+Budget allowed for batching of ingress frames. Larger values means more
+frames can be stored/read. It is recommended to test this for performance.
+
+::
+
+ af-xdp:
+ busy-poll-budget: <budget>
+ busy-poll-budget: 64
+
+Linux tunables
+~~~~~~~~~~~~~~~
+
+The ``SO_PREFER_BUSY_POLL`` option works in concert with the following two Linux
+knobs to ensure best capture performance. These are not socket options:
+
+- gro-flush-timeout
+- napi-defer-hard-irq
+
+The purpose of these two knobs is to defer interrupts and to allow the
+NAPI context to be scheduled from a watchdog timer instead.
+
+The ``gro-flush-timeout`` indicates the timeout period for the watchdog
+timer. When no traffic is received for ``gro-flush-timeout`` the timer will
+exit and softirq handling will resume.
+
+The ``napi-defer-hard-irq`` indicates the number of queue scan attempts
+before exiting to interrupt context. When enabled, the softirq NAPI context will
+exit early, allowing busy polling.
+
+::
+
+ af-xdp:
+ gro-flush-timeout: 2000000
+ napi-defer-hard-irq: 2
+
+
+Hardware setup
+---------------
+
+Intel NIC setup
+~~~~~~~~~~~~~~~
+
+Intel network cards don't support symmetric hashing but it is possible to emulate
+it by using a specific hashing function.
+
+Follow these instructions closely for desired result::
+
+ ifconfig eth3 down
+
+Enable symmetric hashing ::
+
+ ifconfig eth3 down
+ ethtool -L eth3 combined 16 # if you have at least 16 cores
+ ethtool -K eth3 rxhash on
+ ethtool -K eth3 ntuple on
+ ifconfig eth3 up
+ ./set_irq_affinity 0-15 eth3
+ ethtool -X eth3 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 16
+ ethtool -x eth3
+ ethtool -n eth3
+
+In the above setup you are free to use any recent ``set_irq_affinity`` script. It is available in any Intel x520/710 NIC sources driver download.
+
+**NOTE:**
+We use a special low entropy key for the symmetric hashing. `More info about the research for symmetric hashing set up <http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf>`_
+
+Disable any NIC offloading
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Suricata shall disable NIC offloading based on configuration parameter ``disable-offloading``, which is enabled by default.
+See ``capture`` section of yaml file.
+
+::
+
+ capture:
+ # disable NIC offloading. It's restored when Suricata exits.
+ # Enabled by default.
+ #disable-offloading: false
+
+Balance as much as you can
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Try to use the network card's flow balancing as much as possible ::
+
+ for proto in tcp4 udp4 ah4 esp4 sctp4 tcp6 udp6 ah6 esp6 sctp6; do
+ /sbin/ethtool -N eth3 rx-flow-hash $proto sd
+ done
+
+This command triggers load balancing using only source and destination IPs. This may be not optimal
+in terms of load balancing fairness but this ensures all packets of a flow will reach the same thread
+even in the case of IP fragmentation (where source and destination port will not be available for
+some fragmented packets).