summaryrefslogtreecommitdiffstats
path: root/src/seastar/dpdk/doc/guides/nics/pcap_ring.rst
blob: 5e4f5f605c06d19f09b5b9e91746fdcb276a5ede (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
..  BSD LICENSE
    Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
    All rights reserved.

    Redistribution and use in source and binary forms, with or without
    modification, are permitted provided that the following conditions
    are met:

    * Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in
    the documentation and/or other materials provided with the
    distribution.
    * Neither the name of Intel Corporation nor the names of its
    contributors may be used to endorse or promote products derived
    from this software without specific prior written permission.

    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
    "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
    LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
    A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
    OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
    SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
    LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
    DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
    THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
    (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Libpcap and Ring Based Poll Mode Drivers
========================================

In addition to Poll Mode Drivers (PMDs) for physical and virtual hardware,
the DPDK also includes two pure-software PMDs. These two drivers are:

*   A libpcap -based PMD (librte_pmd_pcap) that reads and writes packets using libpcap,
    - both from files on disk, as well as from physical NIC devices using standard Linux kernel drivers.

*   A ring-based PMD (librte_pmd_ring) that allows a set of software FIFOs (that is, rte_ring)
    to be accessed using the PMD APIs, as though they were physical NICs.

.. note::

    The libpcap -based PMD is disabled by default in the build configuration files,
    owing to an external dependency on the libpcap development files which must be installed on the board.
    Once the libpcap development files are installed,
    the library can be enabled by setting CONFIG_RTE_LIBRTE_PMD_PCAP=y and recompiling the DPDK.

Using the Drivers from the EAL Command Line
-------------------------------------------

For ease of use, the DPDK EAL also has been extended to allow pseudo-Ethernet devices,
using one or more of these drivers,
to be created at application startup time during EAL initialization.

To do so, the --vdev= parameter must be passed to the EAL.
This takes take options to allow ring and pcap-based Ethernet to be allocated and used transparently by the application.
This can be used, for example, for testing on a virtual machine where there are no Ethernet ports.

Libpcap-based PMD
~~~~~~~~~~~~~~~~~

Pcap-based devices can be created using the virtual device --vdev option.
The device name must start with the net_pcap prefix followed by numbers or letters.
The name is unique for each device. Each device can have multiple stream options and multiple devices can be used.
Multiple device definitions can be arranged using multiple --vdev.
Device name and stream options must be separated by commas as shown below:

.. code-block:: console

   $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
       --vdev 'net_pcap0,stream_opt0=..,stream_opt1=..' \
       --vdev='net_pcap1,stream_opt0=..'

Device Streams
^^^^^^^^^^^^^^

Multiple ways of stream definitions can be assessed and combined as long as the following two rules are respected:

*   A device is provided with two different streams - reception and transmission.

*   A device is provided with one network interface name used for reading and writing packets.

The different stream types are:

*   rx_pcap: Defines a reception stream based on a pcap file.
    The driver reads each packet within the given pcap file as if it was receiving it from the wire.
    The value is a path to a valid pcap file.

        rx_pcap=/path/to/file.pcap

*   tx_pcap: Defines a transmission stream based on a pcap file.
    The driver writes each received packet to the given pcap file.
    The value is a path to a pcap file.
    The file is overwritten if it already exists and it is created if it does not.

        tx_pcap=/path/to/file.pcap

*   rx_iface: Defines a reception stream based on a network interface name.
    The driver reads packets coming from the given interface using the Linux kernel driver for that interface.
    The value is an interface name.

        rx_iface=eth0

*   tx_iface: Defines a transmission stream based on a network interface name.
    The driver sends packets to the given interface using the Linux kernel driver for that interface.
    The value is an interface name.

        tx_iface=eth0

*   iface: Defines a device mapping a network interface.
    The driver both reads and writes packets from and to the given interface.
    The value is an interface name.

        iface=eth0

Examples of Usage
^^^^^^^^^^^^^^^^^

Read packets from one pcap file and write them to another:

.. code-block:: console

    $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
        --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_pcap=file_tx.pcap' \
        -- --port-topology=chained

Read packets from a network interface and write them to a pcap file:

.. code-block:: console

    $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
        --vdev 'net_pcap0,rx_iface=eth0,tx_pcap=file_tx.pcap' \
        -- --port-topology=chained

Read packets from a pcap file and write them to a network interface:

.. code-block:: console

    $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
        --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_iface=eth1' \
        -- --port-topology=chained

Forward packets through two network interfaces:

.. code-block:: console

    $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
        --vdev 'net_pcap0,iface=eth0' --vdev='net_pcap1;iface=eth1'

Using libpcap-based PMD with the testpmd Application
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

One of the first things that testpmd does before starting to forward packets is to flush the RX streams
by reading the first 512 packets on every RX stream and discarding them.
When using a libpcap-based PMD this behavior can be turned off using the following command line option:

.. code-block:: console

    --no-flush-rx

It is also available in the runtime command line:

.. code-block:: console

    set flush_rx on/off

It is useful for the case where the rx_pcap is being used and no packets are meant to be discarded.
Otherwise, the first 512 packets from the input pcap file will be discarded by the RX flushing operation.

.. code-block:: console

    $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
        --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_pcap=file_tx.pcap' \
        -- --port-topology=chained --no-flush-rx


Rings-based PMD
~~~~~~~~~~~~~~~

To run a DPDK application on a machine without any Ethernet devices, a pair of ring-based rte_ethdevs can be used as below.
The device names passed to the --vdev option must start with net_ring and take no additional parameters.
Multiple devices may be specified, separated by commas.

.. code-block:: console

    ./testpmd -l 1-3 -n 4 --vdev=net_ring0 --vdev=net_ring1 -- -i
    EAL: Detected lcore 1 as core 1 on socket 0
    ...

    Interactive-mode selected
    Configuring Port 0 (socket 0)
    Configuring Port 1 (socket 0)
    Checking link statuses...
    Port 0 Link Up - speed 10000 Mbps - full-duplex
    Port 1 Link Up - speed 10000 Mbps - full-duplex
    Done

    testpmd> start tx_first
    io packet forwarding - CRC stripping disabled - packets/burst=16
    nb forwarding cores=1 - nb forwarding ports=2
    RX queues=1 - RX desc=128 - RX free threshold=0
    RX threshold registers: pthresh=8 hthresh=8 wthresh=4
    TX queues=1 - TX desc=512 - TX free threshold=0
    TX threshold registers: pthresh=36 hthresh=0 wthresh=0
    TX RS bit threshold=0 - TXQ flags=0x0

    testpmd> stop
    Telling cores to stop...
    Waiting for lcores to finish...

.. image:: img/forward_stats.*

.. code-block:: console

    +++++++++++++++ Accumulated forward statistics for allports++++++++++
    RX-packets: 462384736  RX-dropped: 0 RX-total: 462384736
    TX-packets: 462384768  TX-dropped: 0 TX-total: 462384768
    +++++++++++++++++++++++++++++++++++++++++++++++++++++

    Done.


Using the Poll Mode Driver from an Application
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Both drivers can provide similar APIs to allow the user to create a PMD, that is,
rte_ethdev structure, instances at run-time in the end-application,
for example, using rte_eth_from_rings() or rte_eth_from_pcaps() APIs.
For the rings-based PMD, this functionality could be used, for example,
to allow data exchange between cores using rings to be done in exactly the
same way as sending or receiving packets from an Ethernet device.
For the libpcap-based PMD, it allows an application to open one or more pcap files
and use these as a source of packet input to the application.

Usage Examples
^^^^^^^^^^^^^^

To create two pseudo-Ethernet ports where all traffic sent to a port is looped back
for reception on the same port (error handling omitted for clarity):

.. code-block:: c

    #define RING_SIZE 256
    #define NUM_RINGS 2
    #define SOCKET0 0

    struct rte_ring *ring[NUM_RINGS];
    int port0, port1;

    ring[0] = rte_ring_create("R0", RING_SIZE, SOCKET0, RING_F_SP_ENQ|RING_F_SC_DEQ);
    ring[1] = rte_ring_create("R1", RING_SIZE, SOCKET0, RING_F_SP_ENQ|RING_F_SC_DEQ);

    /* create two ethdev's */

    port0 = rte_eth_from_rings("net_ring0", ring, NUM_RINGS, ring, NUM_RINGS, SOCKET0);
    port1 = rte_eth_from_rings("net_ring1", ring, NUM_RINGS, ring, NUM_RINGS, SOCKET0);


To create two pseudo-Ethernet ports where the traffic is switched between them,
that is, traffic sent to port 0 is read back from port 1 and vice-versa,
the final two lines could be changed as below:

.. code-block:: c

    port0 = rte_eth_from_rings("net_ring0", &ring[0], 1, &ring[1], 1, SOCKET0);
    port1 = rte_eth_from_rings("net_ring1", &ring[1], 1, &ring[0], 1, SOCKET0);

This type of configuration could be useful in a pipeline model, for example,
where one may want to have inter-core communication using pseudo Ethernet devices rather than raw rings,
for reasons of API consistency.

Enqueuing and dequeuing items from an rte_ring using the rings-based PMD may be slower than using the native rings API.
This is because DPDK Ethernet drivers make use of function pointers to call the appropriate enqueue or dequeue functions,
while the rte_ring specific functions are direct function calls in the code and are often inlined by the compiler.

   Once an ethdev has been created, for either a ring or a pcap-based PMD,
   it should be configured and started in the same way as a regular Ethernet device, that is,
   by calling rte_eth_dev_configure() to set the number of receive and transmit queues,
   then calling rte_eth_rx_queue_setup() / tx_queue_setup() for each of those queues and
   finally calling rte_eth_dev_start() to allow transmission and reception of packets to begin.