1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
|
.TH MQPRIO 8 "24 Sept 2013" "iproute2" "Linux"
.SH NAME
MQPRIO \- Multiqueue Priority Qdisc (Offloaded Hardware QOS)
.SH SYNOPSIS
.B tc qdisc ... dev
dev (
.B parent
classid | root) [
.B handle
major: ]
.B mqprio
.ti +8
[
.B num_tc
tcs ] [
.B map
P0 P1 P2... ] [
.B queues
count1@offset1 count2@offset2 ... ]
.ti +8
[
.B hw
1|0 ] [
.B mode
dcb|channel ] [
.B shaper
dcb|bw_rlimit ]
.ti +8
[
.B min_rate
min_rate1 min_rate2 ... ] [
.B max_rate
max_rate1 max_rate2 ... ]
.ti +8
[
.B fp
FP0 FP1 FP2 ... ]
.SH DESCRIPTION
The MQPRIO qdisc is a simple queuing discipline that allows mapping
traffic flows to hardware queue ranges using priorities and a configurable
priority to traffic class mapping. A traffic class in this context is
a set of contiguous qdisc classes which map 1:1 to a set of hardware
exposed queues.
By default the qdisc allocates a pfifo qdisc (packet limited first in, first
out queue) per TX queue exposed by the lower layer device. Other queuing
disciplines may be added subsequently. Packets are enqueued using the
.B map
parameter and hashed across the indicated queues in the
.B offset
and
.B count.
By default these parameters are configured by the hardware
driver to match the hardware QOS structures.
.B Channel
mode supports full offload of the mqprio options, the traffic classes, the queue
configurations and QOS attributes to the hardware. Enabled hardware can provide
hardware QOS with the ability to steer traffic flows to designated traffic
classes provided by this qdisc. Hardware based QOS is configured using the
.B shaper
parameter.
.B bw_rlimit
with minimum and maximum bandwidth rates can be used for setting
transmission rates on each traffic class. Also further qdiscs may be added
to the classes of MQPRIO to create more complex configurations.
.SH ALGORITHM
On creation with 'tc qdisc add', eight traffic classes are created mapping
priorities 0..7 to traffic classes 0..7 and priorities greater than 7 to
traffic class 0. This requires base driver support and the creation will
fail on devices that do not support hardware QOS schemes.
These defaults can be overridden using the qdisc parameters. Providing
the 'hw 0' flag allows software to run without hardware coordination.
If hardware coordination is being used and arguments are provided that
the hardware can not support then an error is returned. For many users
hardware defaults should work reasonably well.
As one specific example numerous Ethernet cards support the 802.1Q
link strict priority transmission selection algorithm (TSA). MQPRIO
enabled hardware in conjunction with the classification methods below
can provide hardware offloaded support for this TSA.
.SH CLASSIFICATION
Multiple methods are available to set the SKB priority which MQPRIO
uses to select which traffic class to enqueue the packet.
.TP
From user space
A process with sufficient privileges can encode the destination class
directly with SO_PRIORITY, see
.BR socket(7).
.TP
with iptables/nftables
An iptables/nftables rule can be created to match traffic flows and
set the priority.
.BR iptables(8)
.TP
with net_prio cgroups
The net_prio cgroup can be used to set the priority of all sockets
belong to an application. See kernel and cgroup documentation for details.
.SH QDISC PARAMETERS
.TP
num_tc
Number of traffic classes to use. Up to 16 classes supported.
You cannot have more classes than queues
.TP
map
The priority to traffic class map. Maps priorities 0..15 to a specified
traffic class.
.TP
queues
Provide count and offset of queue range for each traffic class. In the
format,
.B count@offset.
Queue ranges for each traffic classes cannot overlap and must be a
contiguous range of queues.
.TP
hw
Set to
.B 1
to support hardware offload. Set to
.B 0
to configure user specified values in software only.
The default value of this parameter is
.B 1
.TP
mode
Set to
.B channel
for full use of the mqprio options. Use
.B dcb
to offload only TC values and use hardware QOS defaults. Supported with 'hw'
set to 1 only.
.TP
shaper
Use
.B bw_rlimit
to set bandwidth rate limits for a traffic class. Use
.B dcb
for hardware QOS defaults. Supported with 'hw' set to 1 only.
.TP
min_rate
Minimum value of bandwidth rate limit for a traffic class. Supported only when
the
.B 'shaper'
argument is set to
.B 'bw_rlimit'.
.TP
max_rate
Maximum value of bandwidth rate limit for a traffic class. Supported only when
the
.B 'shaper'
argument is set to
.B 'bw_rlimit'.
.TP
fp
Selects whether traffic classes are express (deliver packets via the eMAC) or
preemptible (deliver packets via the pMAC), according to IEEE 802.1Q-2018
clause 6.7.2 Frame preemption. Takes the form of an array (one element per
traffic class) with values being
.B 'E'
(for express) or
.B 'P'
(for preemptible).
Multiple priorities which map to the same traffic class, as well as multiple
TXQs which map to the same traffic class, must have the same FP attributes.
To interpret the FP as an attribute per priority, the
.B 'map'
argument can be used for translation. To interpret FP as an attribute per TXQ,
the
.B 'queues'
argument can be used for translation.
Traffic classes are express by default. The argument is supported only with
.B 'hw'
set to 1. Preemptible traffic classes are accepted only if the device has a MAC
Merge layer configurable through
.BR ethtool(8).
.SH SEE ALSO
.BR ethtool(8)
.SH EXAMPLE
The following example shows how to attach priorities to 4 traffic classes ("num_tc 4"),
and then how to pair these traffic classes with 4 hardware queues with mqprio,
with hardware coordination ("hw 1", or does not specified, because 1 is the default value).
Traffic class 0 (tc0) is mapped to hardware queue 0 (q0), tc1 is mapped to q1,
tc2 is mapped to q2, and tc3 is mapped q3.
.EX
# tc qdisc add dev eth0 root mqprio \
num_tc 4 \
map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 \
queues 1@0 1@1 1@2 1@3 \
hw 1
.EE
The next example shows how to attach priorities to 3 traffic classes ("num_tc 3"),
and how to pair these traffic classes with 4 queues,
without hardware coordination ("hw 0").
Traffic class 0 (tc0) is mapped to hardware queue 0 (q0), tc1 is mapped to q1,
tc2 and is mapped to q2 and q3, where the queue selection between these
two queues is somewhat randomly decided.
.EX
# tc qdisc add dev eth0 root mqprio \
num_tc 3 \
map 0 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 \
hw 0
.EE
In both cases from above the priority values from 0 to 3 (prio0-3) are
mapped to tc0, prio4-7 are mapped to tc1, and the
prio8-11 are mapped to tc2 ("map" attribute). The last four priority values
(prio12-15) are mapped in different ways in the two examples.
They are mapped to tc3 in the first example and mapped to tc2 in the second example.
The values of these two examples are the following:
┌────┬────┬───────┐ ┌────┬────┬────────┐
│Prio│ tc │ queue │ │Prio│ tc │ queue │
├────┼────┼───────┤ ├────┼────┼────────┤
│ 0 │ 0 │ 0 │ │ 0 │ 0 │ 0 │
│ 1 │ 0 │ 0 │ │ 1 │ 0 │ 0 │
│ 2 │ 0 │ 0 │ │ 2 │ 0 │ 0 │
│ 3 │ 0 │ 0 │ │ 3 │ 0 │ 0 │
│ 4 │ 1 │ 1 │ │ 4 │ 1 │ 1 │
│ 5 │ 1 │ 1 │ │ 5 │ 1 │ 1 │
│ 6 │ 1 │ 1 │ │ 6 │ 1 │ 1 │
│ 7 │ 1 │ 1 │ │ 7 │ 1 │ 1 │
│ 8 │ 2 │ 2 │ │ 8 │ 2 │ 2 or 3 │
│ 9 │ 2 │ 2 │ │ 9 │ 2 │ 2 or 3 │
│ 10 │ 2 │ 2 │ │ 10 │ 2 │ 2 or 3 │
│ 11 │ 2 │ 2 │ │ 11 │ 2 │ 2 or 3 │
│ 12 │ 3 │ 3 │ │ 12 │ 2 │ 2 or 3 │
│ 13 │ 3 │ 3 │ │ 13 │ 2 │ 2 or 3 │
│ 14 │ 3 │ 3 │ │ 14 │ 2 │ 2 or 3 │
│ 15 │ 3 │ 3 │ │ 15 │ 2 │ 2 or 3 │
└────┴────┴───────┘ └────┴────┴────────┘
example1 example2
Another example of queue mapping is the following.
There are 5 traffic classes, and there are 8 hardware queues.
.EX
# tc qdisc add dev eth0 root mqprio \
num_tc 5 \
map 0 0 0 1 1 1 1 2 2 3 3 4 4 4 4 4 \
queues 1@0 2@1 1@3 1@4 3@5
.EE
The value mapping is the following for this example:
┌───────┐
tc0────┤Queue 0│◄────1@0
├───────┤
┌─┤Queue 1│◄────2@1
tc1──┤ ├───────┤
└─┤Queue 2│
├───────┤
tc2────┤Queue 3│◄────1@3
├───────┤
tc3────┤Queue 4│◄────1@4
├───────┤
┌─┤Queue 5│◄────3@5
│ ├───────┤
tc4──┼─┤Queue 6│
│ ├───────┤
└─┤Queue 7│
└───────┘
.SH AUTHORS
John Fastabend, <john.r.fastabend@intel.com>
|