summaryrefslogtreecommitdiffstats
path: root/upstream/opensuse-tumbleweed/man8/tc-prio.8
blob: 605f3d39f7438ec9ce2c8fc45a3b7f2411739814 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
.TH PRIO 8 "16 December 2001" "iproute2" "Linux"
.SH NAME
PRIO \- Priority qdisc
.SH SYNOPSIS
.B tc qdisc ... dev
dev
.B  ( parent
classid
.B | root) [ handle
major:
.B ] prio [ bands
bands
.B ] [ priomap
band band band...
.B ] [ estimator
interval timeconstant
.B ]

.SH DESCRIPTION
The PRIO qdisc is a simple classful queueing discipline that contains
an arbitrary number of classes of differing priority. The classes are
dequeued in numerical descending order of priority. PRIO is a scheduler
and never delays packets - it is a work-conserving qdisc, though the qdiscs
contained in the classes may not be.

Very useful for lowering latency when there is no need for slowing down
traffic.

.SH ALGORITHM
On creation with 'tc qdisc add', a fixed number of bands is created. Each
band is a class, although is not possible to add classes with 'tc qdisc
add', the number of bands to be created must instead be specified on the
command line attaching PRIO to its root.

When dequeueing, band 0 is tried first and only if it did not deliver a
packet does PRIO try band 1, and so onwards. Maximum reliability packets
should therefore go to band 0, minimum delay to band 1 and the rest to band
2.

As the PRIO qdisc itself will have minor number 0, band 0 is actually
major:1, band 1 is major:2, etc. For major, substitute the major number
assigned to the qdisc on 'tc qdisc add' with the
.B handle
parameter.

.SH CLASSIFICATION
Three methods are available to PRIO to determine in which band a packet will
be enqueued.
.TP
From userspace
A process with sufficient privileges can encode the destination class
directly with SO_PRIORITY, see
.BR socket(7).
.TP
with a tc filter
A tc filter attached to the root qdisc can point traffic directly to a class
.TP
with the priomap
Based on the packet priority, which in turn is derived from the Type of
Service assigned to the packet.
.P
Only the priomap is specific to this qdisc.
.SH QDISC PARAMETERS
.TP
bands
Number of bands. If changed from the default of 3,
.B priomap
must be updated as well.
.TP
priomap
The priomap maps the priority of
a packet to a class. The priority can either be set directly from userspace,
or be derived from the Type of Service of the packet.

Determines how packet priorities, as assigned by the kernel, map to
bands. Mapping occurs based on the TOS octet of the packet, which looks like
this:

.nf
0   1   2   3   4   5   6   7
+---+---+---+---+---+---+---+---+
|           |               |   |
|PRECEDENCE |      TOS      |MBZ|
|           |               |   |
+---+---+---+---+---+---+---+---+
.fi

The four TOS bits (the 'TOS field') are defined as:

.nf
Binary Decimal  Meaning
-----------------------------------------
1000   8         Minimize delay (md)
0100   4         Maximize throughput (mt)
0010   2         Maximize reliability (mr)
0001   1         Minimize monetary cost (mmc)
0000   0         Normal Service
.fi

As there is 1 bit to the right of these four bits, the actual value of the
TOS field is double the value of the TOS bits. Tcpdump -v -v shows you the
value of the entire TOS field, not just the four bits. It is the value you
see in the first column of this table:

.nf
TOS     Bits  Means                    Linux Priority    Band
------------------------------------------------------------
0x0     0     Normal Service           0 Best Effort     1
0x2     1     Minimize Monetary Cost   0 Best Effort     1
0x4     2     Maximize Reliability     0 Best Effort     1
0x6     3     mmc+mr                   0 Best Effort     1
0x8     4     Maximize Throughput      2 Bulk            2
0xa     5     mmc+mt                   2 Bulk            2
0xc     6     mr+mt                    2 Bulk            2
0xe     7     mmc+mr+mt                2 Bulk            2
0x10    8     Minimize Delay           6 Interactive     0
0x12    9     mmc+md                   6 Interactive     0
0x14    10    mr+md                    6 Interactive     0
0x16    11    mmc+mr+md                6 Interactive     0
0x18    12    mt+md                    4 Int. Bulk       1
0x1a    13    mmc+mt+md                4 Int. Bulk       1
0x1c    14    mr+mt+md                 4 Int. Bulk       1
0x1e    15    mmc+mr+mt+md             4 Int. Bulk       1
.fi

The second column contains the value of the relevant
four TOS bits, followed by their translated meaning. For example, 15 stands
for a packet wanting Minimal Monetary Cost, Maximum Reliability, Maximum
Throughput AND Minimum Delay.

The fourth column lists the way the Linux kernel interprets the TOS bits, by
showing to which Priority they are mapped.

The last column shows the result of the default priomap. On the command line,
the default priomap looks like this:

    1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

This means that priority 4, for example, gets mapped to band number 1.
The priomap also allows you to list higher priorities (> 7) which do not
correspond to TOS mappings, but which are set by other means.

This table from RFC 1349 (read it for more details) explains how
applications might very well set their TOS bits:

.nf
TELNET                   1000           (minimize delay)
FTP
        Control          1000           (minimize delay)
        Data             0100           (maximize throughput)

TFTP                     1000           (minimize delay)

SMTP
        Command phase    1000           (minimize delay)
        DATA phase       0100           (maximize throughput)

Domain Name Service
        UDP Query        1000           (minimize delay)
        TCP Query        0000
        Zone Transfer    0100           (maximize throughput)

NNTP                     0001           (minimize monetary cost)

ICMP
        Errors           0000
        Requests         0000 (mostly)
        Responses        <same as request> (mostly)
.fi


.SH CLASSES
PRIO classes cannot be configured further - they are automatically created
when the PRIO qdisc is attached. Each class however can contain yet a
further qdisc.

.SH BUGS
Large amounts of traffic in the lower bands can cause starvation of higher
bands. Can be prevented by attaching a shaper (for example,
.BR tc-tbf(8)
to these bands to make sure they cannot dominate the link.

.SH AUTHORS
Alexey N. Kuznetsov, <kuznet@ms2.inr.ac.ru>,  J Hadi Salim
<hadi@cyberus.ca>. This manpage maintained by bert hubert <ahu@ds9a.nl>