diff options
Diffstat (limited to '')
-rw-r--r-- | man/man8/tc-cbq.8 | 351 |
1 files changed, 351 insertions, 0 deletions
diff --git a/man/man8/tc-cbq.8 b/man/man8/tc-cbq.8 new file mode 100644 index 0000000..301265d --- /dev/null +++ b/man/man8/tc-cbq.8 @@ -0,0 +1,351 @@ +.TH CBQ 8 "16 December 2001" "iproute2" "Linux" +.SH NAME +CBQ \- Class Based Queueing +.SH SYNOPSIS +.B tc qdisc ... dev +dev +.B ( parent +classid +.B | root) [ handle +major: +.B ] cbq [ allot +bytes +.B ] avpkt +bytes +.B bandwidth +rate +.B [ cell +bytes +.B ] [ ewma +log +.B ] [ mpu +bytes +.B ] + +.B tc class ... dev +dev +.B parent +major:[minor] +.B [ classid +major:minor +.B ] cbq allot +bytes +.B [ bandwidth +rate +.B ] [ rate +rate +.B ] prio +priority +.B [ weight +weight +.B ] [ minburst +packets +.B ] [ maxburst +packets +.B ] [ ewma +log +.B ] [ cell +bytes +.B ] avpkt +bytes +.B [ mpu +bytes +.B ] [ bounded isolated ] [ split +handle +.B & defmap +defmap +.B ] [ estimator +interval timeconstant +.B ] + +.SH DESCRIPTION +Class Based Queueing is a classful qdisc that implements a rich +linksharing hierarchy of classes. It contains shaping elements as +well as prioritizing capabilities. Shaping is performed using link +idle time calculations based on the timing of dequeue events and +underlying link bandwidth. + +.SH SHAPING ALGORITHM +When shaping a 10mbit/s connection to 1mbit/s, the link will +be idle 90% of the time. If it isn't, it needs to be throttled so that it +IS idle 90% of the time. + +During operations, the effective idletime is measured using an +exponential weighted moving average (EWMA), which considers recent +packets to be exponentially more important than past ones. The Unix +loadaverage is calculated in the same way. + +The calculated idle time is subtracted from the EWMA measured one, +the resulting number is called 'avgidle'. A perfectly loaded link has +an avgidle of zero: packets arrive exactly at the calculated +interval. + +An overloaded link has a negative avgidle and if it gets too negative, +CBQ throttles and is then 'overlimit'. + +Conversely, an idle link might amass a huge avgidle, which would then +allow infinite bandwidths after a few hours of silence. To prevent +this, avgidle is capped at +.B maxidle. + +If overlimit, in theory, the CBQ could throttle itself for exactly the +amount of time that was calculated to pass between packets, and then +pass one packet, and throttle again. Due to timer resolution constraints, +this may not be feasible, see the +.B minburst +parameter below. + +.SH CLASSIFICATION +Within the one CBQ instance many classes may exist. Each of these classes +contains another qdisc, by default +.BR tc-pfifo (8). + +When enqueueing a packet, CBQ starts at the root and uses various methods to +determine which class should receive the data. + +In the absence of uncommon configuration options, the process is rather easy. +At each node we look for an instruction, and then go to the class the +instruction refers us to. If the class found is a barren leaf-node (without +children), we enqueue the packet there. If it is not yet a leaf node, we do +the whole thing over again starting from that node. + +The following actions are performed, in order at each node we visit, until one +sends us to another node, or terminates the process. +.TP +(i) +Consult filters attached to the class. If sent to a leafnode, we are done. +Otherwise, restart. +.TP +(ii) +Consult the defmap for the priority assigned to this packet, which depends +on the TOS bits. Check if the referral is leafless, otherwise restart. +.TP +(iii) +Ask the defmap for instructions for the 'best effort' priority. Check the +answer for leafness, otherwise restart. +.TP +(iv) +If none of the above returned with an instruction, enqueue at this node. +.P +This algorithm makes sure that a packet always ends up somewhere, even while +you are busy building your configuration. + +For more details, see +.BR tc-cbq-details(8). + +.SH LINK SHARING ALGORITHM +When dequeuing for sending to the network device, CBQ decides which of its +classes will be allowed to send. It does so with a Weighted Round Robin process +in which each class with packets gets a chance to send in turn. The WRR process +starts by asking the highest priority classes (lowest numerically - +highest semantically) for packets, and will continue to do so until they +have no more data to offer, in which case the process repeats for lower +priorities. + +Classes by default borrow bandwidth from their siblings. A class can be +prevented from doing so by declaring it 'bounded'. A class can also indicate +its unwillingness to lend out bandwidth by being 'isolated'. + +.SH QDISC +The root of a CBQ qdisc class tree has the following parameters: + +.TP +parent major:minor | root +This mandatory parameter determines the place of the CBQ instance, either at the +.B root +of an interface or within an existing class. +.TP +handle major: +Like all other qdiscs, the CBQ can be assigned a handle. Should consist only +of a major number, followed by a colon. Optional, but very useful if classes +will be generated within this qdisc. +.TP +allot bytes +This allotment is the 'chunkiness' of link sharing and is used for determining packet +transmission time tables. The qdisc allot differs slightly from the class allot discussed +below. Optional. Defaults to a reasonable value, related to avpkt. +.TP +avpkt bytes +The average size of a packet is needed for calculating maxidle, and is also used +for making sure 'allot' has a safe value. Mandatory. +.TP +bandwidth rate +To determine the idle time, CBQ must know the bandwidth of your underlying +physical interface, or parent qdisc. This is a vital parameter, more about it +later. Mandatory. +.TP +cell +The cell size determines he granularity of packet transmission time calculations. Has a sensible default. +.TP +mpu +A zero sized packet may still take time to transmit. This value is the lower +cap for packet transmission time calculations - packets smaller than this value +are still deemed to have this size. Defaults to zero. +.TP +ewma log +When CBQ needs to measure the average idle time, it does so using an +Exponentially Weighted Moving Average which smooths out measurements into +a moving average. The EWMA LOG determines how much smoothing occurs. Lower +values imply greater sensitivity. Must be between 0 and 31. Defaults +to 5. +.P +A CBQ qdisc does not shape out of its own accord. It only needs to know certain +parameters about the underlying link. Actual shaping is done in classes. + +.SH CLASSES +Classes have a host of parameters to configure their operation. + +.TP +parent major:minor +Place of this class within the hierarchy. If attached directly to a qdisc +and not to another class, minor can be omitted. Mandatory. +.TP +classid major:minor +Like qdiscs, classes can be named. The major number must be equal to the +major number of the qdisc to which it belongs. Optional, but needed if this +class is going to have children. +.TP +weight weight +When dequeuing to the interface, classes are tried for traffic in a +round-robin fashion. Classes with a higher configured qdisc will generally +have more traffic to offer during each round, so it makes sense to allow +it to dequeue more traffic. All weights under a class are normalized, so +only the ratios matter. Defaults to the configured rate, unless the priority +of this class is maximal, in which case it is set to 1. +.TP +allot bytes +Allot specifies how many bytes a qdisc can dequeue +during each round of the process. This parameter is weighted using the +renormalized class weight described above. Silently capped at a minimum of +3/2 avpkt. Mandatory. + +.TP +prio priority +In the round-robin process, classes with the lowest priority field are tried +for packets first. Mandatory. + +.TP +avpkt +See the QDISC section. + +.TP +rate rate +Maximum rate this class and all its children combined can send at. Mandatory. + +.TP +bandwidth rate +This is different from the bandwidth specified when creating a CBQ disc! Only +used to determine maxidle and offtime, which are only calculated when +specifying maxburst or minburst. Mandatory if specifying maxburst or minburst. + +.TP +maxburst +This number of packets is used to calculate maxidle so that when +avgidle is at maxidle, this number of average packets can be burst +before avgidle drops to 0. Set it higher to be more tolerant of +bursts. You can't set maxidle directly, only via this parameter. + +.TP +minburst +As mentioned before, CBQ needs to throttle in case of +overlimit. The ideal solution is to do so for exactly the calculated +idle time, and pass 1 packet. However, Unix kernels generally have a +hard time scheduling events shorter than 10ms, so it is better to +throttle for a longer period, and then pass minburst packets in one +go, and then sleep minburst times longer. + +The time to wait is called the offtime. Higher values of minburst lead +to more accurate shaping in the long term, but to bigger bursts at +millisecond timescales. Optional. + +.TP +minidle +If avgidle is below 0, we are overlimits and need to wait until +avgidle will be big enough to send one packet. To prevent a sudden +burst from shutting down the link for a prolonged period of time, +avgidle is reset to minidle if it gets too low. + +Minidle is specified in negative microseconds, so 10 means that +avgidle is capped at -10us. Optional. + +.TP +bounded +Signifies that this class will not borrow bandwidth from its siblings. +.TP +isolated +Means that this class will not borrow bandwidth to its siblings + +.TP +split major:minor & defmap bitmap[/bitmap] +If consulting filters attached to a class did not give a verdict, +CBQ can also classify based on the packet's priority. There are 16 +priorities available, numbered from 0 to 15. + +The defmap specifies which priorities this class wants to receive, +specified as a bitmap. The Least Significant Bit corresponds to priority +zero. The +.B split +parameter tells CBQ at which class the decision must be made, which should +be a (grand)parent of the class you are adding. + +As an example, 'tc class add ... classid 10:1 cbq .. split 10:0 defmap c0' +configures class 10:0 to send packets with priorities 6 and 7 to 10:1. + +The complimentary configuration would then +be: 'tc class add ... classid 10:2 cbq ... split 10:0 defmap 3f' +Which would send all packets 0, 1, 2, 3, 4 and 5 to 10:1. +.TP +estimator interval timeconstant +CBQ can measure how much bandwidth each class is using, which tc filters +can use to classify packets with. In order to determine the bandwidth +it uses a very simple estimator that measures once every +.B interval +microseconds how much traffic has passed. This again is a EWMA, for which +the time constant can be specified, also in microseconds. The +.B time constant +corresponds to the sluggishness of the measurement or, conversely, to the +sensitivity of the average to short bursts. Higher values mean less +sensitivity. + +.SH BUGS +The actual bandwidth of the underlying link may not be known, for example +in the case of PPoE or PPTP connections which in fact may send over a +pipe, instead of over a physical device. CBQ is quite resilient to major +errors in the configured bandwidth, probably a the cost of coarser shaping. + +Default kernels rely on coarse timing information for making decisions. These +may make shaping precise in the long term, but inaccurate on second long scales. + +See +.BR tc-cbq-details(8) +for hints on how to improve this. + +.SH SOURCES +.TP +o +Sally Floyd and Van Jacobson, "Link-sharing and Resource +Management Models for Packet Networks", +IEEE/ACM Transactions on Networking, Vol.3, No.4, 1995 + +.TP +o +Sally Floyd, "Notes on CBQ and Guaranteed Service", 1995 + +.TP +o +Sally Floyd, "Notes on Class-Based Queueing: Setting +Parameters", 1996 + +.TP +o +Sally Floyd and Michael Speer, "Experimental Results +for Class-Based Queueing", 1998, not published. + + + +.SH SEE ALSO +.BR tc (8) + +.SH AUTHOR +Alexey N. Kuznetsov, <kuznet@ms2.inr.ac.ru>. This manpage maintained by +bert hubert <ahu@ds9a.nl> |