1 files changed, 125 insertions, 32 deletions
diff --git a/man/iperf.1 b/man/iperf.1
index 8ca1343..b867354 100644
--- a/man/iperf.1
+++ b/man/iperf.1
@@ -1,4 +1,4 @@
-.TH IPERF 1 "February 2023" NLANR/DAST "User Manuals"
+.TH IPERF 1 "March 2024" NLANR/DAST "User Manuals"
 .SH NAME
 iperf \- perform network traffic tests using network sockets. Metrics include throughput and latency or link capacity and responsiveness.
 .SH SYNOPSIS
@@ -30,7 +30,7 @@ computers but need not be.
 .TP
 .BR -b ", " --bandwidth " "
 set the target bandwidth and optional standard deviation per
-\fI<mean>\fR,\fI[<stdev>]\fR (See NOTES for suffixes)
+\fI<mean>\fR,\fI[<stdev>]\fR (See NOTES for suffixes) Setting the target bitrate on the client to 0 will disable bitrate limits (particularly useful for UDP tests). Will limit the read rate on the server.
 .TP
 .BR -e ", " --enhanced " "
 Display enhanced output in reports otherwise use legacy report (ver
@@ -65,7 +65,7 @@ Override the default shared memory size between the traffic thread(s) and report
 output the report or error message to this specified file
 .TP
 .BR "    --permit-key [=" \fI<value>\fR "]"
-Set a key value that must match for the server to accept traffic on a connection. If the option is given without a value on the server a key value will be autogenerated and displayed in its initial settings report. The lifetime of the key is set using --permit-key-timeout and defaults to twenty seconds. The value is required on clients. The value will also be used as part of the transfer id in reports. The option set on the client but not the server will also cause the server to reject the client's traffic. TCP only, no UDP support.
+Set a key value that must match for the server to accept traffic on a connection. If the option is given without a value on the server a key value will be autogenerated and displayed in its initial settings report. The lifetime of the key is set using --permit-key-timeout and defaults to twenty seconds. The value on clients required the use of '=', e.g. --permit-key=password (even though it's required command line option.) The server will auto-generate a value if '=password' is not given.  The value will also be used as part of the transfer id in reports. The option set on the client but not the server will also cause the server to reject the client's traffic. TCP only, no UDP support.
 .TP
 .BR -p ", " --port " \fIm\fR[-\fIn\fR]"
 set client or server port(s) to send or listen on per \fIm\fR (default 5001) w/optional port range per m-n (e.g. -p 6002-6008) (see NOTES)
@@ -76,6 +76,9 @@ sum traffic threads based upon the destination IP address (default is source ip
 .BR "    --sum-only "
 set the output to sum reports only. Useful for -P at large values
 .TP
+.BR "    --tcp-tx-delay " \fIn\fR,\fI[<prob>]\fR
+Set TCP_TX_DELAY on the socket. Delay units are milliseconds and probability is prob >= 0 and prob <= 1. Values takes float. See Notes for qdisc requirements.
+.TP
 .BR -t ", " --time " \fIn\fR"
 time in seconds to listen for new traffic connections, receive traffic or send traffic
 .TP
@@ -112,8 +115,14 @@ exclude C(connection) D(data) M(multicast) S(settings) V(server) reports
 .BR -y ", " --reportstyle " C|c"
 if set to C or c report results as CSV (comma separated values)
 .TP
+.BR "    --tcp-cca "
+Set the congestion control algorithm to be used for TCP connections. See SPECIFIC OPTIONS for more
+.TP
+.BR "    --working-load-cca "
+Set the congestion control algorithm to be used for TCP working loads. See SPECIFIC OPTIONS for more
+.TP
 .BR -Z ", " --tcp-congestion " "
-Set the default congestion-control algorithm to be used for new connections. Platforms must support setsockopt's TCP_CONGESTION. (Notes: See sysctl and tcp_allowed_congestion_control for available options. May require root privileges.)
+Set the default congestion control algorithm to be used for new connections. Platforms must support setsockopt's TCP_CONGESTION. (Notes: See sysctl and tcp_allowed_congestion_control for available options. May require root privileges.)
 .SH "SERVER SPECIFIC OPTIONS"
 .TP
 .BR -1 ", " --singleclient " "
@@ -143,6 +152,9 @@ Set the receive interface to the TAP device as specified.
 .BR "    --tcp-rx-window-clamp "  \fIn\fR[kmKM]
 Set the socket option of TCP_WINDOW_CLAMP, units is bytes.
 .TP
+.BR "    --test-exchange-timeout " \fI<value>\fR
+Set the maximum wait time for a test excahnge in seconds. Defaults to 60 seconds if not set. A value of zero will disable the timeout.
+.TP
 .BR -t ", " --time " \fIn\fR"
 time in seconds to listen for new traffic connections and/or receive traffic (defaults to infinite)
 .TP
@@ -170,6 +182,15 @@ run in single threaded UDP mode
 .TP
 .BR -V ", " --ipv6_domain " "
 Enable IPv6 reception by setting the domain and socket to AF_INET6 (Can receive on both IPv4 and IPv6)
+.TP
+.BR "    --tcp-cca "
+Set the congestion control algorithm to be used for TCP connections - will overide any client side settings (same as --tcp-congestion)
+.TP
+.BR "    --working-load "
+Enable support for TCP working loads on UDP traffic streams
+.TP
+.BR "    --working-load-cca "
+Set the congestion control algorithm to be used for TCP working loads - will overide any client side settings
 .SH "CLIENT SPECIFIC OPTIONS"
 .TP
 .BR -b ", " --bandwidth " \fIn\fR[kmgKMG][,\fIn\fR[kmgKMG]] | \fIn\fR\fR[kmgKMG]pps"
@@ -177,16 +198,22 @@ set target bandwidth to \fIn\fR bits/sec (default 1 Mbit/sec) or
 \fIn\fR packets per sec. This may be used with TCP or UDP. Optionally, for variable loads, use format of  mean,standard deviation
 .TP
 .BR "    --bounceback[=" \fIn\fR "]"
-run a TCP bounceback or rps test with optional number writes in a burst per value of n. The default is ten writes every period and the default period is one second (Note: set size with -l or --len which defaults to 100 bytes)
+run a TCP bounceback or rps test with optional number writes in a burst per value of n. The default is ten writes every period and the default period is one second (Note: set size with --bounceback-request). See NOTES on clock unsynchronized detections.
 .TP
 .BR "    --bounceback-hold " \fIn\fR
 request the server to insert a delay of n milliseconds between its read and write (default is no delay)
 .TP
+.BR "    --bounceback-no-quickack "
+request the server not set the TCP_QUICKACK socket option (disabling TCP ACK delays) during a bounceback test (see NOTES)
+.TP
 .BR "    --bounceback-period[=" \fIn\fR "]"
 request the client schedule its send(s) every n seconds (default is one second, use zero value for immediate or continuous back to back)
 .TP
-.BR "    --bounceback-no-quickack "
-request the server not set the TCP_QUICKACK socket option (disabling TCP ACK delays) during a bounceback test (see NOTES)
+.BR "    --bounceback-request " \fIn\fR
+set the bounceback request size in units bytes. Default value is 100 bytes.
+.TP
+.BR "    --bounceback-reply " \fIn\fR
+set the bounceback reply size in units bytes. This supports asymmetric message sizes between the request and the reply. Default value is zero, which uses the value of --bounceback-request.
 .TP
 .BR "    --bounceback-txdelay " \fIn\fR
 request the client to delay n seconds between the start of the working load and the bounceback traffic (default is no delay)
@@ -196,28 +223,45 @@ Set the burst period in seconds. Defaults to one second. (Note: assumed use case
 .TP
 .BR "    --burst-size " \fIn\fR
 Set the burst size in bytes. Defaults to 1M if no value is given.
+.TP
 .BR -c ", " --client " \fI\fIhost\fR | \fIhost\fR%\fIdevice\fR"
 run in client mode, connecting to \fIhost\fR  where the optional %dev will SO_BINDTODEVICE that output interface (requires root and see NOTES)
 .TP
-.TP
 .BR "    --connect-only[=" \fIn\fR "]"
 only perform a TCP connect (or 3WHS) without any data transfer, useful to measure TCP connect() times. Optional value of n is the total number of connects to do (zero is run forever.) Note that -i will rate limit the connects where -P will create bursts and -t will end the client and hence end its connect attempts.
 .TP
-.BR "    --connect-retries[= " \fIn\fR "]"
-number of times to retry a TCP connect at the application level.  See operating system information on the details of TCP connect related settings.
+.BR "    --connect-retry-time " \fIn\fR
+time value in seconds for application level retries of TCP connect(s). See --connect-retry-timer for the retry time interval. See operating system information for the details of system or kernel TCP connect related settings. This is an application level retry of the connect() call and not the system level connect.
+.TP
+.TP
+.BR "    --connect-retry-timer " \fIn\fR
+The minimum time value in seconds to wait before retrying the connect. Note: This a minimum time to  wait between retries and can be longer dependent upon the system connect time taken. See operating system information for the details of system or kernel TCP connect related settings.
+.TP
+.BR "    --dscp"
+set the DSCP field (masking ECN bits) in the TOS byte (used by IP_TOS & setsockopt)
+.TP
 .TP
 .BR -d ", " --dualtest " "
 Do a bidirectional test simultaneous test using two unidirectional sockets
 .TP
-.BR "    --fq-rate n[kmgKMG]"
+.BR "    --fq-rate \fIn\fR[kmgKMG]"
 Set a rate to be used with fair-queuing based socket-level pacing, in bytes or bits per second. Only available on platforms supporting the SO_MAX_PACING_RATE socket option. (Note: Here the suffixes indicate bytes/sec or bits/sec per use of uppercase or lowercase, respectively)
 .TP
+.BR "    --fq-rate-step \fIn\fR[kmgKMG]"
+Set a step of rate to be used with fair-queuing based socket-level pacing, in bytes or bits per second. Step occurs every fq-rate-step-interval (defaults to one second)
+.TP
+.BR "    --fq-rate-step-interval \fIn\fR"
+Time in seconds before stepping the fq-rate
+.TP
 .BR "    --full-duplex"
 run a full duplex test, i.e. traffic in both transmit and receive directions using the \fBsame socket\fR
 .TP
 .BR "    --histograms[="\fIbinwidth\fR[u],\fIbincount\fR,[\fIlowerci\fR],[\fIupperci\fR] "]"
 enable select()/write() histograms with --tcp-write-times or --bounceback (these options are mutually exclusive.) The binning can be modified. Bin widths (default 100 microseconds, append u for microseconds, m for milliseconds) bincount is total bins (default 10000), ci is confidence interval between 0-100% (default lower 5%, upper 95%, 3 stdev 99.7%)
 .TP
+.BR "    --ignore-shutdown"
+don't wait on the TCP shutdown or close (fin & finack) rather use the final write as the ending event
+.TP
 .BR "    --incr-dstip"
 increment the destination ip address when using the parallel (-P) or port range option
 .TP
@@ -258,10 +302,16 @@ number of bytes to transmit (instead of -t)
 .BR "    --permit-key [=" \fI<value>\fR "]"
 Set a key value that must match the server's value (also set with --permit-key) in order for the server to accept traffic from the client. TCP only, no UDP support.
 .TP
+.BR "    --sync-transfer-id"
+Pass the clients' transfer id(s) to the server so both will use the same id in their respective outputs
+.TP
 .BR -r ", " --tradeoff " "
 Do a bidirectional test individually - client-to-server, followed by
 a reversed test, server-to-client
 .TP
+.BR "    --tcp-cca "
+Set the congestion control algorithm to be used for TCP connections & exchange with the server (same as --tcp-congestion)
+.TP
 .BR "    --tcp-quickack "
 Set TCP_QUICKACK on the socket
 .TP
@@ -310,6 +360,9 @@ time-to-live, for multicast (default 1)
 .BR "    --working-load[="\fBup|down|bidir][\fR,\fIn\fR\fB]\fR
 request a concurrent working load, currently TCP stream(s), defaults to full duplex (or bidir) unless the \fBup\fR or \fBdown\fR option is provided. The number of TCP streams defaults to 1 and can be changed via the n value, e.g. \fB--working-load=down,4\fR will use four TCP streams from server to the client as the working load. The IP ToS will be BE (0x0) for working load traffic.
 .TP
+.BR "    --working-load-cca "
+Set the congestion control algorithm to be used for TCP working loads, exchange with the server
+.TP
 .BR -V ", " --ipv6_domain " "
 Set the domain to IPv6 (send packets over IPv6)
 .TP
@@ -321,44 +374,50 @@ set TCP congestion control algorithm (Linux only)
 .SH EXAMPLES
 
 .B TCP tests (client)
-
 .B iperf -c <host> -e -i 1
 .br
 ------------------------------------------------------------
 .br
-Client connecting to <host>, TCP port 5001 with pid 5149
+Client connecting to 192.168.1.35, TCP port 5001 with pid 256370 (1/0 flows/load)
+.br
+Write buffer size: 131072 Byte
 .br
-Write buffer size:  128 KByte
+TCP congestion control using cubic
 .br
-TCP window size:  340 KByte (default)
+TOS set to 0x0 (dscp=0,ecn=0) (Nagle on)
+.br
+TCP window size:  100 MByte (default)
 .br
 ------------------------------------------------------------
 .br
-[  3] local 45.56.85.133 port 49960 connected with 45.33.58.123 port 5001 (ct=3.23 ms)
+[  1] local 192.168.1.103%enp4s0 port 41024 connected with 192.168.1.35 port 5001 (sock=3) (icwnd/mss/irtt=14/1448/158) (ct=0.21 ms) on 2024-03-26 10:48:47.867 (PDT)
 .br
-[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry     Cwnd/RTT        NetPwr
+[ ID] Interval        Transfer    Bandwidth       Write/Err  Rtry     InF(pkts)/Cwnd(pkts)/RTT(var)        NetPwr
+.br
+[  1] 0.00-1.00 sec   201 MBytes  1.68 Gbits/sec  1605/0        73     1531K(1083)/1566K(1108)/13336(112) us  15775
 .br
-[  3] 0.00-1.00 sec   126 MBytes  1.05 Gbits/sec  1006/0          0       56K/626 us  210636.47
+[  1] 1.00-2.00 sec   101 MBytes   846 Mbits/sec  807/0         0     1670K(1181)/1689K(1195)/14429(83) us  7331
 .br
-[  3] 1.00-2.00 sec   138 MBytes  1.15 Gbits/sec  1100/0        299      483K/3884 us  37121.32
+[  1] 2.00-3.00 sec   101 MBytes   847 Mbits/sec  808/0         0     1790K(1266)/1790K(1266)/15325(97) us  6911
 .br
-[  3] 2.00-3.00 sec   137 MBytes  1.15 Gbits/sec  1093/0         24      657K/5087 us  28162.31
+[  1] 3.00-4.00 sec   134 MBytes  1.13 Gbits/sec  1075/0         0     1858K(1314)/1892K(1338)/16188(99) us  8704
 .br
-[  3] 3.00-4.00 sec   126 MBytes  1.06 Gbits/sec  1010/0        284      294K/2528 us  52366.58
+[  1] 4.00-5.00 sec   101 MBytes   846 Mbits/sec  807/0         1     1350K(955)/1370K(969)/11620(98) us  9103
 .br
-[  3] 4.00-5.00 sec   117 MBytes   980 Mbits/sec  935/0        373      487K/2025 us  60519.66
+[  1] 5.00-6.00 sec   121 MBytes  1.01 Gbits/sec  966/0         0     1422K(1006)/1453K(1028)/12405(118) us  10207
 .br
-[  3] 5.00-6.00 sec   144 MBytes  1.20 Gbits/sec  1149/0          2      644K/3570 us  42185.36
+[  1] 6.00-7.00 sec   115 MBytes   962 Mbits/sec  917/0         0     1534K(1085)/1537K(1087)/13135(105) us  9151
 .br
-[  3] 6.00-7.00 sec   126 MBytes  1.06 Gbits/sec  1011/0        112      582K/5281 us  25092.56
+[  1] 7.00-8.00 sec   101 MBytes   844 Mbits/sec  805/0         0     1532K(1084)/1580K(1118)/13582(136) us  7769
 .br
-[  3] 7.00-8.00 sec   110 MBytes   922 Mbits/sec  879/0         56      279K/1957 us  58871.89
+[  1] 8.00-9.00 sec   134 MBytes  1.13 Gbits/sec  1076/0         0     1603K(1134)/1619K(1145)/13858(105) us  10177
 .br
-[  3] 8.00-9.00 sec   127 MBytes  1.06 Gbits/sec  1014/0         46      483K/3372 us  39414.89
+[  1] 9.00-10.00 sec   101 MBytes   846 Mbits/sec  807/0         0     1602K(1133)/1650K(1167)/14113(105) us  7495
 .br
-[  3] 9.00-10.00 sec   132 MBytes  1.11 Gbits/sec  1054/0          0      654K/3380 us  40872.75
+[  1] 10.00-10.78 sec   128 KBytes  1.34 Mbits/sec  1/0         0        0K(0)/1681K(1189)/14424(111) us  11.64
+.br
+[  1] 0.00-10.78 sec  1.18 GBytes   941 Mbits/sec  9674/0        74        0K(0)/1681K(1189)/14424(111) us  8154
 .br
-[  3] 0.00-10.00 sec  1.25 GBytes  1.07 Gbits/sec  10251/0       1196       -1K/3170 us  42382.03
 
 .TP
 .B where (per -e,)
@@ -371,15 +430,15 @@ Total number of successful socket writes. Total number of non-fatal socket write
 .B Rtry
 Total number of TCP retries
 .br
-.B Cwnd/RTT (*nix only)
-TCP congestion window and round trip time (sampled where NA indicates no value)
+.B Inf(pkts)/Cwnd/RTT(var) (*nix only)
+TCP byes and packets inflight, congestion window and round trip time (sampled where NA indicates no value). Infight is in units of Kbytes and packets where packets_in_flight = (tcp_info_buf.tcpi_unacked - tcp_info_buf.tcpi_sacked - tcp_info_buf.tcpi_lost + tcp_info_buf.tcpi_retrans) RTT (var) is RTT variance. 
 .br
 .B NetPwr (*nix only)
 Network power defined as (throughput / RTT)
 
 .PP
-
-.B iperf -c host.doamin.com -i 1 --bounceback --permit-key=mytest --hide-ips
+.TP
+.B iperf -c host.domain.com -i 1 --bounceback --permit-key=mytest --hide-ips
 .br
 ------------------------------------------------------------
 .br
@@ -830,6 +889,7 @@ The --trip-times option indicates that the client's and server's clocks are sync
 Network Time Protocol (NTP) or Precision Time Protocol (PTP) are commonly used for
 this. The reference clock(s) error and the synchronization protocols will affect
 the accuracy of any end to end latency measurements.
+.B See bounceback NOTES section on clock unsynchronized detections
 .P
 .B Histograms and non-parametric statistics:
 The --histograms option provides the raw data where nothing is averaged. This is useful for non-parametric
@@ -938,6 +998,18 @@ is set to a non-zero value. The socket option is applied after every read() on t
 and before the hold delay call. It's also applied on the client. Use --bounceback-no-quickack
 to have TCP run in default mode per the socket (which is most likely TCP_QUICKACK being off.)
 .P
+.B Unsynchronized clock detections with --bounceback and --trip-times (as of March 19, 2023):
+Iperf 2 can detect when the clocks have synchronization errors larger than the bounceback RTT. This is done via the client's send timestamp (clock A),
+the server's recieve timestamp (clock B) and the client's final receive timestamp (clock A.) The check, done on each bounceback, is
+write(A) < read(B) < read(A). This is supported in bounceback tests  with
+a slight adjustment: clock write(A) < clock read(B) < clock read(A) - (clock write(B) - clock read(B)). All the
+timestamps are sampled on the initial write or read (not the completion of.)
+Error output looks as shown below and there is no output for a zero value.
+
+.br
+[  1] 0.00-10.00 sec  Clock sync error count = 100
+.br
+.P
 .B TCP Connect times:
 The TCP connect time (or three way handshake) can be seen on the iperf
 client when the -e (--enhanced) option is set. Look for the
@@ -978,6 +1050,27 @@ Iperf 2 supports multicast with a couple of caveats. First, multicast streams ca
 .B TCP_QUICKACK:
 The TCP_QUICKACK socket option will be applied after every read() on the server such that TCP acks are sent immediately, rather than possibly delayed.
 .P
+.B TCP_TX_DELAY (--tcp-tx-delay):
+Iperf 2 flows can set different delays, simulating real world conditions. Units is microseconds.
+This \fBrequires FQ packet scheduler\fR or a EDT-enabled NIC.
+Note that FQ packet scheduler limits might need some tweaking
+  man tc-fq
+    PARAMETERS
+    limit
+        Hard  limit  on  the  real  queue  size. When this limit is
+        reached, new packets are dropped. If the value is  lowered,
+        packets  are  dropped so that the new limit is met. Default
+        is 10000 packets.
+
+     flow_limit
+        Hard limit on the maximum  number  of  packets  queued  per
+        flow.  Default value is 100.
+
+Use of TCP_TX_DELAY option will increase number of skbs in FQ qdisc,
+so packets would be dropped if any of the previous limit is hit.
+Using big delays might very well trigger
+old bugs in TSO auto defer logic and/or sndbuf limited detection.
+.P
 .B Fast Sampling:
 Use
 .B ./configure --enable-fastsampling