diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-06-03 05:11:10 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-06-03 05:11:10 +0000 |
commit | cff6d757e3ba609c08ef2aaa00f07e53551e5bf6 (patch) | |
tree | 08c4fc3255483ad397d712edb4214ded49149fd9 /doc | |
parent | Adding upstream version 2.9.7. (diff) | |
download | haproxy-cff6d757e3ba609c08ef2aaa00f07e53551e5bf6.tar.xz haproxy-cff6d757e3ba609c08ef2aaa00f07e53551e5bf6.zip |
Adding upstream version 3.0.0.upstream/3.0.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc')
-rw-r--r-- | doc/DeviceAtlas-device-detection.txt | 32 | ||||
-rw-r--r-- | doc/configuration.txt | 1965 | ||||
-rw-r--r-- | doc/design-thoughts/ring-v2.txt | 312 | ||||
-rw-r--r-- | doc/internals/api/buffer-api.txt | 12 | ||||
-rw-r--r-- | doc/intro.txt | 2 | ||||
-rw-r--r-- | doc/lua-api/index.rst | 66 | ||||
-rw-r--r-- | doc/management.txt | 253 | ||||
-rw-r--r-- | doc/peers-v2.0.txt | 2 |
8 files changed, 2083 insertions, 561 deletions
diff --git a/doc/DeviceAtlas-device-detection.txt b/doc/DeviceAtlas-device-detection.txt index b600918..9df9783 100644 --- a/doc/DeviceAtlas-device-detection.txt +++ b/doc/DeviceAtlas-device-detection.txt @@ -3,15 +3,20 @@ DeviceAtlas Device Detection In order to add DeviceAtlas Device Detection support, you would need to download the API source code from https://deviceatlas.com/deviceatlas-haproxy-module. -The build supports the USE_PCRE and USE_PCRE2 options. Once extracted : +Once extracted : - $ make TARGET=<target> USE_PCRE=1 (or USE_PCRE2=1) USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder> + $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder> Optionally DEVICEATLAS_INC and DEVICEATLAS_LIB may be set to override the path to the include files and libraries respectively if they're not in the source -directory. However, if the API had been installed beforehand, DEVICEATLAS_SRC -can be omitted. Note that the DeviceAtlas C API version supported is the 2.4.0 -at minimum. +directory. Also, in the case the api cache support is not needed and/or a C++ toolchain + could not be used, DEVICEATLAS_NOCACHE is available. + + $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder> DEVICEATLAS_NOCACHE=1 + +However, if the API had been installed beforehand, DEVICEATLAS_SRC +can be omitted. Note that the DeviceAtlas C API version supported is from the 3.x +releases series (3.2.1 minimum recommended). For HAProxy developers who need to verify that their changes didn't accidentally break the DeviceAtlas code, it is possible to build a dummy library provided in @@ -20,7 +25,7 @@ full library. This will not provide the full functionalities, it will just allow haproxy to start with a deviceatlas configuration, which generally is enough to validate API changes : - $ make TARGET=<target> USE_PCRE=1 USE_DEVICEATLAS=1 DEVICEATLAS_SRC=$PWD/addons/deviceatlas/dummy + $ make TARGET=<target> USE_DEVICEATLAS=1 DEVICEATLAS_SRC=$PWD/addons/deviceatlas/dummy These are supported DeviceAtlas directives (see doc/configuration.txt) : - deviceatlas-json-file <path to the DeviceAtlas JSON data file>. @@ -28,6 +33,7 @@ These are supported DeviceAtlas directives (see doc/configuration.txt) : the API, 0 by default). - deviceatlas-property-separator <character> (character used to separate the properties produced by the API, | by default). + - deviceatlas-cache-size <number> (number of cache entries, 0 by default). Sample configuration : @@ -64,18 +70,8 @@ Single HTTP header acl device_type_tablet req.fhdr(User-Agent),da-csv-conv(primaryHardwareType) "Tablet" -Optionally a JSON download scheduler is provided to allow a data file being -fetched automatically in a daily basis without restarting HAProxy : - - $ cd addons/deviceatlas && make [DEVICEATLAS_SRC=<path to the API root folder>] - -Similarly, if the DeviceAtlas API is installed, DEVICEATLAS_SRC can be omitted. - - $ ./dadwsch -u JSON data file URL e.g. "https://deviceatlas.com/getJSON?licencekey=<your licence key>&format=zip&data=my&index=web" \ - [-p download directory path /tmp by default] \ - [-d scheduled hour of download, hour when the service is launched by default] - -Noted it needs to be started before HAProxy. +Note that the JSON download scheduler is now part of the API's package, it is recommended +to read its documentation. Note it needs to be started before HAProxy. Please find more information about DeviceAtlas and the detection methods at diff --git a/doc/configuration.txt b/doc/configuration.txt index e1c5034..6a02988 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -2,8 +2,8 @@ HAProxy Configuration Manual ---------------------- - version 2.9 - 2024/04/05 + version 3.0 + 2024/05/29 This document covers the configuration language as implemented in the version @@ -44,7 +44,8 @@ Summary 2.4. Conditional blocks 2.5. Time format 2.6. Size format -2.7. Examples +2.7. Name format for maps and ACLs +2.8. Examples 3. Global parameters 3.1. Process management and security @@ -58,6 +59,8 @@ Summary 3.9. Rings 3.10. Log forwarding 3.11. HTTPClient tuning +3.12. Certificate Storage +3.12.1. Load options 4. Proxies 4.1. Proxy keywords matrix @@ -240,7 +243,7 @@ sometimes more) streams in parallel over a same connection, and let the server sort them out and respond in any order depending on what response is available. The main benefit of the multiplexed mode is that it significantly reduces the number of round trips, and speeds up page loading time over high latency -networks. It is sometimes visibles on sites using many images, where all images +networks. It is sometimes visible on sites using many images, where all images appear to load in parallel. These protocols have also improved their efficiency by adopting some mechanisms @@ -259,7 +262,8 @@ is called "head of line blocking" or "HoL blocking" or sometimes just "HoL". HTTP/3 is implemented over QUIC, itself implemented over UDP. QUIC solves the head of line blocking at the transport level by means of independently handled streams. Indeed, when experiencing loss, an impacted stream does not affect the -other streams, and all of them can be accessed in parallel. +other streams, and all of them can be accessed in parallel. QUIC also provides +connection migration support but currently haproxy does not support it. By default HAProxy operates in keep-alive mode with regards to persistent connections: for each connection it processes each request and response, and @@ -282,7 +286,7 @@ HAProxy essentially supports 3 connection modes : In addition to this, by default, the server-facing connection is reusable by any request from any client, as mandated by the HTTP protocol specification, so any information pertaining to a specific client has to be passed along with -each request if needed (e.g. client's source adress etc). When HTTP/2 is used +each request if needed (e.g. client's source address etc). When HTTP/2 is used with a server, by default HAProxy will dedicate this connection to the same client to avoid the risk of head of line blocking between clients. @@ -1148,7 +1152,43 @@ for every keyword. Supported units are case insensitive : Both time and size formats require integers, decimal notation is not allowed. -2.7. Examples +2.7. Name format for maps and ACLs +------------------------------------- + +It is possible to use a list of pattern for maps or ACLs. A list of pattern is +identified by its name and may be used at different places in the +configuration. List of pattern are split on three categories depending on +the name format: + + * Lists of pattern based on regular files: It is the default case. The + filename, absolute or relative, is used as name. The file must exist + otherwise an error is triggered. But it may be empty. The "file@" prefix + may also be specified but it is not part of the name identifying the + list. A filename, with or without the prefix, references the same list of + pattern. + + * Lists of pattern based on optional files: The filename must be preceded by + "opt@" prefix. The file existence is optional. If the file exists, its + content is loaded but no error is reported if not. The prefix is not part + of the name identifying the list. It means, for a given filename, Optional + files and regular files reference the same list of pattern. + + * Lists of pattern based on virtual files: The name is just an identified. It + is not a reference to any file. "virt@" prefix must be used. It is part of + the name. Thus it cannot be mixed with other kind of lists. + +Virtual files are useful when patterns are fully dynamically managed with no +patterns on startup and on reload. Optional files may be used under the same +conditions. But patterns can be dumped in the file, via an external script based +on the "show map" CLI command for instance. This way, it is possible to keep +patterns on reload. + +Note: Even if it is unlikely, it means no regular file starting with "file@", + "opt@" or "virt@" can be loaded, except by adding "./" explicitly in + front of the filename (for instance "file@./virt@map"). + + +2.8. Examples ------------- # Simple configuration for an HTTP proxy listening on port 80 on all @@ -1225,6 +1265,7 @@ The following keywords are supported in the "global" section : - deviceatlas-log-level - deviceatlas-properties-cookie - deviceatlas-separator + - expose-deprecated-directives - expose-experimental-directives - external-check - fd-hard-limit @@ -1236,9 +1277,12 @@ The following keywords are supported in the "global" section : - h1-case-adjust-file - h2-workaround-bogus-websocket-clients - hard-stop-after + - harden.reject-privileged-ports.tcp + - harden.reject-privileged-ports.quic - insecure-fork-wanted - insecure-setuid-wanted - issuers-chain-path + - key-base - localpeer - log - log-send-hostname @@ -1250,6 +1294,11 @@ The following keywords are supported in the "global" section : - nbthread - node - numa-cpu-mapping + - ocsp-update.disable + - ocsp-update.maxdelay + - ocsp-update.mindelay + - ocsp-update.httpproxy + - ocsp-update.mode - pidfile - pp2-never-send-local - presetenv @@ -1274,9 +1323,11 @@ The following keywords are supported in the "global" section : - ssl-propquery - ssl-provider - ssl-provider-path + - ssl-security-level - ssl-server-verify - ssl-skip-self-issued-ca - stats + - stats-file - strict-limits - uid - ulimit-n @@ -1314,6 +1365,7 @@ The following keywords are supported in the "global" section : - spread-checks - ssl-engine - ssl-mode-async + - tune.applet.zero-copy-forwarding - tune.buffers.limit - tune.buffers.reserve - tune.bufsize @@ -1360,7 +1412,9 @@ The following keywords are supported in the "global" section : - tune.pool-high-fd-ratio - tune.pool-low-fd-ratio - tune.pt.zero-copy-forwarding + - tune.quic.cc-hystart - tune.quic.frontend.conn-tx-buffers.limit + - tune.quic.frontend.glitches-threshold - tune.quic.frontend.max-idle-timeout - tune.quic.frontend.max-streams-bidi - tune.quic.max-frame-loss @@ -1373,6 +1427,7 @@ The following keywords are supported in the "global" section : - tune.rcvbuf.frontend - tune.rcvbuf.server - tune.recv_enough + - tune.ring.queues - tune.runqueue-depth - tune.sched.low-latency - tune.sndbuf.backend @@ -1390,8 +1445,8 @@ The following keywords are supported in the "global" section : - tune.ssl.lifetime - tune.ssl.maxrecord - tune.ssl.ssl-ctx-cache-size - - tune.ssl.ocsp-update.maxdelay - - tune.ssl.ocsp-update.mindelay + - tune.ssl.ocsp-update.maxdelay (deprecated) + - tune.ssl.ocsp-update.mindelay (deprecated) - tune.vars.global-max-size - tune.vars.proc-max-size - tune.vars.reqres-max-size @@ -1699,6 +1754,12 @@ deviceatlas-separator <char> Sets the character separator for the API properties results. This directive is optional and set to | by default if not set. +expose-deprecated-directives + This statement must appear before using some directives tagged as deprecated + to silent warnings and make sure the config file will not be rejected. Not + all deprecated directives are concerned, only those without any alternative + solution. + expose-experimental-directives This statement must appear before using directives tagged as experimental or the config file will be rejected. @@ -1886,6 +1947,48 @@ hard-stop-after <time> See also: grace +harden.reject-privileged-ports.tcp { on | off } +harden.reject-privileged-ports.quic { on | off } + Toggle per protocol protection which forbid communication with clients which + use privileged ports as their source port. This range of ports is defined + according to RFC 6335. By default, protection is active for QUIC protocol as + this behavior is suspicious and may be used as a spoofing or DNS/NTP + amplification attack. + +http-err-codes [+-]<range>[,...] [...] + Replace, reduce or extend the list of status codes that define an error as + considered by the termination codes and the "http_err_cnt" counter in stick + tables. The default range for errors is 400 to 499, but in certain contexts + some users prefer to exclude specific codes, especially when tracking client + errors (e.g. 404 on systems with dynamically generated contents). See also + "http-fail-codes" and "http_err_cnt". + + A range specified without '+' nor '-' redefines the existing range to the new + one. A range starting with '+' extends the existing range to also include the + specified one, which may or may not overlap with the existing one. A range + starting with '-' removes the specified range from the existing one. A range + consists in a number from 100 to 599, optionally followed by "-" followed by + another number greater than or equal to the first one to indicate the high + boundary of the range. Multiple ranges may be delimited by commas for a same + add/del/ replace operation. + + Example: + http-err-codes 400,402-444,446-480,490 # sets exactly these codes + http-err-codes 400-499 -450 +500 # sets 400 to 500 except 450 + http-err-codes -450-459 # removes 450 to 459 from range + http-err-codes +501,505 # adds 501 and 505 to range + +http-fail-codes [+-]<range>[,...] [...] + Replace, reduce or extend the list of status codes that define a failure as + considered by the termination codes and the "http_fail_cnt" counter in stick + tables. The default range for failures is 500 to 599 except 501 and 505 which + can be triggered by clients, and normally indicate a failure from the server + to process the request. Some users prefer to exclude certain codes in certain + contexts where it is known they're not relevant, such as 500 in certain SOAP + environments as it doesn't translate a server fault there. The syntax is + exactly the same as for http-err-codes above. See also "http-err-codes" and + "http_fail_cnt". + insecure-fork-wanted By default HAProxy tries hard to prevent any thread and process creation after it starts. Doing so is particularly important when using Lua files of @@ -1902,7 +2005,8 @@ insecure-fork-wanted highly recommended that this option is never used and that any workload requiring such a fork be reconsidered and moved to a safer solution (such as agents instead of external checks). This option supports the "no" prefix to - disable it. + disable it. This can also be activated with "-dI" on the haproxy command + line. insecure-setuid-wanted HAProxy doesn't need to call executables at run time (except when using @@ -1933,6 +2037,11 @@ issuers-chain-path <dir> "issuers-chain-path" directory. All other certificates with the same issuer will share the chain in memory. +key-base <dir> + Assigns a default directory to fetch SSL private keys from when a relative + path is used with "key" directives. Absolute locations specified prevail and + ignore "key-base". This option only works with a crt-store load line. + limited-quic This setting must be used to explicitly enable the QUIC listener bindings when haproxy is compiled against a TLS/SSL stack without QUIC support, typically @@ -2058,7 +2167,8 @@ nbthread <number> bound to upon startup. This means that the thread count can easily be adjusted from the calling process using commands like "taskset" or "cpuset". Otherwise, this value defaults to 1. The default value is reported in the - output of "haproxy -vv". + output of "haproxy -vv". Note that values set here or automatically detected + are subject to the limit set by "thread-hard-limit" (if set). no-quic Disable QUIC transport protocol. All the QUIC listeners will still be created. @@ -2077,6 +2187,40 @@ numa-cpu-mapping already specified, for example via the 'cpu-map' directive or the taskset utility. +ocsp-update.disable [ on | off ] + Disable completely the ocsp-update in HAProxy. Any ocsp-update configuration + will be ignored. Default is "off". + See option "ocsp-update" for more information about the auto update + mechanism. + +ocsp-update.httpproxy <address>[:port] + Allow to use an HTTP proxy for the OCSP updates. This only works with HTTP, + HTTPS is not supported. This option will allow the OCSP updater to send + absolute URI in the request to the proxy. + +ocsp-update.maxdelay <number> +tune.ssl.ocsp-update.maxdelay <number> (deprecated) + Sets the maximum interval between two automatic updates of the same OCSP + response. This time is expressed in seconds and defaults to 3600 (1 hour). It + must be set to a higher value than "ocsp-update.mindelay". See + option "ocsp-update" for more information about the auto update mechanism. + +ocsp-update.mindelay <number> +tune.ssl.ocsp-update.mindelay <number> (deprecated) + Sets the minimum interval between two automatic updates of the same OCSP + response. This time is expressed in seconds and defaults to 300 (5 minutes). + It is particularly useful for OCSP response that do not have explicit + expiration times. It must be set to a lower value than + "ocsp-update.maxdelay". See option "ocsp-update" for more + information about the auto update mechanism. + +ocsp-update.mode [ on | off ] + Sets the default ocsp-update mode for all certificates used in the + configuration. This global option can be superseded by the crt-list + "ocsp-update" option. This option is set to "off" by default. + See option "ocsp-update" for more information about the auto update + mechanism. + pidfile <pidfile> Writes PIDs of all daemons into file <pidfile> when daemon mode or writes PID of master process into file <pidfile> when master-worker mode. This option is @@ -2182,7 +2326,7 @@ set-var-fmt <var-name> <fmt> are only those using internal data, typically 'int(value)' or 'str(value)'. It is possible to reference previously allocated variables as well. These variables will then be readable (and modifiable) from the regular rule sets. - Please see section 8.2.4 for details on the log-format syntax. + Please see section 8.2.6 for details on the Custom log format syntax. Example: global @@ -2190,20 +2334,29 @@ set-var-fmt <var-name> <fmt> set-var-fmt proc.bootid "%pid|%t" setcap <name>[,<name>...] - Sets a list of capabilities that must be preserved when starting with uid 0 - and switching to a non-zero uid. By default all permissions are lost by the - uid switch, but some are often needed when trying connecting to a server from - a foreign address during transparent proxying, or when binding to a port - below 1024, e.g. when using "tune.quic.socket-owner connection", resulting in - setups running entirely under uid 0. Setting capabilities generally is a - safer alternative, as only the required capabilities will be preserved. The - feature is OS-specific and only enabled on Linux when USE_LINUX_CAP=1 is set - at build time. The list of supported capabilities also depends on the OS and - is enumerated by the error message displayed when an invalid capability name - or an empty one is passed. Multiple capabilities may be passed, delimited by - commas. Among those commonly used, "cap_net_raw" allows to transparently bind - to a foreign address, and "cap_net_bind_service" allows to bind to a - privileged port and may be used by QUIC. + Sets a list of capabilities that must be preserved when starting and running + either as a non-root user (uid > 0), or when starting with uid 0 (root) + and switching then to a non-root. By default all permissions are + lost by the uid switch, but some are often needed when trying to connect to + a server from a foreign address during transparent proxying, or when binding + to a port below 1024, e.g. when using "tune.quic.socket-owner connection", + resulting in setups running entirely under uid 0. Setting capabilities + generally is a safer alternative, as only the required capabilities will be + preserved. The feature is OS-specific and only enabled on Linux when + USE_LINUX_CAP=1 is set at build time. The list of supported capabilities also + depends on the OS and is enumerated by the error message displayed when an + invalid capability name or an empty one is passed. Multiple capabilities may + be passed, delimited by commas. Among those commonly used, "cap_net_raw" + allows to transparently bind to a foreign address, and "cap_net_bind_service" + allows to bind to a privileged port and may be used by QUIC. If the process + is started and run under the same non-root user, needed capabilities should + be set on haproxy binary file with setcap along with this keyword. For more + details about setting capabilities on haproxy binary, please see chapter + 13.1 Linux capabilities support in the Management guide. + + Example: + global + setcap cap_net_bind_service,cap_net_admin setenv <name> <value> Sets environment variable <name> to value <value>. If the variable exists, it @@ -2516,6 +2669,17 @@ ssl-load-extra-files <none|all|bundle|sctl|ocsp|issuer|key>* See also: "crt", section 5.1 about bind options and section 5.2 about server options. +ssl-security-level <number> + This directive allows to chose the OpenSSL security level as described in + https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_security_level.html + The security level will be applied to every SSL contextes in HAProxy. + Only a value between 0 and 5 is supported. + + The default value depends on your OpenSSL version, distribution and how was + compiled the library. + + This directive requires at least OpenSSL 1.1.1. + ssl-server-verify [none|required] The default behavior for SSL verify on servers side. If specified to 'none', servers certificates are not verified. The default is 'required' except if @@ -2551,6 +2715,11 @@ stats timeout <timeout, in milliseconds> to change this value with "stats timeout". The value must be passed in milliseconds, or be suffixed by a time unit among { us, ms, s, m, h, d }. +stats-file <path> + Path to a generated haproxy stats-file. On startup haproxy will preload the + values to its internal counters. Use the CLI command "dump stats-file" to + produce such stats-file. See the management manual for more details. + strict-limits Makes process fail at startup when a setrlimit fails. HAProxy tries to set the best setrlimit according to what has been calculated. If it fails, it will @@ -2578,6 +2747,19 @@ thread-groups <number> since up to 64 threads per group may be configured. The maximum number of groups is configured at compile time and defaults to 16. See also "nbthread". +thread-hard-limit <number> + This setting is used to enforce a limit to the number of threads, either + detected, or configured. This is particularly useful on operating systems + where the number of threads is automatically detected, where a number of + threads lower than the number of CPUs is desired in generic and portable + configurations. Indeed, while "nbthread" enforces a number of threads that + will result in a warning and bad performance if higher than CPUs available, + thread-hard-limit will only cap the maximum value and automatically limit + the number of threads to no higher than this value, but will not raise lower + values. If "nbthread" is forced to a higher value, thread-hard-limit wins, + and a warning is emitted in so that the configuration anomaly can be + fixed. By default there is no limit. See also "nbthread". + trace <args...> This command configures one "trace" subsystem statement. Each of them can be found in the management manual, and follow the exact same syntax. Only one @@ -2974,28 +3156,32 @@ ssl-mode-async read/write operations (it is only enabled during initial and renegotiation handshakes). +tune.applet.zero-copy-forwarding { on | off } + Enables ('on') of disabled ('off') the zero-copy forwarding of data for the + applets. It is enabled by default. + + See also: tune.disable-zero-copy-forwarding. + tune.buffers.limit <number> Sets a hard limit on the number of buffers which may be allocated per process. - The default value is zero which means unlimited. The minimum non-zero value - will always be greater than "tune.buffers.reserve" and should ideally always - be about twice as large. Forcing this value can be particularly useful to - limit the amount of memory a process may take, while retaining a sane - behavior. When this limit is reached, streams which need a buffer wait for - another one to be released by another stream. Since buffers are dynamically - allocated and released, the waiting time is very short and not perceptible - provided that limits remain reasonable. In fact sometimes reducing the limit - may even increase performance by increasing the CPU cache's efficiency. Tests - have shown good results on average HTTP traffic with a limit to 1/10 of the - expected global maxconn setting, which also significantly reduces memory - usage. The memory savings come from the fact that a number of connections - will not allocate 2*tune.bufsize. It is best not to touch this value unless - advised to do so by an HAProxy core developer. + The default value is zero which means unlimited. The limit will automatically + be re-adjusted to satisfy the reserved buffers for emergency situations so + that the user doesn't have to perform complicated calculations. Forcing this + value can be particularly useful to limit the amount of memory a process may + take, while retaining a sane behavior. When this limit is reached, a task + that requests a buffer waits for another one to be released first. Most of + the time the waiting time is very short and not perceptible provided that + limits remain reasonable. However, some historical limitations have weakened + this mechanism over versions and it is known that in certain situations of + sustained shortage, some tasks may freeze until their timeout expires, so it + is safer to avoid using this when not strictly necessary. tune.buffers.reserve <number> - Sets the number of buffers which are pre-allocated and reserved for use only - during memory shortage conditions resulting in failed memory allocations. The - minimum value is 2 and is also the default. There is no reason a user would - want to change this value, it's mostly aimed at HAProxy core developers. + Sets the number of per-thread buffers which are pre-allocated and reserved + for use only during memory shortage conditions resulting in failed memory + allocations. The minimum value is 0 and the default is 4. There is no reason + a user would want to change this value, unless a core developer suggests to + change it for a very specific reason. tune.bufsize <number> Sets the buffer size to this size (in bytes). Lower values allow more @@ -3036,7 +3222,7 @@ tune.disable-zero-copy-forwarding Thanks to this directive, it is possible to disable this optimization. Note it also disable any kernel tcp splicing. - See also: tune.pt.zero-copy-forwarding, + See also: tune.pt.zero-copy-forwarding, tune.applet.zero-copy-forwarding, tune.h1.zero-copy-fwd-recv, tune.h1.zero-copy-fwd-send, tune.h2.zero-copy-fwd-send, tune.quic.zero-copy-fwd-send @@ -3330,10 +3516,17 @@ tune.lua.forced-yield <number> This directive forces the Lua engine to execute a yield each <number> of instructions executed. This permits interrupting a long script and allows the HAProxy scheduler to process other tasks like accepting connections or - forwarding traffic. The default value is 10000 instructions. If HAProxy often - executes some Lua code but more responsiveness is required, this value can be - lowered. If the Lua code is quite long and its result is absolutely required - to process the data, the <number> can be increased. + forwarding traffic. The default value is 10000 instructions for scripts loaded + using "lua-load-per-thread" and MAX(500, 10000 / nbthread) instructions for + scripts loaded using "lua-load" (it was found to be an optimal value for + performance while taking care of not creating thread contention with multiple + threads competing for the global lua lock). + + If HAProxy often executes some Lua code but more responsiveness is required, + this value can be lowered. If the Lua code is quite long and its result is + absolutely required to process the data, the <number> can be increased, but + the value should be set wisely as in multithreading context it could increase + contention. tune.lua.maxmem <number> Sets the maximum amount of RAM in megabytes per process usable by Lua. By @@ -3551,6 +3744,11 @@ tune.pt.zero-copy-forwarding { on | off } See also: tune.disable-zero-copy-forwarding, option splice-auto, option splice-request and option splice-response +tune.quic.cc-hystart { on | off } + Enables ('on') or disabled ('off') the HyStart++ (RFC 9406) algorithm for + QUIC connections used as a replacement for the slow start phase of congestion + control algorithms which may cause high packet loss. It is disabled by default. + tune.quic.frontend.conn-tx-buffers.limit <number> This settings defines the maximum number of buffers allocated for a QUIC connection on data emission. By default, it is set to 30. QUIC buffers are @@ -3558,6 +3756,18 @@ tune.quic.frontend.conn-tx-buffers.limit <number> and memory consumption and can be adjusted according to an estimated round time-trip. Each buffer is tune.bufsize. +tune.quic.frontend.glitches-threshold <number> + Sets the threshold for the number of glitches on a frontend connection, where + that connection will automatically be killed. This allows to automatically + kill misbehaving connections without having to write explicit rules for them. + The default value is zero, indicating that no threshold is set so that no + event will cause a connection to be closed. Beware that some QUIC clients may + occasionally cause a few glitches over long lasting connection, so any non- + zero value here should probably be in the hundreds or thousands to be + effective without affecting slightly bogus clients. + + See also: fc_glitches + tune.quic.frontend.max-idle-timeout <timeout> Sets the QUIC max_idle_timeout transport parameters in milliseconds for frontends which determines the period of time after which a connection silently @@ -3634,7 +3844,7 @@ tune.quic.socket-owner { connection | listener } tune.quic.zero-copy-fwd-send { on | off } Enables ('on') of disabled ('off') the zero-copy sends of data for the QUIC - multiplexer. It is disabled by default. + multiplexer. It is enabled by default. See also: tune.disable-zero-copy-forwarding @@ -3671,6 +3881,15 @@ tune.recv_enough <number> may be changed by this setting to better deal with workloads involving lots of short messages such as telnet or SSH sessions. +tune.ring.queues <number> + Sets the number of write queues in front of ring buffers. This can have an + effect on the CPU usage of traces during debugging sessions, and both too + low or too large a value can have an important effect. The good value was + determined experimentally by developers and there should be no reason to + try to change it unless instructed to do so in order to try to address + specific issues. Such a setting should not be left in the configuration + across version upgrades because its optimal value may evolve over time. + tune.runqueue-depth <number> Sets the maximum amount of task that can be processed at once when running tasks. The default value depends on the number of threads but sits between 35 @@ -3793,13 +4012,16 @@ tune.ssl.keylog { on | off } SSLKEYLOGFILE Label | Sample fetches for the Secrets --------------------------------|----------------------------------------- - CLIENT_EARLY_TRAFFIC_SECRET | %[ssl_fc_client_early_traffic_secret] - CLIENT_HANDSHAKE_TRAFFIC_SECRET | %[ssl_fc_client_handshake_traffic_secret] - SERVER_HANDSHAKE_TRAFFIC_SECRET | %[ssl_fc_server_handshake_traffic_secret] - CLIENT_TRAFFIC_SECRET_0 | %[ssl_fc_client_traffic_secret_0] - SERVER_TRAFFIC_SECRET_0 | %[ssl_fc_server_traffic_secret_0] - EXPORTER_SECRET | %[ssl_fc_exporter_secret] - EARLY_EXPORTER_SECRET | %[ssl_fc_early_exporter_secret] + CLIENT_EARLY_TRAFFIC_SECRET | %[ssl_xx_client_early_traffic_secret] + CLIENT_HANDSHAKE_TRAFFIC_SECRET | %[ssl_xx_client_handshake_traffic_secret] + SERVER_HANDSHAKE_TRAFFIC_SECRET | %[ssl_xx_server_handshake_traffic_secret] + CLIENT_TRAFFIC_SECRET_0 | %[ssl_xx_client_traffic_secret_0] + SERVER_TRAFFIC_SECRET_0 | %[ssl_xx_server_traffic_secret_0] + EXPORTER_SECRET | %[ssl_xx_exporter_secret] + EARLY_EXPORTER_SECRET | %[ssl_xx_early_exporter_secret] + + These fetches exists for frontend (fc) or backend (bc) sides, replace "xx" by + "fc" or "bc" to use the right side. This is only available with OpenSSL 1.1.1, and useful with TLS1.3 session. @@ -3808,6 +4030,17 @@ tune.ssl.keylog { on | off } "CLIENT_RANDOM %[ssl_fc_client_random,hex] %[ssl_fc_session_key,hex]" + A complete keylog could be generate with a log-format these way, even though + this is not ideal for syslog: + + log-format "CLIENT_EARLY_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_client_early_traffic_secret]\n + CLIENT_HANDSHAKE_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_client_handshake_traffic_secret]\n + SERVER_HANDSHAKE_TRAFFIC_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_server_handshake_traffic_secret]\n + CLIENT_TRAFFIC_SECRET_0 %[ssl_bc_client_random,hex] %[ssl_bc_client_traffic_secret_0]\n + SERVER_TRAFFIC_SECRET_0 %[ssl_bc_client_random,hex] %[ssl_bc_server_traffic_secret_0]\n + EXPORTER_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_exporter_secret]\n + EARLY_EXPORTER_SECRET %[ssl_bc_client_random,hex] %[ssl_bc_early_exporter_secret]" + tune.ssl.lifetime <timeout> Sets how long a cached SSL session may remain valid. This time is expressed in seconds and defaults to 300 (5 min). It is important to understand that it @@ -3837,20 +4070,6 @@ tune.ssl.ssl-ctx-cache-size <number> dynamically is expensive, they are cached. The default cache size is set to 1000 entries. -tune.ssl.ocsp-update.maxdelay <number> - Sets the maximum interval between two automatic updates of the same OCSP - response. This time is expressed in seconds and defaults to 3600 (1 hour). It - must be set to a higher value than "tune.ssl.ocsp-update.mindelay". See - option "ocsp-update" for more information about the auto update mechanism. - -tune.ssl.ocsp-update.mindelay <number> - Sets the minimum interval between two automatic updates of the same OCSP - response. This time is expressed in seconds and defaults to 300 (5 minutes). - It is particularly useful for OCSP response that do not have explicit - expiration times. It must be set to a lower value than - "tune.ssl.ocsp-update.maxdelay". See option "ocsp-update" for more - information about the auto update mechanism. - tune.stick-counters <number> Sets the number of stick-counters that may be tracked at the same time by a connection or a request via "track-sc*" actions in "tcp-request" or @@ -3965,7 +4184,15 @@ user <username> [password|insecure-password <password>] designed to be expensive to compute to achieve resistance against brute force attacks. They do not simply salt/hash the clear text password once, but thousands of times. This can quickly become a major factor in HAProxy's - overall CPU consumption! + overall CPU consumption, and can even lead to application crashes! + + To address the high CPU usage of hash functions, one approach is to reduce + the number of rounds of the hash function (SHA family algorithms) or decrease + the "cost" of the function, if the algorithm supports it. + + As a side note, musl (e.g. Alpine Linux) implementations are known to be + slower than their glibc counterparts when calculating hashes, so you might + want to consider this aspect too. Example: userlist L1 @@ -4577,6 +4804,196 @@ httpclient.timeout.connect <timeout> The default value is 5000ms. + +3.12. Certificate Storage +------------------------- + +HAProxy uses an internal storage mechanism to load and store certificates used +in the configuration. This storage can be configured by using a "crt-store" +section. It allows to configure certificate definitions and which files should +be loaded in it. A certificate definition must be written before it is used +elsewhere in the configuration. + +The "crt-store" takes an optional name in argument. If a name is specified, +every certificate of this store must be referenced using "@<name>/<crt>" or +"@<name>/<alias>". + +Files in the certificate storage can also be updated dynamically with the CLI. +See "set ssl cert" in the section 9.3 of the management guide. + + +The following keywords are supported in the "crt-store" section : + - crt-base + - key-base + - load + +crt-base <dir> + Assigns a default directory to fetch SSL certificates from when a relative + path is used with "crt" directives. Absolute locations specified prevail and + ignore "crt-base". When used in a crt-store, the crt-base of the global + section is ignored. + +key-base <dir> + Assigns a default directory to fetch SSL private keys from when a relative + path is used with "key" directives. Absolute locations specified prevail and + ignore "key-base". When used in a crt-store, the key-base of the global + section is ignored. + +load [crt <filename>] [param*] + Load SSL files in the certificate storage. For the parameter list, see section + "3.12.1. Load options" + +Example: + + crt-store + load crt "site1.crt" key "site1.key" ocsp "site1.ocsp" alias "site1" + load crt "site2.crt" key "site2.key" + + frontend in2 + bind *:443 ssl crt "@/site1" crt "site2.crt" + + crt-store web + crt-base /etc/ssl/certs/ + key-base /etc/ssl/private/ + load crt "site3.crt" alias "site3" + load crt "site4.crt" key "site4.key" + + frontend in2 + bind *:443 ssl crt "@web/site1" crt "site2.crt" crt "@web/site3" crt "@web/site4.crt" + +3.12.1. Load options +-------------------- + +Load SSL files in the certificate storage. The load keyword can take multiple +parameters which are listed below. These keywords are also usable in a +crt-list. + +crt <filename> + This argument is mandatory, it loads a PEM which must contain the public + certificate but could also contain the intermediate certificates and the + private key. If no private key is provided in this file, a key can be provided + with the "key" keyword. + +alias <string> + Optional argument. Allow to name the certificate with an alias, so it can be + referenced with it in the configuration. An alias must be prefixed with '@/' + when called elsewhere in the configuration. + +key <filename> + This argument is optional. Load a private key in PEM format. If a private key + was already defined in "crt", it will overwrite it. + +ocsp <filename> + This argument is optional, it loads an OCSP response in DER format. It can + be updated with the CLI. + +issuer <filename> + This argument is optional. Load the OCSP issuer in PEM format. In order to + identify which certificate an OCSP Response applies to, the issuer's + certificate is necessary. If the issuer's certificate is not found in the + "crt" file, it could be loaded from a file with this argument. + +sctl <filename> + This argument is optional. Support for Certificate Transparency (RFC6962) TLS + extension is enabled. The file must contain a valid Signed Certificate + Timestamp List, as described in RFC. File is parsed to check basic syntax, + but no signatures are verified. + +ocsp-update [ off | on ] + Enable automatic OCSP response update when set to 'on', disable it otherwise. + Its value defaults to 'off'. + To enable the OCSP auto update on a bind line, you can use this option in a + crt-store or you can use the global option "tune.ocsp-update.mode". + If a given certificate is used in multiple crt-lists with different values of + the 'ocsp-update' set, an error will be raised. Likewise, if a certificate + inherits from the global option on a bind line and has an incompatible + explicit 'ocsp-update' option set in a crt-list, the same error will be + raised. + + Examples: + + Here is an example configuration enabling it with a crt-list: + + haproxy.cfg: + frontend fe + bind :443 ssl crt-list haproxy.list + + haproxy.list: + server_cert.pem [ocsp-update on] foo.bar + + Here is an example configuration enabling it with a crt-store: + + haproxy.cfg: + + crt-store + load crt foobar.pem ocsp-update on + + frontend fe + bind :443 ssl crt foobar.pem + + When the option is set to 'on', we will try to get an ocsp response whenever + an ocsp uri is found in the frontend's certificate. The only limitation of + this mode is that the certificate's issuer will have to be known in order for + the OCSP certid to be built. + Each OCSP response will be updated at least once an hour, and even more + frequently if a given OCSP response has an expire date earlier than this one + hour limit. A minimum update interval of 5 minutes will still exist in order + to avoid updating too often responses that have a really short expire time or + even no 'Next Update' at all. Because of this hard limit, please note that + when auto update is set to 'on', any OCSP response loaded during init will + not be updated until at least 5 minutes, even if its expire time ends before + now+5m. This should not be too much of a hassle since an OCSP response must + be valid when it gets loaded during init (its expire time must be in the + future) so it is unlikely that this response expires in such a short time + after init. + On the other hand, if a certificate has an OCSP uri specified and no OCSP + response, setting this option to 'on' for the given certificate will ensure + that the OCSP response gets fetched automatically right after init. + The default minimum and maximum delays (5 minutes and 1 hour respectively) + can be configured by the "ocsp-update.maxdelay" and "ocsp-update.mindelay" + global options. + + Whenever an OCSP response is updated by the auto update task or following a + call to the "update ssl ocsp-response" CLI command, a dedicated log line is + emitted. It follows a dedicated format that contains the following header + "<OCSP-UPDATE>" and is followed by specific OCSP-related information: + - the path of the corresponding frontend certificate + - a numerical update status + - a textual update status + - the number of update failures for the given response + - the number of update successes for the givan response + See "show ssl ocsp-updates" CLI command for a full list of error codes and + error messages. This line is emitted regardless of the success or failure of + the concerned OCSP response update. + The OCSP request/response is sent and received through an http_client + instance that has the dontlog-normal option set and that uses the regular + HTTP log format in case of error (unreachable OCSP responder for instance). + If such an error occurs, another log line that contains HTTP-related + information will then be emitted alongside the "regular" OCSP one (which will + likely have "HTTP error" as text status). But if a purely HTTP error happens + (unreachable OCSP responder for instance), an extra log line that follows the + regular HTTP log-format will be emitted. + Here are two examples of such log lines, with a successful OCSP update log + line first and then an example of an HTTP error with the two different lines + (lines were spit and the URL was shortened for readability): + <133>Mar 6 11:16:53 haproxy[14872]: <OCSP-UPDATE> /path_to_cert/foo.pem 1 \ + "Update successful" 0 1 + + <133>Mar 6 11:18:55 haproxy[14872]: <OCSP-UPDATE> /path_to_cert/bar.pem 2 \ + "HTTP error" 1 0 + <133>Mar 6 11:18:55 haproxy[14872]: -:- [06/Mar/2023:11:18:52.200] \ + <OCSP-UPDATE> -/- 2/0/-1/-1/3009 503 217 - - SC-- 0/0/0/0/3 0/0 {} \ + "GET http://127.0.0.1:12345/MEMwQT HTTP/1.1" + + Troubleshooting: + A common error that can happen with let's encrypt certificates is if the DNS + resolution provides an IPv6 address and your system does not have a valid + outgoing IPv6 route. In such a case, you can either create the appropriate + route or set the "httpclient.resolvers.prefer ipv4" option in the global + section. + In case of "OCSP response check failure" error, you might want to check that + the issuer certificate that you provided is valid. + 4. Proxies ---------- @@ -4771,6 +5188,9 @@ error-log-format X X X - force-persist - - X X filter - X X X fullconn X - X X +guid - X X X +hash-balance-factor X - X X +hash-key X - X X hash-type X - X X http-after-response X (!) X X X http-check comment X - X X @@ -5241,8 +5661,7 @@ balance url_param <param> [check_post] the log messages. When the server goes DOWN, the next server in the list takes its place. When a previously DOWN server goes back UP it is added at the end of the list so that the - sticky server doesn't change until it becomes DOWN. This - algorithm is only usable for backends in LOG mode. + sticky server doesn't change until it becomes DOWN. <arguments> is an optional list of arguments which may be needed by some algorithms. Right now, only "url_param", "uri" and "log-hash" @@ -5250,7 +5669,7 @@ balance url_param <param> [check_post] The load balancing algorithm of a backend is set to roundrobin when no other algorithm, mode nor option have been set. The algorithm may only be set once - for each backend. In backends in LOG mode, server "weight" is always ignored. + for each backend. With authentication schemes that require the same connection like NTLM, URI based algorithms must not be used, as they would cause subsequent requests @@ -6404,7 +6823,7 @@ email-alert to <emailaddr> "email-alert myhostname", section 3.6 about mailers. -error-log-format <string> +error-log-format <fmt> Specifies the log format string to use in case of connection error on the frontend side. May be used in the following contexts: tcp, http @@ -6419,8 +6838,8 @@ error-log-format <string> connection errors described in section 8.2.5. If the directive is used in a defaults section, all subsequent frontends will - use the same log format. Please see section 8.2.4 which covers the log format - string in depth. + use the same log format. Please see section 8.2.6 which covers the custom log + format string in depth. "error-log-format" directive overrides previous "error-log-format" directives. @@ -6534,6 +6953,14 @@ fullconn <conns> See also : "maxconn", "server" +guid <string> + Specify a case-sensitive global unique ID for this proxy. This must be unique + across all haproxy configuration on every object types. Format is left + unspecified to allow the user to select its naming policy. The only + restriction is its length which cannot be greater than 127 characters. All + alphanumerical values and '.', ':', '-' and '_' characters are valid. + + hash-balance-factor <factor> Specify the balancing factor for bounded-load consistent hashing @@ -6567,6 +6994,29 @@ hash-balance-factor <factor> See also : "balance" and "hash-type". +hash-key <key> + Specify how "hash-type consistent" node keys are computed + + Arguments : + <key> <key> may be one of the following : + + id The node keys will be derived from the server's numeric + identifier as set from "id" or which defaults to its position + in the server list. + + addr The node keys will be derived from the server's address, when + available, or else fall back on "id". + + addr-port The node keys will be derived from the server's address and + port, when available, or else fall back on "id". + + The "addr" and "addr-port" options may be useful in scenarios where multiple + HAProxy processes are balancing traffic to the same set of servers. If the + server order of each process is different (because, for example, DNS records + were resolved in different orders) then this will allow each independent + HAProxy processes to agree on routing decisions. + + hash-type <method> <function> <modifier> Specify a method to use for mapping hashes to servers @@ -6897,12 +7347,13 @@ http-check expect [min-recv <int>] [comment <msg>] on-success <fmt> is optional and can be used to customize the informational message reported in logs if the expect rule is successfully evaluated and if it is the last rule - in the tcp-check ruleset. <fmt> is a log-format string. + in the tcp-check ruleset. <fmt> is a Custom log format + string (see section 8.2.6). on-error <fmt> is optional and can be used to customize the informational message reported in logs if an error occurred during the expect rule evaluation. <fmt> is a - log-format string. + Custom log format string (see section 8.2.6). <match> is a keyword indicating how to look for a specific pattern in the response. The keyword may be one of "status", "rstatus", "hdr", @@ -6948,17 +7399,18 @@ http-check expect [min-recv <int>] [comment <msg>] match), "end" (suffix match), "sub" (substring match) or "reg" (regex match). If not specified, exact matching method is used. If the "name-lf" parameter is used, - <name> is evaluated as a log-format string. If "value-lf" - parameter is used, <value> is evaluated as a log-format - string. These parameters cannot be used with the regex - matching method. Finally, the header value is considered - as comma-separated list. Note that matchings are case - insensitive on the header names. + <name> is evaluated as a Custom log format string (see + section 8.2.6). If "value-lf" parameter is used, <value> + is evaluated as a log-format string. These parameters + cannot be used with the regex matching method. Finally, + the header value is considered as comma-separated + list. Note that matchings are case insensitive on the + header names. fhdr { name | name-lf } [ -m <meth> ] <name> [ { value | value-lf } [ -m <meth> ] <value> : test the specified full header pattern on the HTTP - response headers. It does exactly the same than "hdr" + response headers. It does exactly the same as the "hdr" keyword, except the full header value is tested, commas are not considered as delimiters. @@ -6981,12 +7433,13 @@ http-check expect [min-recv <int>] [comment <msg>] of a dynamic page, or to detect a failure when a specific error appears on the check page (e.g. a stack trace). - string-lf <fmt> : test a log-format string match in the HTTP response body. - A health check response will be considered valid if the - response's body contains the string resulting of the - evaluation of <fmt>, which follows the log-format rules. - If prefixed with "!", then the response will be - considered invalid if the body contains the string. + string-lf <fmt> : test a Custom log format string (see section 8.2.6) match + in the HTTP response body. A health check response will + be considered valid if the response's body contains the + string resulting of the evaluation of <fmt>, which + follows the log-format rules. If prefixed with "!", then + the response will be considered invalid if the body + contains the string. It is important to note that the responses will be limited to a certain size defined by the global "tune.bufsize" option, which defaults to 16384 bytes. @@ -7052,9 +7505,10 @@ http-check send [meth <method>] [{ uri <uri> | uri-lf <fmt> }>] [ver <version>] other URI. Query strings are permitted. uri-lf <fmt> is optional and set the URI referenced in the HTTP requests - using the log-format string <fmt>. It defaults to "/" which - is accessible by default on almost any server, but may be - changed to any other URI. Query strings are permitted. + using the Custom log format <fmt> (see section 8.2.6). It + defaults to "/" which is accessible by default on almost any + server, but may be changed to any other URI. Query strings + are permitted. ver <version> is the optional HTTP version string. It defaults to "HTTP/1.0" but some servers might behave incorrectly in HTTP @@ -7064,16 +7518,16 @@ http-check send [meth <method>] [{ uri <uri> | uri-lf <fmt> }>] [ver <version>] hdr <name> <fmt> adds the HTTP header field whose name is specified in <name> and whose value is defined by <fmt>, which follows - to the log-format rules. + the Custom log format rules described in section 8.2.6. body <string> add the body defined by <string> to the request sent during HTTP health checks. If defined, the "Content-Length" header is thus automatically added to the request. - body-lf <fmt> add the body defined by the log-format string <fmt> to the - request sent during HTTP health checks. If defined, the - "Content-Length" header is thus automatically added to the - request. + body-lf <fmt> add the body defined by the Custom log format <fmt> (see + section 8.2.6) to the request sent during HTTP health + checks. If defined, the "Content-Length" header is thus + automatically added to the request. In addition to the request line defined by the "option httpchk" directive, this one is the valid way to add some headers and optionally a body to the @@ -7182,8 +7636,8 @@ http-check set-var-fmt(<var-name>[,<cond>...]) <fmt> <expr> Is a sample-fetch expression potentially followed by converters. - <fmt> This is the value expressed using log-format rules (see Custom - Log Format in section 8.2.4). + <fmt> This is the value expressed using Custom log format (see Custom + Log Format in section 8.2.6). Examples : http-check set-var(check.port) int(1234) @@ -7265,7 +7719,7 @@ http-error status <code> [content-type <type>] file is not empty, its content-type must be set as argument to "content-type", otherwise, any "content-type" argument is ignored. <file> is - evaluated as a log-format string. + evaluated as a Custom log format (see section 8.2.6). lf-string <str> specifies the log-format string to use as response payload. The content-type must always be set as @@ -7273,8 +7727,9 @@ http-error status <code> [content-type <type>] hdr <name> <fmt> adds to the response the HTTP header field whose name is specified in <name> and whose value is defined by - <fmt>, which follows to the log-format rules. - This parameter is ignored if an errorfile is used. + <fmt>, which follows the Custom log format rules (see + section 8.2.6). This parameter is ignored if an + errorfile is used. This directive may be used instead of "errorfile", to define a custom error message. As "errorfile" directive, it is used for errors detected and @@ -7430,12 +7885,29 @@ http-reuse { never | safe | aggressive | always } May be used in sections: defaults | frontend | listen | backend yes | no | yes | yes - By default, a connection established between HAProxy and the backend server - which is considered safe for reuse is moved back to the server's idle - connections pool so that any other request can make use of it. This is the - "safe" strategy below. - - The argument indicates the desired connection reuse strategy : + In order to avoid the cost of setting up new connections to backend servers + for each HTTP request, HAProxy tries to keep such idle connections opened + after being used. These connections are specific to a server and are stored + in a list called a pool, and are grouped together by a set of common key + properties. Subsequent HTTP requests will cause a lookup of a compatible + connection sharing identical properties in the associated pool and result in + this connection being reused instead of establishing a new one. + + A limit on the number of idle connections to keep on a server can be + specified via the "pool-max-conn" server keyword. Unused connections are + periodically purged according to the "pool-purge-delay" interval. + + The following connection properties are used to determine if an idle + connection is eligible for reuse on a given request: + - source and destination addresses + - proxy protocol + - TOS and mark socket options + - connection name, determined either by the result of the evaluation of the + "pool-conn-name" expression if present, otherwise by the "sni" expression + + In some occasions, connection lookup or reuse is not performed due to extra + restrictions. This is determined by the reuse strategy specified via the + keyword argument: - "never" : idle connections are never shared between sessions. This mode may be enforced to cancel a different strategy inherited from @@ -7486,20 +7958,12 @@ http-reuse { never | safe | aggressive | always } gains as "aggressive" but with more risks. It should only be used when it improves the situation over "aggressive". - When http connection sharing is enabled, a great care is taken to respect the - connection properties and compatibility. Indeed, some properties are specific - and it is not possibly to reuse it blindly. Those are the SSL SNI, source - and destination address and proxy protocol block. A connection is reused only - if it shares the same set of properties with the request. - Also note that connections with certain bogus authentication schemes (relying - on the connection) like NTLM are marked private and never shared. - - A connection pool is involved and configurable with "pool-max-conn". - - Note: connection reuse improves the accuracy of the "server maxconn" setting, - because almost no new connection will be established while idle connections - remain available. This is particularly true with the "always" strategy. + on the connection) like NTLM are marked private if possible and never shared. + This won't be the case however when using a protocol with multiplexing + abilities and using reuse mode level value greater than the default "safe" + strategy as in this case nothing prevents the connection from being already + shared. The rules to decide to keep an idle connection opened or to close it after processing are also governed by the "tune.pool-low-fd-ratio" (default: 20%) @@ -7513,13 +7977,14 @@ http-reuse { never | safe | aggressive | always } too few connections are kept open. It may be desirable in this case to adjust such thresholds or simply to increase the global "maxconn" value. - Similarly, when thread groups are explicitly enabled, it is important to - understand that idle connections are only usable between threads from a same - group. As such it may happen that unfair load between groups leads to more - idle connections being needed, causing a lower reuse rate. The same solution - may then be applied (increase global "maxconn" or increase pool ratios). + When thread groups are explicitly enabled, it is important to understand that + idle connections are only usable between threads from a same group. As such + it may happen that unfair load between groups leads to more idle connections + being needed, causing a lower reuse rate. The same solution may then be + applied (increase global "maxconn" or increase pool ratios). - See also : "option http-keep-alive", "server maxconn", "thread-groups", + See also : "option http-keep-alive", "pool-conn-name", "pool-max-conn", + "pool-purge-delay", "server maxconn", "sni", "thread-groups", "tune.pool-high-fd-ratio", "tune.pool-low-fd-ratio" @@ -7883,8 +8348,8 @@ no log # level and send in tcp log "${LOCAL_SYSLOG}:514" local0 notice # send to local server -log-format <string> - Specifies the log format string to use for traffic logs +log-format <fmt> + Specifies the custom log format string to use for traffic logs May be used in the following contexts: tcp, http @@ -7894,16 +8359,17 @@ log-format <string> This directive specifies the log format string that will be used for all logs resulting from traffic passing through the frontend using this line. If the directive is used in a defaults section, all subsequent frontends will use - the same log format. Please see section 8.2.4 which covers the log format - string in depth. + the same log format. Please see section 8.2.6 which covers the custom log + format string in depth. + A specific log-format used only in case of connection error can also be defined, see the "error-log-format" option. "log-format" directive overrides previous "option tcplog", "log-format", "option httplog" and "option httpslog" directives. -log-format-sd <string> - Specifies the RFC5424 structured-data log format string +log-format-sd <fmt> + Specifies the Custom log format string used to produce RFC5424 structured-data May be used in the following contexts: tcp, http @@ -7913,7 +8379,7 @@ log-format-sd <string> This directive specifies the RFC5424 structured-data log format string that will be used for all logs resulting from traffic passing through the frontend using this line. If the directive is used in a defaults section, all - subsequent frontends will use the same log format. Please see section 8.2.4 + subsequent frontends will use the same log format. Please see section 8.2.6 which covers the log format string in depth. See https://tools.ietf.org/html/rfc5424#section-6.3 for more information @@ -9421,7 +9887,7 @@ no option logasap Arguments : none - By default, logs are emitted when all the log format variables and sample + By default, logs are emitted when all the log format aliases and sample fetches used in the definition of the log-format string return a value, or when the stream is terminated. This allows the built in log-format strings to account for the transfer time, or the number of bytes in log messages. @@ -10467,8 +10933,8 @@ redirect scheme <sch> [code <code>] <option> [{if | unless} <condition>] Arguments : <loc> With "redirect location", the exact value in <loc> is placed into the HTTP "Location" header. When used in an "http-request" rule, - <loc> value follows the log-format rules and can include some - dynamic values (see Custom Log Format in section 8.2.4). + <loc> value follows the Custom log format rules and can include + some dynamic values (see Custom log format in section 8.2.6). <pfx> With "redirect prefix", the "Location" header is built from the concatenation of <pfx> and the complete URI path, including the @@ -10476,9 +10942,9 @@ redirect scheme <sch> [code <code>] <option> [{if | unless} <condition>] below). As a special case, if <pfx> equals exactly "/", then nothing is inserted before the original URI. It allows one to redirect to the same URL (for instance, to insert a cookie). When - used in an "http-request" rule, <pfx> value follows the log-format - rules and can include some dynamic values (see Custom Log Format - in section 8.2.4). + used in an "http-request" rule, <pfx> value follows the Custom + Log Format rules and can include some dynamic values (see Custom + Log Format in section 8.2.6). <sch> With "redirect scheme", then the "Location" header is built by concatenating <sch> with "://" then the first occurrence of the @@ -10489,8 +10955,8 @@ redirect scheme <sch> [code <code>] <option> [{if | unless} <condition>] returned, which most recent browsers interpret as redirecting to the same host. This directive is mostly used to redirect HTTP to HTTPS. When used in an "http-request" rule, <sch> value follows - the log-format rules and can include some dynamic values (see - Custom Log Format in section 8.2.4). + the Custom log format rules and can include some dynamic values + (see Custom log format in section 8.2.6). <code> The code is optional. It indicates which type of HTTP redirection is desired. Only codes 301, 302, 303, 307 and 308 are supported, @@ -12367,12 +12833,13 @@ tcp-check expect [min-recv <int>] [comment <msg>] on-success <fmt> is optional and can be used to customize the informational message reported in logs if the expect rule is successfully evaluated and if it is the last rule - in the tcp-check ruleset. <fmt> is a log-format string. + in the tcp-check ruleset. <fmt> is a Custom log format + (see section 8.2.6). on-error <fmt> is optional and can be used to customize the informational message reported in logs if an error occurred during the expect rule evaluation. <fmt> is a - log-format string. + Custom log format (see section 8.2.6). status-code <expr> is optional and can be used to set the check status code reported in logs, on success or on error. <expr> is a @@ -12405,12 +12872,13 @@ tcp-check expect [min-recv <int>] [comment <msg>] will be considered invalid if the body matches the expression. - string-lf <fmt> : test a log-format string match in the response's buffer. + string-lf <fmt> : test a Custom log format match in the response's buffer. A health check response will be considered valid if the response's buffer contains the string resulting of the - evaluation of <fmt>, which follows the log-format rules. - If prefixed with "!", then the response will be - considered invalid if the buffer contains the string. + evaluation of <fmt>, which follows the Custom log format + rules described in section 8.2.6. If prefixed with "!", + then the response will be considered invalid if the + buffer contains the string. binary <hexstring> : test the exact string in its hexadecimal form matches in the response buffer. A health check response will @@ -12427,16 +12895,16 @@ tcp-check expect [min-recv <int>] [comment <msg>] pattern should work on at-most half the response buffer size. - binary-lf <hexfmt> : test a log-format string in its hexadecimal form - match in the response's buffer. A health check response - will be considered valid if the response's buffer - contains the hexadecimal string resulting of the - evaluation of <fmt>, which follows the log-format - rules. If prefixed with "!", then the response will be - considered invalid if the buffer contains the - hexadecimal string. The hexadecimal string is converted - in a binary string before matching the response's - buffer. + binary-lf <hexfmt> : test a Custom log format in its hexadecimal form match + in the response's buffer. A health check response will + be considered valid if the response's buffer contains + the hexadecimal string resulting of the evaluation of + <fmt>, which follows the Custom log format rules (see + section 8.2.6). If prefixed with "!", then the + response will be considered invalid if the buffer + contains the hexadecimal string. The hexadecimal + string is converted in a binary string before matching + the response's buffer. It is important to note that the responses will be limited to a certain size defined by the global "tune.bufsize" option, which defaults to 16384 bytes. @@ -12475,7 +12943,7 @@ tcp-check expect [min-recv <int>] [comment <msg>] tcp-check send <data> [comment <msg>] tcp-check send-lf <fmt> [comment <msg>] - Specify a string or a log-format string to be sent as a question during a + Specify a string or a Custom log format to be sent as a question during a generic health check May be used in the following contexts: tcp, http, log @@ -12489,8 +12957,8 @@ tcp-check send-lf <fmt> [comment <msg>] <data> is the string that will be sent during a generic health check session. - <fmt> is the log-format string that will be sent, once evaluated, - during a generic health check session. + <fmt> is the Custom log format that will be sent, once evaluated, + during a generic health check session (see section 8.2.6). Examples : # look for the redis master server @@ -12504,7 +12972,7 @@ tcp-check send-lf <fmt> [comment <msg>] tcp-check send-binary <hexstring> [comment <msg>] tcp-check send-binary-lf <hexfmt> [comment <msg>] - Specify an hex digits string or an hex digits log-format string to be sent as + Specify an hex digits string or an hex digits Custom log format to be sent as a binary question during a raw tcp health check May be used in the following contexts: tcp, http, log @@ -12518,9 +12986,9 @@ tcp-check send-binary-lf <hexfmt> [comment <msg>] <hexstring> is the hexadecimal string that will be send, once converted to binary, during a generic health check session. - <hexfmt> is the hexadecimal log-format string that will be send, once + <hexfmt> is the hexadecimal Custom log format that will be send, once evaluated and converted to binary, during a generic health - check session. + check session (see section 8.2.6). Examples : # redis check in binary @@ -12559,8 +13027,8 @@ tcp-check set-var-fmt(<var-name>[,<cond>...]) <fmt> <expr> Is a sample-fetch expression potentially followed by converters. - <fmt> This is the value expressed using log-format rules (see Custom - Log Format in section 8.2.4). + <fmt> This is the value expressed using Custom log format rules (see + Custom log format in section 8.2.6). Examples : tcp-check set-var(check.port) int(1234) @@ -13499,7 +13967,7 @@ transparent (deprecated) See also: "option transparent" -unique-id-format <string> +unique-id-format <fmt> Generate a unique ID for each request. May be used in the following contexts: tcp, http @@ -13508,12 +13976,12 @@ unique-id-format <string> yes | yes | yes | no Arguments : - <string> is a log-format string. + <fmt> is a Custom log format string (see section 8.2.6). This keyword creates a ID for each request using the custom log format. A unique ID is useful to trace a request passing through many components of a complex infrastructure. The newly created ID may also be logged using the - %ID tag the log-format string. + %ID alias in the Custom log format string. The format should be composed from elements that are guaranteed to be unique when combined together. For instance, if multiple HAProxy instances @@ -13572,7 +14040,8 @@ use_backend <backend> [{if | unless} <condition>] Arguments : <backend> is the name of a valid backend or "listen" section, or a - "log-format" string resolving to a backend name. + Custom log format resolving to a backend name (see Custom + Log Format in section 8.2.6). <condition> is a condition composed of ACLs, as described in section 7. If it is omitted, the rule is unconditionally applied. @@ -13604,7 +14073,7 @@ use_backend <backend> [{if | unless} <condition>] When <backend> is a simple name, it is resolved at configuration time, and an error is reported if the specified backend does not exist. If <backend> is - a log-format string instead, no check may be done at configuration time, so + a Custom log format instead, no check may be done at configuration time, so the backend name is resolved dynamically at run time. If the resulting backend name does not correspond to any valid backend, no other rule is evaluated, and the default_backend directive is applied instead. Note that @@ -13643,7 +14112,8 @@ use-server <server> unless <condition> Arguments : <server> is the name of a valid server in the same backend section - or a "log-format" string resolving to a server name. + or a Custom log format string resolving to a server name + (see section 8.2.6). <condition> is a condition composed of ACLs, as described in section 7. @@ -13691,10 +14161,10 @@ use-server <server> unless <condition> When <server> is a simple name, it is checked against existing servers in the configuration and an error is reported if the specified server does not exist. - If it is a log-format, no check is performed when parsing the configuration, - and if we can't resolve a valid server name at runtime but the use-server rule - was conditioned by an ACL returning true, no other use-server rule is applied - and we fall back to load balancing. + If it is a Custom log format, no check is performed when parsing the + configuration, and if we can't resolve a valid server name at runtime but the + use-server rule was conditioned by an ACL returning true, no other use-server + rule is applied and we fall back to load balancing. See also: "use_backend", section 5 about server and section 7 about ACLs. @@ -13766,12 +14236,16 @@ sc-set-gpt X X X X X X X sc-set-gpt0 X X X X X X X send-spoe-group - - X X X X - set-bandwidth-limit - - X X X X - +set-bc-mark - - X - X - - +set-bc-tos - - X - X - - set-dst X X X - X - - set-dst-port X X X - X - - +set-fc-mark X X X X X X - +set-fc-tos X X X X X X - set-header - - - - X X X set-log-level - - X X X X X set-map - - - - X X X -set-mark X X X X X X - +set-mark (deprecated) X X X X X X - set-method - - - - X - - set-nice - - X X X X - set-path - - - - X - - @@ -13784,7 +14258,7 @@ set-src X X X - X - - set-src-port X X X - X - - set-status - - - - - X X set-timeout - - - - X X - -set-tos X X X X X X - +set-tos (deprecated) X X X X X X - set-uri - - - - X - - set-var X X X X X X X set-var-fmt X X X X X X X @@ -13827,10 +14301,10 @@ add-acl(<file-name>) <key fmt> This is used to add a new entry into an ACL. The ACL must be loaded from a file (even a dummy empty file). The file name of the ACL to be updated is passed between parentheses. It takes one argument: <key fmt>, which follows - log-format rules, to collect content of the new entry. It performs a lookup - in the ACL before insertion, to avoid duplicated (or more) values. - It is the equivalent of the "add acl" command from the stats socket, but can - be triggered by an HTTP request. + Custom log format rules described in section 8.2.6, to collect content of the + new entry. It performs a lookup in the ACL before insertion, to avoid + duplicated (or more) values. It is the equivalent of the "add acl" command + from the stats socket, but can be triggered by an HTTP request. add-header <name> <fmt> @@ -13838,13 +14312,13 @@ add-header <name> <fmt> - | - | - | - | X | X | X This appends an HTTP header field whose name is specified in <name> and - whose value is defined by <fmt> which follows the log-format rules (see - Custom Log Format in section 8.2.4). This is particularly useful to pass + whose value is defined by <fmt> which follows the Custom log format rules + (see Custom log format in section 8.2.6). This is particularly useful to pass connection-specific information to the server (e.g. the client's SSL - certificate), or to combine several headers into one. This rule is not - final, so it is possible to add other similar rules. Note that header - addition is performed immediately, so one rule might reuse the resulting - header from a previous rule. + certificate), or to combine several headers into one. This rule is not final, + so it is possible to add other similar rules. Note that header addition is + performed immediately, so one rule might reuse the resulting header from a + previous rule. allow @@ -13868,20 +14342,19 @@ attach-srv <srv> [name <expr>] [ EXPERIMENTAL ] pool of server <srv>. This may only be used with servers having an 'rhttp@' address. - An extra parameter <expr> can be specified. Its value is interpreted as a - sample expression to name the connection inside the server idle pool. When - routing an outgoing request through this server, this name will be matched - against the 'sni' parameter of the server line. Otherwise, the connection - will have no name and will only match requests without SNI. - - This rule is only valid for frontend in HTTP mode. Also all listeners must - not require a protocol different from HTTP/2. + The connection is inserted into the server idle pool with a name defined by + the result of the <expr> evaluation. This is the name that will be matched + against by requests subject to "pool-conn-name" or "sni" parameter. See + "http-reuse" for more details. Reverse HTTP is currently still in active development. Configuration mechanism may change in the future. For this reason it is internally marked as experimental, meaning that "expose-experimental-directives" must appear on a line before this directive. + Note that a very similar but independent protocol is under development. See + https://www.ietf.org/archive/id/draft-bt-httpbis-reverse-http-00.html. + auth [realm <realm>] Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - | - | - | - | X | - | - @@ -13935,7 +14408,7 @@ capture <sample> [ len <length> | id <id> ] This captures sample expression <sample> from the request or response buffer, and converts it to a string of at most <len> characters. The resulting string - is stored into the next "capture" slot (either request or reponse), so it + is stored into the next "capture" slot (either request or response), so it will possibly appear next to some captured HTTP headers. It will then automatically appear in the logs, and it will be possible to extract it using sample fetch methods to feed it into headers or anything. The length should @@ -13974,9 +14447,9 @@ del-acl(<file-name>) <key fmt> This is used to delete an entry from an ACL. The ACL must be loaded from a file (even a dummy empty file). The file name of the ACL to be updated is passed between parentheses. It takes one argument: <key fmt>, which follows - log-format rules, to collect content of the entry to delete. - It is the equivalent of the "del acl" command from the stats socket, but can - be triggered by an HTTP request or response. + Custom log format rules of section 8.2.6, to collect content of the entry to + delete. It is the equivalent of the "del acl" command from the stats socket, + but can be triggered by an HTTP request or response. del-header <name> [ -m <meth> ] @@ -13990,17 +14463,17 @@ del-header <name> [ -m <meth> ] method is used. -del-map(<file-name>) <key fmt> +del-map(<map-name>) <key fmt> Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - | - | - | - | X | X | X - This is used to delete an entry from a MAP. The MAP must be loaded from a - file (even a dummy empty file). The file name of the MAP to be updated is - passed between parentheses. It takes one argument: <key fmt>, which follows - log-format rules, to collect content of the entry to delete. - It takes one argument: "file name" It is the equivalent of the "del map" - command from the stats socket, but can be triggered by an HTTP request or - response. + This is used to delete an entry from a MAP. <map-name> must follow the format + described in 2.7. about name format for maps and ACLs. The name of the MAP to + be updated is passed between parentheses. It takes one argument: <key fmt>, + which follows Custom log format rules of section 8.2.6, to collect content of + the entry to delete. It takes one argument: "file name" It is the equivalent + of the "del map" command from the stats socket, but can be triggered by an + HTTP request or response. deny [ { status | deny_status } <code> ] [ content-type <type> ] @@ -14095,10 +14568,10 @@ early-hint <name> <fmt> This is used to build an HTTP 103 Early Hints response prior to any other one. This appends an HTTP header field to this response whose name is specified in - <name> and whose value is defined by <fmt> which follows the log-format rules - (see Custom Log Format in section 8.2.4). This is particularly useful to pass - to the client some Link headers to preload resources required to render the - HTML documents. + <name> and whose value is defined by <fmt> which follows the Custom Log + Format rules (see Custom log format in section 8.2.6). This is particularly + useful to pass to the client some Link headers to preload resources required + to render the HTML documents. See RFC 8297 for more information. @@ -14285,7 +14758,7 @@ redirect <rule> This performs an HTTP redirection based on a redirect rule. This is exactly the same as the "redirect" statement except that it inserts a redirect rule which is processed in the middle of other "http-request" or "http-response" - rules and that these rules use the "log-format" strings. For responses, only + rules and that these rules use the Custom log format. For responses, only the "location" type of redirect is permitted. In addition, when a redirect is performed during a response, the transfer from the server to HAProxy is interrupted so that no payload can be forwarded to the client. This may cause @@ -14511,19 +14984,20 @@ return [ status <code> ] [ content-type <type> ] used as the response payload. If the file is not empty, its content-type must be set as argument to "content-type". Otherwise, any "content-type" argument is ignored. With a "lf-file" argument, the file's content is - evaluated as a log-format string. With a "file" argument, it is considered - as a raw content. + evaluated as a Custom log format (see section 8.2.6). With a "file" + argument, it is considered as a raw content. * If a "string" or "lf-string" argument is specified, the defined string is used as the response payload. The content-type must always be set as argument to "content-type". With a "lf-string" argument, the string is - evaluated as a log-format string. With a "string" argument, it is - considered as a raw string. + evaluated as a Custom log format (see section 8.2.6). With a "string" + argument, it is considered as a raw string. When the response is not based on an errorfile, it is possible to append HTTP header fields to the response using "hdr" arguments. Otherwise, all "hdr" arguments are ignored. For each one, the header name is specified in <name> - and its value is defined by <fmt> which follows the log-format rules. + and its value is defined by <fmt> which follows the Custom log format rules + described in section 8.2.6. Note that the generated response must be smaller than a buffer. And to avoid any warning, when an errorfile or a raw file is loaded, the buffer space @@ -14674,6 +15148,42 @@ set-bandwidth-limit <name> [limit {<expr> | <size>}] [period {<expr> | <time>}] See section 9.7 about bandwidth limitation filter setup. +set-bc-mark { <mark> | <expr> } + Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft + - | - | X | - | X | - | - + + This is used to set the Netfilter/IPFW MARK on the backend connection (all + packets sent to the server) to the value passed in <mark> or <expr> on + platforms which support it. This value is an unsigned 32 bit value which can + be matched by netfilter/ipfw and by the routing table or monitoring the + packets through DTrace. <mark> can be expressed both in decimal or hexadecimal + format (prefixed by "0x"). Alternatively, <expr> can be used: it is a standard + HAProxy expression formed by a sample-fetch followed by some converters which + must resolve to integer type. This action can be useful to force certain + packets to take a different route (for example a cheaper network path for bulk + downloads). This works on Linux kernels 2.6.32 and above and requires admin + privileges, as well on FreeBSD and OpenBSD. The mark will be set for the whole + duration of the backend/server connection (from connect to close). + + +set-bc-tos { <tos> | <expr> } + Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft + - | - | X | - | X | - | - + + This is used to set the TOS or DSCP field value on the backend connection + (all packets sent to the server) to the value passed in <tos> or <expr> on + platforms which support this. This value represents the whole 8 bits of the + IP TOS field. Note that only the 6 higher bits are used in DSCP or TOS, and + the two lower bits are always 0. Alternatively, <expr> can be used: it is a + standard HAProxy expression formed by a sample-fetch followed by some + converters which must resolve to integer type. This action can be used to + adjust some routing behavior on inner routers based on some information from + the request. The tos will be set for the whole duration of the backend/server + connection (from connect to close). + + See RFC 2474, 2597, 3260 and 4594 for more information. + + set-dst <expr> Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft X | X | X | - | X | - | - @@ -14717,6 +15227,39 @@ set-dst-port <expr> destination address to IPv4 "0.0.0.0" before rewriting the port. +set-fc-mark { <mark> | <expr> } + Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft + X | X | X | X | X | X | - + + This is used to set the Netfilter/IPFW MARK on all packets sent to the client + to the value passed in <mark> or <expr> on platforms which support it. This + value is an unsigned 32 bit value which can be matched by netfilter/ipfw and + by the routing table or monitoring the packets through DTrace. <mark> can be + expressed both in decimal or hexadecimal format (prefixed by "0x"). + Alternatively, <expr> can be used: it is a standard HAProxy expression formed + by a sample-fetch followed by some converters which must resolve to integer + type. This action can be useful to force certain packets to take a different + route (for example a cheaper network path for bulk downloads). This works on + Linux kernels 2.6.32 and above and requires admin privileges, as well on + FreeBSD and OpenBSD. + + +set-fc-tos { <tos | <expr> } + Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft + X | X | X | X | X | X | - + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> or <expr> on platforms which support this. This + value represents the whole 8 bits of the IP TOS field. Note that only the 6 + higher bits are used in DSCP or TOS, and the two lower bits are always 0. + Alternatively, <expr> can be used: it is a standard HAProxy expression formed + by a sample-fetch followed by some converters which must resolve to integer + type. This action can be used to adjust some routing behavior on border + routers based on some information from the request. + + See RFC 2474, 2597, 3260 and 4594 for more information. + + set-header <name> <fmt> Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - | - | - | - | X | X | X @@ -14751,33 +15294,23 @@ set-log-level <level> can be useful to disable health checks coming from another equipment. -set-map(<file-name>) <key fmt> <value fmt> +set-map(<map-name>) <key fmt> <value fmt> Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - | - | - | - | X | X | X - This is used to add a new entry into a map. The map must be loaded from a - file (even a dummy empty file). The file name of the map to be updated is - passed between parentheses. It takes 2 arguments: <key fmt>, which follows - log-format rules, used to collect map key, and <value fmt>, which follows - log-format rules, used to collect content for the new entry. - It performs a lookup in the map before insertion, to avoid duplicated (or - more) values. It is the equivalent of the "set map" command from the - stats socket, but can be triggered by an HTTP request. + This is used to add a new entry into a map. <map-name> must follow the format + described in 2.7. about name format for maps and ACLs. The name of the MAP to + be updated is passed between parentheses. It takes 2 arguments: <key fmt>, + which follows Custom log format rules described in section 8.2.6, used to + collect map key, and <value fmt>, which follows Custom log format rules, used + to collect content for the new entry. It performs a lookup in the map before + insertion, to avoid duplicated (or more) values. It is the equivalent of the + "set map" command from the stats socket, but can be triggered by an HTTP + request. -set-mark <mark> - Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - X | X | X | X | X | X | - - - This is used to set the Netfilter/IPFW MARK on all packets sent to the client - to the value passed in <mark> on platforms which support it. This value is an - unsigned 32 bit value which can be matched by netfilter/ipfw and by the - routing table or monitoring the packets through DTrace. It can be expressed - both in decimal or hexadecimal format (prefixed by "0x"). - This can be useful to force certain packets to take a different route (for - example a cheaper network path for bulk downloads). This works on Linux - kernels 2.6.32 and above and requires admin privileges, as well on FreeBSD - and OpenBSD. +set-mark <mark> (deprecated) + This is an alias for "set-fc-mark" (which should be used instead). set-method <fmt> @@ -14964,19 +15497,8 @@ set-timeout { client | server | tunnel } { <timeout> | <expr> } http-response set-timeout server res.hdr(X-Refresh-Seconds),mul(1000) -set-tos <tos> - Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft - X | X | X | X | X | X | - - - This is used to set the TOS or DSCP field value of packets sent to the client - to the value passed in <tos> on platforms which support this. This value - represents the whole 8 bits of the IP TOS field, and can be expressed both in - decimal or hexadecimal format (prefixed by "0x"). Note that only the 6 higher - bits are used in DSCP or TOS, and the two lower bits are always 0. This can - be used to adjust some routing behavior on border routers based on some - information from the request. - - See RFC 2474, 2597, 3260 and 4594 for more information. +set-tos <tos> (deprecated) + This is an alias for "set-fc-tos" (which should be used instead). set-uri <fmt> @@ -15024,8 +15546,8 @@ set-var-fmt(<var-name>[,<cond>...]) <fmt> <expr> Is a standard HAProxy expression formed by a sample-fetch followed by some converters. - <fmt> This is the value expressed using log-format rules (see Custom - Log Format in section 8.2.4). + <fmt> This is the value expressed using Custom log format rules (see + Custom log format in section 8.2.6). All scopes are usable for HTTP rules, but scopes "proc" and "sess" are the only usable ones in rule sets which do not have access to contents such as @@ -15058,7 +15580,7 @@ silent-drop [ rst-ttl <ttl> ] the RST packet travels through the local infrastructure, deleting the connection in firewalls and other systems, but disappears before reaching the client. Future packets from the client will then be dropped already by - front equipments. These local RSTs protect local resources, but not the + front equipment. These local RSTs protect local resources, but not the client's. This must not be used unless the consequences of doing this are fully understood. @@ -15152,7 +15674,7 @@ track-sc2 <key> [table <table>] <key> is mandatory, and is a sample expression rule as described in section 7.3. It describes what elements of the incoming connection, - request or reponse will be analyzed, extracted, combined, and used + request or response will be analyzed, extracted, combined, and used to select which table entry to update the counters. <table> is an optional table to be used instead of the default one, which @@ -15224,7 +15746,7 @@ wait-for-body time <time> [ at-least <bytes> ] case HAProxy will respond with a 408 "Request Timeout" error to the client and stop processing the request. Note that if any of the other conditions happens first, this timeout will not occur even if the full body has - not yet been recieved. + not yet been received. This action may be used as a replacement for "option http-buffer-request". @@ -15480,7 +16002,13 @@ crt <cert> match any certificate, then the first loaded certificate will be presented. This means that when loading certificates from a directory, it is highly recommended to load the default one first as a file or to ensure that it will - always be the first one in the directory. + always be the first one in the directory. In order to chose multiple default + certificates (1 rsa and 1 ecdsa), there are 3 options: + - A multi-cert bundle can be configured as the first certificate + (`crt foobar.pem` in the configuration where the existing files + are `foobar.pem.ecdsa` and `foobar.pem.rsa`. + - Or a '*' filter for each certificate in a crt-list line. + - The 'default-crt' keyword can be used. Note that the same cert may be loaded multiple times without side effects. @@ -15538,11 +16066,38 @@ crt-list <file> <crtfile> [\[<sslbindconf> ...\]] [[!]<snifilter> ...] - sslbindconf supports "allow-0rtt", "alpn", "ca-file", "ca-verify-file", - "ciphers", "ciphersuites", "crl-file", "curves", "ecdhe", "no-ca-names", - "npn", "verify" configuration. With BoringSSL and Openssl >= 1.1.1 - "ssl-min-ver" and "ssl-max-ver" are also supported. It overrides the - configuration set in bind line for the certificate. + sslbindconf supports the following keywords from the bind line + (see Section 5.1. Bind options): + + - allow-0rtt + - alpn + - ca-file + - ca-verify-file + - ciphers + - ciphersuites + - client-sigalgs + - crl-file + - curves + - ecdhe + - no-alpn + - no-ca-names + - npn + - sigalgs + - ssl-min-ver + - ssl-max-ver + - verify + + sslbindconf also supports the following keywords from the crt-store load + keyword (see Section 3.12.1. Load options): + + - crt + - key + - ocsp + - issuer + - sctl + - ocsp-update + + It overrides the configuration set in bind line for the certificate. Wildcards are supported in the SNI filter. Negative filter are also supported, useful in combination with a wildcard filter to exclude a particular SNI, or @@ -15567,7 +16122,10 @@ crt-list <file> filter is found on any crt-list. The SNI filter !* can be used after the first declared certificate to not include its CN and SAN in the SNI tree, so it will never match except if no other certificate matches. This way the first - declared certificate act as a fallback. + declared certificate act as a fallback. It is also possible to declare a '*' + filter, which will allow to chose this certificate as default. When multiple + default certificates are defined, HAProxy is able to chose the right ECDSA or + RSA one depending on what the client supports. When no ALPN is set, the "bind" line's default one is used. If a "bind" line has no "no-alpn", "alpn" nor "npn" set, a default value will be used @@ -15581,6 +16139,25 @@ crt-list <file> cert2.pem [alpn h2,http/1.1] certW.pem *.domain.tld !secure.domain.tld certS.pem [curves X25519:P-256 ciphers ECDHE-ECDSA-AES256-GCM-SHA384] secure.domain.tld + default.pem.rsa * + default.pem.ecdsa * + +default-crt <cert> + This option does the same as the "crt" option, with the difference that this + certificate will be used as a default one. It is possible to add multiple + default certificates to have an ECDSA and an RSA one, having more is not + really useful. + + A default certificate is used when no "strict-sni" option is used on the bind + line. A default certificate is provided when the servername extension was not + used by the client, or when the servername does not match any configured + certificate. + + Example: + + bind *:443 default-crt foobar.pem.rsa default-crt foobar.pem.ecdsa crt website.pem.rsa + + See also the "crt" keyword. defer-accept Is an optional keyword which is supported only on certain Linux kernels. It @@ -15660,6 +16237,12 @@ group <group> "gid" setting except that the group name is used instead of its gid. This setting is ignored by non UNIX sockets. +guid-prefix <string> + Generate case-sensitive global unique IDs for each listening sockets + allocated on this bind line. Prefix will be concatenated to listeners + position index on the current bind line, with character '-' as separator. See + "guid" proxy keyword description for more information on its format. + id <id> Fixes the socket ID. By default, socket IDs are automatically assigned, but sometimes it is more convenient to fix them to ease monitoring. This value @@ -15843,87 +16426,6 @@ npn <protocols> at the time of writing this. It is possible to enable both NPN and ALPN though it probably doesn't make any sense out of testing. -ocsp-update [ off | on ] (crt-list only) - Enable automatic OCSP response update when set to 'on', disable it otherwise. - Its value defaults to 'off'. - Please note that for now, this option can only be used in a crt-list line, it - cannot be used directly on a bind line. It lies in this "Bind options" - section because it is still a frontend option. This limitation was set so - that the option applies to only one certificate at a time. - If a given certificate is used in multiple crt-lists with different values of - the 'ocsp-update' set, an error will be raised. Here is an example - configuration enabling it: - - haproxy.cfg: - frontend fe - bind :443 ssl crt-list haproxy.list - - haproxy.list: - server_cert.pem [ocsp-update on] foo.bar - - When the option is set to 'on', we will try to get an ocsp response whenever - an ocsp uri is found in the frontend's certificate. The only limitation of - this mode is that the certificate's issuer will have to be known in order for - the OCSP certid to be built. - Each OCSP response will be updated at least once an hour, and even more - frequently if a given OCSP response has an expire date earlier than this one - hour limit. A minimum update interval of 5 minutes will still exist in order - to avoid updating too often responses that have a really short expire time or - even no 'Next Update' at all. Because of this hard limit, please note that - when auto update is set to 'on' or 'auto', any OCSP response loaded during - init will not be updated until at least 5 minutes, even if its expire time - ends before now+5m. This should not be too much of a hassle since an OCSP - response must be valid when it gets loaded during init (its expire time must - be in the future) so it is unlikely that this response expires in such a - short time after init. - On the other hand, if a certificate has an OCSP uri specified and no OCSP - response, setting this option to 'on' for the given certificate will ensure - that the OCSP response gets fetched automatically right after init. - The default minimum and maximum delays (5 minutes and 1 hour respectively) - can be configured by the "tune.ssl.ocsp-update.maxdelay" and - "tune.ssl.ocsp-update.mindelay" global options. - - Whenever an OCSP response is updated by the auto update task or following a - call to the "update ssl ocsp-response" CLI command, a dedicated log line is - emitted. It follows a dedicated log-format that contains the following header - "%ci:%cp [%tr] %ft" and is followed by specific OCSP-related information: - - the path of the corresponding frontend certificate - - a numerical update status - - a textual update status - - the number of update failures for the given response - - the number of update successes for the givan response - See "show ssl ocsp-updates" CLI command for a full list of error codes and - error messages. This line is emitted regardless of the success or failure of - the concerned OCSP response update. - The OCSP request/response is sent and received through an http_client - instance that has the dontlog-normal option set and that uses the regular - HTTP log format in case of error (unreachable OCSP responder for instance). - If such an error occurs, another log line that contains HTTP-related - information will then be emitted alongside the "regular" OCSP one (which will - likely have "HTTP error" as text status). But if a purely HTTP error happens - (unreachable OCSP responder for instance), an extra log line that follows the - regular HTTP log-format will be emitted. - Here are two examples of such log lines, with a successful OCSP update log - line first and then an example of an HTTP error with the two different lines - (lines were spit and the URL was shortened for readability): - <134>Mar 6 11:16:53 haproxy[14872]: -:- [06/Mar/2023:11:16:52.808] \ - <OCSP-UPDATE> /path_to_cert/foo.pem 1 "Update successful" 0 1 - - <134>Mar 6 11:18:55 haproxy[14872]: -:- [06/Mar/2023:11:18:54.207] \ - <OCSP-UPDATE> /path_to_cert/bar.pem 2 "HTTP error" 1 0 - <134>Mar 6 11:18:55 haproxy[14872]: -:- [06/Mar/2023:11:18:52.200] \ - <OCSP-UPDATE> -/- 2/0/-1/-1/3009 503 217 - - SC-- 0/0/0/0/3 0/0 {} \ - "GET http://127.0.0.1:12345/MEMwQT HTTP/1.1" - - Troubleshooting: - A common error that can happen with let's encrypt certificates is if the DNS - resolution provides an IPv6 address and your system does not have a valid - outgoing IPv6 route. In such a case, you can either create the appropriate - route or set the "httpclient.resolvers.prefer ipv4" option in the global - section. - In case of "OCSP response check failure" error, you might want to check that - the issuer certificate that you provided is valid. - prefer-client-ciphers Use the client's preference when selecting the cipher suite, by default the server's preference is enforced. This option is also available on @@ -16053,9 +16555,9 @@ ssl-min-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] strict-sni This setting is only available when support for OpenSSL was built in. The - SSL/TLS negotiation is allow only if the client provided an SNI which match + SSL/TLS negotiation is allowed only if the client provided an SNI that matches a certificate. The default certificate is not used. This option also allows - to start without any certificate on a bind line, so an empty directory could + starting without any certificate on a bind line, so an empty directory could be used and filled later from the stats socket. See the "crt" option for more information. See "add ssl crt-list" command in the management guide. @@ -16144,7 +16646,7 @@ thread [<thread-group>/]<thread-set>[,...] lines and their assignment to multiple groups of threads. This keyword is compatible with reverse HTTP binds. However, it is forbidden - to specify a thread set which spans accross several thread groups for such a + to specify a thread set which spans across several thread groups for such a listener as this may caused "nbconn" to not work as intended. tls-ticket-keys <keyfile> @@ -16233,6 +16735,8 @@ keywords, except "id" which is only supported by "server". The currently supported settings are the following ones. addr <ipv4|ipv6> + May be used in the following contexts: tcp, http, log + Using the "addr" parameter, it becomes possible to use a different IP address to send health-checks or to probe the agent-check. On some servers, it may be desirable to dedicate an IP address to specific component able to perform @@ -16241,6 +16745,8 @@ addr <ipv4|ipv6> "port" parameter. agent-check + May be used in the following contexts: tcp, http, log + Enable an auxiliary agent check which is run independently of a regular health check. An agent health check is performed by making a TCP connection to the port set by the "agent-port" parameter and reading an ASCII string @@ -16302,6 +16808,8 @@ agent-check and "no-agent-check" parameters. agent-send <string> + May be used in the following contexts: tcp, http, log + If this option is specified, HAProxy will send the given string (verbatim) to the agent server upon connection. You could, for example, encode the backend name into this string, which would enable your agent to send @@ -16309,6 +16817,8 @@ agent-send <string> you want to terminate your request with a newline. agent-inter <delay> + May be used in the following contexts: tcp, http, log + The "agent-inter" parameter sets the interval between two agent checks to <delay> milliseconds. If left unspecified, the delay defaults to 2000 ms. @@ -16325,6 +16835,8 @@ agent-inter <delay> See also the "agent-check" and "agent-port" parameters. agent-addr <addr> + May be used in the following contexts: tcp, http, log + The "agent-addr" parameter sets address for agent check. You can offload agent-check to another target, so you can make single place @@ -16333,16 +16845,22 @@ agent-addr <addr> hostname, it will be resolved. agent-port <port> + May be used in the following contexts: tcp, http, log + The "agent-port" parameter sets the TCP port used for agent checks. See also the "agent-check" and "agent-inter" parameters. allow-0rtt + May be used in the following contexts: tcp, http, log, peers, ring + Allow sending early data to the server when using TLS 1.3. Note that early data will be sent only if the client used early data, or if the backend uses "retry-on" with the "0rtt-rejected" keyword. alpn <protocols> + May be used in the following contexts: tcp, http + This enables the TLS ALPN extension and advertises the specified protocol list as supported on top of ALPN. The protocol list consists in a comma- delimited list of protocol names, for instance: "http/1.1,http/1.0" (without @@ -16359,6 +16877,8 @@ alpn <protocols> See also "ws" to use an alternative ALPN for websocket streams. backup + May be used in the following contexts: tcp, http, log + When "backup" is present on a server line, the server is only used in load balancing when all other non-backup servers are unavailable. Requests coming with a persistence cookie referencing the server will always be served @@ -16367,6 +16887,8 @@ backup "allbackups" options. ca-file <cafile> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It designates a PEM file from which to load CA certificates used to verify server's certificate. It is possible to load a directory containing multiple @@ -16378,6 +16900,8 @@ ca-file <cafile> overwritten by setting the SSL_CERT_DIR environment variable. check + May be used in the following contexts: tcp, http, log + This option enables health checks on a server: - when not set, no health checking is performed, and the server is always considered available. @@ -16435,6 +16959,8 @@ check server s1 192.168.0.1:443 ssl check check-send-proxy + May be used in the following contexts: tcp, http + This option forces emission of a PROXY protocol line with outgoing health checks, regardless of whether the server uses send-proxy or not for the normal traffic. By default, the PROXY protocol is enabled for health checks @@ -16444,11 +16970,15 @@ check-send-proxy protocol. See also the "send-proxy" option for more information. check-alpn <protocols> + May be used in the following contexts: tcp, http + Defines which protocols to advertise with ALPN. The protocol list consists in a comma-delimited list of protocol names, for instance: "http/1.1,http/1.0" (without quotes). If it is not set, the server ALPN is used. check-proto <name> + May be used in the following contexts: tcp, http + Forces the multiplexer's protocol to use for the server's health-check connections. It must be compatible with the health-check type (TCP or HTTP). It must also be usable on the backend side. The list of available @@ -16472,11 +17002,15 @@ check-proto <name> If not defined, the server one will be used, if set. check-sni <sni> + May be used in the following contexts: tcp, http, log + This option allows you to specify the SNI to be used when doing health checks over SSL. It is only possible to use a string to set <sni>. If you want to set a SNI for proxied traffic, see "sni". check-ssl + May be used in the following contexts: tcp, http, log + This option forces encryption of all health checks over SSL, regardless of whether the server uses SSL or not for the normal traffic. This is generally used when an explicit "port" or "addr" directive is specified and SSL health @@ -16489,11 +17023,15 @@ check-ssl this option. check-via-socks4 + May be used in the following contexts: tcp, http, log + This option enables outgoing health checks using upstream socks4 proxy. By default, the health checks won't go through socks tunnel even it was enabled for normal traffic. ciphers <ciphers> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. This option sets the string describing the list of cipher algorithms that is negotiated during the SSL/TLS handshake with the server. The format of the @@ -16504,6 +17042,8 @@ ciphers <ciphers> cipher configuration, please check the "ciphersuites" keyword. ciphersuites <ciphersuites> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in and OpenSSL 1.1.1 or later was used to build HAProxy. This option sets the string describing the list of cipher algorithms that is negotiated during the TLS @@ -16513,6 +17053,8 @@ ciphersuites <ciphersuites> keyword. client-sigalgs <sigalgs> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It sets the string describing the list of signature algorithms related to client authentication that are negotiated . The format of the string is defined in @@ -16520,6 +17062,8 @@ client-sigalgs <sigalgs> recommended to use this setting if no specific usecase was identified. cookie <value> + May be used in the following contexts: http + The "cookie" parameter sets the cookie value assigned to the server to <value>. This value will be checked in incoming requests, and the first operational server possessing the same value will be selected. In return, in @@ -16529,11 +17073,15 @@ cookie <value> backup servers. See also the "cookie" keyword in backend section. crl-file <crlfile> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It designates a PEM file from which to load certificate revocation list used to verify server's certificate. crt <cert> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It designates a PEM file from which to load both a certificate and the associated private key. This file can be built by concatenating both PEM @@ -16545,6 +17093,8 @@ crt <cert> option is set accordingly). curves <curves> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It sets the string describing the list of elliptic curves algorithms ("curve suite") that are negotiated during the SSL/TLS handshake with ECDHE. The format of the @@ -16552,6 +17102,8 @@ curves <curves> Example: "X25519:P-256" (without quote) disabled + May be used in the following contexts: tcp, http, log + The "disabled" keyword starts the server in the "disabled" state. That means that it is marked down in maintenance mode, and no connection other than the ones allowed by persist mode will reach it. It is very well suited to setup @@ -16560,6 +17112,8 @@ disabled See also "enabled" setting. enabled + May be used in the following contexts: tcp, http, log + This option may be used as 'server' setting to reset any 'disabled' setting which would have been inherited from 'default-server' directive as default value. @@ -16567,6 +17121,8 @@ enabled 'default-server' 'disabled' setting. error-limit <count> + May be used in the following contexts: tcp, http, log + If health observing is enabled, the "error-limit" parameter specifies the number of consecutive errors that triggers event selected by the "on-error" option. By default it is set to 10 consecutive errors. @@ -16574,42 +17130,63 @@ error-limit <count> See also the "check", "error-limit" and "on-error". fall <count> + May be used in the following contexts: tcp, http, log + The "fall" parameter states that a server will be considered as dead after <count> consecutive unsuccessful health checks. This value defaults to 3 if unspecified. See also the "check", "inter" and "rise" parameters. force-sslv3 + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of SSLv3 only when SSL is used to communicate with the server. SSLv3 is generally less expensive than the TLS counterparts for high connection rates. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". force-tlsv10 + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of TLSv1.0 only when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". force-tlsv11 + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of TLSv1.1 only when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". force-tlsv12 + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of TLSv1.2 only when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". force-tlsv13 + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of TLSv1.3 only when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". +guid <string> + Specify a case-sensitive global unique ID for this server. This must be + unique across all haproxy configuration on every object types. See "guid" + proxy keyword description for more information on its format. + id <value> + May be used in the following contexts: tcp, http, log + Set a persistent ID for the server. This ID must be positive and unique for the proxy. An unused ID will automatically be assigned if unset. The first assigned value will be 1. This ID is currently only returned in statistics. init-addr {last | libc | none | <ip>},[...]* + May be used in the following contexts: tcp, http, log + Indicate in what order the server's address should be resolved upon startup if it uses an FQDN. Attempts are made to resolve the address by applying in turn each of the methods mentioned in the comma-delimited list. The first @@ -16639,6 +17216,8 @@ init-addr {last | libc | none | <ip>},[...]* inter <delay> fastinter <delay> downinter <delay> + May be used in the following contexts: tcp, http, log + The "inter" parameter sets the interval between two consecutive health checks to <delay> milliseconds. If left unspecified, the delay defaults to 2000 ms. It is also possible to use "fastinter" and "downinter" to optimize delays @@ -16674,6 +17253,8 @@ downinter <delay> reduce the time spent in the queue. log-bufsize <bufsize> + May be used in the following contexts: log + The "log-bufsize" specifies the ring bufsize to use for the implicit ring that will be associated to the log server in a log backend. When not specified, this defaults to BUFSIZE. Use of a greater value will increase @@ -16682,12 +17263,16 @@ log-bufsize <bufsize> This keyword may only be used in log backend sections (with "mode log") log-proto <logproto> + May be used in the following contexts: log, ring + The "log-proto" specifies the protocol used to forward event messages to a server configured in a log or ring section. Possible values are "legacy" and "octet-count" corresponding respectively to "Non-transparent-framing" and "Octet counting" in rfc6587. "legacy" is the default. maxconn <maxconn> + May be used in the following contexts: tcp, http + The "maxconn" parameter specifies the maximal number of concurrent connections that will be sent to this server. If the number of incoming concurrent connections goes higher than this value, they will be queued, @@ -16704,6 +17289,8 @@ maxconn <maxconn> than 50 concurrent requests. maxqueue <maxqueue> + May be used in the following contexts: tcp, http + The "maxqueue" parameter specifies the maximal number of connections which will wait in the queue for this server. If this limit is reached, next requests will be redispatched to other servers instead of indefinitely @@ -16717,6 +17304,8 @@ maxqueue <maxqueue> and "balance leastconn". max-reuse <count> + May be used in the following contexts: http + The "max-reuse" argument indicates the HTTP connection processors that they should not reuse a server connection more than this number of times to send new requests. Permitted values are -1 (the default), which disables this @@ -16727,6 +17316,8 @@ max-reuse <count> enforce. At least HTTP/2 connections to servers will respect it. minconn <minconn> + May be used in the following contexts: tcp, http + When the "minconn" parameter is set, the maxconn limit becomes a dynamic limit following the backend's load. The server will always accept at least <minconn> connections, never more than <maxconn>, and the limit will be on @@ -16737,12 +17328,16 @@ minconn <minconn> and "maxqueue" parameters, as well as the "fullconn" backend keyword. namespace <name> + May be used in the following contexts: tcp, http, log, peers, ring + On Linux, it is possible to specify which network namespace a socket will belong to. This directive makes it possible to explicitly bind a server to a namespace different from the default one. Please refer to your operating system's documentation to find more details about network namespaces. no-agent-check + May be used in the following contexts: tcp, http, log + This option may be used as "server" setting to reset any "agent-check" setting which would have been inherited from "default-server" directive as default value. @@ -16750,6 +17345,8 @@ no-agent-check "default-server" "agent-check" setting. no-backup + May be used in the following contexts: tcp, http, log + This option may be used as "server" setting to reset any "backup" setting which would have been inherited from "default-server" directive as default value. @@ -16757,6 +17354,8 @@ no-backup "default-server" "backup" setting. no-check + May be used in the following contexts: tcp, http, log + This option may be used as "server" setting to reset any "check" setting which would have been inherited from "default-server" directive as default value. @@ -16764,6 +17363,8 @@ no-check "default-server" "check" setting. no-check-ssl + May be used in the following contexts: tcp, http, log + This option may be used as "server" setting to reset any "check-ssl" setting which would have been inherited from "default-server" directive as default value. @@ -16771,6 +17372,8 @@ no-check-ssl "default-server" "check-ssl" setting. no-send-proxy + May be used in the following contexts: tcp, http + This option may be used as "server" setting to reset any "send-proxy" setting which would have been inherited from "default-server" directive as default value. @@ -16778,6 +17381,8 @@ no-send-proxy "default-server" "send-proxy" setting. no-send-proxy-v2 + May be used in the following contexts: tcp, http + This option may be used as "server" setting to reset any "send-proxy-v2" setting which would have been inherited from "default-server" directive as default value. @@ -16785,6 +17390,8 @@ no-send-proxy-v2 "default-server" "send-proxy-v2" setting. no-send-proxy-v2-ssl + May be used in the following contexts: tcp, http + This option may be used as "server" setting to reset any "send-proxy-v2-ssl" setting which would have been inherited from "default-server" directive as default value. @@ -16792,6 +17399,8 @@ no-send-proxy-v2-ssl "default-server" "send-proxy-v2-ssl" setting. no-send-proxy-v2-ssl-cn + May be used in the following contexts: tcp, http + This option may be used as "server" setting to reset any "send-proxy-v2-ssl-cn" setting which would have been inherited from "default-server" directive as default value. @@ -16799,6 +17408,8 @@ no-send-proxy-v2-ssl-cn "default-server" "send-proxy-v2-ssl-cn" setting. no-ssl + May be used in the following contexts: tcp, http, log, peers, ring + This option may be used as "server" setting to reset any "ssl" setting which would have been inherited from "default-server" directive as default value. @@ -16810,12 +17421,16 @@ no-ssl runtime API: see `set server` commands in management doc. no-ssl-reuse + May be used in the following contexts: tcp, http, log, peers, ring + This option disables SSL session reuse when SSL is used to communicate with the server. It will force the server to perform a full handshake for every new connection. It's probably only useful for benchmarking, troubleshooting, and for paranoid users. no-sslv3 + May be used in the following contexts: tcp, http, log, peers, ring + This option disables support for SSLv3 when SSL is used to communicate with the server. Note that SSLv2 is disabled in the code and cannot be enabled using any configuration option. Use "ssl-min-ver" and "ssl-max-ver" instead. @@ -16823,6 +17438,8 @@ no-sslv3 Supported in default-server: No no-tls-tickets + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It disables the stateless session resumption (RFC 5077 TLS Ticket extension) and force to use stateful session resumption. Stateless @@ -16834,6 +17451,8 @@ no-tls-tickets See also "tls-tickets". no-tlsv10 + May be used in the following contexts: tcp, http, log, peers, ring + This option disables support for TLSv1.0 when SSL is used to communicate with the server. Note that SSLv2 is disabled in the code and cannot be enabled using any configuration option. TLSv1 is more expensive than SSLv3 so it @@ -16844,6 +17463,8 @@ no-tlsv10 Supported in default-server: No no-tlsv11 + May be used in the following contexts: tcp, http, log, peers, ring + This option disables support for TLSv1.1 when SSL is used to communicate with the server. Note that SSLv2 is disabled in the code and cannot be enabled using any configuration option. TLSv1 is more expensive than SSLv3 so it @@ -16854,6 +17475,8 @@ no-tlsv11 Supported in default-server: No no-tlsv12 + May be used in the following contexts: tcp, http, log, peers, ring + This option disables support for TLSv1.2 when SSL is used to communicate with the server. Note that SSLv2 is disabled in the code and cannot be enabled using any configuration option. TLSv1 is more expensive than SSLv3 so it @@ -16864,6 +17487,8 @@ no-tlsv12 Supported in default-server: No no-tlsv13 + May be used in the following contexts: tcp, http, log, peers, ring + This option disables support for TLSv1.3 when SSL is used to communicate with the server. Note that SSLv2 is disabled in the code and cannot be enabled using any configuration option. TLSv1 is more expensive than SSLv3 so it @@ -16874,6 +17499,8 @@ no-tlsv13 Supported in default-server: No no-verifyhost + May be used in the following contexts: tcp, http, log, peers, ring + This option may be used as "server" setting to reset any "verifyhost" setting which would have been inherited from "default-server" directive as default value. @@ -16881,6 +17508,8 @@ no-verifyhost "default-server" "verifyhost" setting. no-tfo + May be used in the following contexts: tcp, http, log, peers, ring + This option may be used as "server" setting to reset any "tfo" setting which would have been inherited from "default-server" directive as default value. @@ -16888,11 +17517,15 @@ no-tfo "default-server" "tfo" setting. non-stick + May be used in the following contexts: tcp, http + Never add connections allocated to this sever to a stick-table. This may be used in conjunction with backup to ensure that stick-table persistence is disabled for backup servers. npn <protocols> + May be used in the following contexts: tcp, http + This enables the NPN TLS extension and advertises the specified protocol list as supported on top of NPN. The protocol list consists in a comma-delimited list of protocol names, for instance: "http/1.1,http/1.0" (without quotes). @@ -16902,6 +17535,8 @@ npn <protocols> only available starting with OpenSSL 1.0.2. observe <mode> + May be used in the following contexts: tcp, http + This option enables health adjusting based on observing communication with the server. By default this functionality is disabled and enabling it also requires to enable health checks. There are two supported modes: "layer4" and @@ -16913,6 +17548,8 @@ observe <mode> See also the "check", "on-error" and "error-limit". on-error <mode> + May be used in the following contexts: tcp, http, log + Select what should happen when enough consecutive errors are detected. Currently, four modes are available: - fastinter: force fastinter @@ -16924,6 +17561,8 @@ on-error <mode> See also the "check", "observe" and "error-limit". on-marked-down <action> + May be used in the following contexts: tcp, http, log + Modify what occurs when a server is marked down. Currently one action is available: - shutdown-sessions: Shutdown peer streams. When this setting is enabled, @@ -16938,6 +17577,8 @@ on-marked-down <action> Actions are disabled by default on-marked-up <action> + May be used in the following contexts: tcp, http, log + Modify what occurs when a server is marked up. Currently one action is available: - shutdown-backup-sessions: Shutdown streams on all backup servers. This is @@ -16951,7 +17592,25 @@ on-marked-up <action> Actions are disabled by default +pool-conn-name <expr> + May be used in the following contexts: http + + When a backend connection is established, this expression is evaluated to + generate the connection name. This name is one of the key properties of the + connection in the idle server pool. See the "http-reuse" keyword. When a + request looks up an existing idle connection, this expression is evaluated to + match an identical connection. + + In context where SSL SNI is used for backend connection, the connection name + is automatically assigned to the result of the "sni" expression. This suits + the most common usage. For more advanced setup, "pool-conn-name" may be used + to override this. + + See also: "http-reuse", "sni" + pool-low-conn <max> + May be used in the following contexts: http + Set a low threshold on the number of idling connections for a server, below which a thread will not try to steal a connection from another thread. This can be useful to improve CPU usage patterns in scenarios involving many very @@ -16968,6 +17627,8 @@ pool-low-conn <max> connection reuse rate will decrease as thread count increases. pool-max-conn <max> + May be used in the following contexts: http + Set the maximum number of idling connections for a server. -1 means unlimited connections, 0 means no idle connections. The default is -1. When idle connections are enabled, orphaned idle connections which do not belong to any @@ -16976,11 +17637,15 @@ pool-max-conn <max> according to the same principles as those applying to "http-reuse". pool-purge-delay <delay> + May be used in the following contexts: http + Sets the delay to start purging idle connections. Each <delay> interval, half of the idle connections are closed. 0 means we don't keep any idle connection. The default is 5s. port <port> + May be used in the following contexts: tcp, http, log + Using the "port" parameter, it becomes possible to use a different port to send health-checks or to probe the agent-check. On some servers, it may be desirable to dedicate a port to a specific component able to perform complex @@ -16989,6 +17654,8 @@ port <port> ignored if the "check" parameter is not set. See also the "addr" parameter. proto <name> + May be used in the following contexts: tcp, http + Forces the multiplexer's protocol to use for the outgoing connections to this server. It must be compatible with the mode of the backend (TCP or HTTP). It must also be usable on the backend side. The list of available protocols is @@ -17013,6 +17680,8 @@ proto <name> See also "ws" to use an alternative protocol for websocket streams. redir <prefix> + May be used in the following contexts: http + The "redir" parameter enables the redirection mode for all GET and HEAD requests addressing this server. This means that instead of having HAProxy forward the request to the server, it will send an "HTTP 302" response with @@ -17031,11 +17700,15 @@ redir <prefix> Example : server srv1 192.168.1.1:80 redir http://image1.mydomain.com check rise <count> + May be used in the following contexts: tcp, http, log + The "rise" parameter states that a server will be considered as operational after <count> consecutive successful health checks. This value defaults to 2 if unspecified. See also the "check", "inter" and "fall" parameters. resolve-opts <option>,<option>,... + May be used in the following contexts: tcp, http, log + Comma separated list of options to apply to DNS resolution linked to this server. @@ -17075,6 +17748,8 @@ resolve-opts <option>,<option>,... Default value: not set resolve-prefer <family> + May be used in the following contexts: tcp, http, log + When DNS resolution is enabled for a server and multiple IP addresses from different families are returned, HAProxy will prefer using an IP address from the family mentioned in the "resolve-prefer" parameter. @@ -17087,6 +17762,8 @@ resolve-prefer <family> server s1 app1.domain.com:80 resolvers mydns resolve-prefer ipv6 resolve-net <network>[,<network[,...]] + May be used in the following contexts: tcp, http, log + This option prioritizes the choice of an ip address matching a network. This is useful with clouds to prefer a local ip. In some cases, a cloud high availability service can be announced with many ip addresses on many @@ -17099,6 +17776,8 @@ resolve-net <network>[,<network[,...]] server s1 app1.domain.com:80 resolvers mydns resolve-net 10.0.0.0/8 resolvers <id> + May be used in the following contexts: tcp, http, log + Points to an existing "resolvers" section to resolve current server's hostname. @@ -17109,6 +17788,8 @@ resolvers <id> See also section 5.3 send-proxy + May be used in the following contexts: tcp, http + The "send-proxy" parameter enforces use of the PROXY protocol over any connection established to this server. The PROXY protocol informs the other end about the layer 3/4 addresses of the incoming connection, so that it can @@ -17127,6 +17808,8 @@ send-proxy "accept-netscaler-cip" option of the "bind" keyword. send-proxy-v2 + May be used in the following contexts: tcp, http + The "send-proxy-v2" parameter enforces use of the PROXY protocol version 2 over any connection established to this server. The PROXY protocol informs the other end about the layer 3/4 addresses of the incoming connection, so @@ -17137,6 +17820,8 @@ send-proxy-v2 this section and send-proxy" option of the "bind" keyword. set-proxy-v2-tlv-fmt(<id>) <fmt> + May be used in the following contexts: tcp, http + The "set-proxy-v2-tlv-fmt" parameter is used to send arbitrary PROXY protocol version 2 TLVs. For the type (<id>) range of the defined TLV type please refer to section 2.2.8. of the proxy protocol specification. However, the value can @@ -17153,6 +17838,8 @@ set-proxy-v2-tlv-fmt(<id>) <fmt> of a newly created TLV that also has the type 0x20. proxy-v2-options <option>[,<option>]* + May be used in the following contexts: tcp, http + The "proxy-v2-options" parameter add options to send in PROXY protocol version 2 when "send-proxy-v2" is used. Options available are: @@ -17172,6 +17859,8 @@ proxy-v2-options <option>[,<option>]* within a Keep-Alive connection. send-proxy-v2-ssl + May be used in the following contexts: tcp, http + The "send-proxy-v2-ssl" parameter enforces use of the PROXY protocol version 2 over any connection established to this server. The PROXY protocol informs the other end about the layer 3/4 addresses of the incoming connection, so @@ -17183,6 +17872,8 @@ send-proxy-v2-ssl "send-proxy-v2" option of the "bind" keyword. send-proxy-v2-ssl-cn + May be used in the following contexts: tcp, http + The "send-proxy-v2-ssl" parameter enforces use of the PROXY protocol version 2 over any connection established to this server. The PROXY protocol informs the other end about the layer 3/4 addresses of the incoming connection, so @@ -17195,6 +17886,8 @@ send-proxy-v2-ssl-cn the "send-proxy-v2" option of the "bind" keyword. shard <shard> + May be used in the following contexts: peers + This parameter in used only in the context of stick-tables synchronisation with peers protocol. The "shard" parameter identifies the peers which will receive all the stick-table updates for keys with this shard as distribution @@ -17213,6 +17906,8 @@ shard <shard> peer D 127.0.0.1:40004 shard 3 sigalgs <sigalgs> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. It sets the string describing the list of signature algorithms that are negotiated during the TLSv1.2 and TLSv1.3 handshake. The format of the string is defined @@ -17221,6 +17916,8 @@ sigalgs <sigalgs> required. slowstart <start_time_in_ms> + May be used in the following contexts: tcp, http + The "slowstart" parameter for a server accepts a value in milliseconds which indicates after how long a server which has just come back up will run at full speed. Just as with every other time-based parameter, it can be entered @@ -17241,6 +17938,8 @@ slowstart <start_time_in_ms> seen as failed. sni <expression> + May be used in the following contexts: tcp, http, log, peers, ring + The "sni" parameter evaluates the sample fetch expression, converts it to a string and uses the result as the host name sent in the SNI TLS extension to the server. A typical use case is to send the SNI received from the client in @@ -17253,9 +17952,14 @@ sni <expression> "verify" directive for more details. If you want to set a SNI for health checks, see the "check-sni" directive for more details. + By default, the SNI is assigned to the connection name for "http-reuse", + unless overriden by the "pool-conn-name" server keyword. + source <addr>[:<pl>[-<ph>]] [usesrc { <addr2>[:<port2>] | client | clientip } ] source <addr>[:<port>] [usesrc { <addr2>[:<port2>] | hdr_ip(<hdr>[,<occ>]) } ] source <addr>[:<pl>[-<ph>]] [interface <name>] ... + May be used in the following contexts: tcp, http, log, peers, ring + The "source" parameter sets the source address which will be used when connecting to the server. It follows the exact same parameters and principle as the backend "source" keyword, except that it only applies to the server @@ -17273,6 +17977,8 @@ source <addr>[:<pl>[-<ph>]] [interface <name>] ... specifying the source address without port(s). ssl + May be used in the following contexts: tcp, http, log, peers, ring + This option enables SSL ciphering on outgoing connections to the server. It is critical to verify server certificates using "verify" when using SSL to connect to servers, otherwise the communication is prone to trivial man in @@ -17283,16 +17989,22 @@ ssl SSL health checks. ssl-max-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of <version> or lower when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-min-ver". ssl-min-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + May be used in the following contexts: tcp, http, log, peers, ring + This option enforces use of <version> or upper when SSL is used to communicate with the server. This option is also available on global statement "ssl-default-server-options". See also "ssl-max-ver". ssl-reuse + May be used in the following contexts: tcp, http, log, peers, ring + This option may be used as "server" setting to reset any "no-ssl-reuse" setting which would have been inherited from "default-server" directive as default value. @@ -17300,6 +18012,8 @@ ssl-reuse "default-server" "no-ssl-reuse" setting. stick + May be used in the following contexts: tcp, http + This option may be used as "server" setting to reset any "non-stick" setting which would have been inherited from "default-server" directive as default value. @@ -17307,11 +18021,15 @@ stick "default-server" "non-stick" setting. socks4 <addr>:<port> + May be used in the following contexts: tcp, http, log, peers, ring + This option enables upstream socks4 tunnel for outgoing connections to the server. Using this option won't force the health check to go via socks4 by default. You will have to use the keyword "check-via-socks4" to enable it. tcp-ut <delay> + May be used in the following contexts: tcp, http, log, peers, ring + Sets the TCP User Timeout for all outgoing connections to this server. This option is available on Linux since version 2.6.37. It allows HAProxy to configure a timeout for sockets which contain data not receiving an @@ -17327,6 +18045,8 @@ tcp-ut <delay> regular TCP connections, and is ignored for other protocols. tfo + May be used in the following contexts: tcp, http, log, peers, ring + This option enables using TCP fast open when connecting to servers, on systems that support it (currently only the Linux kernel >= 4.11). See the "tfo" bind option for more information about TCP fast open. @@ -17335,6 +18055,8 @@ tfo won't be able to retry the connection on failure. See also "no-tfo". track [<backend>/]<server> + May be used in the following contexts: tcp, http, log + This option enables ability to set the current state of the server by tracking another one. It is possible to track a server which itself tracks another server, provided that at the end of the chain, a server has health checks @@ -17342,6 +18064,8 @@ track [<backend>/]<server> used, it has to be enabled on both proxies. tls-tickets + May be used in the following contexts: tcp, http, log, peers, ring + This option may be used as "server" setting to reset any "no-tls-tickets" setting which would have been inherited from "default-server" directive as default value. @@ -17352,6 +18076,8 @@ tls-tickets "default-server" "no-tls-tickets" setting. verify [none|required] + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in. If set to 'none', server certificate is not verified. In the other case, The certificate provided by the server is verified using CAs from 'ca-file' and @@ -17367,6 +18093,8 @@ verify [none|required] the global section, "verify" is set to "required" by default. verifyhost <hostname> + May be used in the following contexts: tcp, http, log, peers, ring + This setting is only available when support for OpenSSL was built in, and only takes effect if 'verify required' is also specified. This directive sets a default static hostname to check the server's certificate against when no @@ -17378,6 +18106,8 @@ verifyhost <hostname> include wildcards. See also "verify", "sni" and "no-verifyhost" options. weight <weight> + May be used in the following contexts: tcp, http + The "weight" parameter is used to adjust the server's weight relative to other servers. All servers will receive a load proportional to their weight relative to the sum of all weights, so the higher the weight, the higher the @@ -17389,6 +18119,8 @@ weight <weight> room above and below for later adjustments. ws { auto | h1 | h2 } + May be used in the following contexts: http + This option allows to configure the protocol used when relaying websocket streams. This is most notably useful when using an HTTP/2 backend without the support for H2 websockets through the RFC8441. @@ -17856,25 +18588,27 @@ The ACL engine can match these types against patterns of the following types : The following ACL flags are currently supported : -i : ignore case during matching of all subsequent patterns. - -f : load patterns from a file. + -f : load patterns from a list. -m : use a specific pattern matching method -n : forbid the DNS resolutions - -M : load the file pointed by -f like a map file. + -M : load the file pointed by -f like a map. -u : force the unique id of the ACL -- : force end of flags. Useful when a string looks like one of the flags. -The "-f" flag is followed by the name of a file from which all lines will be -read as individual values. It is even possible to pass multiple "-f" arguments -if the patterns are to be loaded from multiple files. Empty lines as well as -lines beginning with a sharp ('#') will be ignored. All leading spaces and tabs -will be stripped. If it is absolutely necessary to insert a valid pattern -beginning with a sharp, just prefix it with a space so that it is not taken for -a comment. Depending on the data type and match method, HAProxy may load the -lines into a binary tree, allowing very fast lookups. This is true for IPv4 and -exact string matching. In this case, duplicates will automatically be removed. - -The "-M" flag allows an ACL to use a map file. If this flag is set, the file is -parsed as two column file. The first column contains the patterns used by the +The "-f" flag is followed by the name that must follow the format described in +2.7. about name format for maps and ACLs. It is even possible to pass multiple +"-f" arguments if the patterns are to be loaded from multiple lists. if an +existing file is referenced, all lines will be read as individual values. Empty +lines as well as lines beginning with a sharp ('#') will be ignored. All +leading spaces and tabs will be stripped. If it is absolutely necessary to +insert a valid pattern beginning with a sharp, just prefix it with a space so +that it is not taken for a comment. Depending on the data type and match +method, HAProxy may load the lines into a binary tree, allowing very fast +lookups. This is true for IPv4 and exact string matching. In this case, +duplicates will automatically be removed. + +The "-M" flag allows an ACL to use a map. If this flag is set, the list is +parsed as two column entries. The first column contains the patterns used by the ACL, and the second column contain the samples. The sample can be used later by a map. This can be useful in some rare cases where an ACL would just be used to check for the existence of a pattern in a map before a mapping is applied. @@ -18362,6 +19096,7 @@ The following keywords are supported: add(value) integer integer add_item(delim,[var][,suff]]) string string aes_gcm_dec(bits,nonce,key,aead_tag) binary binary +aes_gcm_enc(bits,nonce,key,aead_tag) binary binary and(value) integer integer b64dec string binary base64 binary string @@ -18560,6 +19295,18 @@ aes_gcm_dec(<bits>,<nonce>,<key>,<aead_tag>) http-response set-header X-Decrypted-Text %[var(txn.enc),\ aes_gcm_dec(128,txn.nonce,Zm9vb2Zvb29mb29wZm9vbw==,txn.aead_tag)] +aes_gcm_enc(<bits>,<nonce>,<key>,<aead_tag>) + Encrypts the raw byte input using the AES128-GCM, AES192-GCM or + AES256-GCM algorithm, depending on the <bits> parameter. <nonce> and <key> + parameters must be base64 encoded. Last parameter, <aead_tag>, must be a + variable. The AEAD tag will be stored base64 encoded into that variable. + The returned result is in raw byte format. The <nonce> and <key> can either + be strings or variables. This converter requires at least OpenSSL 1.0.1. + + Example: + http-response set-header X-Encrypted-Text %[var(txn.plain),\ + aes_gcm_enc(128,txn.nonce,Zm9vb2Zvb29mb29wZm9vbw==,txn.aead_tag)] + and(<value>) Performs a bitwise "AND" between <value> and the input value of type signed integer, and returns the result as an signed integer. <value> can be a @@ -19109,17 +19856,18 @@ ltrim(<chars>) Skips any characters from <chars> from the beginning of the string representation of the input sample. -map(<map_file>[,<default_value>]) -map_<match_type>(<map_file>[,<default_value>]) -map_<match_type>_<output_type>(<map_file>[,<default_value>]) - Search the input value from <map_file> using the <match_type> matching method, - and return the associated value converted to the type <output_type>. If the - input value cannot be found in the <map_file>, the converter returns the - <default_value>. If the <default_value> is not set, the converter fails and - acts as if no input value could be fetched. If the <match_type> is not set, it - defaults to "str". Likewise, if the <output_type> is not set, it defaults to - "str". For convenience, the "map" keyword is an alias for "map_str" and maps a - string to another string. +map(<map_name>[,<default_value>]) +map_<match_type>(<map_name>[,<default_value>]) +map_<match_type>_<output_type>(<map_name>[,<default_value>]) + Search the input value from <map_name> using the <match_type> matching + method, and return the associated value converted to the type <output_type>. + If the input value cannot be found in the <map_name>, the converter returns + the <default_value>. If the <default_value> is not set, the converter fails + and acts as if no input value could be fetched. If the <match_type> is not + set, it defaults to "str". Likewise, if the <output_type> is not set, it + defaults to "str". For convenience, the "map" keyword is an alias for + "map_str" and maps a string to another string. <map_name> must follow the + format described in 2.7. about name format for maps and ACLs It is important to avoid overlapping between the keys : IP addresses and strings are stored in trees, so the first of the finest match will be used. @@ -19128,38 +19876,43 @@ map_<match_type>_<output_type>(<map_file>[,<default_value>]) The following array contains the list of all map functions available sorted by input type, match type and output type. - input type | match method | output type str | output type int | output type ip - -----------+--------------+-----------------+-----------------+--------------- - str | str | map_str | map_str_int | map_str_ip - -----------+--------------+-----------------+-----------------+--------------- - str | beg | map_beg | map_beg_int | map_end_ip - -----------+--------------+-----------------+-----------------+--------------- - str | sub | map_sub | map_sub_int | map_sub_ip - -----------+--------------+-----------------+-----------------+--------------- - str | dir | map_dir | map_dir_int | map_dir_ip - -----------+--------------+-----------------+-----------------+--------------- - str | dom | map_dom | map_dom_int | map_dom_ip - -----------+--------------+-----------------+-----------------+--------------- - str | end | map_end | map_end_int | map_end_ip - -----------+--------------+-----------------+-----------------+--------------- - str | reg | map_reg | map_reg_int | map_reg_ip - -----------+--------------+-----------------+-----------------+--------------- - str | reg | map_regm | map_reg_int | map_reg_ip - -----------+--------------+-----------------+-----------------+--------------- - int | int | map_int | map_int_int | map_int_ip - -----------+--------------+-----------------+-----------------+--------------- - ip | ip | map_ip | map_ip_int | map_ip_ip - -----------+--------------+-----------------+-----------------+--------------- + input type | match method | output type str | output type int | output type ip | output type key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | str | map_str | map_str_int | map_str_ip | map_str_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | beg | map_beg | map_beg_int | map_end_ip | map_end_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | sub | map_sub | map_sub_int | map_sub_ip | map_sub_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | dir | map_dir | map_dir_int | map_dir_ip | map_dir_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | dom | map_dom | map_dom_int | map_dom_ip | map_dom_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | end | map_end | map_end_int | map_end_ip | map_end_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | reg | map_reg | map_reg_int | map_reg_ip | map_reg_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + str | reg | map_regm | map_reg_int | map_reg_ip | map_reg_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + int | int | map_int | map_int_int | map_int_ip | map_int_key + -----------+--------------+-----------------+-----------------+----------------+---------------- + ip | ip | map_ip | map_ip_int | map_ip_ip | map_ip_key + -----------+--------------+-----------------+-----------------+----------------+---------------- The special map called "map_regm" expect matching zone in the regular expression and modify the output replacing back reference (like "\1") by the corresponding match text. - The file contains one key + value per line. Lines which start with '#' are - ignored, just like empty lines. Leading tabs and spaces are stripped. The key - is then the first "word" (series of non-space/tabs characters), and the value - is what follows this series of space/tab till the end of the line excluding - trailing spaces/tabs. + Output type "key" means that it is the matched entry's key (as found in the + map file) that will be returned as a string instead of the value. Note that + optional <default_value> argument is not supported when "key" output type is + used. + + Files referenced by <map_name> contains one key + value per line. Lines which + start with '#' are ignored, just like empty lines. Leading tabs and spaces + are stripped. The key is then the first "word" (series of non-space/tabs + characters), and the value is what follows this series of space/tab till the + end of the line excluding trailing spaces/tabs. Example : @@ -19699,6 +20452,21 @@ table_expire(<table>[,<default_value>]) input sample in the designated table. See also the table_idle sample fetch keyword. +table_glitch_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of front + connection glitches associated with the input sample in the designated table. + See also the sc_glitch_cnt sample fetch keyword and fc_glitches for the value + measured on the current front connection. + +table_glitch_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the average front connection + glitch rate associated with the input sample in the designated table. See + also the sc_glitch_rate sample fetch keyword. + table_gpc(<idx>,<table>) Uses the string representation of the input sample to perform a lookup in the specified table. If the key is not found in the table, integer value zero @@ -20212,7 +20980,6 @@ table_avl([<table>]) integer table_cnt([<table>]) integer thread integer txn.id32 integer -txn.conn_retries integer txn.sess_term_state string uuid([<version>]) string var(<var-name>[,<default>]) undefined @@ -20669,14 +21436,8 @@ txn.id32 : integer depends on the request rate. In practice, it should not be an issue. For a true unique ID, see "unique-id-format" directive. -txn.conn_retries : integer - Returns the the number of connection retries experienced by this stream when - trying to connect to the server. This value is subject to change while the - connection is not fully established. For HTTP connections, the value may be - affected by L7 retries. - txn.sess_term_state : string - Retruns the TCP or HTTP stream termination state, as reported in the log. It + Returns the TCP or HTTP stream termination state, as reported in the log. It is a 2-characters string, The final stream state followed by the event which caused its to terminate. See section 8.5 about stream state at disconnection for the list of possible events. The current value at time the sample fetch @@ -20687,10 +21448,14 @@ txn.sess_term_state : string # Return a 429-Too-Many-Requests if stream timed out in queue http-after-response set-status 429 if { txn.sess_term_state "sQ" } +uptime : integer + Returns the uptime of the current HAProxy worker in seconds. + uuid([<version>]) : string - Returns a UUID following the RFC4122 standard. If the version is not + Returns a UUID following the RFC 9562 standard. If the version is not specified, a UUID version 4 (fully random) is returned. - Currently, only version 4 is supported. + + Versions 4 and 7 are supported. var(<var-name>[,<default>]) : undefined Returns a variable with the stored type. If the variable is not set, the @@ -20730,14 +21495,18 @@ Summary of sample fetch methods in this section and their respective types: -------------------------------------------------+------------- accept_date([<unit>]) integer bc.timer.connect integer +bc_be_queue integer bc_dst ip bc_dst_port integer bc_err integer bc_err_str string bc_glitches integer bc_http_major integer +bc_nb_streams integer bc_src ip bc_src_port integer +bc_srv_queue integer +bc_settings_streams_limit integer be_id integer be_name string bc_rtt(<unit>) integer @@ -20764,6 +21533,7 @@ fc_fackets integer fc_glitches integer fc_http_major integer fc_lost integer +fc_nb_streams integer fc_pp_authority string fc_pp_unique_id string fc_pp_tlv(<id>) string @@ -20776,6 +21546,7 @@ fc_sacked integer fc_src ip fc_src_is_local boolean fc_src_port integer +fc_settings_streams_limit integer fc_unacked integer fe_defbe string fe_id integer @@ -20825,6 +21596,14 @@ sc_get_gpt0(<ctr>[,<table>]) integer sc0_get_gpt0([<table>]) integer sc1_get_gpt0([<table>]) integer sc2_get_gpt0([<table>]) integer +sc_glitch_cnt(<ctr>[,<table>]) integer +sc0_glitch_cnt([<table>]) integer +sc1_glitch_cnt([<table>]) integer +sc2_glitch_cnt([<table>]) integer +sc_glitch_rate(<ctr>[,<table>]) integer +sc0_glitch_rate([<table>]) integer +sc1_glitch_rate([<table>]) integer +sc2_glitch_rate([<table>]) integer sc_gpc_rate(<idx>,<ctr>[,<table>]) integer sc_gpc0_rate(<ctr>[,<table>]) integer sc0_gpc0_rate([<table>]) integer @@ -20929,6 +21708,7 @@ src_updt_conn_cnt([<table>]) integer srv_id integer srv_name string txn.conn_retries integer +txn.redispatched boolean -------------------------------------------------+------------- Detailed list: @@ -20955,6 +21735,10 @@ bc.timer.connect : integer equivalent of %Tc in the log-format. This is reported in milliseconds (ms). For more information see Section 8.4 "Timing events" +bc_be_queue : integer + Number of streams de-queued while waiting for a connection slot on the + target backend. This is the equivalent of %bq in the log-format. + bc_dst : ip This is the destination ip address of the connection on the server side, which is the server address HAProxy connected to. It is of type IP and works @@ -20995,6 +21779,9 @@ bc_http_major : integer for HTTP/0.9 to HTTP/1.1 or 2 for HTTP/2. Note, this is based on the on-wire encoding and not the version present in the request header. +bc_nb_streams : integer + Returns the number of streams opened on the backend connection. + bc_src : ip This is the source ip address of the connection on the server side, which is the server address HAProxy connected from. It is of type IP and works on both @@ -21005,6 +21792,15 @@ bc_src_port : integer Returns an integer value corresponding to the TCP source port of the connection on the server side, which is the port HAProxy connected from. +bc_srv_queue : integer + Number of streams de-queued while waiting for a connection slot on the + target server. This is the equivalent of %sq in the log-format. + +bc_settings_streams_limit : integer + Returns the maximum number of streams allowed on the backend connection. For + TCP and HTTP/1.1 connections, it is always 1. For other protocols, it depends + on the settings negociated with the server. + be_id : integer Returns an integer containing the current backend's id. It can be used in frontends with responses to check which backend processed the request. If @@ -21137,13 +21933,13 @@ fc_err : integer Returns the ID of the error that might have occurred on the current connection. Any strictly positive value of this fetch indicates that the connection did not succeed and would result in an error log being output (as - described in section 8.2.6). See the "fc_err_str" fetch for a full list of + described in section 8.2.5). See the "fc_err_str" fetch for a full list of error codes and their corresponding error message. fc_err_str : string Returns an error message describing what problem happened on the current connection, resulting in a connection failure. This string corresponds to the - "message" part of the error log format (see section 8.2.6). See below for a + "message" part of the error log format (see section 8.2.5). See below for a full list of error codes and their corresponding error messages : +----+---------------------------------------------------------------------------+ @@ -21229,6 +22025,9 @@ fc_lost : integer not TCP or if the operating system does not support TCP_INFO, for example Linux kernels before 2.4, the sample fetch fails. +fc_nb_streams : integer + Returns the number of streams opened on the frontend connection. + fc_pp_authority : string Returns the first authority TLV sent by the client in the PROXY protocol header, if any. @@ -21314,6 +22113,10 @@ fc_src_port : integer connection on the client side. Only "tcp-request connection" rules may alter this address. See "src-port" for details. +fc_settings_streams_limit : integer + Returns the maximum number of streams allowed on the frontend connection. For + TCP and HTTP/1.1 connections, it is always 1. For other protocols, it depends + on the settings negociated with the client. fc_unacked : integer Returns the unacked counter measured by the kernel for the client connection. @@ -21464,6 +22267,34 @@ sc2_get_gpt0([<table>]) : integer Returns the value of the first General Purpose Tag associated to the currently tracked counters. See also src_get_gpt0. +sc_glitch_cnt(<ctr>[,<table>]) : integer +sc0_glitch_cnt([<table>]) : integer +sc1_glitch_cnt([<table>]) : integer +sc2_glitch_cnt([<table>]) : integer + Returns the cumulative number of front connection glitches that were observed + on connections associated with the currently tracked counters. Usually these + result in requests or connections to be aborted so the returned value will + often correspond to past connections. There is no good nor bad value, but a + poor quality client may occasionally cause a few glitches per connection, + while a very bogus or malevolent client may quickly cause thousands of events + to be added on a connection. See also fc_glitches for the number affecting + the current connection, src_glitch_cnt to look them up per source, and + sc_glitch_rate for the event rate measurements. + +sc_glitch_rate(<ctr>[,<table>]) : integer +sc0_glitch_rate([<table>]) : integer +sc1_glitch_rate([<table>]) : integer +sc2_glitch_rate([<table>]) : integer + Returns the average rate at which front connection glitches were observed for + the currently tracked counters, measured in amount of events over the period + configured in the table. Usually these glitches result in requests or + connections to be aborted so the returned value will often be related to past + connections. There is no good nor bad value, but a poor quality client may + occasionally cause a few glitches per connection, hence a low rate is + generally expected. However, a very bogus or malevolent client may quickly + cause thousands of events to be added per connection, and maintain a high + rate here. See also src_glitch_rate and sc_glitch_cnt. + sc_gpc_rate(<idx>,<ctr>[,<table>]) : integer Returns the average increment rate of the General Purpose Counter at the index <idx> of the array associated to the tracked counter of ID <ctr> from @@ -21778,6 +22609,29 @@ src_get_gpt0([<table>]) : integer the designated stick-table. If the address is not found, zero is returned. See also sc/sc0/sc1/sc2_get_gpt0. +src_glitch_cnt([<table>]) : integer + Returns the cumulative number of front connection glitches that were observed + on connections from the current connection's source address. Usually these + result in requests or connections to be aborted so the returned value will + often correspond to past connections. There is no good nor bad value, but a + poor quality client may occasionally cause a few glitches per connection, + while a very bogus or malevolent client may quickly cause thousands of events + to be added on a connection. See also fc_glitches for the number affecting + the current connection, sc_glitch_cnt to look them up in currently tracked + counters, and src_glitch_rate for the event rate measurements. + +src_glitch_rate([<table>]) : integer + Returns the average rate at which front connection glitches were observed for + on connections from the current connection's source address, measured in + amount of events over the period configured in the table. Usually these + glitches result in requests or connections to be aborted so the returned + value will often be related to past connections. There is no good nor bad + value, but a poor quality client may occasionally cause a few glitches per + connection, hence a low rate is generally expected. However, a very bogus or + malevolent client may quickly cause thousands of events to be added per + connection, and maintain a high rate here. See also sc_glitch_rate and + src_glitch_cnt. + src_gpc_rate(<idx>[,<table>]) : integer Returns the average increment rate of the General Purpose Counter at the index <idx> of the array associated to the incoming connection's @@ -21963,6 +22817,12 @@ txn.conn_retries : integer connection is not fully established. For HTTP connections, the value may be affected by L7 retries. +txn.redispatched : boolean + Returns true if the connection has experienced redispatch upon retry according + to "option redispatch" configuration. This value is subject to change while + the connection is not fully established. For HTTP connections, the value may + be affected by L7 retries. + 7.3.4. Fetching samples at Layer 5 ---------------------------------- @@ -21982,6 +22842,11 @@ ssl_bc_alg_keysize integer ssl_bc_alpn string ssl_bc_cipher string ssl_bc_client_random binary +ssl_bc_client_early_traffic_secret string +ssl_bc_client_handshake_traffic_secret string +ssl_bc_client_traffic_secret_0 string +ssl_bc_exporter_secret string +ssl_bc_early_exporter_secret string ssl_bc_curve string ssl_bc_err integer ssl_bc_err_str string @@ -21989,6 +22854,8 @@ ssl_bc_is_resumed boolean ssl_bc_npn string ssl_bc_protocol string ssl_bc_unique_id binary +ssl_bc_server_handshake_traffic_secret string +ssl_bc_server_traffic_secret_0 string ssl_bc_server_random binary ssl_bc_session_id binary ssl_bc_session_key binary @@ -22122,6 +22989,51 @@ ssl_bc_client_random : binary sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or BoringSSL. It can be used in a tcp-check or an http-check ruleset. +ssl_bc_client_early_traffic_secret : string + Return the CLIENT_EARLY_TRAFFIC_SECRET as an hexadecimal string for the + back connection when the outgoing connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_bc_client_handshake_traffic_secret : string + Return the CLIENT_HANDSHAKE_TRAFFIC_SECRET as an hexadecimal string for the + bacl connection when the outgoing connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_bc_client_traffic_secret_0 : string + Return the CLIENT_TRAFFIC_SECRET_0 as an hexadecimal string for the + back connection when the outgoing connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_bc_exporter_secret : string + Return the EXPORTER_SECRET as an hexadecimal string for the + back connection when the outgoing connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_bc_early_exporter_secret : string + Return the EARLY_EXPORTER_SECRET as an hexadecimal string for the + back connection when the outgoing connection was made over an TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + ssl_bc_curve : string Returns the name of the curve used in the key agreement when the outgoing connection was made over an SSL/TLS transport layer. This requires @@ -22171,6 +23083,24 @@ ssl_bc_unique_id : binary can be encoded to base64 using the converter: "ssl_bc_unique_id,base64". It can be used in a tcp-check or an http-check ruleset. +ssl_bc_server_handshake_traffic_secret : string + Return the SERVER_HANDSHAKE_TRAFFIC_SECRET as an hexadecimal string for the + back connection when the outgoing connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_bc_server_traffic_secret_0 : string + Return the SERVER_TRAFFIC_SECRET_0 as an hexadecimal string for the + back connection when the outgoing connection was made over an TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + ssl_bc_server_random : binary Returns the server random of the back connection when the incoming connection was made over an SSL/TLS transport layer. It is useful to to decrypt traffic @@ -22821,7 +23751,7 @@ Warning : Following sample fetches are ignored if used from HTTP proxies. They HTTP proxies use structured content. Thus raw representation of these data are meaningless. A warning is emitted if an ACL relies on one of the following sample fetches. But it is not possible to detect - all invalid usage (for instance inside a log-format string or a + all invalid usage (for instance inside a Custom log format or a sample expression). So be careful. Summary of sample fetch methods in this section and their respective types: @@ -22829,9 +23759,13 @@ Summary of sample fetch methods in this section and their respective types: keyword output type ----------------------------------------------------+------------- bs.id integer +bs.aborted boolean +bs.rst_code integer distcc_body(<token>[,<occ>]) binary distcc_param(<token>[,<occ>]) integer fs.id integer +fs.aborted boolean +fs.rst_code integer payload(<offset>,<length>) binary payload_lv(<offset1>,<length>[,<offset2>]) binary req.len integer @@ -22867,6 +23801,16 @@ bs.id : integer Returns the multiplexer's stream ID on the server side. It is the multiplexer's responsibility to return the appropriate information. +bs.aborted: boolean + Returns true is an abort was received from the server for the current + stream. Otherwise false is returned. + +bs.rst_code: integer + Returns the reset code received from the server for the current stream. The + code of the H2 RST_STREAM frame or the QUIC STOP_SENDING frame received from + the server is returned. The sample fetch fails if no abort was received or if + the server stream is not an H2/QUIC stream. + distcc_body(<token>[,<occ>]) : binary Parses a distcc message and returns the body associated to occurrence #<occ> of the token <token>. Occurrences start at 1, and when unspecified, any may @@ -22898,6 +23842,16 @@ fs.id : integer multiplexer's responsibility to return the appropriate information. For instance, on a raw TCP, 0 is always returned because there is no stream. +fs.aborted: boolean + Returns true is an abort was received from the client for the current + stream. Otherwise false is returned. + +fs.rst_code: integer + Returns the reset code received from the client for the current stream. The + code of the H2 RST_STREAM frame or the QUIC STOP_SENDING frame received from + the client is returned. The sample fetch fails if no abort was received or + if the client stream is not an H2/QUIC stream. + payload(<offset>,<length>) : binary (deprecated) This is an alias for "req.payload" when used in the context of a request (e.g. "stick on", "stick match"), and for "res.payload" when used in the context of @@ -23438,7 +24392,7 @@ hdr([<name>[,<occ>]]) : string request_date([<unit>]) : integer This is the exact date when the first byte of the HTTP request was received - by HAProxy (log-format tag %tr). This is computed from accept_date + + by HAProxy (log-format alias %tr). This is computed from accept_date + handshake time (%Th) + idle time (%Ti). Returns a value in number of seconds since epoch. @@ -23705,8 +24659,8 @@ req.hdr_names([<delim>]) : string req.ver : string req_ver : string (deprecated) Returns the version string from the HTTP request, for example "1.1". This can - be useful for ACL. For logs use the "%HV" log variable. Some predefined ACL - already check for versions 1.0 and 1.1. + be useful for ACL. For logs use the "%HV" logformat alias. Some predefined + ACL already check for versions 1.0 and 1.1. Common values are "1.0", "1.1", "2.0" or "3.0". @@ -24934,42 +25888,116 @@ regular traffic log (see option httplog or option httpslog). 8.2.6. Custom log format ------------------------ -When the default log formats are not sufficient, it is possible to define new -ones in very fine details. As creating a log-format from scratch is not always -a trivial task, it is strongly recommended to first have a look at the existing -formats ("option tcplog", "option httplog", "option httpslog"), pick the one -looking the closest to the expectation, copy its "log-format" equivalent string -and adjust it. - -HAProxy understands some log format variables. % precedes log format variables. -Variables can take arguments using braces ('{}'), and multiple arguments are -separated by commas within the braces. Flags may be added or removed by -prefixing them with a '+' or '-' sign. - -Special variable "%o" may be used to propagate its flags to all other -variables on the same format string. This is particularly handy with quoted -("Q") and escaped ("E") string formats. - -If a variable is named between square brackets ('[' .. ']') then it is used +Historically, custom log formats were only used to produce logs. But their +convenience when used to produce a string by assembling multiple complex +expressions has got them adopted by many directives which used to take only +a string in argument and which may now also take an such a Custom log format +definition. Such arguments, which are commonly designated by "<fmt>" in this +document, are defined exactly the same way as the argument to the "log-format" +directive, described here. + +When it comes to logs and when the default log formats are not sufficient, it +is possible to define new ones in very fine details. As creating a log-format +from scratch is not always a trivial task, it is strongly recommended to first +have a look at the existing formats ("option tcplog", "option httplog", "option +httpslog"), pick the one looking the closest to the expectation, copy its +"log-format" equivalent string and adjust it. + +A Custom log format definition is a single argument from a configuration +perspective. This means that it may not contain blanks (spaces or tabs), unless +these blanks are escaped using the backslash character ('\'), or the whole +definition is enclosed between quotes (which is the recommended way to use +them). The use of unquoted format strings is not recommended anymore as history +has shown that it was very error prone since a single missing backslash +character could result in silent truncation of the format. Such configurations +are still commonly encountered due to the massive adoption of log formats after +version 1.5-dev9, 3 years before quotes were usable, but it is recommended to +convert them to quoted strings and to drop the backslashes now. + +A log format definition is made of any number of log format items separated +by text and spaces. A log format item starts with character '%'. In order to +emit a verbatim '%', it must be preceded by another '%' resulting in '%%'. + +Logformat items may either be aliases or sample expressions: + +If an item is named between square brackets ('[' .. ']') then it is used as a sample expression rule (see section 7.3). This it useful to add some less common information such as the client's SSL certificate's DN, or to log -the key that would be used to store an entry into a stick table. +the key that would be used to store an entry into a stick table. It is also +commonly used with non-log actions (header manipulation, variables etc). + +Else if the item is named using an alpha-numerical name, it is an alias. +(Refer to the table below for the list of available aliases) -Note: spaces must be escaped. In configuration directives "log-format", -"log-format-sd" and "unique-id-format", spaces are considered as -delimiters and are merged. In order to emit a verbatim '%', it must be -preceded by another '%' resulting in '%%'. +Items can take arguments using braces ('{}'), and multiple arguments are +separated by commas within the braces. Flags may be added or removed by +prefixing them with a '+' or '-' sign (see below for the list of available +flags). + +Special alias "%o" may be used to propagate its flags to all other +logformat items on the same format string. This is particularly handy with +quoted ("Q") and escaped ("E") string formats. + +Items can optionally be named using ('()'). The name must be provided right +after '%' (before arguments). It will automatically be used as key name when +encoding flag such as "json" or "cbor" is set. When no encoding flag is +specified (default), item name will be ignored. It is also possible to force +the item's output to a given type by appending ':type' after the name, like +this: %(itemname:itemtype)aliasname or %(itemname:itemtype)[expr] where +itemtype may be 'str', 'sint' or 'bool'. Specifying the type is only relevant +when an encoding method is used. Also, it is supported to provide an empty name +to force the output type on an anonymous item: %(:itemtype), ie: when encoding +is not set globally, see flags definitions below for more information. + +Due to the original goal of custom log formats to be used for logging only, +there is a special case made of non-printable and unsafe characters (those +outside ASCII codes 32 to 126 plus a few other ones) depending where they are +used. Section 8.6 describes what's done exactly for logs in order to make sure +one will not send unsafe codes that alter the readability of the output in a +terminal. When used to form header fields, health checks or payload responses, +the rules are less strict and only characters forbidden in HTTP header fields +are replaced by their hexadecimal encoding preceded by character '%'. This is +normally not a problem, but it might affect the output when the character was +expected to be reproduced verbatim (e.g. when building an error page or a full +response payload, where line feeds could appear as "%0A"). + +Note: in configuration directives "log-format", "log-format-sd" and +"unique-id-format", spaces are considered as delimiters and are merged. Note: when using the RFC5424 syslog message format, the characters '"', '\' and ']' inside PARAM-VALUE should be escaped with '\' as prefix (see https://tools.ietf.org/html/rfc5424#section-6.3.3 for more details). In such cases, the use of the flag "E" should be considered. -Flags are : +Supported item flags are (may be enabled/disabled from item's arguments): * Q: quote a string * X: hexadecimal representation (IPs, Ports, %Ts, %rt, %pid) * E: escape characters '"', '\' and ']' in a string with '\' as prefix (intended purpose is for the RFC5424 structured-data log formats) + * bin: try to preserve binary data, this can be useful with sample + expressions that output binary data in order to preserve the original + data. Be careful however, because it can obviously generate non- + printable chars, including NULL-byte, which most syslog endpoints + don't expect. Thus it is mainly intended for use with set-var-fmt, + rings and binary-capable log endpoints. + This option can only be set globally (with %o), it will be ignored + if set on an individual item's options. + * json: automatically encode value in JSON format + (when set globally, only named logformat items are considered) + Incomplete numerical values (e.g.: '%B' when logasap is used), + which are normally prefixed with '+' without encoding, will be + encoded as-is. Also, '+E' option will be ignored. + * cbor: automatically encode value in CBOR format + (when set globally, only named logformat items are considered) + By default, cbor encoded data is represented in HEX form so + that it remains printable on stdout an can be used with usual + syslog endpoints. + As with json encoding, incomplete numerical values will be encoded + as-is and '+E' option will be ignored. + When combined with '+bin' option, it will directly generate raw + binary CBOR payload. Be careful, because it will obviously generate + non-printable chars, thus it is mainly intended for use with + set-var-fmt, rings and binary-capable log endpoints. Example: @@ -24978,13 +26006,16 @@ Flags are : log-format-sd %{+Q,+E}o\ [exampleSDID@1234\ header=%[capture.req.hdr(0)]] -Please refer to the table below for currently defined variables : + log-format "%{+json}o %(request)r %(custom_expr)[str(custom)]" + log-format "%{+cbor}o %(request)r %(custom_expr)[str(custom)]" + +Please refer to the table below for currently defined aliases : +---+------+------------------------------------------------------+---------+ - | R | var | field name (8.2.2 and 8.2.3 for description) | type | + | R | alias| field name (8.2.2 and 8.2.3 for description) | type | | | | sample fetch alternative | | +===+======+======================================================+=========+ - | | %o | special variable, apply flags on all next var | | + | | %o | special, apply flags on all following items | | +---+------+------------------------------------------------------+---------+ | date formats | +---+------+------------------------------------------------------+---------+ @@ -24995,12 +26026,13 @@ Please refer to the table below for currently defined variables : | | | %[accept_date,ltime("%d/%b/%Y:%H:%M:%S %z")] | date | +---+------+------------------------------------------------------+---------+ | | %Ts | Accept date as a UNIX timestamp | numeric | + | | | %[accept_date] | | +---+------+------------------------------------------------------+---------+ | | %t | Accept date local (with millisecond resolution) | | | | | %[accept_date(ms),ms_ltime("%d/%b/%Y:%H:%M:%S.%3N")] | date | +---+------+------------------------------------------------------+---------+ | | %ms | Accept date milliseconds | | - | | | %[accept_date(ms),ms_utime("%3N") | numeric | + | | | %[accept_date(ms),ms_utime("%3N")] | numeric | +---+------+------------------------------------------------------+---------+ | H | %tr | Request date local (with millisecond resolution) | | | | | %[request_date(ms),ms_ltime("%d/%b/%Y:%H:%M:%S.%3N")]| date | @@ -25056,8 +26088,10 @@ Please refer to the table below for currently defined variables : | H | %CS | captured_response_cookie | string | +---+------+------------------------------------------------------+---------+ | | %H | hostname | string | + | | | %[hostname] | | +---+------+------------------------------------------------------+---------+ | H | %HM | HTTP method (ex: POST) | string | + | | | %[method] +---+------+------------------------------------------------------+---------+ | H | %HP | HTTP request URI without query string | string | +---+------+------------------------------------------------------+---------+ @@ -25072,6 +26106,7 @@ Please refer to the table below for currently defined variables : | | | HTTP/%[req.ver] | | +---+------+------------------------------------------------------+---------+ | | %ID | unique-id | string | + | | | %[unique-id] | | +---+------+------------------------------------------------------+---------+ | | %ST | status_code | numeric | | | | %[txn.status] | | @@ -25086,6 +26121,7 @@ Please refer to the table below for currently defined variables : | | | %[be_name] | string | +---+------+------------------------------------------------------+---------+ | | %bc | beconn (backend concurrent connections) | numeric | + | | | %[be_conn] | | +---+------+------------------------------------------------------+---------+ | | %bi | backend_source_ip (connecting address) | | | | | %[bc_src] | IP | @@ -25094,6 +26130,7 @@ Please refer to the table below for currently defined variables : | | | %[bc_src_port] | numeric | +---+------+------------------------------------------------------+---------+ | | %bq | backend_queue | numeric | + | | | %[bc_be_queue] | | +---+------+------------------------------------------------------+---------+ | | %ci | client_ip (accepted address) | | | | | %[src] | IP | @@ -25102,8 +26139,10 @@ Please refer to the table below for currently defined variables : | | | %[src_port] | numeric | +---+------+------------------------------------------------------+---------+ | | %f | frontend_name | string | + | | | %[fe_name] | | +---+------+------------------------------------------------------+---------+ | | %fc | feconn (frontend concurrent connections) | numeric | + | | | %[fe_conn] | | +---+------+------------------------------------------------------+---------+ | | %fi | frontend_ip (accepting address) | | | | | %[dst] | IP | @@ -25131,12 +26170,13 @@ Please refer to the table below for currently defined variables : | H | %r | http_request | string | +---+------+------------------------------------------------------+---------+ | | %rc | retries | numeric | - | | | %[txn.conn_retries] | | + | | | %[txn.redispatched,iif(+,)]%[txn.conn_retries] | | +---+------+------------------------------------------------------+---------+ | | %rt | request_counter (HTTP req or TCP session) | numeric | | | | %[txn.id32] | | +---+------+------------------------------------------------------+---------+ | | %s | server_name | string | + | | | %[srv_name] | | +---+------+------------------------------------------------------+---------+ | | %sc | srv_conn (server concurrent connections) | numeric | +---+------+------------------------------------------------------+---------+ @@ -25147,6 +26187,7 @@ Please refer to the table below for currently defined variables : | | | %[bc_dst_port] | numeric | +---+------+------------------------------------------------------+---------+ | | %sq | srv_queue | numeric | + | | | %[bc_srv_queue] | | +---+------+------------------------------------------------------+---------+ | S | %sslc| ssl_ciphers (ex: AES-SHA) | | | | | %[ssl_fc_cipher] | string | @@ -25280,7 +26321,7 @@ Timings events in TCP mode: all request to calculate the amortized value. The second and subsequent request will always report zero here. - This timer is named %Th as a log-format tag, and fc.timer.handshake as a + This timer is named %Th as a log-format alias, and fc.timer.handshake as a sample fetch. - Ti: is the idle time before the HTTP request (HTTP mode only). This timer @@ -25293,7 +26334,7 @@ Timings events in TCP mode: pending until they need it. This delay will be reported as the idle time. A value of -1 indicates that nothing was received on the connection. - This timer is named %Ti as a log-format tag, and req.timer.idle as a + This timer is named %Ti as a log-format alias, and req.timer.idle as a sample fetch. - TR: total time to get the client request (HTTP mode only). It's the time @@ -25304,7 +26345,7 @@ Timings events in TCP mode: since most requests fit in a single packet. A large time may indicate a request typed by hand during a test. - This timer is named %TR as a log-format tag, and req.timer.hdr as a + This timer is named %TR as a log-format alias, and req.timer.hdr as a sample fetch. - Tq: total time to get the client request from the accept date or since the @@ -25315,7 +26356,7 @@ Timings events in TCP mode: it in favor of TR nowadays, as the idle time adds a lot of noise to the reports. - This timer is named %Tq as a log-format tag, and req.timer.tq as a + This timer is named %Tq as a log-format alias, and req.timer.tq as a sample fetch. - Tw: total time spent in the queues waiting for a connection slot. It @@ -25324,7 +26365,7 @@ Timings events in TCP mode: requests. The value "-1" means that the request was killed before reaching the queue, which is generally what happens with invalid or denied requests. - This timer is named %Tw as a log-format tag, and req.timer.queue as a + This timer is named %Tw as a log-format alias, and req.timer.queue as a sample fetch. - Tc: total time to establish the TCP connection to the server. It's the time @@ -25333,7 +26374,7 @@ Timings events in TCP mode: the matching SYN/ACK packet in return. The value "-1" means that the connection never established. - This timer is named %Tc as a log-format tag, and bc.timer.connect as a + This timer is named %Tc as a log-format alias, and bc.timer.connect as a sample fetch. - Tr: server response time (HTTP mode only). It's the time elapsed between @@ -25348,7 +26389,7 @@ Timings events in TCP mode: header (empty line) was never seen, most likely because the server timeout stroke before the server managed to process the request. - This timer is named %Tr as a log-format tag, and res.timer.hdr as a + This timer is named %Tr as a log-format alias, and res.timer.hdr as a sample fetch. - Td: this is the total transfer time of the response payload till the last @@ -25358,7 +26399,7 @@ Timings events in TCP mode: The data sent are not guaranteed to be received by the client, they can be stuck in either the kernel or the network. - This timer is named %Td as a log-format tag, and res.timer.data as a + This timer is named %Td as a log-format alias, and res.timer.data as a sample fetch. - Ta: total active time for the HTTP request, between the moment the proxy @@ -25373,7 +26414,7 @@ Timings events in TCP mode: Timers with "-1" values have to be excluded from this equation. Note that "Ta" can never be negative. - This timer is named %Ta as a log-format tag, and txn.timer.total as a + This timer is named %Ta as a log-format alias, and txn.timer.total as a sample fetch. - Tt: total stream duration time, between the moment the proxy accepted it @@ -25388,7 +26429,7 @@ Timings events in TCP mode: mode, "Ti", "Tq" and "Tr" have to be excluded too. Note that "Tt" can never be negative and that for HTTP, Tt is simply equal to (Th+Ti+Ta). - This timer is named %Tt as a log-format tag, and fc.timer.total as a + This timer is named %Tt as a log-format alias, and fc.timer.total as a sample fetch. - Tu: total estimated time as seen from client, between the moment the proxy @@ -25400,7 +26441,7 @@ Timings events in TCP mode: option is specified. In this case, it only equals (Th+TR+Tw+Tc+Tr), and is prefixed with a '+' sign. - This timer is named %Tu as a log-format tag, and txn.timer.user as a + This timer is named %Tu as a log-format alias, and txn.timer.user as a sample fetch. These timers provide precious indications on trouble causes. Since the TCP @@ -26462,8 +27503,8 @@ no option mpxs-conns set-param <name> <fmt> [ { if | unless } <condition> ] Set a FastCGI parameter that should be passed to this application. Its - value, defined by <fmt> must follows the log-format rules (see section 8.2.4 - "Custom Log format"). It may optionally be followed by an ACL-based + value, defined by <fmt> must follows the Custom log format rules (see section + 8.2.6 "Custom Log format"). It may optionally be followed by an ACL-based condition, in which case it will only be evaluated if the condition is true. With this directive, it is possible to overwrite the value of default FastCGI diff --git a/doc/design-thoughts/ring-v2.txt b/doc/design-thoughts/ring-v2.txt new file mode 100644 index 0000000..48c539a --- /dev/null +++ b/doc/design-thoughts/ring-v2.txt @@ -0,0 +1,312 @@ +2024-02-20 - Ring buffer v2 +=========================== + +Goals: + - improve the multi-thread performance of rings so that traces can be written + from all threads in parallel without the huge bottleneck of the lock that + is currently necessary to protect the buffer. This is important for mmapped + areas that are left as a file when the process crashes. + + - keep traces synchronous within a given thread, i.e. when the TRACE() call + returns, the trace is either written into the ring or lost due to slow + readers. + + - try hard to limit the cache line bounces between threads due to the use of + a shared work area. + + - make waiting threads not disturb working ones + + - continue to work on all supported platforms, with a particular focus on + performance for modern platforms (memory ordering, DWCAS etc can be used if + they provide any benefit), with a fallback for inferior platforms. + + - do not reorder traces within a given thread. + + - do not break existing features + + - do not significantly increase memory usage + + +Analysis of the current situation +================================= + +Currently, there is a read lock around the call to __sink_write() in order to +make sure that an attempt to write the number of lost messages is delivered +with highest priority and is consistent with the lost counter. This doesn't +seem to pose any problem at this point though if it were, it could possibly +be revisited. + +__sink_write() calls ring_write() which first measures the input string length +from the multiple segments, and locks the ring: + - while trying to free space + - while copying the message, due to the buffer's API + +Because of this, there is a huge serialization and threads wait in queue. Tests +involving a split of the lock and a release around the message copy have shown +a +60% performance increase, which is still not acceptable. + + +First proposed approach +======================= + +The first approach would have consisted in writing messages in small parts: + 1) write 0xFF in the tag to mean "size not filled yet" + 2) write the message's length and write a zero tag after the message's + location + 3) replace the first tag to 0xFE to indicate the size is known, but the + message is not filled yet. + 4) memcpy() of the message to the area + 5) replace the first tag to 0 to mark the entry as valid. + +It's worth noting that doing that without any lock will allow a second thread +looping on the first tag to jump to the second tag after step 3. But the cost +is high: in a 64-thread scenario where each of them wants to send one message, +the work would look like this: + - 64 threads try to CAS the tag. One gets it, 63 fail. They loop on the byte + in question in read-only mode, waiting for the byte to change. This loop + constantly forces the cache line to switch from MODIFIED to SHARED in the + writer thread, and makes it a pain for it to write the message's length + just after it. + + - once the first writer thread finally manages to write the length (step 2), + it writes 0xFE on the tag to release the waiting threads, and starts with + step 4. At this point, 63 threads try a CAS on the same entry, and this + hammering further complicates the memcpy() of step 4 for the first 63 bytes + of the message (well, 32 on avg since the tag is not necessarily aligned). + One thread wins, 62 fail. All read the size field and jump to the next tag, + waiting in read loops there. The second thread starts to write its size and + faces the same difficulty as described above, facing 62 competitors when + writing its size and the beginning of its message. + + - when the first writer thread writes the end of its message, it gets close + to the final tag where the 62 waiting threads are still reading, causing + a slow down again with the loss of exclusivity on the cache line. This is + the same for the second thread etc. + +Thus, on average, a writing thread is hindered by N-1 threads at the beginning +of its message area (in the first 32 bytes on avg) and by N-2 threads at the +end of its area (in the last 32 bytes on avg). Given that messages are roughly +218 bytes on avg for HTTP/1, this means that roughly 1/3 of the message is +written under severe cache contention. + +In addition to this, the buffer's tail needs to be updated once all threads are +ready, something that adds the need for synchronization so that the last writing +threads (the most likely to complete fast due to less perturbations) needs to +wait for all previous ones. This also means N atomic writes to the tail. + + +New proposal +============ + +In order to address the contention scenarios above, let's try to factor the +work as much as possible. The principle is that threads that want to write will +either do it themselves or declare their intent and wait for a writing thread +to do it for them. This aims at ensuring a maximum usage of read-only data +between threads, and to leave the work area read-write between very few +threads, and exclusive for multiple messages at once, avoiding the bounces. + +First, the buffer will have 2 indexes: + - head: where the valid data start + - tail: where new data need to be appended + +When a thread starts to work, it will keep a copy of $tail and push it forward +by as many bytes as needed to write all the messages it has to. In order to +guarantee that neither the previous nor the new $tail point to an outdated or +overwritten location but that there is always a tag there, $tail contains a +lock bit in its highest bit that will guarantee that only one at a time will +update it. The goal here is to perform as few atomic ops as possible in the +contended path so as to later amortize the costs and make sure to limit the +number of atomic ops on the wait path to the strict minimum so that waiting +threads do not hinder the workers: + + Fast path: + 1 load($tail) to check the topmost bit + 1 CAS($tail,$tail|BIT63) to set the bit (atomic_fetch_or / atomic_bts also work) + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + 1 store(1 byte tag=0) at the beginning to release the message + + Contented path: + N load($tail) while waiting for the bit to be zero + M CAS($tail,$tail|BIT63) to try to set the bit on tail, competing with others + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + 1 store(1 byte tag=0) at the beginning to release the message + +Queue +----- + +In order to limit the contention, writers will not start to write but will wait +in a queue, announcing their message pointers/lengths and total lengths. The +queue is made of a (ptr, len) pair that points to one such descriptor, located +in the waiter thread's stack, that itself points to the next pair. In fact +messages are ordered in a LIFO fashion but that isn't important since intra- +thread ordering is preserved (and in the worst case it will also be possible +to write them from end to beginning). + +The approach is the following: a writer loasd $tail and sees it's busy, there's +no point continuing, it will add itself to the queue, announcing (ptr, len + +next->len) so that by just reading the first entry, one knows the total size +of the queue. And it will wait there as long as $tail has its topmost bit set +and the queue points to itself (meaning it's the queue's leader), so that only +one thread in the queue watches $tail, limiting the number of cache line +bounces. If the queue doesn't point anymore to the current thread, it means +another thread has taken it over so there's no point continuing, this thread +just becomes passive. If the lock bit is dropped from $tail, the watching +thread needs to re-check that it's still the queue's leader before trying to +grab the lock, so that only the leading thread will attempt it. Indeed, a few +of the last leading threads might still be looping, unaware that they're no +longer leaders. A CAS(&queue, self, self) will do it. Upon failure, the thread +just becomes a passive thread. Upon success, the thread is a confirmed leader, +it must then try to grab the tail lock. Only this thread and a few potential +newcomers will compete on this one. If the leading thread wins, it brings all +the queue with it and the newcomers will queue again. If the leading thread +loses, it needs to loop back to the point above, watching $tail and the +queue. In this case a newcomer might have grabbed the lock. It will notice +the non-empty queue and will take it with it. Thus in both cases the winner +thread does a CAS(queue, queue, NULL) to reset the queue, keeping the previous +pointer. + +At this point the winner thread considers its own message size plus the +retrieved queue's size as the total required size and advances $tail by as +much, and will iterate over all messages to copy them in turn. The passive +threads are released by doing XCHG(&ptr->next, ptr) for each message, that +is normally impossible otherwise. As such, a passive thread just has to +loop over its own value, stored in its own stack, reading from its L1 cache +in loops without any risk of disturbing others, hence no need for EBO. + +During the time it took to update $tail, more messages will have been +accumulating in the queue from various other threads, and once $tail is +written, one thread can pick them up again. + +The benefit here is that the longer it takes one thread to free some space, +the more messages add up in the queue and the larger the next batch, so that +there are always very few contenders on the ring area and on the tail index. +At worst, the queue pointer is hammered but it's not on the fast path, since +wasting time here means all waiters will be queued. + +Also, if we keep the first tag unchanged after it's set to 0xFF, it allows to +avoid atomic ops inside all the message. Indeed there's no reader in the area +as long as the tag is 0xFF, so we can just write all contents at once including +the varints and subsequent message tags without ever using atomic ops, hence +not forcing ordered writes. So maybe in the end there is some value in writing +the messages backwards from end to beginning, and just writing the first tag +atomically but not the rest. + +The scenario would look like this: + + (without queue) + + - before starting to work: + do { + while (ret=(load(&tail) & BIT63)) + ; + } while (!cas(&tail, &ret, ret | BIT63)); + + - at this point, alone on it and guaranteed not to change + - after new size is calculated, write it and drop the lock: + + store(&tail, new_tail & ~BIT63); + + - that's sufficient to unlock other waiters. + + (with queue) + + in_queue = 0; + do { + ret = load(&tail); + if (ret & BIT63) { + if (!in_queue) { + queue_this_node(); + in_queue = 1; + } + while (ret & BIT63) + ; + } + } while (!cas(&tail, &ret, ret | BIT63)); + + dequeue(in_queue) etc. + + Fast path: + 1 load($tail) to check the topmost bit + 1 CAS($tail,$tail|BIT63) to set the bit (atomic_fetch_or / atomic_bts also work) + 1 load of the queue to see that it's empty + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + 1 store(1 byte tag=0) at the beginning to release the message + + Contented path: + 1 load($tail) to see the tail is changing + M CAS(queue,queue,self) to try to add the thread to the queue (avgmax nbthr/2) + N load($tail) while waiting for the lock bit to become zero + 1 CAS(queue,self,self) to check the leader still is + M CAS($tail,$tail|BIT63) to try to set the bit on tail, competing with others + 1 CAS(queue,queue,NULL) to reset the queue + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + P copies of individual messages + P stores of individual pointers to release writers + 1 store(1 byte tag=0) at the beginning to release the message + +Optimal approach (later if needed?): multiple queues. Each thread has one queue +assigned, either from a thread group, or using a modulo from the thread ID. +Same as above then. + + +Steps +----- + +It looks that the queue is what allows the process to scale by amortizing a +single lock for every N messages, but that it's not a prerequisite to start, +without a queue threads can just wait on $tail. + + +Options +------- + +It is possible to avoid the extra check on CAS(queue,self,self) by forcing +writers into the queue all the time. It would slow down the fast path but +may improve the slow path, both of which would become the same: + + Contented path: + 1 XCHG(queue,self) to try to add the thread to the queue + N load($tail) while waiting for the lock bit to become zero + M CAS($tail,$tail|BIT63) to try to set the bit on tail, competing with others + 1 CAS(queue,self,NULL) to reset the queue + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + P copies of individual messages + P stores of individual pointers to release writers + 1 store(1 byte tag=0) at the beginning to release the message + +There seems to remain a race when resetting the queue, where a newcomer thread +would queue itself while not being the leader. It seems it can be addressed by +deciding that whoever gets the bit is not important, what matters is the thread +that manages to reset the queue. This can then be done using another XCHG: + + 1 XCHG(queue,self) to try to add the thread to the queue + N load($tail) while waiting for the lock bit to become zero + M CAS($tail,$tail|BIT63) to try to set the bit on tail, competing with others + 1 XCHG(queue,NULL) to reset the queue + 1 store(1 byte tag=0xFF) at the beginning to mark the area busy + 1 store($tail) to update the new value + 1 copy of the whole message + P copies of individual messages + P stores of individual pointers to release writers + 1 store(1 byte tag=0) at the beginning to release the message + +However this time this can cause fragmentation of multiple sub-queues that will +need to be reassembled. So finally the CAS is better, the leader thread should +recognize itself. + +It seems tricky to reliably store the next pointer in each element, and a DWCAS +wouldn't help here either. Maybe uninitialized elements should just have a +special value (eg 0x1) for their next pointer, meaning "not initialized yet", +and that the thread will then replace with the previous queue pointer. A reader +would have to wait on this value when meeting it, knowing the pointer is not +filled yet but is coming. diff --git a/doc/internals/api/buffer-api.txt b/doc/internals/api/buffer-api.txt index ac35300..1e09ff9 100644 --- a/doc/internals/api/buffer-api.txt +++ b/doc/internals/api/buffer-api.txt @@ -548,11 +548,15 @@ buffer_almost_full | const buffer *buf| returns true if the buffer is not null | | are used. A waiting buffer will match. --------------------+------------------+--------------------------------------- b_alloc | buffer *buf | ensures that <buf> is allocated or - | ret: buffer * | allocates a buffer and assigns it to - | | *buf. If no memory is available, (1) - | | is assigned instead with a zero size. + | enum dynbuf_crit | allocates a buffer and assigns it to + | criticality | *buf. If no memory is available, (1) + | ret: buffer * | is assigned instead with a zero size. | | The allocated buffer is returned, or - | | NULL in case no memory is available + | | NULL in case no memory is available. + | | The criticality indicates the how the + | | buffer might be used and how likely it + | | is that the allocated memory will be + | | quickly released. --------------------+------------------+--------------------------------------- __b_free | buffer *buf | releases <buf> which must be allocated | ret: void | and marks it empty diff --git a/doc/intro.txt b/doc/intro.txt index f4133a1..c3f6cda 100644 --- a/doc/intro.txt +++ b/doc/intro.txt @@ -1,7 +1,7 @@ ----------------------- HAProxy Starter Guide ----------------------- - version 2.9 + version 3.0 This document is an introduction to HAProxy for all those who don't know it, as diff --git a/doc/lua-api/index.rst b/doc/lua-api/index.rst index 17927f3..0d69a2f 100644 --- a/doc/lua-api/index.rst +++ b/doc/lua-api/index.rst @@ -348,33 +348,33 @@ Core class end .. -.. js:function:: core.add_acl(filename, key) +.. js:function:: core.add_acl(name, key) **context**: init, task, action, sample-fetch, converter - Add the ACL *key* in the ACLs list referenced by the file *filename*. + Add the ACL *key* in the ACLs list referenced by *name*. - :param string filename: the filename that reference the ACL entries. + :param string name: the name that reference the ACL entries. :param string key: the key which will be added. -.. js:function:: core.del_acl(filename, key) +.. js:function:: core.del_acl(name, key) **context**: init, task, action, sample-fetch, converter Delete the ACL entry referenced by the key *key* in the list of ACLs - referenced by *filename*. + referenced by *name*. - :param string filename: the filename that reference the ACL entries. + :param string name: the name that reference the ACL entries. :param string key: the key which will be deleted. -.. js:function:: core.del_map(filename, key) +.. js:function:: core.del_map(name, key) **context**: init, task, action, sample-fetch, converter Delete the map entry indexed with the specified key in the list of maps - referenced by his filename. + referenced by his name. - :param string filename: the filename that reference the map entries. + :param string name: the name that reference the map entries. :param string key: the key which will be deleted. .. js:function:: core.get_info() @@ -828,14 +828,14 @@ Core class :param integer nice: the nice value, it must be between -1024 and 1024. -.. js:function:: core.set_map(filename, key, value) +.. js:function:: core.set_map(name, key, value) **context**: init, task, action, sample-fetch, converter Set the value *value* associated to the key *key* in the map referenced by - *filename*. + *name*. - :param string filename: the Map reference + :param string name: the Map reference :param string key: the key to set or replace :param string value: the associated value @@ -2877,6 +2877,22 @@ TXN class :see: :js:func:`TXN.reply`, :js:class:`Reply` +.. js:function:: TXN.set_fc_tos(txn, tos) + + Is used to set the TOS or DSCP field value of packets sent to the client to + the value passed in "tos" on platforms which support this. + + :param class_txn txn: The class txn object containing the data. + :param integer tos: The new TOS os DSCP. + +.. js:function:: TXN.set_fc_mark(txn, mark) + + Is used to set the Netfilter MARK on all packets sent to the client to the + value passed in "mark" on platforms which support it. + + :param class_txn txn: The class txn object containing the data. + :param integer mark: The mark value. + .. js:function:: TXN.set_loglevel(txn, loglevel) Is used to change the log level of the current request. The "loglevel" must @@ -2888,21 +2904,21 @@ TXN class :js:attr:`core.crit`, :js:attr:`core.err`, :js:attr:`core.warning`, :js:attr:`core.notice`, :js:attr:`core.info`, :js:attr:`core.debug` (log level definitions) -.. js:function:: TXN.set_tos(txn, tos) +.. js:function:: TXN.set_mark(txn, mark) - Is used to set the TOS or DSCP field value of packets sent to the client to - the value passed in "tos" on platforms which support this. + Alias for :js:func:`TXN.set_fc_mark()`. - :param class_txn txn: The class txn object containing the data. - :param integer tos: The new TOS os DSCP. + .. warning:: + This function is deprecated. :js:func:`TXN.set_fc_mark()` must be used + instead. -.. js:function:: TXN.set_mark(txn, mark) +.. js:function:: TXN.set_tos(txn, tos) - Is used to set the Netfilter MARK on all packets sent to the client to the - value passed in "mark" on platforms which support it. + Alias for :js:func:`TXN.set_fc_tos()`. - :param class_txn txn: The class txn object containing the data. - :param integer mark: The mark value. + .. warning:: + This function is deprecated. :js:func:`TXN.set_fc_tos()` must be used + instead. .. js:function:: TXN.set_priority_class(txn, prio) @@ -3367,11 +3383,11 @@ Map class Note that :js:attr:`Map.reg` is also available for compatibility. -.. js:function:: Map.new(file, method) +.. js:function:: Map.new(name, method) Creates and load a map. - :param string file: Is the file containing the map. + :param string name: Is the name referencing the map. :param integer method: Is the map pattern matching method. See the attributes of the Map class. :returns: a class Map object. @@ -3913,7 +3929,7 @@ Filter class This class contains return codes some filter callback functions may return. It also contains configuration flags and some helper functions. To understand how - the filter API works, see `doc/internal/filters.txt` documentation. + the filter API works, see `doc/internals/api/filters.txt` documentation. .. js:attribute:: filter.CONTINUE diff --git a/doc/management.txt b/doc/management.txt index 9cbc772..d036018 100644 --- a/doc/management.txt +++ b/doc/management.txt @@ -1,7 +1,7 @@ ------------------------ HAProxy Management Guide ------------------------ - version 2.9 + version 3.0 This document describes how to start, stop, manage, and troubleshoot HAProxy, @@ -32,10 +32,12 @@ Summary 9.3. Unix Socket commands 9.4. Master CLI 9.4.1. Master CLI commands +9.5. Stats-file 10. Tricks for easier configuration management 11. Well-known traps to avoid 12. Debugging and performance issues 13. Security considerations +13.1. Linux capabilities support 1. Prerequisites @@ -49,7 +51,7 @@ familiar with troubleshooting utilities such as strace and tcpdump. 2. Quick reminder about HAProxy's architecture ---------------------------------------------- -HAProxy is a multi-threaded, event-driven, non-blocking daemon. This means is +HAProxy is a multi-threaded, event-driven, non-blocking daemon. This means it uses event multiplexing to schedule all of its activities instead of relying on the system to schedule between multiple activities. Most of the time it runs as a single process, so the output of "ps aux" on a system will report only one @@ -128,7 +130,7 @@ followed by one of more letters, and possibly followed by one or multiple extra arguments. Without any option, HAProxy displays the help page with a reminder about supported options. Available options may vary slightly based on the operating system. A fair number of these options overlap with an equivalent one -if the "global" section. In this case, the command line always has precedence +in the "global" section. In this case, the command line always has precedence over the configuration file, so that the command line can be used to quickly enforce some settings without touching the configuration files. The current list of options is : @@ -230,6 +232,11 @@ list of options is : getaddrinfo() exist on various systems and cause anomalies that are difficult to troubleshoot. + -dI : enable the insecure fork. This is the equivalent of the + "insecure-fork-wanted" in the global section. It can be useful when running + all the reg-tests with ASAN which need to fork addr2line to resolve the + addresses. + -dK<class[,class]*> : dumps the list of registered keywords in each class. The list of classes is available with "-dKhelp". All classes may be dumped using "-dKall", otherwise a selection of those shown in the help can be @@ -407,16 +414,20 @@ list of options is : detect protocol violations from clients or servers. An optional argument can be used to specify a list of various trace configurations using ',' as separator. Each element activates one or all trace sources. Additionally, - level and verbosity can be optionally specified on each element using ':' as - inner separator with trace name. - - -m <limit> : limit the total allocatable memory to <limit> megabytes across - all processes. This may cause some connection refusals or some slowdowns - depending on the amount of memory needed for normal operations. This is - mostly used to force the processes to work in a constrained resource usage - scenario. It is important to note that the memory is not shared between - processes, so in a multi-process scenario, this value is first divided by - global.nbproc before forking. + level and verbosity can be optionally specified on each element using ':' + as inner separator with trace name. When entering an invalid verbosity or + level name, the list of available keywords is presented. For example it can + be convenient to pass 'help' for each field to consult the list first. + + -m <limit> : limit allocatable memory, which is used to keep process's data, + to <limit> megabytes. This may cause some connection refusals or some + slowdowns depending on the amount of memory needed for normal operations. + This is mostly used to force haproxy process to work in a constrained + resource consumption scenario. It is important to note that the memory is + not shared between haproxy processes and a child process created via fork() + system call inherits its parent's resource limits. So, in a master-worker + mode this memory limit is separately applied to the master and its forked + worker process. -n <limit> : limits the per-process connection limit to <limit>. This is equivalent to the global section's keyword "maxconn". It has precedence @@ -450,7 +461,7 @@ list of options is : -st <pid>* : send the "terminate" signal (SIGTERM) to older processes after boot completion to terminate them immediately without finishing what they were doing. <pid> is a list of pids to signal (one per argument). The list - is ends on any option starting with a "-". It is not a problem if the list + ends on any option starting with a "-". It is not a problem if the list of pids is empty, so that it can be built on the fly based on the result of a command like "pidof" or "pgrep". @@ -462,11 +473,16 @@ list of options is : -x <unix_socket> : connect to the specified socket and try to retrieve any listening sockets from the old process, and use them instead of trying to bind new ones. This is useful to avoid missing any new connection when - reloading the configuration on Linux. The capability must be enable on the - stats socket using "expose-fd listeners" in your configuration. - In master-worker mode, the master will use this option upon a reload with - the "sockpair@" syntax, which allows the master to connect directly to a - worker without using stats socket declared in the configuration. + reloading the configuration on Linux. + + Without master-worker mode, the capability must be enable on the stats + socket using "expose-fd listeners" in your configuration. + + In master-worker mode, it does not need "expose-fd listeners", the master + will use automatically this option upon a reload with the "sockpair@" + syntax, which allows the master to connect directly to a worker without using + any stats socket declared in the configuration. If you want to disable this, + you can pass -x /dev/null. A safe way to start HAProxy from an init file consists in forcing the daemon mode, storing existing pids to a pid file and using this pid file to notify @@ -1553,7 +1569,7 @@ Limitations do exist: the length of the whole buffer passed to the CLI must not be greater than tune.bfsize and the pattern "<<" must not be glued to the last word of the line. -When entering a paylod while in interactive mode, the prompt will change from +When entering a payload while in interactive mode, the prompt will change from "> " to "+ ". It is important to understand that when multiple haproxy processes are started @@ -1586,7 +1602,7 @@ abort ssl crl-file <crlfile> See also "set ssl crl-file" and "commit ssl crl-file". add acl [@<ver>] <acl> <pattern> - Add an entry into the acl <acl>. <acl> is the #<id> or the <file> returned by + Add an entry into the acl <acl>. <acl> is the #<id> or the <name> returned by "show acl". This command does not verify if the entry already exists. Entries are added to the current version of the ACL, unless a specific version is specified with "@<ver>". This version number must have preliminary been @@ -1595,7 +1611,7 @@ add acl [@<ver>] <acl> <pattern> added with a specific version number will not match until a "commit acl" operation is performed on them. They may however be consulted using the "show acl @<ver>" command, and cleared using a "clear acl @<ver>" command. - This command cannot be used if the reference <acl> is a file also used with + This command cannot be used if the reference <acl> is a name also used with a map. In this case, the "add map" command must be used instead. add map [@<ver>] <map> <key> <value> @@ -1692,7 +1708,6 @@ add server <backend>/<server> [args]* - crt - disabled - downinter - - enabled - error-limit - fall - fastinter @@ -1788,15 +1803,15 @@ clear counters all and can only be issued on sockets configured for level "admin". clear acl [@<ver>] <acl> - Remove all entries from the acl <acl>. <acl> is the #<id> or the <file> - returned by "show acl". Note that if the reference <acl> is a file and is + Remove all entries from the acl <acl>. <acl> is the #<id> or the <name> + returned by "show acl". Note that if the reference <acl> is a name and is shared with a map, this map will be also cleared. By default only the current version of the ACL is cleared (the one being matched against). However it is possible to specify another version using '@' followed by this version. clear map [@<ver>] <map> - Remove all entries from the map <map>. <map> is the #<id> or the <file> - returned by "show map". Note that if the reference <map> is a file and is + Remove all entries from the map <map>. <map> is the #<id> or the <name> + returned by "show map". Note that if the reference <map> is a name and is shared with a acl, this acl will be also cleared. By default only the current version of the map is cleared (the one being matched against). However it is possible to specify another version using '@' followed by this version. @@ -1851,7 +1866,7 @@ clear table <table> [ data.<type> <operator> <value> ] | [ key <key> ] commit acl @<ver> <acl> Commit all changes made to version <ver> of ACL <acl>, and deletes all past - versions. <acl> is the #<id> or the <file> returned by "show acl". The + versions. <acl> is the #<id> or the <name> returned by "show acl". The version number must be between "curr_ver"+1 and "next_ver" as reported in "show acl". The contents to be committed to the ACL can be consulted with "show acl @<ver> <acl>" if desired. The specified version number has normally @@ -1861,12 +1876,12 @@ commit acl @<ver> <acl> and all entries in the new version to become visible. It is also possible to use this command to perform an atomic removal of all visible entries of an ACL by calling "prepare acl" first then committing without adding any - entries. This command cannot be used if the reference <acl> is a file also + entries. This command cannot be used if the reference <acl> is a name also used as a map. In this case, the "commit map" command must be used instead. commit map @<ver> <map> Commit all changes made to version <ver> of map <map>, and deletes all past - versions. <map> is the #<id> or the <file> returned by "show map". The + versions. <map> is the #<id> or the <name> returned by "show map". The version number must be between "curr_ver"+1 and "next_ver" as reported in "show map". The contents to be committed to the map can be consulted with "show map @<ver> <map>" if desired. The specified version number has normally @@ -1903,7 +1918,7 @@ commit ssl cert <filename> Commit a temporary SSL certificate update transaction. In the case of an existing certificate (in a "Used" state in "show ssl - cert"), generate every SSL contextes and SNIs it need, insert them, and + cert"), generate every SSL contexts and SNIs it needs, insert them, and remove the previous ones. Replace in memory the previous SSL certificates everywhere the <filename> was used in the configuration. Upon failure it doesn't remove or insert anything. Once the temporary transaction is @@ -1952,16 +1967,16 @@ debug dev <command> [args]* del acl <acl> [<key>|#<ref>] Delete all the acl entries from the acl <acl> corresponding to the key <key>. - <acl> is the #<id> or the <file> returned by "show acl". If the <ref> is used, + <acl> is the #<id> or the <name> returned by "show acl". If the <ref> is used, this command delete only the listed reference. The reference can be found with - listing the content of the acl. Note that if the reference <acl> is a file and + listing the content of the acl. Note that if the reference <acl> is a name and is shared with a map, the entry will be also deleted in the map. del map <map> [<key>|#<ref>] Delete all the map entries from the map <map> corresponding to the key <key>. - <map> is the #<id> or the <file> returned by "show map". If the <ref> is used, + <map> is the #<id> or the <name> returned by "show map". If the <ref> is used, this command delete only the listed reference. The reference can be found with - listing the content of the map. Note that if the reference <map> is a file and + listing the content of the map. Note that if the reference <map> is a name and is shared with a acl, the entry will be also deleted in the map. del ssl ca-file <cafile> @@ -1992,7 +2007,7 @@ del server <backend>/<server> Remove a server attached to the backend <backend>. All servers are eligible, except servers which are referenced by other configuration elements. The server must be put in maintenance mode prior to its deletion. The operation - is cancelled if the serveur still has active or idle connection or its + is cancelled if the server still has active or idle connection or its connection queue is not empty. disable agent <backend>/<server> @@ -2060,6 +2075,10 @@ disable server <backend>/<server> This command is restricted and can only be issued on sockets configured for level "admin". +dump stats-file + Generate a stats-file which can be used to preload haproxy counters values on + startup. See "Stats-file" section for more detail. + enable agent <backend>/<server> Resume auxiliary agent check that was temporarily stopped. @@ -2142,7 +2161,7 @@ expert-mode [on|off] get map <map> <value> get acl <acl> <value> Lookup the value <value> in the map <map> or in the ACL <acl>. <map> or <acl> - are the #<id> or the <file> returned by "show map" or "show acl". This command + are the #<id> or the <name> returned by "show map" or "show acl". This command returns all the matching patterns associated with this map. This is useful for debugging maps and ACLs. The output format is composed by one line par matching type. Each line is composed by space-delimited series of words. @@ -2219,7 +2238,7 @@ new ssl crl-file <crlfile> prepare acl <acl> Allocate a new version number in ACL <acl> for atomic replacement. <acl> is - the #<id> or the <file> returned by "show acl". The new version number is + the #<id> or the <name> returned by "show acl". The new version number is shown in response after "New version created:". This number will then be usable to prepare additions of new entries into the ACL which will then atomically replace the current ones once committed. It is reported as @@ -2227,12 +2246,12 @@ prepare acl <acl> unused versions will automatically be removed once a more recent version is committed. Version numbers are unsigned 32-bit values which wrap at the end, so care must be taken when comparing them in an external program. This - command cannot be used if the reference <acl> is a file also used as a map. + command cannot be used if the reference <acl> is a name also used as a map. In this case, the "prepare map" command must be used instead. prepare map <map> Allocate a new version number in map <map> for atomic replacement. <map> is - the #<id> or the <file> returned by "show map". The new version number is + the #<id> or the <name> returned by "show map". The new version number is shown in response after "New version created:". This number will then be usable to prepare additions of new entries into the map which will then atomically replace the current ones once committed. It is reported as @@ -2281,7 +2300,7 @@ set anon global-key <key> set map <map> [<key>|#<ref>] <value> Modify the value corresponding to each key <key> in a map <map>. <map> is the - #<id> or <file> returned by "show map". If the <ref> is used in place of + #<id> or <name> returned by "show map". If the <ref> is used in place of <key>, only the entry pointed by <ref> is changed. The new value is <value>. set maxconn frontend <frontend> <value> @@ -2547,7 +2566,7 @@ set weight <backend>/<server> <weight>[%] show acl [[@<ver>] <acl>] Dump info about acl converters. Without argument, the list of all available acls is returned. If a <acl> is specified, its contents are dumped. <acl> is - the #<id> or <file>. By default the current version of the ACL is shown (the + the #<id> or <name>. By default the current version of the ACL is shown (the version currently being matched against and reported as 'curr_ver' in the ACL list). It is possible to instead dump other versions by prepending '@<ver>' before the ACL's identifier. The version works as a filter and non-existing @@ -2930,7 +2949,7 @@ show libs show map [[@<ver>] <map>] Dump info about map converters. Without argument, the list of all available maps is returned. If a <map> is specified, its contents are dumped. <map> is - the #<id> or <file>. By default the current version of the map is shown (the + the #<id> or <name>. By default the current version of the map is shown (the version currently being matched against and reported as 'curr_ver' in the map list). It is possible to instead dump other versions by prepending '@<ver>' before the map's identifier. The version works as a filter and non-existing @@ -3068,14 +3087,22 @@ show resolvers [<resolvers section id>] too_big: too big response outdated: number of response arrived too late (after another name server) -show quic [oneline|full] [all] +show quic [<format>] [<filter>] Dump information on all active QUIC frontend connections. This command is restricted and can only be issued on sockets configured for levels "operator" - or "admin". An optional format can be specified as first argument to control - the verbosity. Currently supported values are "oneline" which is the default - if format is unspecified or "full". By default, connections on closing or - draining state are not displayed. Use the extra argument "all" to include - them in the output. + or "admin". + + An optional argument can be specified to control the verbosity. Its value can + be interpreted in different way. The first possibility is to used predefined + values, "oneline" for the default format and "full" to display all + information. Alternatively, a list of comma-delimited fields can be specified + to restrict output. Currently supported values are "tp", "sock", "pktns", + "cc" and "mux". + + The final argument is used to restrict or extend the connection list. By + default, connections on closing or draining state are not displayed. Use the + extra argument "all" to include them in the output. It's also possible to + restrict to a single connection by specifying its hexadecimal address. show servers conn [<backend>] Dump the current and idle connections state of the servers belonging to the @@ -3998,6 +4025,37 @@ update ssl ocsp-response <certfile> local tree, its contents will be displayed on the standard output. The format is the same as the one described in "show ssl ocsp-response". +wait { -h | <delay> } [<condition> [<args>...]] + In its simplest form without any condition, this simply waits for the + requested delay before continuing. This can be used to collect metrics around + a specific interval. + + With a condition and optional arguments, the command will wait for the + specified condition to be satisfied, to unrecoverably fail, or to remain + unsatisfied for the whole <delay> duration. The supported conditions are: + + - srv-removable <proxy>/<server> : this will wait for the specified server to + be removable, i.e. be in maintenance and no longer have any connection on + it. Some conditions will never be accepted (e.g. not in maintenance) and + will cause the report of a specific error message indicating what condition + is not met. The server might even have been removed in parallel and no + longer exit. If everything is OK before the delay, a success is returned + and the operation is terminated. + + The default unit for the delay is milliseconds, though other units are + accepted if suffixed with the usual timer units (us, ms, s, m, h, d). When + used with the 'socat' utility, do not forget to extend socat's close timeout + to cover the wait time. Passing "-h" as the first or second argument provides + the command's usage. + Example: + $ socat -t20 /path/to/socket - <<< "show activity; wait 10s; show activity" + + $ socat -t5 /path/to/socket - <<< " + disable server px/srv1 + shutdown sessions server px/srv1 + wait 2s srv-removable px/srv1 + del server px/srv1" + 9.4. Master CLI --------------- @@ -4122,7 +4180,7 @@ reload return a reload status, once the reload was performed. Be careful with the timeout if a tool is used to parse it, it is only returned once the configuration is parsed and the new worker is forked. The "socat" command uses - a timeout of 0.5s by default so it will quits before showing the message if + a timeout of 0.5s by default so it will quit before showing the message if the reload is too long. "ncat" does not have a timeout by default. When compiled with USE_SHM_OPEN=1, the reload command is also able to dump the startup-logs of the master. @@ -4189,6 +4247,29 @@ show startup-logs Those messages are also dumped with the "reload" command. + +9.5. Stats-file +-------------- + +A so-called stats-file can be used to preload internal haproxy counters on +process startup with non-null values. Its main purpose is to preserve +statistics for worker processes across reloads. Only an excerpt of all the +exposed haproxy statistics is present in a stats-file as it only makes sense to +preload metric-type values. + +For the moment, only proxy counters are supported in stats-file. This allows to +preload values for frontends, backends, servers and listeners. However only +objects instances with a non-empty GUID are stored in a stats-file. This +guarantees that value will be preloaded for object with matching type and GUID, +even if other parameters differ. + +The CLI command "dump stats-file" purpose is to generate a stats-file. Format +of the stats-file is internally defined and freely subject to future changes +and extension. It is designed to be compatible at least across adjacent +haproxy stable branch releases, but may require optional extra configuration +when loading a stats-file to a process running on an older version. + + 10. Tricks for easier configuration management ---------------------------------------------- @@ -4208,7 +4289,7 @@ using regular expressions involving the dollar symbol). Environment variables also make it convenient to write configurations which are expected to work on various sites where only the address changes. It can also -permit to remove passwords from some configs. Example below where the the file +permit to remove passwords from some configs. Example below where the file "site1.env" file is sourced by the init script upon startup : $ cat site1.env @@ -4520,3 +4601,73 @@ A safe configuration will have : stats socket /var/run/haproxy.stat uid hatop gid hatop mode 600 +13.1. Linux capabilities support +------------------------------ + +Since version v2.9 haproxy supports Linux capabilities. If the binary is +compiled with USE_LINUX_CAP=1, it is able to preserve capabilities given in +'setcap' keyword during switching from root user to a non-root. + +Since version v3.1 haproxy also checks if capabilities given in 'setcap' +keyword were set in its binary file Permitted set by administrator +(capget syscall). If this a case it performs transition of these capabilities +in its process Effective set (capset syscall), while running as a non-root +user. + +This was done to avoid all potential use cases when haproxy starts and runs as +root: transparent proxy mode, binding to privileged ports. + +'setcap' keyword supports following network capabilities: +- cap_net_admin: transparent proxying, binding socket to a specific network + interface, using set-mark action; +- cap_net_raw (subset of cap_net_admin): transparent proxying; +- cap_net_bind_service: binding socket to a specific network interface; +- cap_sys_admin: creating socket in a specific network namespace. + +Haproxy never does the transition of these capabilities from its Permitted set +to the Effective, if they are not listed as 'setcap' argument. See more +information about 'setcap' keyword and supported capabilities in the chapter +3.1 Process management and security in the Configuration guide. + +Administrator may add needed capabilities in the haproxy binary file Permitted +set with the following command: + +Example: + # setcap cap_net_admin,cap_net_bind_service=p /usr/local/sbin/haproxy + +Added capabilities will be seen in process Permitted set after its start. +If the same capabilities are the arguments of 'setcap' keyword, they could be +also seen in the process Effective set. This could be check with the following +command: + +Example: + # grep Cap /proc/<haproxy PID>/status + CapInh: 0000000000000000 + CapPrm: 0000000000001400 + CapEff: 0000000000001400 + CapBnd: 000001ffffffffff + CapAmb: 0000000000000000 + +See more details about setcap and capabilities sets in Linux man pages +(capabilities(7)). + +In some use cases like transparent proxying or creating socket in a specific +network namespace, configuration file parser detects that cap_net_raw or +cap_sys_admin or some other supported capabilities are needed. Then, during +the initialization stage, haproxy process checks, if these capabilities could +be put in its Effective set. If it's not possible due to capget or capset +syscall failure (restrictions set on syscalls by some security modules like +SELinux, Seccomp, etc), process emits diagnostic warnings (start with -dD). + +Due to support of many different platforms with different system settings, +it's impossible for the parser to deduce from the configuration file, if +binding to privileged ports will be done. So, in the case of insufficient +privileges (run as non-root) process will terminate only with an alert +message like below. It's up to a user to recheck its configuration and haproxy +binary capabilities set. + +Example: + $ haproxy -dD -f haproxy.cfg + ... + [ALERT] (96797) : Binding [haproxy.cfg:36] for frontend fe: cannot bind socket (Permission denied) for [0.0.0.0:80] + [ALERT] (96797) : [haproxy.main()] Some protocols failed to start their listeners! Exiting. diff --git a/doc/peers-v2.0.txt b/doc/peers-v2.0.txt index 711c949..3b82369 100644 --- a/doc/peers-v2.0.txt +++ b/doc/peers-v2.0.txt @@ -227,6 +227,8 @@ bit 22: gpt array 23: gpc array 24: gpc rate array + 25: glitch counter + 26: glitch rate d) Table Switch Message |