diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 09:35:11 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 09:35:11 +0000 |
commit | da76459dc21b5af2449af2d36eb95226cb186ce2 (patch) | |
tree | 542ebb3c1e796fac2742495b8437331727bbbfa0 /doc | |
parent | Initial commit. (diff) | |
download | haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.tar.xz haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.zip |
Adding upstream version 2.6.12.upstream/2.6.12upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc')
94 files changed, 57283 insertions, 0 deletions
diff --git a/doc/51Degrees-device-detection.txt b/doc/51Degrees-device-detection.txt new file mode 100644 index 0000000..8c69bb1 --- /dev/null +++ b/doc/51Degrees-device-detection.txt @@ -0,0 +1,152 @@ +51Degrees Device Detection +-------------------------- + +You can also include 51Degrees for inbuilt device detection enabling attributes +such as screen size (physical & pixels), supported input methods, release date, +hardware vendor and model, browser information, and device price among many +others. Such information can be used to improve the user experience of a web +site by tailoring the page content, layout and business processes to the +precise characteristics of the device. Such customisations improve profit by +making it easier for customers to get to the information or services they +need. Attributes of the device making a web request can be added to HTTP +headers as configurable parameters. + +In order to enable 51Degrees download the 51Degrees source code from the +official git repository : + + - either use the proven stable but frozen 3.2.10 version which + supports the Trie algorithm : + + git clone https://git.51Degrees.com/Device-Detection.git -b v3.2.10 + + - or use the new 3.2.12.12 version which continues to receive database + updates and supports a new Hash Trie algorithm, but which is not + compatible with older Trie databases : + + git clone https://github.com/51Degrees/Device-Detection.git -b v3.2.12 + +then run 'make' with USE_51DEGREES and 51DEGREES_SRC set. Both 51DEGREES_INC +and 51DEGREES_LIB may additionally be used to force specific different paths +for .o and .h, but will default to 51DEGREES_SRC. Make sure to replace +'51D_REPO_PATH' with the path to the 51Degrees repository. + +51Degrees provide 3 different detection algorithms: + + 1. Pattern - balances main memory usage and CPU. + 2. Trie - a very high performance detection solution which uses more main + memory than Pattern. + 3. Hash Trie - replaces Trie, 3x faster, 80% lower memory consumption and + tuning options. + +To make with 51Degrees Pattern algorithm use the following command line. + + $ make TARGET=<target> USE_51DEGREES=1 51DEGREES_SRC='51D_REPO_PATH'/src/pattern + +To use the 51Degrees Trie algorithm use the following command line. + + $ make TARGET=<target> USE_51DEGREES=1 51DEGREES_SRC='51D_REPO_PATH'/src/trie + +A data file containing information about devices, browsers, operating systems +and their associated signatures is then needed. 51Degrees provide a free +database with Github repo for this purpose. These free data files are located +in '51D_REPO_PATH'/data with the extensions .dat for Pattern data and .trie for +Trie data. Free Hash Trie data file can be obtained by signing up for a licence +key at https://51degrees.com/products/store/on-premise-device-detection. + +For HAProxy developers who need to verify that their changes didn't affect the +51Degrees implementation, a dummy library is provided in the +"addons/51degrees/dummy" directory. This does not function, but implements the +API such that the 51Degrees module can be used (but not return any meaningful +information). To test either Pattern or Hash Trie, build with: + + $ make TARGET=<target> USE_51DEGREES=1 51DEGREES_SRC=addons/51degrees/dummy/pattern +or + $ make TARGET=<target> USE_51DEGREES=1 51DEGREES_SRC=addons/51degrees/dummy/trie + +respectively. + +The configuration file needs to set the following parameters: + + global + 51degrees-data-file path to the Pattern or Trie data file + 51degrees-property-name-list list of 51Degrees properties to detect + 51degrees-property-separator separator to use between values + 51degrees-cache-size LRU-based cache size (disabled by default) + +The following is an example of the settings for Pattern. + + global + 51degrees-data-file '51D_REPO_PATH'/data/51Degrees-LiteV3.2.dat + 51degrees-property-name-list IsTablet DeviceType IsMobile + 51degrees-property-separator , + 51degrees-cache-size 10000 + +HAProxy needs a way to pass device information to the backend servers. This is +done by using the 51d converter or fetch method, which intercepts the HTTP +headers and creates some new headers. This is controlled in the frontend +http-in section. + +The following is an example which adds two new HTTP headers prefixed X-51D- + + frontend http-in + bind *:8081 + default_backend servers + http-request set-header X-51D-DeviceTypeMobileTablet %[51d.all(DeviceType,IsMobile,IsTablet)] + http-request set-header X-51D-Tablet %[51d.all(IsTablet)] + +Here, two headers are created with 51Degrees data, X-51D-DeviceTypeMobileTablet +and X-51D-Tablet. Any number of headers can be created this way and can be +named anything. 51d.all( ) invokes the 51degrees fetch. It can be passed up to +five property names of values to return. Values will be returned in the same +order, separated by the 51-degrees-property-separator configured earlier. If a +property name can't be found the value 'NoData' is returned instead. + +In addition to the device properties three additional properties related to the +validity of the result can be returned when used with the Pattern method. The +following example shows how Method, Difference and Rank could be included as one +new HTTP header X-51D-Stats. + + frontend http-in + ... + http-request set-header X-51D-Stats %[51d.all(Method,Difference,Rank)] + +These values indicate how confident 51Degrees is in the result that that was +returned. More information is available on the 51Degrees web site at: + + https://51degrees.com/support/documentation/pattern + +The above 51d.all fetch method uses all available HTTP headers for detection. A +modest performance improvement can be obtained by only passing one HTTP header +to the detection method with the 51d.single converter. The following example +uses the User-Agent HTTP header only for detection. + + frontend http-in + ... + http-request set-header X-51D-DeviceTypeMobileTablet %[req.fhdr(User-Agent),51d.single(DeviceType,IsMobile,IsTablet)] + +Any HTTP header could be used inplace of User-Agent by changing the parameter +provided to req.fhdr. + +When compiled to use the Trie detection method the trie format data file needs +to be provided. Changing the extension of the data file from dat to trie will +use the correct data. + + global + 51degrees-data-file '51D_REPO_PATH'/data/51Degrees-LiteV3.2.trie + +When used with Trie the Method, Difference and Rank properties are not +available. + +The free Lite data file contains information about screen size in pixels and +whether the device is a mobile. A full list of available properties is located +on the 51Degrees web site at: + + https://51degrees.com/resources/property-dictionary + +Some properties are only available in the paid for Premium and Enterprise +versions of 51Degrees. These data sets not only contain more properties but +are updated weekly and daily and contain signatures for 100,000s of different +device combinations. For more information see the data options comparison web +page: + + https://51degrees.com/compare-data-options diff --git a/doc/DeviceAtlas-device-detection.txt b/doc/DeviceAtlas-device-detection.txt new file mode 100644 index 0000000..b600918 --- /dev/null +++ b/doc/DeviceAtlas-device-detection.txt @@ -0,0 +1,82 @@ +DeviceAtlas Device Detection +---------------------------- + +In order to add DeviceAtlas Device Detection support, you would need to download +the API source code from https://deviceatlas.com/deviceatlas-haproxy-module. +The build supports the USE_PCRE and USE_PCRE2 options. Once extracted : + + $ make TARGET=<target> USE_PCRE=1 (or USE_PCRE2=1) USE_DEVICEATLAS=1 DEVICEATLAS_SRC=<path to the API root folder> + +Optionally DEVICEATLAS_INC and DEVICEATLAS_LIB may be set to override the path +to the include files and libraries respectively if they're not in the source +directory. However, if the API had been installed beforehand, DEVICEATLAS_SRC +can be omitted. Note that the DeviceAtlas C API version supported is the 2.4.0 +at minimum. + +For HAProxy developers who need to verify that their changes didn't accidentally +break the DeviceAtlas code, it is possible to build a dummy library provided in +the addons/deviceatlas/dummy directory and to use it as an alternative for the +full library. This will not provide the full functionalities, it will just allow +haproxy to start with a deviceatlas configuration, which generally is enough to +validate API changes : + + $ make TARGET=<target> USE_PCRE=1 USE_DEVICEATLAS=1 DEVICEATLAS_SRC=$PWD/addons/deviceatlas/dummy + +These are supported DeviceAtlas directives (see doc/configuration.txt) : + - deviceatlas-json-file <path to the DeviceAtlas JSON data file>. + - deviceatlas-log-level <number> (0 to 3, level of information returned by + the API, 0 by default). + - deviceatlas-property-separator <character> (character used to separate the + properties produced by the API, | by default). + +Sample configuration : + + global + deviceatlas-json-file <path to json file> + + ... + frontend + bind *:8881 + default_backend servers + +There are two distinct methods available, one which leverages all HTTP headers +and one which uses only a single HTTP header for the detection. The former +method is highly recommended and more accurate. There are several possible use +cases. + +# To transmit the DeviceAtlas data downstream to the target application + +All HTTP headers via the sample / fetch + + http-request set-header X-DeviceAtlas-Data %[da-csv-fetch(primaryHardwareType,osName,osVersion,browserName,browserVersion,browserRenderingEngine)] + +Single HTTP header (e.g. User-Agent) via the converter + + http-request set-header X-DeviceAtlas-Data %[req.fhdr(User-Agent),da-csv-conv(primaryHardwareType,osName,osVersion,browserName,browserVersion,browserRenderingEngine)] + +# Mobile content switching with ACL + +All HTTP headers + + acl is_mobile da-csv-fetch(mobileDevice) 1 + +Single HTTP header + + acl device_type_tablet req.fhdr(User-Agent),da-csv-conv(primaryHardwareType) "Tablet" + +Optionally a JSON download scheduler is provided to allow a data file being +fetched automatically in a daily basis without restarting HAProxy : + + $ cd addons/deviceatlas && make [DEVICEATLAS_SRC=<path to the API root folder>] + +Similarly, if the DeviceAtlas API is installed, DEVICEATLAS_SRC can be omitted. + + $ ./dadwsch -u JSON data file URL e.g. "https://deviceatlas.com/getJSON?licencekey=<your licence key>&format=zip&data=my&index=web" \ + [-p download directory path /tmp by default] \ + [-d scheduled hour of download, hour when the service is launched by default] + +Noted it needs to be started before HAProxy. + + +Please find more information about DeviceAtlas and the detection methods at +https://deviceatlas.com/resources . diff --git a/doc/SOCKS4.protocol.txt b/doc/SOCKS4.protocol.txt new file mode 100644 index 0000000..06aee8a --- /dev/null +++ b/doc/SOCKS4.protocol.txt @@ -0,0 +1 @@ +Please reference to "https://www.openssh.com/txt/socks4.protocol".
\ No newline at end of file diff --git a/doc/SPOE.txt b/doc/SPOE.txt new file mode 100644 index 0000000..9a4cc4b --- /dev/null +++ b/doc/SPOE.txt @@ -0,0 +1,1255 @@ + ----------------------------------------------- + Stream Processing Offload Engine (SPOE) + Version 1.2 + ( Last update: 2020-06-13 ) + ----------------------------------------------- + Author : Christopher Faulet + Contact : cfaulet at haproxy dot com + + +SUMMARY +-------- + + 0. Terms + 1. Introduction + 2. SPOE configuration + 2.1. SPOE scope + 2.2. "spoe-agent" section + 2.3. "spoe-message" section + 2.4. "spoe-group" section + 2.5. Example + 3. SPOP specification + 3.1. Data types + 3.2. Frames + 3.2.1. Frame capabilities + 3.2.2. Frame types overview + 3.2.3. Workflow + 3.2.4. Frame: HAPROXY-HELLO + 3.2.5. Frame: AGENT-HELLO + 3.2.6. Frame: NOTIFY + 3.2.7. Frame: ACK + 3.2.8. Frame: HAPROXY-DISCONNECT + 3.2.9. Frame: AGENT-DISCONNECT + 3.3. Events & messages + 3.4. Actions + 3.5. Errors & timeouts + 4. Logging + + +0. Terms +--------- + +* SPOE : Stream Processing Offload Engine. + + A SPOE is a filter talking to servers managed by a SPOA to offload the + stream processing. An engine is attached to a proxy. A proxy can have + several engines. Each engine is linked to an agent and only one. + +* SPOA : Stream Processing Offload Agent. + + A SPOA is a service that will receive info from a SPOE to offload the + stream processing. An agent manages several servers. It uses a backend to + reference all of them. By extension, these servers can also be called + agents. + +* SPOP : Stream Processing Offload Protocol, used by SPOEs to talk to SPOA + servers. + + This protocol is used by engines to talk to agents. It is an in-house + binary protocol described in this documentation. + + +1. Introduction +---------------- + +SPOE is a feature introduced in HAProxy 1.7. It makes possible the +communication with external components to retrieve some info. The idea started +with the problems caused by most ldap libs not working fine in event-driven +systems (often at least the connect() is blocking). So, it is hard to properly +implement Single Sign On solution (SSO) in HAProxy. The SPOE will ease this +kind of processing, or we hope so. + +Now, the aim of SPOE is to allow any kind of offloading on the streams. First +releases won't do lot of things. As we will see, there are few handled events +and even less actions supported. Actually, for now, the SPOE can offload the +processing before "tcp-request content", "tcp-response content", "http-request" +and "http-response" rules. And it only supports variables definition. But, in +spite of these limited features, we can easily imagine to implement SSO +solution, ip reputation or ip geolocation services. + +Some example implementations in various languages are linked to from the +HAProxy Wiki page dedicated to this mechanism: + + https://github.com/haproxy/wiki/wiki/SPOE:-Stream-Processing-Offloading-Engine + +2. SPOE configuration +---------------------- + +Because SPOE is implemented as a filter, To use it, you must declare a "filter +spoe" line in a proxy section (frontend/backend/listen) : + + frontend my-front + ... + filter spoe [engine <name>] config <file> + ... + +The "config" parameter is mandatory. It specififies the SPOE configuration +file. The engine name is optional. It can be set to declare the scope to use in +the SPOE configuration. So it is possible to use the same SPOE configuration +for several engines. If no name is provided, the SPOE configuration must not +contain any scope directive. + +We use a separate configuration file on purpose. By commenting SPOE filter +line, you completely disable the feature, including the parsing of sections +reserved to SPOE. This is also a way to keep the HAProxy configuration clean. + +A SPOE configuration file must contains, at least, the SPOA configuration +("spoe-agent" section) and SPOE messages/groups ("spoe-message" or "spoe-group" +sections) attached to this agent. + +IMPORTANT : The configuration of a SPOE filter must be located in a dedicated +file. But the backend used by a SPOA must be declared in HAProxy configuration +file. + +2.1. SPOE scope +------------------------- + +If you specify an engine name on the SPOE filter line, then you need to define +scope in the SPOE configuration with the same name. You can have several SPOE +scope in the same file. In each scope, you must define one and only one +"spoe-agent" section to configure the SPOA linked to your SPOE and several +"spoe-message" and "spoe-group" sections to describe, respectively, messages and +group of messages sent to servers mananged by your SPOA. + +A SPOE scope starts with this kind of line : + + [<name>] + +where <name> is the same engine name specified on the SPOE filter line. The +scope ends when the file ends or when another scope is found. + + Example : + [my-first-engine] + spoe-agent my-agent + ... + spoe-message msg1 + ... + spoe-message msg2 + ... + spoe-group grp1 + ... + spoe-group grp2 + ... + + [my-second-engine] + ... + +If no engine name is provided on the SPOE filter line, no SPOE scope must be +found in the SPOE configuration file. All the file is considered to be in the +same anonymous and implicit scope. + +The engine name must be uniq for a proxy. If no engine name is provided on the +SPOE filter line, the SPOE agent name is used by default. + +2.2. "spoe-agent" section +-------------------------- + +For each engine, you must define one and only one "spoe-agent" section. In this +section, you will declare SPOE messages and the backend you will use. You will +also set timeouts and options to customize your agent's behaviour. + + +spoe-agent <name> + Create a new SPOA with the name <name>. It must have one and only one + "spoe-agent" definition by SPOE scope. + + Arguments : + <name> is the name of the agent section. + + following keywords are supported : + - groups + - log + - maxconnrate + - maxerrrate + - max-frame-size + - max-waiting-frames + - messages + - [no] option async + - [no] option dontlog-normal + - [no] option pipelining + - [no] option send-frag-payload + - option continue-on-error + - option force-set-var + - option set-on-error + - option set-process-time + - option set-total-time + - option var-prefix + - register-var-names + - timeout hello|idle|processing + - use-backend + + +groups <grp-name> ... + Declare the list of SPOE groups that an agent will handle. + + Arguments : + <grp-name> is the name of a SPOE group. + + Groups declared here must be found in the same engine scope, else an error is + triggered during the configuration parsing. You can have many "groups" lines. + + See also: "spoe-group" section. + + +log global +log <address> [len <length>] [format <format>] <facility> [<level> [<minlevel>]] +no log + Enable per-instance logging of events and traffic. + + Prefix : + no should be used when the logger list must be flushed. + + See the HAProxy Configuration Manual for details about this option. + +maxconnrate <number> + Set the maximum number of connections per second to <number>. The SPOE will + stop to open new connections if the maximum is reached and will wait to + acquire an existing one. So it is important to set "timeout hello" to a + relatively small value. + + +maxerrrate <number> + Set the maximum number of errors per second to <number>. The SPOE will stop + its processing if the maximum is reached. + + +max-frame-size <number> + Set the maximum allowed size for frames exchanged between HAProxy and SPOA. + It must be in the range [256, tune.bufsize-4] (4 bytes are reserved for the + frame length). By default, it is set to (tune.bufsize-4). + +max-waiting-frames <number> + Set the maximum number of frames waiting for an acknowledgement on the same + connection. This value is only used when the pipelinied or asynchronus + exchanges between HAProxy and SPOA are enabled. By default, it is set to 20. + +messages <msg-name> ... + Declare the list of SPOE messages that an agent will handle. + + Arguments : + <msg-name> is the name of a SPOE message. + + Messages declared here must be found in the same engine scope, else an error + is triggered during the configuration parsing. You can have many "messages" + lines. + + See also: "spoe-message" section. + + +option async +no option async + Enable or disable the support of asynchronus exchanges between HAProxy and + SPOA. By default, this option is enabled. + + +option continue-on-error + Do not stop the events processing when an error occurred on a stream. + + By default, for a specific stream, when an abnormal/unexpected error occurs, + the SPOE is disabled for all the transaction. So if you have several events + configured, such error on an event will disabled all following. For TCP + streams, this will disable the SPOE for the whole session. For HTTP streams, + this will disable it for the transaction (request and response). + + When set, this option bypass this behaviour and only the current event will + be ignored. + + +option dontlog-normal +no option dontlog-normal + Enable or disable logging of normal, successful processing. + + Arguments : none + + See also: "log" and section 4 about logging. + + +option force-set-var + By default, SPOE filter only register already known variables (mainly from + parsing of the configuration), and process-wide variables (those of scope + "proc") cannot be created. If you want that haproxy trusts the agent and + registers all variables (ex: can be useful for LUA workload), activate this + option. + + Caution : this option opens to a variety of attacks such as a rogue SPOA that + asks to register too many variables. + + +option pipelining +no option pipelining + Enable or disable the support of pipelined exchanges between HAProxy and + SPOA. By default, this option is enabled. + + +option send-frag-payload +no option send-frag-payload + Enable or disable the sending of fragmented payload to SPOA. By default, this + option is enabled. + + +option set-on-error <var name> + Define the variable to set when an error occurred during an event processing. + + Arguments : + + <var name> is the variable name, without the scope. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + + This variable will only be set when an error occurred in the scope of the + transaction. As for all other variables define by the SPOE, it will be + prefixed. So, if your variable name is "error" and your prefix is + "my_spoe_pfx", the variable will be "txn.my_spoe_pfx.error". + + When set, the variable is an integer representing the error reason. For values + under 256, it represents an error coming from the engine. Below 256, it + reports a SPOP error. In this case, to retrieve the right SPOP status code, + you must remove 256 to this value. Here are possible values: + + * 1 a timeout occurred during the event processing. + + * 2 an error was triggered during the resources allocation. + + * 3 the frame payload exceeds the frame size and it cannot be + fragmented. + + * 4 the fragmentation of a payload is aborted. + + * 5 The frame processing has been interrupted by HAProxy. + + * 255 an unknown error occurred during the event processing. + + * 256+N a SPOP error occurred during the event processing (see section + "Errors & timeouts"). + + Note that if "option continue-on-error" is set, the variable is not + automatically removed between events processing. + + See also: "option continue-on-error", "option var-prefix". + + +option set-process-time <var name> + Define the variable to set to report the processing time of the last event or + group. + + Arguments : + + <var name> is the variable name, without the scope. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + + This variable will be set in the scope of the transaction. As for all other + variables define by the SPOE, it will be prefixed. So, if your variable name + is "process_time" and your prefix is "my_spoe_pfx", the variable will be + "txn.my_spoe_pfx.process_time". + + When set, the variable is an integer representing the delay to process the + event or the group, in milliseconds. From the stream point of view, it is the + latency added by the SPOE processing for the last handled event or group. + + If several events or groups are processed for the same stream, this value + will be overrideen. + + See also: "option set-total-time". + + +option set-total-time <var name> + Define the variable to set to report the total processing time SPOE for a + stream. + + Arguments : + + <var name> is the variable name, without the scope. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + + This variable will be set in the scope of the transaction. As for all other + variables define by the SPOE, it will be prefixed. So, if your variable name + is "total_time" and your prefix is "my_spoe_pfx", the variable will be + "txn.my_spoe_pfx.total_time". + + When set, the variable is an integer representing the sum of processing times + for a stream, in milliseconds. From the stream point of view, it is the + latency added by the SPOE processing. + + If several events or groups are processed for the same stream, this value + will be updated. + + See also: "option set-process-time". + + +option var-prefix <prefix> + Define the prefix used when variables are set by an agent. + + Arguments : + + <prefix> is the prefix used to limit the scope of variables set by an + agent. + + To avoid conflict with other variables defined by HAProxy, all variables + names will be prefixed. By default, the "spoe-agent" name is used. This + option can be used to customize it. + + The prefix will be added between the variable scope and its name, separated + by a '.'. It may only contain characters 'a-z', 'A-Z', '0-9', '.' and '_', as + for variables name. In HAProxy configuration, you need to use this prefix as + a part of the variables name. For example, if an agent define the variable + "myvar" in the "txn" scope, with the prefix "my_spoe_pfx", then you should + use "txn.my_spoe_pfx.myvar" name in your HAProxy configuration. + + By default, an agent will never set new variables at runtime: It can only set + new value for existing ones. If you want a different behaviour, see + force-set-var option and register-var-names directive. + +register-var-names <var name> ... + Register some variable names. By default, an agent will not be allowed to set + new variables at runtime. This rule can be totally relaxed by setting the + option "force-set-var". If you know all the variables you will need, this + directive is a good way to register them without letting an agent doing what + it want. This is only required if these variables are not referenced anywhere + in the HAProxy configuration or the SPOE one. + + Arguments: + <var name> is a variable name without the scope. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + + The prefix will be automatically added during the registration. You can have + many "register-var-names" lines. + + See also: "option force-set-var", "option var-prefix". + +timeout hello <timeout> + Set the maximum time to wait for an agent to receive the AGENT-HELLO frame. + It is applied on the stream that handle the connection with the agent. + + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + This timeout is an applicative timeout. It differ from "timeout connect" + defined on backends. + + +timeout idle <timeout> + Set the maximum time to wait for an agent to close an idle connection. It is + applied on the stream that handle the connection with the agent. + + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + +timeout processing <timeout> + Set the maximum time to wait for a stream to process an event, i.e to acquire + a stream to talk with an agent, to encode all messages, to send the NOTIFY + frame, to receive the corresponding acknowledgement and to process all + actions. It is applied on the stream that handle the client and the server + sessions. + + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + +use-backend <backend> + Specify the backend to use. It must be defined. + + Arguments : + <backend> is the name of a valid "backend" section. + + +2.3. "spoe-message" section +---------------------------- + +To offload the stream processing, SPOE will send messages with specific +information at a specific moment in the stream life and will wait for +corresponding replies to know what to do. + + +spoe-message <name> + Create a new SPOE message with the name <name>. + + Arguments : + <name> is the name of the SPOE message. + + Here you define a message that can be referenced in a "spoe-agent" + section. Following keywords are supported : + - acl + - args + - event + + See also: "spoe-agent" section. + + +acl <aclname> <criterion> [flags] [operator] <value> ... + Declare or complete an access list. + + See section 7 about ACL usage in the HAProxy Configuration Manual. + + +args [name=]<sample> ... + Define arguments passed into the SPOE message. + + Arguments : + <sample> is a sample expression. + + When the message is processed, if a sample expression is not available, it is + set to NULL. Arguments are processed in their declaration order and added in + the message in that order. It is possible to declare named arguments. + + For example: + args frontend=fe_id src dst + + +event <name> [ { if | unless } <condition> ] + Set the event that triggers sending of the message. It may optionally be + followed by an ACL-based condition, in which case it will only be evaluated + if the condition is true. A SPOE message can only be sent on one event. If + several events are defined, only the last one is considered. + + ACL-based conditions are executed in the context of the stream that handle + the client and the server connections. + + Arguments : + <name> is the event name. + <condition> is a standard ACL-based condition. + + Supported events are: + - on-client-session + - on-server-session + - on-frontend-tcp-request + - on-backend-tcp-request + - on-tcp-response + - on-frontend-http-request + - on-backend-http-request + - on-http-response + + See section "Events & Messages" for more details about supported events. + See section 7 about ACL usage in the HAProxy Configuration Manual. + +2.4. "spoe-group" section +-------------------------- + +This section can be used to declare a group of SPOE messages. Unlike messages +referenced in a "spoe-agent" section, messages inside a group are not sent on a +specific event. The sending must be triggered by TCP or HTTP rules, from the +HAProxy configuration. + + +spoe-group <name> + Create a new SPOE group with the name <name>. + + Arguments : + <name> is the name of the SPOE group. + + Here you define a group of SPOE messages that can be referenced in a + "spoe-agent" section. Following keywords are supported : + - messages + + See also: "spoe-agent" and "spoe-message" sections. + + +messages <msg-name> ... + Declare the list of SPOE messages belonging to the group. + + Arguments : + <msg-name> is the name of a SPOE message. + + Messages declared here must be found in the same engine scope, else an error + is triggered during the configuration parsing. Furthermore, a message belongs + at most to a group. You can have many "messages" lines. + + See also: "spoe-message" section. + + +2.5. Example +------------- + +Here is a simple but complete example that sends client-ip address to a ip +reputation service. This service can set the variable "ip_score" which is an +integer between 0 and 100, indicating its reputation (100 means totally safe +and 0 a blacklisted IP with no doubt). + + ### + ### HAProxy configuration + frontend www + mode http + bind *:80 + + filter spoe engine ip-reputation config spoe-ip-reputation.conf + + # Reject connection if the IP reputation is under 20 + tcp-request content reject if { var(sess.iprep.ip_score) -m int lt 20 } + + default_backend http-servers + + backend http-servers + mode http + server http A.B.C.D:80 + + backend iprep-servers + mode tcp + balance roundrobin + + timeout connect 5s # greater than hello timeout + timeout server 3m # greater than idle timeout + + server iprep1 A1.B1.C1.D1:12345 + server iprep2 A2.B2.C2.D2:12345 + + #### + ### spoe-ip-reputation.conf + [ip-reputation] + + spoe-agent iprep-agent + messages get-ip-reputation + + option var-prefix iprep + + timeout hello 2s + timeout idle 2m + timeout processing 10ms + + use-backend iprep-servers + + spoe-message get-ip-reputation + args ip=src + event on-client-session if ! { src -f /etc/haproxy/whitelist.lst } + + +3. SPOP specification +---------------------- + +3.1. Data types +---------------- + +Here is the bytewise representation of typed data: + + TYPED-DATA : <TYPE:4 bits><FLAGS:4 bits><DATA> + +Supported types and their representation are: + + TYPE | ID | DESCRIPTION + -----------------------------+-----+---------------------------------- + NULL | 0 | NULL : <0> + Boolean | 1 | BOOL : <1+FLAG> + 32bits signed integer | 2 | INT32 : <2><VALUE:varint> + 32bits unsigned integer | 3 | UINT32 : <3><VALUE:varint> + 64bits signed integer | 4 | INT64 : <4><VALUE:varint> + 32bits unsigned integer | 5 | UNIT64 : <5><VALUE:varint> + IPV4 | 6 | IPV4 : <6><STRUCT IN_ADDR:4 bytes> + IPV6 | 7 | IPV6 : <7><STRUCT IN_ADDR6:16 bytes> + String | 8 | STRING : <8><LENGTH:varint><BYTES> + Binary | 9 | BINARY : <9><LENGTH:varint><BYTES> + 10 -> 15 unused/reserved | - | - + -----------------------------+-----+---------------------------------- + +Variable-length integer (varint) are encoded using Peers encoding: + + + 0 <= X < 240 : 1 byte (7.875 bits) [ XXXX XXXX ] + 240 <= X < 2288 : 2 bytes (11 bits) [ 1111 XXXX ] [ 0XXX XXXX ] + 2288 <= X < 264432 : 3 bytes (18 bits) [ 1111 XXXX ] [ 1XXX XXXX ] [ 0XXX XXXX ] + 264432 <= X < 33818864 : 4 bytes (25 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*2 [ 0XXX XXXX ] + 33818864 <= X < 4328786160 : 5 bytes (32 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*3 [ 0XXX XXXX ] + ... + +For booleans, the value (true or false) is the first bit in the FLAGS +bitfield. if this bit is set to 0, then the boolean is evaluated as false, +otherwise, the boolean is evaluated as true. + +3.2. Frames +------------ + +Exchange between HAProxy and agents are made using FRAME packets. All frames +must be prefixed with their size encoded on 4 bytes in network byte order: + + <FRAME-LENGTH:4 bytes> <FRAME> + +A frame always starts with its type, on one byte, followed by metadata +containing flags, on 4 bytes and a two variable-length integer representing the +stream identifier and the frame identifier inside the stream: + + FRAME : <FRAME-TYPE:1 byte> <METADATA> <FRAME-PAYLOAD> + METADATA : <FLAGS:4 bytes> <STREAM-ID:varint> <FRAME-ID:varint> + +Then comes the frame payload. Depending on the frame type, the payload can be +of three types: a simple key/value list, a list of messages or a list of +actions. + + FRAME-PAYLOAD : <LIST-OF-MESSAGES> | <LIST-OF-ACTIONS> | <KV-LIST> + + LIST-OF-MESSAGES : [ <MESSAGE-NAME> <NB-ARGS:1 byte> <KV-LIST> ... ] + MESSAGE-NAME : <STRING> + + LIST-OF-ACTIONS : [ <ACTION-TYPE:1 byte> <NB-ARGS:1 byte> <ACTION-ARGS> ... ] + ACTION-ARGS : [ <TYPED-DATA>... ] + + KV-LIST : [ <KV-NAME> <KV-VALUE> ... ] + KV-NAME : <STRING> + KV-VALUE : <TYPED-DATA> + + FLAGS : + + Flags are a 32 bits field. They are encoded on 4 bytes in network byte + order, where the bit 0 is the LSB. + + 0 1 2-31 + +---+---+----------+ + | | A | | + | F | B | | + | I | O | RESERVED | + | N | R | | + | | T | | + +---+---+----------+ + + FIN: Indicates that this is the final payload fragment. The first fragment + may also be the final fragment. + + ABORT: Indicates that the processing of the current frame must be + cancelled. This bit should be set on frames with a fragmented + payload. It can be ignore for frames with an unfragemnted + payload. When it is set, the FIN bit must also be set. + + +Frames cannot exceed a maximum size negotiated between HAProxy and agents +during the HELLO handshake. Most of time, payload will be small enough to send +it in one frame. But when supported by the peer, it will be possible to +fragment huge payload on many frames. This ability is announced during the +HELLO handshake and it can be asynmetric (supported by agents but not by +HAProxy or the opposite). The following rules apply to fragmentation: + + * An unfragemnted payload consists of a single frame with the FIN bit set. + + * A fragemented payload consists of several frames with the FIN bit clear and + terminated by a single frame with the FIN bit set. All these frames must + share the same STREAM-ID and FRAME-ID. The first frame must set the right + FRAME-TYPE (e.g, NOTIFY). The following frames must have an unset type (0). + +Beside the support of fragmented payload by a peer, some payload must not be +fragmented. See below for details. + +IMPORTANT : The maximum size supported by peers for a frame must be greater +than or equal to 256 bytes. + +3.2.1. Frame capabilities +-------------------------- + +Here are the list of official capabilities that HAProxy and agents can support: + + * fragmentation: This is the ability for a peer to support fragmented + payload in received frames. This is an asymmectical + capability, it only concerns the peer that announces + it. This is the responsibility to the other peer to use it + or not. + + * pipelining: This is the ability for a peer to decouple NOTIFY and ACK + frames. This is a symmectical capability. To be used, it must + be supported by HAProxy and agents. Unlike HTTP pipelining, the + ACK frames can be send in any order, but always on the same TCP + connection used for the corresponding NOTIFY frame. + + * async: This ability is similar to the pipelining, but here any TCP + connection established between HAProxy and the agent can be used to + send ACK frames. if an agent accepts connections from multiple + HAProxy, it can use the "engine-id" value to group TCP + connections. See details about HAPROXY-HELLO frame. + +Unsupported or unknown capabilities are silently ignored, when possible. + +NOTE: HAProxy does not support the fragmentation for now. This means it is not + able to handle fragmented frames. However, if an agent announces the + fragmentation support, HAProxy may choose to send fragemented frames. + +3.2.2. Frame types overview +---------------------------- + +Here are types of frame supported by SPOE. Frames sent by HAProxy come first, +then frames sent by agents : + + TYPE | ID | DESCRIPTION + -----------------------------+-----+------------------------------------- + UNSET | 0 | Used for all frames but the first when a + | | payload is fragmented. + -----------------------------+-----+------------------------------------- + HAPROXY-HELLO | 1 | Sent by HAProxy when it opens a + | | connection on an agent. + | | + HAPROXY-DISCONNECT | 2 | Sent by HAProxy when it want to close + | | the connection or in reply to an + | | AGENT-DISCONNECT frame + | | + NOTIFY | 3 | Sent by HAProxy to pass information + | | to an agent + -----------------------------+-----+------------------------------------- + AGENT-HELLO | 101 | Reply to a HAPROXY-HELLO frame, when + | | the connection is established + | | + AGENT-DISCONNECT | 102 | Sent by an agent just before closing + | | the connection + | | + ACK | 103 | Sent to acknowledge a NOTIFY frame + -----------------------------+-----+------------------------------------- + +Unknown frames may be silently skipped. + +3.2.3. Workflow +---------------- + + * Successful HELLO handshake: + + HAPROXY AGENT SRV + | HAPROXY-HELLO | + | (healthcheck: false) | + | --------------------------> | + | | + | AGENT-HELLO | + | <-------------------------- | + | | + + * Successful HELLO healthcheck: + + HAPROXY AGENT SRV + | HAPROXY-HELLO | + | (healthcheck: true) | + | --------------------------> | + | | + | AGENT-HELLO + close() | + | <-------------------------- | + | | + + + * Error encountered by agent during the HELLO handshake: + + HAPROXY AGENT SRV + | HAPROXY-HELLO | + | --------------------------> | + | | + | DISCONNECT + close() | + | <-------------------------- | + | | + + * Error encountered by HAProxy during the HELLO handshake: + + HAPROXY AGENT SRV + | HAPROXY-HELLO | + | --------------------------> | + | | + | AGENT-HELLO | + | <-------------------------- | + | | + | DISCONNECT | + | --------------------------> | + | | + | DISCONNECT + close() | + | <-------------------------- | + | | + + * Notify / Ack exchange (unfragmented payload): + + HAPROXY AGENT SRV + | NOTIFY | + | --------------------------> | + | | + | ACK | + | <-------------------------- | + | | + + * Notify / Ack exchange (fragmented payload): + + HAPROXY AGENT SRV + | NOTIFY (frag 1) | + | --------------------------> | + | | + | UNSET (frag 2) | + | --------------------------> | + | ... | + | UNSET (frag N) | + | --------------------------> | + | | + | ACK | + | <-------------------------- | + | | + + * Aborted fragmentation of a NOTIFY frame: + + HAPROXY AGENT SRV + | ... | + | UNSET (frag X) | + | --------------------------> | + | | + | ACK/ABORT | + | <-------------------------- | + | | + | UNSET (frag X+1) | + | -----------X | + | | + | | + + * Connection closed by haproxy: + + HAPROXY AGENT SRV + | DISCONNECT | + | --------------------------> | + | | + | DISCONNECT + close() | + | <-------------------------- | + | | + + * Connection closed by agent: + + HAPROXY AGENT SRV + | DISCONNECT + close() | + | <-------------------------- | + | | + +3.2.4. Frame: HAPROXY-HELLO +---------------------------- + +This frame is the first one exchanged between HAProxy and an agent, when the +connection is established. The payload of this frame is a KV-LIST. It cannot be +fragmented. STREAM-ID and FRAME-ID are must be set 0. + +Following items are mandatory in the KV-LIST: + + * "supported-versions" <STRING> + + Last SPOP major versions supported by HAProxy. It is a comma-separated list + of versions, following the format "Major.Minor". Spaces must be ignored, if + any. When a major version is announced by HAProxy, it means it also support + all previous minor versions. + + Example: "2.0, 1.5" means HAProxy supports SPOP 2.0 and 1.0 to 1.5 + + * "max-frame-size" <UINT32> + + This is the maximum size allowed for a frame. The HAPROXY-HELLO frame must + be lower or equal to this value. + + * "capabilities" <STRING> + + This a comma-separated list of capabilities supported by HAProxy. Spaces + must be ignored, if any. + +Following optional items can be added in the KV-LIST: + + * "healthcheck" <BOOLEAN> + + If this item is set to TRUE, then the HAPROXY-HELLO frame is sent during a + SPOE health check. When set to FALSE, this item can be ignored. + + * "engine-id" <STRING> + + This is a uniq string that identify a SPOE engine. + +To finish the HELLO handshake, the agent must return an AGENT-HELLO frame with +its supported SPOP version, the lower value between its maximum size allowed +for a frame and the HAProxy one and capabilities it supports. If an error +occurs or if an incompatibility is detected with the agent configuration, an +AGENT-DISCONNECT frame must be returned. + +3.2.5. Frame: AGENT-HELLO +-------------------------- + +This frame is sent in reply to a HAPROXY-HELLO frame to finish a HELLO +handshake. As for HAPROXY-HELLO frame, STREAM-ID and FRAME-ID are also set +0. The payload of this frame is a KV-LIST and it cannot be fragmented. + +Following items are mandatory in the KV-LIST: + + * "version" <STRING> + + This is the SPOP version the agent supports. It must follow the format + "Major.Minor" and it must be lower or equal than one of major versions + announced by HAProxy. + + * "max-frame-size" <UINT32> + + This is the maximum size allowed for a frame. It must be lower or equal to + the value in the HAPROXY-HELLO frame. This value will be used for all + subsequent frames. + + * "capabilities" <STRING> + + This a comma-separated list of capabilities supported by agent. Spaces must + be ignored, if any. + +At this time, if everything is ok for HAProxy (supported version and valid +max-frame-size value), the HELLO handshake is successfully completed. Else, +HAProxy sends a HAPROXY-DISCONNECT frame with the corresponding error. + +If "healthcheck" item was set to TRUE in the HAPROXY-HELLO frame, the agent can +safely close the connection without DISCONNECT frame. In all cases, HAProxy +will close the connection at the end of the health check. + +3.2.6. Frame: NOTIFY +--------------------- + +Information are sent to the agents inside NOTIFY frames. These frames are +attached to a stream, so STREAM-ID and FRAME-ID must be set. The payload of +NOTIFY frames is a LIST-OF-MESSAGES and, if supported by agents, it can be +fragmented. + +NOTIFY frames must be acknowledge by agents sending an ACK frame, repeating +right STREAM-ID and FRAME-ID. + +3.2.7. Frame: ACK +------------------ + +ACK frames must be sent by agents to reply to NOTIFY frames. STREAM-ID and +FRAME-ID found in a NOTIFY frame must be reuse in the corresponding ACK +frame. The payload of ACK frames is a LIST-OF-ACTIONS and, if supported by +HAProxy, it can be fragmented. + +3.2.8. Frame: HAPROXY-DISCONNECT +--------------------------------- + +If an error occurs, at anytime, from the HAProxy side, a HAPROXY-DISCONNECT +frame is sent with information describing the error. HAProxy will wait an +AGENT-DISCONNECT frame in reply. All other frames will be ignored. The agent +must then close the socket. + +The payload of this frame is a KV-LIST. It cannot be fragmented. STREAM-ID and +FRAME-ID are must be set 0. + +Following items are mandatory in the KV-LIST: + + * "status-code" <UINT32> + + This is the code corresponding to the error. + + * "message" <STRING> + + This is a textual message describing the error. + +For more information about known errors, see section "Errors & timeouts" + +3.2.9. Frame: AGENT-DISCONNECT +------------------------------- + +If an error occurs, at anytime, from the agent size, a AGENT-DISCONNECT frame +is sent, with information describing the error. such frame is also sent in reply +to a HAPROXY-DISCONNECT. The agent must close the socket just after sending +this frame. + +The payload of this frame is a KV-LIST. It cannot be fragmented. STREAM-ID and +FRAME-ID are must be set 0. + +Following items are mandatory in the KV-LIST: + + * "status-code" <UINT32> + + This is the code corresponding to the error. + + * "message" <STRING> + + This is a textual message describing the error. + +For more information about known errors, see section "Errors & timeouts" + +3.3. Events & Messages +----------------------- + +Information about streams are sent in NOTIFY frames. You can specify which kind +of information to send by defining "spoe-message" sections in your SPOE +configuration file. for each "spoe-message" there will be a message in a NOTIFY +frame when the right event is triggered. + +A NOTIFY frame is sent for an specific event when there is at least one +"spoe-message" attached to this event. All messages for an event will be added +in the same NOTIFY frame. + +Here is the list of supported events: + + * on-client-session is triggered when a new client session is created. + This event is only available for SPOE filters + declared in a frontend or a listen section. + + * on-frontend-tcp-request is triggered just before the evaluation of + "tcp-request content" rules on the frontend side. + This event is only available for SPOE filters + declared in a frontend or a listen section. + + * on-backend-tcp-request is triggered just before the evaluation of + "tcp-request content" rules on the backend side. + This event is skipped for SPOE filters declared + in a listen section. + + * on-frontend-http-request is triggered just before the evaluation of + "http-request" rules on the frontend side. This + event is only available for SPOE filters declared + in a frontend or a listen section. + + * on-backend-http-request is triggered just before the evaluation of + "http-request" rules on the backend side. This + event is skipped for SPOE filters declared in a + listen section. + + * on-server-session is triggered when the session with the server is + established. + + * on-tcp-response is triggered just before the evaluation of + "tcp-response content" rules. + + * on-http-response is triggered just before the evaluation of + "http-response" rules. + + +The stream processing will loop on these events, when triggered, waiting the +agent reply. + +3.4. Actions +------------- + +An agent must acknowledge each NOTIFY frame by sending the corresponding ACK +frame. Actions can be added in these frames to dynamically take action on the +processing of a stream. + +Here is the list of supported actions: + + * set-var set the value for an existing variable. 3 arguments must be + attached to this action: the variable scope (proc, sess, txn, + req or res), the variable name (a string) and its value. + + ACTION-SET-VAR : <SET-VAR:1 byte><NB-ARGS:1 byte><VAR-SCOPE:1 byte><VAR-NAME><VAR-VALUE> + + SET-VAR : <1> + NB-ARGS : <3> + VAR-SCOPE : <PROCESS> | <SESSION> | <TRANSACTION> | <REQUEST> | <RESPONSE> + VAR-NAME : <STRING> + VAR-VALUE : <TYPED-DATA> + + PROCESS : <0> + SESSION : <1> + TRANSACTION : <2> + REQUEST : <3> + RESPONSE : <4> + + * unset-var unset the value for an existing variable. 2 arguments must be + attached to this action: the variable scope (proc, sess, txn, + req or res) and the variable name (a string). + + ACTION-UNSET-VAR : <UNSET-VAR:1 byte><NB-ARGS:1 byte><VAR-SCOPE:1 byte><VAR-NAME> + + UNSET-VAR : <2> + NB-ARGS : <2> + VAR-SCOPE : <PROCESS> | <SESSION> | <TRANSACTION> | <REQUEST> | <RESPONSE> + VAR-NAME : <STRING> + + PROCESS : <0> + SESSION : <1> + TRANSACTION : <2> + REQUEST : <3> + RESPONSE : <4> + + +NOTE: Name of the variables will be automatically prefixed by HAProxy to avoid + name clashes with other variables used in HAProxy. Moreover, unknown + variable will be silently ignored. + +3.5. Errors & timeouts +---------------------- + +Here is the list of all known errors: + + STATUS CODE | DESCRIPTION + ----------------+-------------------------------------------------------- + 0 | normal (no error occurred) + 1 | I/O error + 2 | A timeout occurred + 3 | frame is too big + 4 | invalid frame received + 5 | version value not found + 6 | max-frame-size value not found + 7 | capabilities value not found + 8 | unsupported version + 9 | max-frame-size too big or too small + 10 | payload fragmentation is not supported + 11 | invalid interlaced frames + 12 | frame-id not found (it does not match any referenced frame) + 13 | resource allocation error + 99 | an unknown error occurrde + ----------------+-------------------------------------------------------- + +An agent can define its own errors using a not yet assigned status code. + +IMPORTANT NOTE: By default, for a specific stream, when an abnormal/unexpected + error occurs, the SPOE is disabled for all the transaction. So + if you have several events configured, such error on an event + will disabled all following. For TCP streams, this will + disable the SPOE for the whole session. For HTTP streams, this + will disable it for the transaction (request and response). + See 'option continue-on-error' to bypass this limitation. + +To avoid a stream to wait undefinetly, you must carefully choose the +acknowledgement timeout. In most of cases, it will be quiet low. But it depends +on the responsivness of your service. + +You must also choose idle timeout carefully. Because connection with your +service depends on the backend configuration used by the SPOA, it is important +to use a lower value for idle timeout than the server timeout. Else the +connection will be closed by HAProxy. The same is true for hello timeout. You +should choose a lower value than the connect timeout. + +4. Logging +----------- + +Activity of an SPOE is logged using HAProxy's logger. The messages are logged +in the context of the streams that handle the client and the server +connections. A message is emitted for each event or group handled by an +SPOE. Depending on the status code, the log level will be different. In the +normal case, when no error occurred, the message is logged with the level +LOG_NOTICE. Otherwise, the message is logged with the level LOG_WARNING. + +The messages are logged using the agent's logger, if defined, and use the +following format: + + SPOE: [AGENT] <TYPE:NAME> sid=STREAM-ID st=STATUS-CODE reqT/qT/wT/resT/pT \ + <idles>/<applets> <nb_sending>/<nb_waiting> <nb_error>/<nb_processed> + + AGENT is the agent name + TYPE is EVENT of GROUP + NAME is the event or the group name + STREAM-ID is an integer, the unique id of the stream + STATUS_CODE is the processing's status code + reqT/qT/wT/resT/pT are the following time events: + + * reqT : the encoding time. It includes ACLs processing, if any. For + fragmented frames, it is the sum of all fragments. + * qT : the delay before the request gets out the sending queue. For + fragmented frames, it is the sum of all fragments. + * wT : the delay before the response is received. No fragmentation + supported here. + * resT : the delay to process the response. No fragmentation supported + here. + * pT : the delay to process the event or the group. From the stream + point of view, it is the latency added by the SPOE processing. + It is more or less the sum of values above. + + <idle> is the numbers of idle SPOE applets + <applets> is the numbers of SPOE applets + <nb_sending> is the numbers of streams waiting to send data + <nb_waiting> is the numbers of streams waiting for a ack + <nb_error> is the numbers of processing errors + <nb_processed> is the numbers of events/groups processed + + +For all these time events, -1 means the processing was interrupted before the +end. So -1 for the queue time means the request was never dequeued. For +fragmented frames it is harder to know when the interruption happened. + +/* + * Local variables: + * fill-column: 79 + * End: + */ diff --git a/doc/WURFL-device-detection.txt b/doc/WURFL-device-detection.txt new file mode 100644 index 0000000..4786e22 --- /dev/null +++ b/doc/WURFL-device-detection.txt @@ -0,0 +1,71 @@ +Scientiamobile WURFL Device Detection +------------------------------------- + +You can also include WURFL for inbuilt device detection enabling attributes. + +WURFL is a high-performance and low-memory footprint mobile device detection +software component that can quickly and accurately detect over 500 capabilities +of visiting devices. It can differentiate between portable mobile devices, desktop devices, +SmartTVs and any other types of devices on which a web browser can be installed. + +In order to add WURFL device detection support, you would need to download Scientiamobile +InFuze C API and install it on your system. Refer to www.scientiamobile.com to obtain a valid +InFuze license. +Compile haproxy as shown : + + $ make TARGET=<target> USE_WURFL=1 + +Optionally WURFL_DEBUG=1 may be set to increase logs verbosity + +For HAProxy developers who need to verify that their changes didn't accidentally +break the WURFL code, it is possible to build a dummy library provided in the +addons/wurfl/dummy directory and to use it as an alternative for the full library. +This will not provide the full functionalities, it will just allow haproxy to +start with a wurfl configuration, which generally is enough to validate API +changes : + + $ make -C addons/wurfl/dummy + $ make TARGET=<target> USE_WURFL=1 WURFL_INC=$PWD/addons/wurfl/dummy WURFL_LIB=$PWD/addons/wurfl/dummy + +These are the supported WURFL directives (see doc/configuration.txt) : +- wurfl-data-file <path to WURFL data file> +- wurfl-information-list [<string>] (list of WURFL capabilities, + virtual capabilities, property names we plan to use in injected headers) +- wurfl-information-list-separator <char> (character that will be + used to separate values in a response header, ',' by default). +- wurfl-cache-size <string> (Sets the WURFL caching strategy) +- wurfl-patch-file [<file path>] (Sets the paths to custom WURFL patch files) + +Sample configuration : + + global + wurfl-data-file /usr/share/wurfl/wurfl.zip + + wurfl-information-list wurfl_id model_name + + #wurfl-information-list-separator | + + ## single LRU cache + #wurfl-cache-size 100000 + ## no cache + #wurfl-cache-size 0 + + #wurfl-patch-file <paths to custom patch files> + + ... + frontend + bind *:8888 + default_backend servers + +There are two distinct methods available to transmit the WURFL data downstream +to the target application: + +All data listed in wurfl-information-list + + http-request set-header X-WURFL-All %[wurfl-get-all()] + +A subset of data listed in wurfl-information-list + + http-request set-header X-WURFL-Properties %[wurfl-get(wurfl_id,is_tablet)] + +Please find more information about WURFL and the detection methods at https://www.scientiamobile.com diff --git a/doc/acl.fig b/doc/acl.fig new file mode 100644 index 0000000..253a053 --- /dev/null +++ b/doc/acl.fig @@ -0,0 +1,229 @@ +#FIG 3.2 Produced by xfig version 3.2.5-alpha5 +Portrait +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +6 2430 1080 2700 2250 +1 2 0 1 0 11 52 -1 20 0.000 1 0.0000 2587 1687 113 563 2474 1687 2700 1687 +4 1 0 50 -1 16 8 1.5708 4 120 840 2610 1710 tcp-req inspect\001 +-6 +6 5805 1080 6255 2250 +1 2 0 1 0 29 52 -1 20 0.000 1 0.0000 6052 1687 203 563 5849 1687 6255 1687 +4 1 0 50 -1 16 8 1.5708 4 90 300 6030 1710 HTTP\001 +4 1 0 50 -1 16 8 1.5708 4 120 615 6165 1710 processing\001 +-6 +6 1575 3375 1800 4500 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 1575 3375 1800 3375 1800 4500 1575 4500 1575 3375 +4 1 0 50 -1 16 8 1.5708 4 120 735 1710 3960 http-resp out\001 +-6 +6 2025 3375 2250 4500 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 2025 3375 2250 3375 2250 4500 2025 4500 2025 3375 +4 1 0 50 -1 16 8 1.5708 4 120 735 2160 3960 http-resp out\001 +-6 +6 810 3600 1080 4230 +4 1 0 50 -1 16 8 1.5708 4 105 555 900 3915 Response\001 +4 1 0 50 -1 16 8 1.5708 4 105 450 1065 3915 to client\001 +-6 +6 720 1350 1035 2070 +4 1 0 50 -1 16 8 1.5708 4 120 540 855 1710 Requests \001 +4 1 0 50 -1 16 8 1.5708 4 105 645 1020 1710 from clients\001 +-6 +6 7695 1350 8010 1980 +4 1 0 50 -1 16 8 1.5708 4 120 510 7830 1665 Requests\001 +4 1 0 50 -1 16 8 1.5708 4 105 555 7995 1665 to servers\001 +-6 +6 7785 3600 8055 4230 +4 1 0 50 -1 16 8 1.5708 4 105 555 7875 3915 Response\001 +4 1 0 50 -1 16 8 1.5708 4 105 630 8055 3915 from server\001 +-6 +1 2 0 1 0 11 52 -1 20 0.000 1 0.0000 1687 1687 113 563 1574 1687 1800 1687 +1 2 0 1 0 11 52 -1 20 0.000 1 0.0000 7087 3937 113 563 6974 3937 7200 3937 +1 2 0 1 0 29 52 -1 20 0.000 1 0.0000 4072 3937 203 563 3869 3937 4275 3937 +1 2 0 1 0 29 52 -1 20 0.000 1 0.0000 2903 3937 203 563 2700 3937 3106 3937 +2 3 0 1 0 6 54 -1 20 0.000 0 0 -1 0 0 9 + 1485 900 1485 2475 4140 2475 4140 1035 6390 1035 6390 2340 + 6840 2340 6840 900 1485 900 +2 3 0 1 0 2 54 -1 20 0.000 0 0 -1 0 0 9 + 4365 1035 4365 2475 7290 2475 7290 900 6840 900 6840 2340 + 5715 2340 5715 1035 4365 1035 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 4950 1125 5175 1125 5175 2250 4950 2250 4950 1125 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 5400 1125 5625 1125 5625 2250 5400 2250 5400 1125 +2 2 0 1 0 11 52 -1 20 0.000 0 0 -1 0 0 5 + 2025 1125 2250 1125 2250 2250 2025 2250 2025 1125 +2 2 0 1 0 11 52 -1 20 0.000 0 0 -1 0 0 5 + 2925 1125 3150 1125 3150 2250 2925 2250 2925 1125 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1125 1710 1575 1710 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1125 1935 1575 1755 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1125 1485 1575 1665 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 3825 1125 4050 1125 4050 2250 3825 2250 3825 1125 +2 2 0 1 0 6 50 -1 20 0.000 0 0 -1 0 0 5 + 1575 450 2025 450 2025 540 1575 540 1575 450 +2 2 0 1 0 2 50 -1 20 0.000 0 0 -1 0 0 5 + 1575 675 2025 675 2025 765 1575 765 1575 675 +2 2 0 1 0 11 50 -1 20 0.000 0 0 -1 0 0 5 + 3150 450 3600 450 3600 540 3150 540 3150 450 +2 2 0 1 0 29 50 -1 20 0.000 0 0 -1 0 0 5 + 3150 675 3600 675 3600 765 3150 765 3150 675 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 6525 1125 6750 1125 6750 2250 6525 2250 6525 1125 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7200 1665 7650 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7200 1620 7650 1530 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7200 1710 7650 1800 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 6975 1125 7200 1125 7200 2250 6975 2250 6975 1125 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 3375 1125 3600 1125 3600 2250 3375 2250 3375 1125 +2 2 0 1 0 11 52 -1 20 0.000 0 0 -1 0 0 5 + 4500 1125 4725 1125 4725 2250 4500 2250 4500 1125 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 1800 1665 2025 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 2250 1665 2475 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 2700 1665 2925 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 3150 1665 3375 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 3600 1665 3825 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 4725 1665 4950 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 5175 1665 5400 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 5625 1665 5850 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 6750 1665 6975 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 6255 1665 6525 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 4050 1665 4500 1665 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 4050 1620 4500 1530 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 4050 1710 4500 1800 +2 2 0 1 0 11 52 -1 20 0.000 0 0 -1 0 0 5 + 6525 3375 6750 3375 6750 4500 6525 4500 6525 3375 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 6075 3375 6300 3375 6300 4500 6075 4500 6075 3375 +2 3 0 1 0 2 54 -1 20 0.000 0 0 -1 0 0 9 + 7290 3150 7290 4725 5985 4725 5985 3285 2385 3285 2385 4590 + 1935 4590 1935 3150 7290 3150 +2 3 0 1 0 6 54 -1 20 0.000 0 0 -1 0 0 9 + 1935 3150 1485 3150 1485 4725 5985 4725 5985 3285 5085 3285 + 5085 4590 1935 4590 1935 3150 +2 2 0 1 0 11 52 -1 20 0.000 0 0 -1 0 0 5 + 5625 3375 5850 3375 5850 4500 5625 4500 5625 3375 +2 2 0 1 0 29 52 -1 20 0.000 0 0 -1 0 0 5 + 5175 3375 5400 3375 5400 4500 5175 4500 5175 3375 +2 1 0 1 0 0 54 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7650 3915 7200 3915 +2 1 0 1 0 0 54 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1575 3915 1125 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 6975 3915 6750 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 6525 3915 6300 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 6075 3915 5850 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 5625 3915 5400 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 2025 3915 1800 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 5175 3915 4275 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 3870 3915 3105 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 30.00 60.00 + 2700 3915 2250 3915 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 3 + 1 1 1.00 30.00 60.00 + 3465 2250 3465 2880 2970 3465 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 5040 2250 5040 2655 3600 2880 3015 3510 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 6075 2250 6075 2565 3645 2925 3060 3555 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 6615 2250 6615 2610 3690 2970 3060 3645 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 7065 2250 7065 2655 3735 3015 3060 3690 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 5265 3375 5265 2970 3825 3105 3105 3780 +2 1 0 1 0 0 50 -1 -1 0.000 0 0 -1 1 0 4 + 1 1 1.00 30.00 60.00 + 6165 3375 6165 2835 3780 3060 3105 3735 +4 1 0 50 -1 16 8 1.5708 4 120 630 2160 1710 tcp-request\001 +4 1 0 50 -1 16 8 1.5708 4 120 870 3060 1710 tcp-req content\001 +4 1 0 50 -1 16 8 1.5708 4 120 600 5085 1710 http-req in\001 +4 1 0 50 -1 16 8 1.5708 4 105 690 3960 1710 use-backend\001 +4 1 0 50 -1 16 8 1.5708 4 75 570 5535 1710 use-server\001 +4 1 0 50 -1 16 8 1.5708 4 120 360 1710 1710 accept\001 +4 0 0 50 -1 18 6 0.0000 4 90 435 2115 540 frontend\001 +4 0 0 50 -1 18 6 0.0000 4 90 405 2115 765 backend\001 +4 0 0 50 -1 18 6 0.0000 4 105 150 3735 540 tcp\001 +4 0 0 50 -1 18 6 0.0000 4 105 450 3735 765 http only\001 +4 2 0 50 -1 18 6 0.0000 4 90 435 4050 2430 frontend\001 +4 0 0 50 -1 18 6 0.0000 4 90 405 4455 2430 backend\001 +4 1 0 50 -1 16 8 1.5708 4 120 675 6660 1710 http-req out\001 +4 1 0 50 -1 16 8 1.5708 4 120 675 7110 1710 http-req out\001 +4 1 0 50 -1 16 8 1.5708 4 120 600 3510 1710 http-req in\001 +4 1 0 50 -1 16 8 1.5708 4 120 870 4635 1710 tcp-req content\001 +4 1 0 50 -1 16 8 1.5708 4 120 660 6210 3960 http-resp in\001 +4 1 0 50 -1 16 8 1.5708 4 120 930 6660 3960 tcp-resp content\001 +4 1 0 50 -1 16 8 1.5708 4 120 900 7110 3960 tcp-resp inspect\001 +4 1 0 50 -1 16 8 1.5708 4 120 930 5760 3960 tcp-resp content\001 +4 1 0 50 -1 16 8 1.5708 4 120 660 5310 3960 http-resp in\001 +4 0 0 50 -1 18 6 0.0000 4 90 405 6075 4680 backend\001 +4 1 0 50 -1 16 8 1.5708 4 90 300 4050 3960 HTTP\001 +4 1 0 50 -1 16 8 1.5708 4 120 615 4185 3960 processing\001 +4 1 0 50 -1 16 8 1.5708 4 90 300 2835 3915 Error\001 +4 1 0 50 -1 16 8 1.5708 4 120 615 2970 3915 processing\001 +4 2 0 50 -1 18 6 0.0000 4 90 435 5895 4680 frontend\001 diff --git a/doc/architecture.txt b/doc/architecture.txt new file mode 100644 index 0000000..c37632f --- /dev/null +++ b/doc/architecture.txt @@ -0,0 +1,1448 @@ + ------------------- + HAProxy + Architecture Guide + ------------------- + version 1.1.34 + willy tarreau + 2006/01/29 + + +This document provides real world examples with working configurations. +Please note that except stated otherwise, global configuration parameters +such as logging, chrooting, limits and time-outs are not described here. + +=================================================== +1. Simple HTTP load-balancing with cookie insertion +=================================================== + +A web application often saturates the front-end server with high CPU loads, +due to the scripting language involved. It also relies on a back-end database +which is not much loaded. User contexts are stored on the server itself, and +not in the database, so that simply adding another server with simple IP/TCP +load-balancing would not work. + + +-------+ + |clients| clients and/or reverse-proxy + +---+---+ + | + -+-----+--------+---- + | _|_db + +--+--+ (___) + | web | (___) + +-----+ (___) + 192.168.1.1 192.168.1.2 + + +Replacing the web server with a bigger SMP system would cost much more than +adding low-cost pizza boxes. The solution is to buy N cheap boxes and install +the application on them. Install haproxy on the old one which will spread the +load across the new boxes. + + 192.168.1.1 192.168.1.11-192.168.1.14 192.168.1.2 + -------+-----------+-----+-----+-----+--------+---- + | | | | | _|_db + +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | A | | B | | C | | D | (___) + +-----+ +---+ +---+ +---+ +---+ (___) + haproxy 4 cheap web servers + + +Config on haproxy (LB1) : +------------------------- + + listen webfarm 192.168.1.1:80 + mode http + balance roundrobin + cookie SERVERID insert indirect + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + + +Description : +------------- + - LB1 will receive clients requests. + - if a request does not contain a cookie, it will be forwarded to a valid + server + - in return, a cookie "SERVERID" will be inserted in the response holding the + server name (eg: "A"). + - when the client comes again with the cookie "SERVERID=A", LB1 will know that + it must be forwarded to server A. The cookie will be removed so that the + server does not see it. + - if server "webA" dies, the requests will be sent to another valid server + and a cookie will be reassigned. + + +Flows : +------- + +(client) (haproxy) (server A) + >-- GET /URI1 HTTP/1.0 ------------> | + ( no cookie, haproxy forwards in load-balancing mode. ) + | >-- GET /URI1 HTTP/1.0 ----------> + | <-- HTTP/1.0 200 OK -------------< + ( the proxy now adds the server cookie in return ) + <-- HTTP/1.0 200 OK ---------------< | + Set-Cookie: SERVERID=A | + >-- GET /URI2 HTTP/1.0 ------------> | + Cookie: SERVERID=A | + ( the proxy sees the cookie. it forwards to server A and deletes it ) + | >-- GET /URI2 HTTP/1.0 ----------> + | <-- HTTP/1.0 200 OK -------------< + ( the proxy does not add the cookie in return because the client knows it ) + <-- HTTP/1.0 200 OK ---------------< | + >-- GET /URI3 HTTP/1.0 ------------> | + Cookie: SERVERID=A | + ( ... ) + + +Limits : +-------- + - if clients use keep-alive (HTTP/1.1), only the first response will have + a cookie inserted, and only the first request of each session will be + analyzed. This does not cause trouble in insertion mode because the cookie + is put immediately in the first response, and the session is maintained to + the same server for all subsequent requests in the same session. However, + the cookie will not be removed from the requests forwarded to the servers, + so the server must not be sensitive to unknown cookies. If this causes + trouble, you can disable keep-alive by adding the following option : + + option httpclose + + - if for some reason the clients cannot learn more than one cookie (eg: the + clients are indeed some home-made applications or gateways), and the + application already produces a cookie, you can use the "prefix" mode (see + below). + + - LB1 becomes a very sensible server. If LB1 dies, nothing works anymore. + => you can back it up using keepalived (see below) + + - if the application needs to log the original client's IP, use the + "forwardfor" option which will add an "X-Forwarded-For" header with the + original client's IP address. You must also use "httpclose" to ensure + that you will rewrite every requests and not only the first one of each + session : + + option httpclose + option forwardfor + + - if the application needs to log the original destination IP, use the + "originalto" option which will add an "X-Original-To" header with the + original destination IP address. You must also use "httpclose" to ensure + that you will rewrite every requests and not only the first one of each + session : + + option httpclose + option originalto + + The web server will have to be configured to use this header instead. + For example, on apache, you can use LogFormat for this : + + LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b " combined + CustomLog /var/log/httpd/access_log combined + +Hints : +------- +Sometimes on the internet, you will find a few percent of the clients which +disable cookies on their browser. Obviously they have troubles everywhere on +the web, but you can still help them access your site by using the "source" +balancing algorithm instead of the "roundrobin". It ensures that a given IP +address always reaches the same server as long as the number of servers remains +unchanged. Never use this behind a proxy or in a small network, because the +distribution will be unfair. However, in large internal networks, and on the +internet, it works quite well. Clients which have a dynamic address will not +be affected as long as they accept the cookie, because the cookie always has +precedence over load balancing : + + listen webfarm 192.168.1.1:80 + mode http + balance source + cookie SERVERID insert indirect + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + + +================================================================== +2. HTTP load-balancing with cookie prefixing and high availability +================================================================== + +Now you don't want to add more cookies, but rather use existing ones. The +application already generates a "JSESSIONID" cookie which is enough to track +sessions, so we'll prefix this cookie with the server name when we see it. +Since the load-balancer becomes critical, it will be backed up with a second +one in VRRP mode using keepalived under Linux. + +Download the latest version of keepalived from this site and install it +on each load-balancer LB1 and LB2 : + + http://www.keepalived.org/ + +You then have a shared IP between the two load-balancers (we will still use the +original IP). It is active only on one of them at any moment. To allow the +proxy to bind to the shared IP on Linux 2.4, you must enable it in /proc : + +# echo 1 >/proc/sys/net/ipv4/ip_nonlocal_bind + + + shared IP=192.168.1.1 + 192.168.1.3 192.168.1.4 192.168.1.11-192.168.1.14 192.168.1.2 + -------+------------+-----------+-----+-----+-----+--------+---- + | | | | | | _|_db + +--+--+ +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | LB2 | | A | | B | | C | | D | (___) + +-----+ +-----+ +---+ +---+ +---+ +---+ (___) + haproxy haproxy 4 cheap web servers + keepalived keepalived + + +Config on both proxies (LB1 and LB2) : +-------------------------------------- + + listen webfarm 192.168.1.1:80 + mode http + balance roundrobin + cookie JSESSIONID prefix + option httpclose + option forwardfor + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + + +Notes: the proxy will modify EVERY cookie sent by the client and the server, +so it is important that it can access to ALL cookies in ALL requests for +each session. This implies that there is no keep-alive (HTTP/1.1), thus the +"httpclose" option. Only if you know for sure that the client(s) will never +use keep-alive (eg: Apache 1.3 in reverse-proxy mode), you can remove this +option. + + +Configuration for keepalived on LB1/LB2 : +----------------------------------------- + + vrrp_script chk_haproxy { # Requires keepalived-1.1.13 + script "killall -0 haproxy" # cheaper than pidof + interval 2 # check every 2 seconds + weight 2 # add 2 points of prio if OK + } + + vrrp_instance VI_1 { + interface eth0 + state MASTER + virtual_router_id 51 + priority 101 # 101 on master, 100 on backup + virtual_ipaddress { + 192.168.1.1 + } + track_script { + chk_haproxy + } + } + + +Description : +------------- + - LB1 is VRRP master (keepalived), LB2 is backup. Both monitor the haproxy + process, and lower their prio if it fails, leading to a failover to the + other node. + - LB1 will receive clients requests on IP 192.168.1.1. + - both load-balancers send their checks from their native IP. + - if a request does not contain a cookie, it will be forwarded to a valid + server + - in return, if a JESSIONID cookie is seen, the server name will be prefixed + into it, followed by a delimiter ('~') + - when the client comes again with the cookie "JSESSIONID=A~xxx", LB1 will + know that it must be forwarded to server A. The server name will then be + extracted from cookie before it is sent to the server. + - if server "webA" dies, the requests will be sent to another valid server + and a cookie will be reassigned. + + +Flows : +------- + +(client) (haproxy) (server A) + >-- GET /URI1 HTTP/1.0 ------------> | + ( no cookie, haproxy forwards in load-balancing mode. ) + | >-- GET /URI1 HTTP/1.0 ----------> + | X-Forwarded-For: 10.1.2.3 + | <-- HTTP/1.0 200 OK -------------< + ( no cookie, nothing changed ) + <-- HTTP/1.0 200 OK ---------------< | + >-- GET /URI2 HTTP/1.0 ------------> | + ( no cookie, haproxy forwards in lb mode, possibly to another server. ) + | >-- GET /URI2 HTTP/1.0 ----------> + | X-Forwarded-For: 10.1.2.3 + | <-- HTTP/1.0 200 OK -------------< + | Set-Cookie: JSESSIONID=123 + ( the cookie is identified, it will be prefixed with the server name ) + <-- HTTP/1.0 200 OK ---------------< | + Set-Cookie: JSESSIONID=A~123 | + >-- GET /URI3 HTTP/1.0 ------------> | + Cookie: JSESSIONID=A~123 | + ( the proxy sees the cookie, removes the server name and forwards + to server A which sees the same cookie as it previously sent ) + | >-- GET /URI3 HTTP/1.0 ----------> + | Cookie: JSESSIONID=123 + | X-Forwarded-For: 10.1.2.3 + | <-- HTTP/1.0 200 OK -------------< + ( no cookie, nothing changed ) + <-- HTTP/1.0 200 OK ---------------< | + ( ... ) + +Hints : +------- +Sometimes, there will be some powerful servers in the farm, and some smaller +ones. In this situation, it may be desirable to tell haproxy to respect the +difference in performance. Let's consider that WebA and WebB are two old +P3-1.2 GHz while WebC and WebD are shiny new Opteron-2.6 GHz. If your +application scales with CPU, you may assume a very rough 2.6/1.2 performance +ratio between the servers. You can inform haproxy about this using the "weight" +keyword, with values between 1 and 256. It will then spread the load the most +smoothly possible respecting those ratios : + + server webA 192.168.1.11:80 cookie A weight 12 check + server webB 192.168.1.12:80 cookie B weight 12 check + server webC 192.168.1.13:80 cookie C weight 26 check + server webD 192.168.1.14:80 cookie D weight 26 check + + +======================================================== +2.1 Variations involving external layer 4 load-balancers +======================================================== + +Instead of using a VRRP-based active/backup solution for the proxies, +they can also be load-balanced by a layer4 load-balancer (eg: Alteon) +which will also check that the services run fine on both proxies : + + | VIP=192.168.1.1 + +----+----+ + | Alteon | + +----+----+ + | + 192.168.1.3 | 192.168.1.4 192.168.1.11-192.168.1.14 192.168.1.2 + -------+-----+------+-----------+-----+-----+-----+--------+---- + | | | | | | _|_db + +--+--+ +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | LB2 | | A | | B | | C | | D | (___) + +-----+ +-----+ +---+ +---+ +---+ +---+ (___) + haproxy haproxy 4 cheap web servers + + +Config on both proxies (LB1 and LB2) : +-------------------------------------- + + listen webfarm 0.0.0.0:80 + mode http + balance roundrobin + cookie JSESSIONID prefix + option httpclose + option forwardfor + option httplog + option dontlognull + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + +The "dontlognull" option is used to prevent the proxy from logging the health +checks from the Alteon. If a session exchanges no data, then it will not be +logged. + +Config on the Alteon : +---------------------- + + /c/slb/real 11 + ena + name "LB1" + rip 192.168.1.3 + /c/slb/real 12 + ena + name "LB2" + rip 192.168.1.4 + /c/slb/group 10 + name "LB1-2" + metric roundrobin + health tcp + add 11 + add 12 + /c/slb/virt 10 + ena + vip 192.168.1.1 + /c/slb/virt 10/service http + group 10 + + +Note: the health-check on the Alteon is set to "tcp" to prevent the proxy from +forwarding the connections. It can also be set to "http", but for this the +proxy must specify a "monitor-net" with the Alteons' addresses, so that the +Alteon can really check that the proxies can talk HTTP but without forwarding +the connections to the end servers. Check next section for an example on how to +use monitor-net. + + +============================================================ +2.2 Generic TCP relaying and external layer 4 load-balancers +============================================================ + +Sometimes it's useful to be able to relay generic TCP protocols (SMTP, TSE, +VNC, etc...), for example to interconnect private networks. The problem comes +when you use external load-balancers which need to send periodic health-checks +to the proxies, because these health-checks get forwarded to the end servers. +The solution is to specify a network which will be dedicated to monitoring +systems and must not lead to a forwarding connection nor to any log, using the +"monitor-net" keyword. Note: this feature expects a version of haproxy greater +than or equal to 1.1.32 or 1.2.6. + + + | VIP=172.16.1.1 | + +----+----+ +----+----+ + | Alteon1 | | Alteon2 | + +----+----+ +----+----+ + 192.168.1.252 | GW=192.168.1.254 | 192.168.1.253 + | | + ------+---+------------+--+-----------------> TSE farm : 192.168.1.10 + 192.168.1.1 | | 192.168.1.2 + +--+--+ +--+--+ + | LB1 | | LB2 | + +-----+ +-----+ + haproxy haproxy + + +Config on both proxies (LB1 and LB2) : +-------------------------------------- + + listen tse-proxy + bind :3389,:1494,:5900 # TSE, ICA and VNC at once. + mode tcp + balance roundrobin + server tse-farm 192.168.1.10 + monitor-net 192.168.1.252/31 + +The "monitor-net" option instructs the proxies that any connection coming from +192.168.1.252 or 192.168.1.253 will not be logged nor forwarded and will be +closed immediately. The Alteon load-balancers will then see the proxies alive +without perturbating the service. + +Config on the Alteon : +---------------------- + + /c/l3/if 1 + ena + addr 192.168.1.252 + mask 255.255.255.0 + /c/slb/real 11 + ena + name "LB1" + rip 192.168.1.1 + /c/slb/real 12 + ena + name "LB2" + rip 192.168.1.2 + /c/slb/group 10 + name "LB1-2" + metric roundrobin + health tcp + add 11 + add 12 + /c/slb/virt 10 + ena + vip 172.16.1.1 + /c/slb/virt 10/service 1494 + group 10 + /c/slb/virt 10/service 3389 + group 10 + /c/slb/virt 10/service 5900 + group 10 + + +Special handling of SSL : +------------------------- +Sometimes, you want to send health-checks to remote systems, even in TCP mode, +in order to be able to failover to a backup server in case the first one is +dead. Of course, you can simply enable TCP health-checks, but it sometimes +happens that intermediate firewalls between the proxies and the remote servers +acknowledge the TCP connection themselves, showing an always-up server. Since +this is generally encountered on long-distance communications, which often +involve SSL, an SSL health-check has been implemented to work around this issue. +It sends SSL Hello messages to the remote server, which in turns replies with +SSL Hello messages. Setting it up is very easy : + + listen tcp-syslog-proxy + bind :1514 # listen to TCP syslog traffic on this port (SSL) + mode tcp + balance roundrobin + option ssl-hello-chk + server syslog-prod-site 192.168.1.10 check + server syslog-back-site 192.168.2.10 check backup + + +========================================================= +3. Simple HTTP/HTTPS load-balancing with cookie insertion +========================================================= + +This is the same context as in example 1 above, but the web +server uses HTTPS. + + +-------+ + |clients| clients + +---+---+ + | + -+-----+--------+---- + | _|_db + +--+--+ (___) + | SSL | (___) + | web | (___) + +-----+ + 192.168.1.1 192.168.1.2 + + +Since haproxy does not handle SSL, this part will have to be extracted from the +servers (freeing even more resources) and installed on the load-balancer +itself. Install haproxy and apache+mod_ssl on the old box which will spread the +load between the new boxes. Apache will work in SSL reverse-proxy-cache. If the +application is correctly developed, it might even lower its load. However, +since there now is a cache between the clients and haproxy, some security +measures must be taken to ensure that inserted cookies will not be cached. + + + 192.168.1.1 192.168.1.11-192.168.1.14 192.168.1.2 + -------+-----------+-----+-----+-----+--------+---- + | | | | | _|_db + +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | A | | B | | C | | D | (___) + +-----+ +---+ +---+ +---+ +---+ (___) + apache 4 cheap web servers + mod_ssl + haproxy + + +Config on haproxy (LB1) : +------------------------- + + listen 127.0.0.1:8000 + mode http + balance roundrobin + cookie SERVERID insert indirect nocache + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + + +Description : +------------- + - apache on LB1 will receive clients requests on port 443 + - it forwards it to haproxy bound to 127.0.0.1:8000 + - if a request does not contain a cookie, it will be forwarded to a valid + server + - in return, a cookie "SERVERID" will be inserted in the response holding the + server name (eg: "A"), and a "Cache-control: private" header will be added + so that the apache does not cache any page containing such cookie. + - when the client comes again with the cookie "SERVERID=A", LB1 will know that + it must be forwarded to server A. The cookie will be removed so that the + server does not see it. + - if server "webA" dies, the requests will be sent to another valid server + and a cookie will be reassigned. + +Notes : +------- + - if the cookie works in "prefix" mode, there is no need to add the "nocache" + option because it is an application cookie which will be modified, and the + application flags will be preserved. + - if apache 1.3 is used as a front-end before haproxy, it always disables + HTTP keep-alive on the back-end, so there is no need for the "httpclose" + option on haproxy. + - configure apache to set the X-Forwarded-For header itself, and do not do + it on haproxy if you need the application to know about the client's IP. + + +Flows : +------- + +(apache) (haproxy) (server A) + >-- GET /URI1 HTTP/1.0 ------------> | + ( no cookie, haproxy forwards in load-balancing mode. ) + | >-- GET /URI1 HTTP/1.0 ----------> + | <-- HTTP/1.0 200 OK -------------< + ( the proxy now adds the server cookie in return ) + <-- HTTP/1.0 200 OK ---------------< | + Set-Cookie: SERVERID=A | + Cache-Control: private | + >-- GET /URI2 HTTP/1.0 ------------> | + Cookie: SERVERID=A | + ( the proxy sees the cookie. it forwards to server A and deletes it ) + | >-- GET /URI2 HTTP/1.0 ----------> + | <-- HTTP/1.0 200 OK -------------< + ( the proxy does not add the cookie in return because the client knows it ) + <-- HTTP/1.0 200 OK ---------------< | + >-- GET /URI3 HTTP/1.0 ------------> | + Cookie: SERVERID=A | + ( ... ) + + + +======================================== +3.1. Alternate solution using Stunnel +======================================== + +When only SSL is required and cache is not needed, stunnel is a cheaper +solution than Apache+mod_ssl. By default, stunnel does not process HTTP and +does not add any X-Forwarded-For header, but there is a patch on the official +haproxy site to provide this feature to recent stunnel versions. + +This time, stunnel will only process HTTPS and not HTTP. This means that +haproxy will get all HTTP traffic, so haproxy will have to add the +X-Forwarded-For header for HTTP traffic, but not for HTTPS traffic since +stunnel will already have done it. We will use the "except" keyword to tell +haproxy that connections from local host already have a valid header. + + + 192.168.1.1 192.168.1.11-192.168.1.14 192.168.1.2 + -------+-----------+-----+-----+-----+--------+---- + | | | | | _|_db + +--+--+ +-+-+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | A | | B | | C | | D | (___) + +-----+ +---+ +---+ +---+ +---+ (___) + stunnel 4 cheap web servers + haproxy + + +Config on stunnel (LB1) : +------------------------- + + cert=/etc/stunnel/stunnel.pem + setuid=stunnel + setgid=proxy + + socket=l:TCP_NODELAY=1 + socket=r:TCP_NODELAY=1 + + [https] + accept=192.168.1.1:443 + connect=192.168.1.1:80 + xforwardedfor=yes + + +Config on haproxy (LB1) : +------------------------- + + listen 192.168.1.1:80 + mode http + balance roundrobin + option forwardfor except 192.168.1.1 + cookie SERVERID insert indirect nocache + option httpchk HEAD /index.html HTTP/1.0 + server webA 192.168.1.11:80 cookie A check + server webB 192.168.1.12:80 cookie B check + server webC 192.168.1.13:80 cookie C check + server webD 192.168.1.14:80 cookie D check + +Description : +------------- + - stunnel on LB1 will receive clients requests on port 443 + - it forwards them to haproxy bound to port 80 + - haproxy will receive HTTP client requests on port 80 and decrypted SSL + requests from Stunnel on the same port. + - stunnel will add the X-Forwarded-For header + - haproxy will add the X-Forwarded-For header for everyone except the local + address (stunnel). + + +======================================== +4. Soft-stop for application maintenance +======================================== + +When an application is spread across several servers, the time to update all +instances increases, so the application seems jerky for a longer period. + +HAProxy offers several solutions for this. Although it cannot be reconfigured +without being stopped, nor does it offer any external command, there are other +working solutions. + + +========================================= +4.1 Soft-stop using a file on the servers +========================================= + +This trick is quite common and very simple: put a file on the server which will +be checked by the proxy. When you want to stop the server, first remove this +file. The proxy will see the server as failed, and will not send it any new +session, only the old ones if the "persist" option is used. Wait a bit then +stop the server when it does not receive anymore connections. + + + listen 192.168.1.1:80 + mode http + balance roundrobin + cookie SERVERID insert indirect + option httpchk HEAD /running HTTP/1.0 + server webA 192.168.1.11:80 cookie A check inter 2000 rise 2 fall 2 + server webB 192.168.1.12:80 cookie B check inter 2000 rise 2 fall 2 + server webC 192.168.1.13:80 cookie C check inter 2000 rise 2 fall 2 + server webD 192.168.1.14:80 cookie D check inter 2000 rise 2 fall 2 + option persist + redispatch + contimeout 5000 + + +Description : +------------- + - every 2 seconds, haproxy will try to access the file "/running" on the + servers, and declare the server as down after 2 attempts (4 seconds). + - only the servers which respond with a 200 or 3XX response will be used. + - if a request does not contain a cookie, it will be forwarded to a valid + server + - if a request contains a cookie for a failed server, haproxy will insist + on trying to reach the server anyway, to let the user finish what they were + doing. ("persist" option) + - if the server is totally stopped, the connection will fail and the proxy + will rebalance the client to another server ("redispatch") + +Usage on the web servers : +-------------------------- +- to start the server : + # /etc/init.d/httpd start + # touch /home/httpd/www/running + +- to soft-stop the server + # rm -f /home/httpd/www/running + +- to completely stop the server : + # /etc/init.d/httpd stop + +Limits +------ +If the server is totally powered down, the proxy will still try to reach it +for those clients who still have a cookie referencing it, and the connection +attempt will expire after 5 seconds ("contimeout"), and only after that, the +client will be redispatched to another server. So this mode is only useful +for software updates where the server will suddenly refuse the connection +because the process is stopped. The problem is the same if the server suddenly +crashes. All of its users will be fairly perturbated. + + +================================== +4.2 Soft-stop using backup servers +================================== + +A better solution which covers every situation is to use backup servers. +Version 1.1.30 fixed a bug which prevented a backup server from sharing +the same cookie as a standard server. + + + listen 192.168.1.1:80 + mode http + balance roundrobin + redispatch + cookie SERVERID insert indirect + option httpchk HEAD / HTTP/1.0 + server webA 192.168.1.11:80 cookie A check port 81 inter 2000 + server webB 192.168.1.12:80 cookie B check port 81 inter 2000 + server webC 192.168.1.13:80 cookie C check port 81 inter 2000 + server webD 192.168.1.14:80 cookie D check port 81 inter 2000 + + server bkpA 192.168.1.11:80 cookie A check port 80 inter 2000 backup + server bkpB 192.168.1.12:80 cookie B check port 80 inter 2000 backup + server bkpC 192.168.1.13:80 cookie C check port 80 inter 2000 backup + server bkpD 192.168.1.14:80 cookie D check port 80 inter 2000 backup + +Description +----------- +Four servers webA..D are checked on their port 81 every 2 seconds. The same +servers named bkpA..D are checked on the port 80, and share the exact same +cookies. Those servers will only be used when no other server is available +for the same cookie. + +When the web servers are started, only the backup servers are seen as +available. On the web servers, you need to redirect port 81 to local +port 80, either with a local proxy (eg: a simple haproxy tcp instance), +or with iptables (linux) or pf (openbsd). This is because we want the +real web server to reply on this port, and not a fake one. Eg, with +iptables : + + # /etc/init.d/httpd start + # iptables -t nat -A PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80 + +A few seconds later, the standard server is seen up and haproxy starts to send +it new requests on its real port 80 (only new users with no cookie, of course). + +If a server completely crashes (even if it does not respond at the IP level), +both the standard and backup servers will fail, so clients associated to this +server will be redispatched to other live servers and will lose their sessions. + +Now if you want to enter a server into maintenance, simply stop it from +responding on port 81 so that its standard instance will be seen as failed, +but the backup will still work. Users will not notice anything since the +service is still operational : + + # iptables -t nat -D PREROUTING -p tcp --dport 81 -j REDIRECT --to-port 80 + +The health checks on port 81 for this server will quickly fail, and the +standard server will be seen as failed. No new session will be sent to this +server, and existing clients with a valid cookie will still reach it because +the backup server will still be up. + +Now wait as long as you want for the old users to stop using the service, and +once you see that the server does not receive any traffic, simply stop it : + + # /etc/init.d/httpd stop + +The associated backup server will in turn fail, and if any client still tries +to access this particular server, they will be redispatched to any other valid +server because of the "redispatch" option. + +This method has an advantage : you never touch the proxy when doing server +maintenance. The people managing the servers can make them disappear smoothly. + + +4.2.1 Variations for operating systems without any firewall software +-------------------------------------------------------------------- + +The downside is that you need a redirection solution on the server just for +the health-checks. If the server OS does not support any firewall software, +this redirection can also be handled by a simple haproxy in tcp mode : + + global + daemon + quiet + pidfile /var/run/haproxy-checks.pid + listen 0.0.0.0:81 + mode tcp + dispatch 127.0.0.1:80 + contimeout 1000 + clitimeout 10000 + srvtimeout 10000 + +To start the web service : + + # /etc/init.d/httpd start + # haproxy -f /etc/haproxy/haproxy-checks.cfg + +To soft-stop the service : + + # kill $(</var/run/haproxy-checks.pid) + +The port 81 will stop responding and the load-balancer will notice the failure. + + +4.2.2 Centralizing the server management +---------------------------------------- + +If one finds it preferable to manage the servers from the load-balancer itself, +the port redirector can be installed on the load-balancer itself. See the +example with iptables below. + +Make the servers appear as operational : + # iptables -t nat -A OUTPUT -d 192.168.1.11 -p tcp --dport 81 -j DNAT --to-dest :80 + # iptables -t nat -A OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80 + # iptables -t nat -A OUTPUT -d 192.168.1.13 -p tcp --dport 81 -j DNAT --to-dest :80 + # iptables -t nat -A OUTPUT -d 192.168.1.14 -p tcp --dport 81 -j DNAT --to-dest :80 + +Soft stop one server : + # iptables -t nat -D OUTPUT -d 192.168.1.12 -p tcp --dport 81 -j DNAT --to-dest :80 + +Another solution is to use the "COMAFILE" patch provided by Alexander Lazic, +which is available for download here : + + http://w.ods.org/tools/haproxy/contrib/ + + +4.2.3 Notes : +------------- + - Never, ever, start a fake service on port 81 for the health-checks, because + a real web service failure will not be detected as long as the fake service + runs. You must really forward the check port to the real application. + + - health-checks will be sent twice as often, once for each standard server, + and once for each backup server. All this will be multiplicated by the + number of processes if you use multi-process mode. You will have to ensure + that all the checks sent to the server do not overload it. + +======================= +4.3 Hot reconfiguration +======================= + +There are two types of haproxy users : + - those who can never do anything in production out of maintenance periods ; + - those who can do anything at any time provided that the consequences are + limited. + +The first ones have no problem stopping the server to change configuration +because they got some maintenance periods during which they can break anything. +So they will even prefer doing a clean stop/start sequence to ensure everything +will work fine upon next reload. Since those have represented the majority of +haproxy uses, there has been little effort trying to improve this. + +However, the second category is a bit different. They like to be able to fix an +error in a configuration file without anyone noticing. This can sometimes also +be the case for the first category because humans are not failsafe. + +For this reason, a new hot reconfiguration mechanism has been introduced in +version 1.1.34. Its usage is very simple and works even in chrooted +environments with lowered privileges. The principle is very simple : upon +reception of a SIGTTOU signal, the proxy will stop listening to all the ports. +This will release the ports so that a new instance can be started. Existing +connections will not be broken at all. If the new instance fails to start, +then sending a SIGTTIN signal back to the original processes will restore +the listening ports. This is possible without any special privileges because +the sockets will not have been closed, so the bind() is still valid. Otherwise, +if the new process starts successfully, then sending a SIGUSR1 signal to the +old one ensures that it will exit as soon as its last session ends. + +A hot reconfiguration script would look like this : + + # save previous state + mv /etc/haproxy/config /etc/haproxy/config.old + mv /var/run/haproxy.pid /var/run/haproxy.pid.old + + mv /etc/haproxy/config.new /etc/haproxy/config + kill -TTOU $(cat /var/run/haproxy.pid.old) + if haproxy -p /var/run/haproxy.pid -f /etc/haproxy/config; then + echo "New instance successfully loaded, stopping previous one." + kill -USR1 $(cat /var/run/haproxy.pid.old) + rm -f /var/run/haproxy.pid.old + exit 1 + else + echo "New instance failed to start, resuming previous one." + kill -TTIN $(cat /var/run/haproxy.pid.old) + rm -f /var/run/haproxy.pid + mv /var/run/haproxy.pid.old /var/run/haproxy.pid + mv /etc/haproxy/config /etc/haproxy/config.new + mv /etc/haproxy/config.old /etc/haproxy/config + exit 0 + fi + +After this, you can still force old connections to end by sending +a SIGTERM to the old process if it still exists : + + kill $(cat /var/run/haproxy.pid.old) + rm -f /var/run/haproxy.pid.old + +Be careful with this as in multi-process mode, some pids might already +have been reallocated to completely different processes. + + +================================================== +5. Multi-site load-balancing with local preference +================================================== + +5.1 Description of the problem +============================== + +Consider a world-wide company with sites on several continents. There are two +production sites SITE1 and SITE2 which host identical applications. There are +many offices around the world. For speed and communication cost reasons, each +office uses the nearest site by default, but can switch to the backup site in +the event of a site or application failure. There also are users on the +production sites, which use their local sites by default, but can switch to the +other site in case of a local application failure. + +The main constraints are : + + - application persistence : although the application is the same on both + sites, there is no session synchronisation between the sites. A failure + of one server or one site can cause a user to switch to another server + or site, but when the server or site comes back, the user must not switch + again. + + - communication costs : inter-site communication should be reduced to the + minimum. Specifically, in case of a local application failure, every + office should be able to switch to the other site without continuing to + use the default site. + +5.2 Solution +============ + - Each production site will have two haproxy load-balancers in front of its + application servers to balance the load across them and provide local HA. + We will call them "S1L1" and "S1L2" on site 1, and "S2L1" and "S2L2" on + site 2. These proxies will extend the application's JSESSIONID cookie to + put the server name as a prefix. + + - Each production site will have one front-end haproxy director to provide + the service to local users and to remote offices. It will load-balance + across the two local load-balancers, and will use the other site's + load-balancers as backup servers. It will insert the local site identifier + in a SITE cookie for the local load-balancers, and the remote site + identifier for the remote load-balancers. These front-end directors will + be called "SD1" and "SD2" for "Site Director". + + - Each office will have one haproxy near the border gateway which will direct + local users to their preference site by default, or to the backup site in + the event of a previous failure. It will also analyze the SITE cookie, and + direct the users to the site referenced in the cookie. Thus, the preferred + site will be declared as a normal server, and the backup site will be + declared as a backup server only, which will only be used when the primary + site is unreachable, or when the primary site's director has forwarded + traffic to the second site. These proxies will be called "OP1".."OPXX" + for "Office Proxy #XX". + + +5.3 Network diagram +=================== + +Note : offices 1 and 2 are on the same continent as site 1, while + office 3 is on the same continent as site 3. Each production + site can reach the second one either through the WAN or through + a dedicated link. + + + Office1 Office2 Office3 + users users users +192.168 # # # 192.168 # # # # # # +.1.0/24 | | | .2.0/24 | | | 192.168.3.0/24 | | | + --+----+-+-+- --+----+-+-+- ---+----+-+-+- + | | .1 | | .1 | | .1 + | +-+-+ | +-+-+ | +-+-+ + | |OP1| | |OP2| | |OP3| ... + ,-:-. +---+ ,-:-. +---+ ,-:-. +---+ + ( X ) ( X ) ( X ) + `-:-' `-:-' ,---. `-:-' + --+---------------+------+----~~~( X )~~~~-------+---------+- + | `---' | + | | + +---+ ,-:-. +---+ ,-:-. + |SD1| ( X ) |SD2| ( X ) + ( SITE 1 ) +-+-+ `-:-' ( SITE 2 ) +-+-+ `-:-' + |.1 | |.1 | + 10.1.1.0/24 | | ,---. 10.2.1.0/24 | | + -+-+-+-+-+-+-+-----+-+--( X )------+-+-+-+-+-+-+-----+-+-- + | | | | | | | `---' | | | | | | | + ...# # # # # |.11 |.12 ...# # # # # |.11 |.12 + Site 1 +-+--+ +-+--+ Site 2 +-+--+ +-+--+ + Local |S1L1| |S1L2| Local |S2L1| |S2L2| + users +-+--+ +--+-+ users +-+--+ +--+-+ + | | | | + 10.1.2.0/24 -+-+-+--+--++-- 10.2.2.0/24 -+-+-+--+--++-- + |.1 |.4 |.1 |.4 + +-+-+ +-+-+ +-+-+ +-+-+ + |W11| ~~~ |W14| |W21| ~~~ |W24| + +---+ +---+ +---+ +---+ + 4 application servers 4 application servers + on site 1 on site 2 + + + +5.4 Description +=============== + +5.4.1 Local users +----------------- + - Office 1 users connect to OP1 = 192.168.1.1 + - Office 2 users connect to OP2 = 192.168.2.1 + - Office 3 users connect to OP3 = 192.168.3.1 + - Site 1 users connect to SD1 = 10.1.1.1 + - Site 2 users connect to SD2 = 10.2.1.1 + +5.4.2 Office proxies +-------------------- + - Office 1 connects to site 1 by default and uses site 2 as a backup. + - Office 2 connects to site 1 by default and uses site 2 as a backup. + - Office 3 connects to site 2 by default and uses site 1 as a backup. + +The offices check the local site's SD proxy every 30 seconds, and the +remote one every 60 seconds. + + +Configuration for Office Proxy OP1 +---------------------------------- + + listen 192.168.1.1:80 + mode http + balance roundrobin + redispatch + cookie SITE + option httpchk HEAD / HTTP/1.0 + server SD1 10.1.1.1:80 cookie SITE1 check inter 30000 + server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup + + +Configuration for Office Proxy OP2 +---------------------------------- + + listen 192.168.2.1:80 + mode http + balance roundrobin + redispatch + cookie SITE + option httpchk HEAD / HTTP/1.0 + server SD1 10.1.1.1:80 cookie SITE1 check inter 30000 + server SD2 10.2.1.1:80 cookie SITE2 check inter 60000 backup + + +Configuration for Office Proxy OP3 +---------------------------------- + + listen 192.168.3.1:80 + mode http + balance roundrobin + redispatch + cookie SITE + option httpchk HEAD / HTTP/1.0 + server SD2 10.2.1.1:80 cookie SITE2 check inter 30000 + server SD1 10.1.1.1:80 cookie SITE1 check inter 60000 backup + + +5.4.3 Site directors ( SD1 and SD2 ) +------------------------------------ +The site directors forward traffic to the local load-balancers, and set a +cookie to identify the site. If no local load-balancer is available, or if +the local application servers are all down, it will redirect traffic to the +remote site, and report this in the SITE cookie. In order not to uselessly +load each site's WAN link, each SD will check the other site at a lower +rate. The site directors will also insert their client's address so that +the application server knows which local user or remote site accesses it. + +The SITE cookie which is set by these directors will also be understood +by the office proxies. This is important because if SD1 decides to forward +traffic to site 2, it will write "SITE2" in the "SITE" cookie, and on next +request, the office proxy will automatically and directly talk to SITE2 if +it can reach it. If it cannot, it will still send the traffic to SITE1 +where SD1 will in turn try to reach SITE2. + +The load-balancers checks are performed on port 81. As we'll see further, +the load-balancers provide a health monitoring port 81 which reroutes to +port 80 but which allows them to tell the SD that they are going down soon +and that the SD must not use them anymore. + + +Configuration for SD1 +--------------------- + + listen 10.1.1.1:80 + mode http + balance roundrobin + redispatch + cookie SITE insert indirect + option httpchk HEAD / HTTP/1.0 + option forwardfor + server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 4000 + server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 4000 + server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 8000 backup + server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 8000 backup + +Configuration for SD2 +--------------------- + + listen 10.2.1.1:80 + mode http + balance roundrobin + redispatch + cookie SITE insert indirect + option httpchk HEAD / HTTP/1.0 + option forwardfor + server S2L1 10.2.1.11:80 cookie SITE2 check port 81 inter 4000 + server S2L2 10.2.1.12:80 cookie SITE2 check port 81 inter 4000 + server S1L1 10.1.1.11:80 cookie SITE1 check port 81 inter 8000 backup + server S1L2 10.1.1.12:80 cookie SITE1 check port 81 inter 8000 backup + + +5.4.4 Local load-balancers S1L1, S1L2, S2L1, S2L2 +------------------------------------------------- +Please first note that because SD1 and SD2 use the same cookie for both +servers on a same site, the second load-balancer of each site will only +receive load-balanced requests, but as soon as the SITE cookie will be +set, only the first LB will receive the requests because it will be the +first one to match the cookie. + +The load-balancers will spread the load across 4 local web servers, and +use the JSESSIONID provided by the application to provide server persistence +using the new 'prefix' method. Soft-stop will also be implemented as described +in section 4 above. Moreover, these proxies will provide their own maintenance +soft-stop. Port 80 will be used for application traffic, while port 81 will +only be used for health-checks and locally rerouted to port 80. A grace time +will be specified to service on port 80, but not on port 81. This way, a soft +kill (kill -USR1) on the proxy will only kill the health-check forwarder so +that the site director knows it must not use this load-balancer anymore. But +the service will still work for 20 seconds and as long as there are established +sessions. + +These proxies will also be the only ones to disable HTTP keep-alive in the +chain, because it is enough to do it at one place, and it's necessary to do +it with 'prefix' cookies. + +Configuration for S1L1/S1L2 +--------------------------- + + listen 10.1.1.11:80 # 10.1.1.12:80 for S1L2 + grace 20000 # don't kill us until 20 seconds have elapsed + mode http + balance roundrobin + cookie JSESSIONID prefix + option httpclose + option forwardfor + option httpchk HEAD / HTTP/1.0 + server W11 10.1.2.1:80 cookie W11 check port 81 inter 2000 + server W12 10.1.2.2:80 cookie W12 check port 81 inter 2000 + server W13 10.1.2.3:80 cookie W13 check port 81 inter 2000 + server W14 10.1.2.4:80 cookie W14 check port 81 inter 2000 + + server B11 10.1.2.1:80 cookie W11 check port 80 inter 4000 backup + server B12 10.1.2.2:80 cookie W12 check port 80 inter 4000 backup + server B13 10.1.2.3:80 cookie W13 check port 80 inter 4000 backup + server B14 10.1.2.4:80 cookie W14 check port 80 inter 4000 backup + + listen 10.1.1.11:81 # 10.1.1.12:81 for S1L2 + mode tcp + dispatch 10.1.1.11:80 # 10.1.1.12:80 for S1L2 + + +Configuration for S2L1/S2L2 +--------------------------- + + listen 10.2.1.11:80 # 10.2.1.12:80 for S2L2 + grace 20000 # don't kill us until 20 seconds have elapsed + mode http + balance roundrobin + cookie JSESSIONID prefix + option httpclose + option forwardfor + option httpchk HEAD / HTTP/1.0 + server W21 10.2.2.1:80 cookie W21 check port 81 inter 2000 + server W22 10.2.2.2:80 cookie W22 check port 81 inter 2000 + server W23 10.2.2.3:80 cookie W23 check port 81 inter 2000 + server W24 10.2.2.4:80 cookie W24 check port 81 inter 2000 + + server B21 10.2.2.1:80 cookie W21 check port 80 inter 4000 backup + server B22 10.2.2.2:80 cookie W22 check port 80 inter 4000 backup + server B23 10.2.2.3:80 cookie W23 check port 80 inter 4000 backup + server B24 10.2.2.4:80 cookie W24 check port 80 inter 4000 backup + + listen 10.2.1.11:81 # 10.2.1.12:81 for S2L2 + mode tcp + dispatch 10.2.1.11:80 # 10.2.1.12:80 for S2L2 + + +5.5 Comments +------------ +Since each site director sets a cookie identifying the site, remote office +users will have their office proxies direct them to the right site and stick +to this site as long as the user still uses the application and the site is +available. Users on production sites will be directed to the right site by the +site directors depending on the SITE cookie. + +If the WAN link dies on a production site, the remote office users will not +see their site anymore, so they will redirect the traffic to the second site. +If there are dedicated inter-site links as on the diagram above, the second +SD will see the cookie and still be able to reach the original site. For +example : + +Office 1 user sends the following to OP1 : + GET / HTTP/1.0 + Cookie: SITE=SITE1; JSESSIONID=W14~123; + +OP1 cannot reach site 1 because its external router is dead. So the SD1 server +is seen as dead, and OP1 will then forward the request to SD2 on site 2, +regardless of the SITE cookie. + +SD2 on site 2 receives a SITE cookie containing "SITE1". Fortunately, it +can reach Site 1's load balancers S1L1 and S1L2. So it forwards the request +so S1L1 (the first one with the same cookie). + +S1L1 (on site 1) finds "W14" in the JSESSIONID cookie, so it can forward the +request to the right server, and the user session will continue to work. Once +the Site 1's WAN link comes back, OP1 will see SD1 again, and will not route +through SITE 2 anymore. + +However, when a new user on Office 1 connects to the application during a +site 1 failure, it does not contain any cookie. Since OP1 does not see SD1 +because of the network failure, it will direct the request to SD2 on site 2, +which will by default direct the traffic to the local load-balancers, S2L1 and +S2L2. So only initial users will load the inter-site link, not the new ones. + + +=================== +6. Source balancing +=================== + +Sometimes it may reveal useful to access servers from a pool of IP addresses +instead of only one or two. Some equipment (NAT firewalls, load-balancers) +are sensible to source address, and often need many sources to distribute the +load evenly amongst their internal hash buckets. + +To do this, you simply have to use several times the same server with a +different source. Example : + + listen 0.0.0.0:80 + mode tcp + balance roundrobin + server from1to1 10.1.1.1:80 source 10.1.2.1 + server from2to1 10.1.1.1:80 source 10.1.2.2 + server from3to1 10.1.1.1:80 source 10.1.2.3 + server from4to1 10.1.1.1:80 source 10.1.2.4 + server from5to1 10.1.1.1:80 source 10.1.2.5 + server from6to1 10.1.1.1:80 source 10.1.2.6 + server from7to1 10.1.1.1:80 source 10.1.2.7 + server from8to1 10.1.1.1:80 source 10.1.2.8 + + +============================================= +7. Managing high loads on application servers +============================================= + +One of the roles often expected from a load balancer is to mitigate the load on +the servers during traffic peaks. More and more often, we see heavy frameworks +used to deliver flexible and evolutive web designs, at the cost of high loads +on the servers, or very low concurrency. Sometimes, response times are also +rather high. People developing web sites relying on such frameworks very often +look for a load balancer which is able to distribute the load in the most +evenly fashion and which will be nice with the servers. + +There is a powerful feature in haproxy which achieves exactly this : request +queueing associated with concurrent connections limit. + +Let's say you have an application server which supports at most 20 concurrent +requests. You have 3 servers, so you can accept up to 60 concurrent HTTP +connections, which often means 30 concurrent users in case of keep-alive (2 +persistent connections per user). + +Even if you disable keep-alive, if the server takes a long time to respond, +you still have a high risk of multiple users clicking at the same time and +having their requests unserved because of server saturation. To work around +the problem, you increase the concurrent connection limit on the servers, +but their performance stalls under higher loads. + +The solution is to limit the number of connections between the clients and the +servers. You set haproxy to limit the number of connections on a per-server +basis, and you let all the users you want connect to it. It will then fill all +the servers up to the configured connection limit, and will put the remaining +connections in a queue, waiting for a connection to be released on a server. + +This ensures five essential principles : + + - all clients can be served whatever their number without crashing the + servers, the only impact it that the response time can be delayed. + + - the servers can be used at full throttle without the risk of stalling, + and fine tuning can lead to optimal performance. + + - response times can be reduced by making the servers work below the + congestion point, effectively leading to shorter response times even + under moderate loads. + + - no domino effect when a server goes down or starts up. Requests will be + queued more or less, always respecting servers limits. + + - it's easy to achieve high performance even on memory-limited hardware. + Indeed, heavy frameworks often consume huge amounts of RAM and not always + all the CPU available. In case of wrong sizing, reducing the number of + concurrent connections will protect against memory shortages while still + ensuring optimal CPU usage. + + +Example : +--------- + +HAProxy is installed in front of an application servers farm. It will limit +the concurrent connections to 4 per server (one thread per CPU), thus ensuring +very fast response times. + + + 192.168.1.1 192.168.1.11-192.168.1.13 192.168.1.2 + -------+-------------+-----+-----+------------+---- + | | | | _|_db + +--+--+ +-+-+ +-+-+ +-+-+ (___) + | LB1 | | A | | B | | C | (___) + +-----+ +---+ +---+ +---+ (___) + haproxy 3 application servers + with heavy frameworks + + +Config on haproxy (LB1) : +------------------------- + + listen appfarm 192.168.1.1:80 + mode http + maxconn 10000 + option httpclose + option forwardfor + balance roundrobin + cookie SERVERID insert indirect + option httpchk HEAD /index.html HTTP/1.0 + server railsA 192.168.1.11:80 cookie A maxconn 4 check + server railsB 192.168.1.12:80 cookie B maxconn 4 check + server railsC 192.168.1.13:80 cookie C maxconn 4 check + contimeout 60000 + + +Description : +------------- +The proxy listens on IP 192.168.1.1, port 80, and expects HTTP requests. It +can accept up to 10000 concurrent connections on this socket. It follows the +roundrobin algorithm to assign servers to connections as long as servers are +not saturated. + +It allows up to 4 concurrent connections per server, and will queue the +requests above this value. The "contimeout" parameter is used to set the +maximum time a connection may take to establish on a server, but here it +is also used to set the maximum time a connection may stay unserved in the +queue (1 minute here). + +If the servers can each process 4 requests in 10 ms on average, then at 3000 +connections, response times will be delayed by at most : + + 3000 / 3 servers / 4 conns * 10 ms = 2.5 seconds + +Which is not that dramatic considering the huge number of users for such a low +number of servers. + +When connection queues fill up and application servers are starving, response +times will grow and users might abort by clicking on the "Stop" button. It is +very undesirable to send aborted requests to servers, because they will eat +CPU cycles for nothing. + +An option has been added to handle this specific case : "option abortonclose". +By specifying it, you tell haproxy that if an input channel is closed on the +client side AND the request is still waiting in the queue, then it is highly +likely that the user has stopped, so we remove the request from the queue +before it will get served. + + +Managing unfair response times +------------------------------ + +Sometimes, the application server will be very slow for some requests (eg: +login page) and faster for other requests. This may cause excessive queueing +of expectedly fast requests when all threads on the server are blocked on a +request to the database. Then the only solution is to increase the number of +concurrent connections, so that the server can handle a large average number +of slow connections with threads left to handle faster connections. + +But as we have seen, increasing the number of connections on the servers can +be detrimental to performance (eg: Apache processes fighting for the accept() +lock). To improve this situation, the "minconn" parameter has been introduced. +When it is set, the maximum connection concurrency on the server will be bound +by this value, and the limit will increase with the number of clients waiting +in queue, till the clients connected to haproxy reach the proxy's maxconn, in +which case the connections per server will reach the server's maxconn. It means +that during low-to-medium loads, the minconn will be applied, and during surges +the maxconn will be applied. It ensures both optimal response times under +normal loads, and availability under very high loads. + +Example : +--------- + + listen appfarm 192.168.1.1:80 + mode http + maxconn 10000 + option httpclose + option abortonclose + option forwardfor + balance roundrobin + # The servers will get 4 concurrent connections under low + # loads, and 12 when there will be 10000 clients. + server railsA 192.168.1.11:80 minconn 4 maxconn 12 check + server railsB 192.168.1.12:80 minconn 4 maxconn 12 check + server railsC 192.168.1.13:80 minconn 4 maxconn 12 check + contimeout 60000 + + diff --git a/doc/close-options.txt b/doc/close-options.txt new file mode 100644 index 0000000..0554bb8 --- /dev/null +++ b/doc/close-options.txt @@ -0,0 +1,39 @@ +2011/04/20 - List of keep-alive / close options with associated behaviours. + +PK="http-pretend-keepalive", HC="httpclose", SC="http-server-close", + +0 = option not set +1 = option is set +* = option doesn't matter + +Options can be split between frontend and backend, so some of them might have +a meaning only when combined by associating a frontend to a backend. Some forms +are not the normal ones and provide a behaviour compatible with another normal +form. Those are considered alternate forms and are marked "(alt)". + +SC HC PK Behaviour + 0 0 X tunnel mode + 0 1 0 passive close, only set headers then tunnel + 0 1 1 forced close with keep-alive announce (alt) + 1 0 0 server close + 1 0 1 server close with keep-alive announce + 1 1 0 forced close (alt) + 1 1 1 forced close with keep-alive announce (alt) + +At this point this results in 4 distinct effective modes for a request being +processed : + - tunnel mode : Connection header is left untouched and body is ignored + - passive close : Connection header is changed and body is ignored + - server close : Connection header set, body scanned, client-side keep-alive + is made possible regardless of server-side capabilities + - forced close : Connection header set, body scanned, connection closed. + +The "close" modes may be combined with a fake keep-alive announce to the server +in order to workaround buggy servers that disable chunked encoding and content +length announces when the client does not ask for keep-alive. + +Note: "http-pretend-keepalive" alone has no effect. However, if it is set in a + backend while a frontend is in "http-close" mode, then the combination of + both will result in a forced close with keep-alive announces for requests + passing through both. + diff --git a/doc/coding-style.txt b/doc/coding-style.txt new file mode 100644 index 0000000..02a55f5 --- /dev/null +++ b/doc/coding-style.txt @@ -0,0 +1,1566 @@ +2020/07/07 - HAProxy coding style - Willy Tarreau <w@1wt.eu> +------------------------------------------------------------ + +A number of contributors are often embarrassed with coding style issues, they +don't always know if they're doing it right, especially since the coding style +has elvoved along the years. What is explained here is not necessarily what is +applied in the code, but new code should as much as possible conform to this +style. Coding style fixes happen when code is replaced. It is useless to send +patches to fix coding style only, they will be rejected, unless they belong to +a patch series which needs these fixes prior to get code changes. Also, please +avoid fixing coding style in the same patches as functional changes, they make +code review harder. + +A good way to quickly validate your patch before submitting it is to pass it +through the Linux kernel's checkpatch.pl utility which can be downloaded here : + + http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/scripts/checkpatch.pl + +Running it with the following options relaxes its checks to accommodate to the +extra degree of freedom that is tolerated in HAProxy's coding style compared to +the stricter style used in the kernel : + + checkpatch.pl -q --max-line-length=160 --no-tree --no-signoff \ + --ignore=LEADING_SPACE,CODE_INDENT,DEEP_INDENTATION \ + --ignore=ELSE_AFTER_BRACE < patch + +You can take its output as hints instead of strict rules, but in general its +output will be accurate and it may even spot some real bugs. + +When modifying a file, you must accept the terms of the license of this file +which is recalled at the top of the file, or is explained in the LICENSE file, +or if not stated, defaults to LGPL version 2.1 or later for files in the +'include' directory, and GPL version 2 or later for all other files. + +When adding a new file, you must add a copyright banner at the top of the +file with your real name, e-mail address and a reminder of the license. +Contributions under incompatible licenses or too restrictive licenses might +get rejected. If in doubt, please apply the principle above for existing files. + +All code examples below will intentionally be prefixed with " | " to mark +where the code aligns with the first column, and tabs in this document will be +represented as a series of 8 spaces so that it displays the same everywhere. + + +1) Indentation and alignment +---------------------------- + +1.1) Indentation +---------------- + +Indentation and alignment are two completely different things that people often +get wrong. Indentation is used to mark a sub-level in the code. A sub-level +means that a block is executed in the context of another block (eg: a function +or a condition) : + + | main(int argc, char **argv) + | { + | int i; + | + | if (argc < 2) + | exit(1); + | } + +In the example above, the code belongs to the main() function and the exit() +call belongs to the if statement. Indentation is made with tabs (\t, ASCII 9), +which allows any developer to configure their preferred editor to use their +own tab size and to still get the text properly indented. Exactly one tab is +used per sub-level. Tabs may only appear at the beginning of a line or after +another tab. It is illegal to put a tab after some text, as it mangles displays +in a different manner for different users (particularly when used to align +comments or values after a #define). If you're tempted to put a tab after some +text, then you're doing it wrong and you need alignment instead (see below). + +Note that there are places where the code was not properly indented in the +past. In order to view it correctly, you may have to set your tab size to 8 +characters. + + +1.2) Alignment +-------------- + +Alignment is used to continue a line in a way to makes things easier to group +together. By definition, alignment is character-based, so it uses spaces. Tabs +would not work because for one tab there would not be as many characters on all +displays. For instance, the arguments in a function declaration may be broken +into multiple lines using alignment spaces : + + | int http_header_match2(const char *hdr, const char *end, + | const char *name, int len) + | { + | ... + | } + +In this example, the "const char *name" part is aligned with the first +character of the group it belongs to (list of function arguments). Placing it +here makes it obvious that it's one of the function's arguments. Multiple lines +are easy to handle this way. This is very common with long conditions too : + + | if ((len < eol - sol) && + | (sol[len] == ':') && + | (strncasecmp(sol, name, len) == 0)) { + | ctx->del = len; + | } + +If we take again the example above marking tabs with "[-Tabs-]" and spaces +with "#", we get this : + + | [-Tabs-]if ((len < eol - sol) && + | [-Tabs-]####(sol[len] == ':') && + | [-Tabs-]####(strncasecmp(sol, name, len) == 0)) { + | [-Tabs-][-Tabs-]ctx->del = len; + | [-Tabs-]} + +It is worth noting that some editors tend to confuse indentations and alignment. +Emacs is notoriously known for this brokenness, and is responsible for almost +all of the alignment mess. The reason is that Emacs only counts spaces, tries +to fill as many as possible with tabs and completes with spaces. Once you know +it, you just have to be careful, as alignment is not used much, so generally it +is just a matter of replacing the last tab with 8 spaces when this happens. + +Indentation should be used everywhere there is a block or an opening brace. It +is not possible to have two consecutive closing braces on the same column, it +means that the innermost was not indented. + +Right : + + | main(int argc, char **argv) + | { + | if (argc > 1) { + | printf("Hello\n"); + | } + | exit(0); + | } + +Wrong : + + | main(int argc, char **argv) + | { + | if (argc > 1) { + | printf("Hello\n"); + | } + | exit(0); + | } + +A special case applies to switch/case statements. Due to my editor's settings, +I've been used to align "case" with "switch" and to find it somewhat logical +since each of the "case" statements opens a sublevel belonging to the "switch" +statement. But indenting "case" after "switch" is accepted too. However in any +case, whatever follows the "case" statement must be indented, whether or not it +contains braces : + + | switch (*arg) { + | case 'A': { + | int i; + | for (i = 0; i < 10; i++) + | printf("Please stop pressing 'A'!\n"); + | break; + | } + | case 'B': + | printf("You pressed 'B'\n"); + | break; + | case 'C': + | case 'D': + | printf("You pressed 'C' or 'D'\n"); + | break; + | default: + | printf("I don't know what you pressed\n"); + | } + + +2) Braces +--------- + +Braces are used to delimit multiple-instruction blocks. In general it is +preferred to avoid braces around single-instruction blocks as it reduces the +number of lines : + +Right : + + | if (argc >= 2) + | exit(0); + +Wrong : + + | if (argc >= 2) { + | exit(0); + | } + +But it is not that strict, it really depends on the context. It happens from +time to time that single-instruction blocks are enclosed within braces because +it makes the code more symmetrical, or more readable. Example : + + | if (argc < 2) { + | printf("Missing argument\n"); + | exit(1); + | } else { + | exit(0); + | } + +Braces are always needed to declare a function. A function's opening brace must +be placed at the beginning of the next line : + +Right : + + | int main(int argc, char **argv) + | { + | exit(0); + | } + +Wrong : + + | int main(int argc, char **argv) { + | exit(0); + | } + +Note that a large portion of the code still does not conforms to this rule, as +it took years to get all authors to adapt to this more common standard which +is now preferred, as it avoids visual confusion when function declarations are +broken on multiple lines : + +Right : + + | int foo(const char *hdr, const char *end, + | const char *name, const char *err, + | int len) + | { + | int i; + +Wrong : + + | int foo(const char *hdr, const char *end, + | const char *name, const char *err, + | int len) { + | int i; + +Braces should always be used where there might be an ambiguity with the code +later. The most common example is the stacked "if" statement where an "else" +may be added later at the wrong place breaking the code, but it also happens +with comments or long arguments in function calls. In general, if a block is +more than one line long, it should use braces. + +Dangerous code waiting of a victim : + + | if (argc < 2) + | /* ret must not be negative here */ + | if (ret < 0) + | return -1; + +Wrong change : + + | if (argc < 2) + | /* ret must not be negative here */ + | if (ret < 0) + | return -1; + | else + | return 0; + +It will do this instead of what your eye seems to tell you : + + | if (argc < 2) + | /* ret must not be negative here */ + | if (ret < 0) + | return -1; + | else + | return 0; + +Right : + + | if (argc < 2) { + | /* ret must not be negative here */ + | if (ret < 0) + | return -1; + | } + | else + | return 0; + +Similarly dangerous example : + + | if (ret < 0) + | /* ret must not be negative here */ + | complain(); + | init(); + +Wrong change to silent the annoying message : + + | if (ret < 0) + | /* ret must not be negative here */ + | //complain(); + | init(); + +... which in fact means : + + | if (ret < 0) + | init(); + + +3) Breaking lines +----------------- + +There is no strict rule for line breaking. Some files try to stick to the 80 +column limit, but given that various people use various tab sizes, it does not +make much sense. Also, code is sometimes easier to read with less lines, as it +represents less surface on the screen (since each new line adds its tabs and +spaces). The rule is to stick to the average line length of other lines. If you +are working in a file which fits in 80 columns, try to keep this goal in mind. +If you're in a function with 120-chars lines, there is no reason to add many +short lines, so you can make longer lines. + +In general, opening a new block should lead to a new line. Similarly, multiple +instructions should be avoided on the same line. But some constructs make it +more readable when those are perfectly aligned : + +A copy-paste bug in the following construct will be easier to spot : + + | if (omult % idiv == 0) { omult /= idiv; idiv = 1; } + | if (idiv % omult == 0) { idiv /= omult; omult = 1; } + | if (imult % odiv == 0) { imult /= odiv; odiv = 1; } + | if (odiv % imult == 0) { odiv /= imult; imult = 1; } + +than in this one : + + | if (omult % idiv == 0) { + | omult /= idiv; + | idiv = 1; + | } + | if (idiv % omult == 0) { + | idiv /= omult; + | omult = 1; + | } + | if (imult % odiv == 0) { + | imult /= odiv; + | odiv = 1; + | } + | if (odiv % imult == 0) { + | odiv /= imult; + | imult = 1; + | } + +What is important is not to mix styles. For instance there is nothing wrong +with having many one-line "case" statements as long as most of them are this +short like below : + + | switch (*arg) { + | case 'A': ret = 1; break; + | case 'B': ret = 2; break; + | case 'C': ret = 4; break; + | case 'D': ret = 8; break; + | default : ret = 0; break; + | } + +Otherwise, prefer to have the "case" statement on its own line as in the +example in section 1.2 about alignment. In any case, avoid to stack multiple +control statements on the same line, so that it will never be the needed to +add two tab levels at once : + +Right : + + | switch (*arg) { + | case 'A': + | if (ret < 0) + | ret = 1; + | break; + | default : ret = 0; break; + | } + +Wrong : + + | switch (*arg) { + | case 'A': if (ret < 0) + | ret = 1; + | break; + | default : ret = 0; break; + | } + +Right : + + | if (argc < 2) + | if (ret < 0) + | return -1; + +or Right : + + | if (argc < 2) + | if (ret < 0) return -1; + +but Wrong : + + | if (argc < 2) if (ret < 0) return -1; + + +When complex conditions or expressions are broken into multiple lines, please +do ensure that alignment is perfectly appropriate, and group all main operators +on the same side (which you're free to choose as long as it does not change for +every block. Putting binary operators on the right side is preferred as it does +not mangle with alignment but various people have their preferences. + +Right : + + | if ((txn->flags & TX_NOT_FIRST) && + | ((req->flags & BF_FULL) || + | req->r < req->lr || + | req->r > req->data + req->size - global.tune.maxrewrite)) { + | return 0; + | } + +Right : + + | if ((txn->flags & TX_NOT_FIRST) + | && ((req->flags & BF_FULL) + | || req->r < req->lr + | || req->r > req->data + req->size - global.tune.maxrewrite)) { + | return 0; + | } + +Wrong : + + | if ((txn->flags & TX_NOT_FIRST) && + | ((req->flags & BF_FULL) || + | req->r < req->lr + | || req->r > req->data + req->size - global.tune.maxrewrite)) { + | return 0; + | } + +If it makes the result more readable, parenthesis may even be closed on their +own line in order to align with the opening one. Note that should normally not +be needed because such code would be too complex to be digged into. + +The "else" statement may either be merged with the closing "if" brace or lie on +its own line. The later is preferred but it adds one extra line to each control +block which is annoying in short ones. However, if the "else" is followed by an +"if", then it should really be on its own line and the rest of the if/else +blocks must follow the same style. + +Right : + + | if (a < b) { + | return a; + | } + | else { + | return b; + | } + +Right : + + | if (a < b) { + | return a; + | } else { + | return b; + | } + +Right : + + | if (a < b) { + | return a; + | } + | else if (a != b) { + | return b; + | } + | else { + | return 0; + | } + +Wrong : + + | if (a < b) { + | return a; + | } else if (a != b) { + | return b; + | } else { + | return 0; + | } + +Wrong : + + | if (a < b) { + | return a; + | } + | else if (a != b) { + | return b; + | } else { + | return 0; + | } + + +4) Spacing +---------- + +Correctly spacing code is very important. When you have to spot a bug at 3am, +you need it to be clear. When you expect other people to review your code, you +want it to be clear and don't want them to get nervous when trying to find what +you did. + +Always place spaces around all binary or ternary operators, commas, as well as +after semi-colons and opening braces if the line continues : + +Right : + + | int ret = 0; + | /* if (x >> 4) { x >>= 4; ret += 4; } */ + | ret += (x >> 4) ? (x >>= 4, 4) : 0; + | val = ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1; + +Wrong : + + | int ret=0; + | /* if (x>>4) {x>>=4;ret+=4;} */ + | ret+=(x>>4)?(x>>=4,4):0; + | val=ret+((0xFFFFAA50U>>(x<<1))&3)+1; + +Never place spaces after unary operators (&, *, -, !, ~, ++, --) nor cast, as +they might be confused with they binary counterpart, nor before commas or +semicolons : + +Right : + + | bit = !!(~len++ ^ -(unsigned char)*x); + +Wrong : + + | bit = ! ! (~len++ ^ - (unsigned char) * x) ; + +Note that "sizeof" is a unary operator which is sometimes considered as a +language keyword, but in no case it is a function. It does not require +parenthesis so it is sometimes followed by spaces and sometimes not when +there are no parenthesis. Most people do not really care as long as what +is written is unambiguous. + +Braces opening a block must be preceded by one space unless the brace is +placed on the first column : + +Right : + + | if (argc < 2) { + | } + +Wrong : + + | if (argc < 2){ + | } + +Do not add unneeded spaces inside parenthesis, they just make the code less +readable. + +Right : + + | if (x < 4 && (!y || !z)) + | break; + +Wrong : + + | if ( x < 4 && ( !y || !z ) ) + | break; + +Language keywords must all be followed by a space. This is true for control +statements (do, for, while, if, else, return, switch, case), and for types +(int, char, unsigned). As an exception, the last type in a cast does not take +a space before the closing parenthesis). The "default" statement in a "switch" +construct is generally just followed by the colon. However the colon after a +"case" or "default" statement must be followed by a space. + +Right : + + | if (nbargs < 2) { + | printf("Missing arg at %c\n", *(char *)ptr); + | for (i = 0; i < 10; i++) beep(); + | return 0; + | } + | switch (*arg) { + +Wrong : + + | if(nbargs < 2){ + | printf("Missing arg at %c\n", *(char*)ptr); + | for(i = 0; i < 10; i++)beep(); + | return 0; + | } + | switch(*arg) { + +Function calls are different, the opening parenthesis is always coupled to the +function name without any space. But spaces are still needed after commas : + +Right : + + | if (!init(argc, argv)) + | exit(1); + +Wrong : + + | if (!init (argc,argv)) + | exit(1); + + +5) Excess or lack of parenthesis +-------------------------------- + +Sometimes there are too many parenthesis in some formulas, sometimes there are +too few. There are a few rules of thumb for this. The first one is to respect +the compiler's advice. If it emits a warning and asks for more parenthesis to +avoid confusion, follow the advice at least to shut the warning. For instance, +the code below is quite ambiguous due to its alignment : + + | if (var1 < 2 || var2 < 2 && + | var3 != var4) { + | /* fail */ + | return -3; + | } + +Note that this code does : + + | if (var1 < 2 || (var2 < 2 && var3 != var4)) { + | /* fail */ + | return -3; + | } + +But maybe the author meant : + + | if ((var1 < 2 || var2 < 2) && var3 != var4) { + | /* fail */ + | return -3; + | } + +A second rule to put parenthesis is that people don't always know operators +precedence too well. Most often they have no issue with operators of the same +category (eg: booleans, integers, bit manipulation, assignment) but once these +operators are mixed, it causes them all sort of issues. In this case, it is +wise to use parenthesis to avoid errors. One common error concerns the bit +shift operators because they're used to replace multiplies and divides but +don't have the same precedence : + +The expression : + + | x = y * 16 + 5; + +becomes : + + | x = y << 4 + 5; + +which is wrong because it is equivalent to : + + | x = y << (4 + 5); + +while the following was desired instead : + + | x = (y << 4) + 5; + +It is generally fine to write boolean expressions based on comparisons without +any parenthesis. But on top of that, integer expressions and assignments should +then be protected. For instance, there is an error in the expression below +which should be safely rewritten : + +Wrong : + + | if (var1 > 2 && var1 < 10 || + | var1 > 2 + 256 && var2 < 10 + 256 || + | var1 > 2 + 1 << 16 && var2 < 10 + 2 << 16) + | return 1; + +Right (may remove a few parenthesis depending on taste) : + + | if ((var1 > 2 && var1 < 10) || + | (var1 > (2 + 256) && var2 < (10 + 256)) || + | (var1 > (2 + (1 << 16)) && var2 < (10 + (1 << 16)))) + | return 1; + +The "return" statement is not a function, so it takes no argument. It is a +control statement which is followed by the expression to be returned. It does +not need to be followed by parenthesis : + +Wrong : + + | int ret0() + | { + | return(0); + | } + +Right : + + | int ret0() + | { + | return 0; + | } + +Parenthesisis are also found in type casts. Type casting should be avoided as +much as possible, especially when it concerns pointer types. Casting a pointer +disables the compiler's type checking and is the best way to get caught doing +wrong things with data not the size you expect. If you need to manipulate +multiple data types, you can use a union instead. If the union is really not +convenient and casts are easier, then try to isolate them as much as possible, +for instance when initializing function arguments or in another function. Not +proceeding this way causes huge risks of not using the proper pointer without +any notification, which is especially true during copy-pastes. + +Wrong : + + | void *check_private_data(void *arg1, void *arg2) + | { + | char *area; + | + | if (*(int *)arg1 > 1000) + | return NULL; + | if (memcmp(*(const char *)arg2, "send(", 5) != 0)) + | return NULL; + | area = malloc(*(int *)arg1); + | if (!area) + | return NULL; + | memcpy(area, *(const char *)arg2 + 5, *(int *)arg1); + | return area; + | } + +Right : + + | void *check_private_data(void *arg1, void *arg2) + | { + | char *area; + | int len = *(int *)arg1; + | const char *msg = arg2; + | + | if (len > 1000) + | return NULL; + | if (memcmp(msg, "send(", 5) != 0) + | return NULL; + | area = malloc(len); + | if (!area) + | return NULL; + | memcpy(area, msg + 5, len); + | return area; + | } + + +6) Ambiguous comparisons with zero or NULL +------------------------------------------ + +In C, '0' has no type, or it has the type of the variable it is assigned to. +Comparing a variable or a return value with zero means comparing with the +representation of zero for this variable's type. For a boolean, zero is false. +For a pointer, zero is NULL. Very often, to make things shorter, it is fine to +use the '!' unary operator to compare with zero, as it is shorter and easier to +remind or understand than a plain '0'. Since the '!' operator is read "not", it +helps read code faster when what follows it makes sense as a boolean, and it is +often much more appropriate than a comparison with zero which makes an equal +sign appear at an undesirable place. For instance : + + | if (!isdigit(*c) && !isspace(*c)) + | break; + +is easier to understand than : + + | if (isdigit(*c) == 0 && isspace(*c) == 0) + | break; + +For a char this "not" operator can be reminded as "no remaining char", and the +absence of comparison to zero implies existence of the tested entity, hence the +simple strcpy() implementation below which automatically stops once the last +zero is copied : + + | void my_strcpy(char *d, const char *s) + | { + | while ((*d++ = *s++)); + | } + +Note the double parenthesis in order to avoid the compiler telling us it looks +like an equality test. + +For a string or more generally any pointer, this test may be understood as an +existence test or a validity test, as the only pointer which will fail to +validate equality is the NULL pointer : + + | area = malloc(1000); + | if (!area) + | return -1; + +However sometimes it can fool the reader. For instance, strcmp() precisely is +one of such functions whose return value can make one think the opposite due to +its name which may be understood as "if strings compare...". Thus it is strongly +recommended to perform an explicit comparison with zero in such a case, and it +makes sense considering that the comparison's operator is the same that is +wanted to compare the strings (note that current config parser lacks a lot in +this regards) : + + strcmp(a, b) == 0 <=> a == b + strcmp(a, b) != 0 <=> a != b + strcmp(a, b) < 0 <=> a < b + strcmp(a, b) > 0 <=> a > b + +Avoid this : + + | if (strcmp(arg, "test")) + | printf("this is not a test\n"); + | + | if (!strcmp(arg, "test")) + | printf("this is a test\n"); + +Prefer this : + + | if (strcmp(arg, "test") != 0) + | printf("this is not a test\n"); + | + | if (strcmp(arg, "test") == 0) + | printf("this is a test\n"); + + +7) System call returns +---------------------- + +This is not directly a matter of coding style but more of bad habits. It is +important to check for the correct value upon return of syscalls. The proper +return code indicating an error is described in its man page. There is no +reason to consider wider ranges than what is indicated. For instance, it is +common to see such a thing : + + | if ((fd = open(file, O_RDONLY)) < 0) + | return -1; + +This is wrong. The man page says that -1 is returned if an error occurred. It +does not suggest that any other negative value will be an error. It is possible +that a few such issues have been left in existing code. They are bugs for which +fixes are accepted, even though they're currently harmless since open() is not +known for returning negative values at the moment. + + +8) Declaring new types, names and values +---------------------------------------- + +Please refrain from using "typedef" to declare new types, they only obfuscate +the code. The reader never knows whether he's manipulating a scalar type or a +struct. For instance it is not obvious why the following code fails to build : + + | int delay_expired(timer_t exp, timer_us_t now) + | { + | return now >= exp; + | } + +With the types declared in another file this way : + + | typedef unsigned int timer_t; + | typedef struct timeval timer_us_t; + +This cannot work because we're comparing a scalar with a struct, which does +not make sense. Without a typedef, the function would have been written this +way without any ambiguity and would not have failed : + + | int delay_expired(unsigned int exp, struct timeval *now) + | { + | return now >= exp->tv_sec; + | } + +Declaring special values may be done using enums. Enums are a way to define +structured integer values which are related to each other. They are perfectly +suited for state machines. While the first element is always assigned the zero +value, not everybody knows that, especially people working with multiple +languages all the day. For this reason it is recommended to explicitly force +the first value even if it's zero. The last element should be followed by a +comma if it is planned that new elements might later be added, this will make +later patches shorter. Conversely, if the last element is placed in order to +get the number of possible values, it must not be followed by a comma and must +be preceded by a comment : + + | enum { + | first = 0, + | second, + | third, + | fourth, + | }; + + + | enum { + | first = 0, + | second, + | third, + | fourth, + | /* nbvalues must always be placed last */ + | nbvalues + | }; + +Structure names should be short enough not to mangle function declarations, +and explicit enough to avoid confusion (which is the most important thing). + +Wrong : + + | struct request_args { /* arguments on the query string */ + | char *name; + | char *value; + | struct misc_args *next; + | }; + +Right : + + | struct qs_args { /* arguments on the query string */ + | char *name; + | char *value; + | struct qs_args *next; + | } + + +When declaring new functions or structures, please do not use CamelCase, which +is a style where upper and lower case are mixed in a single word. It causes a +lot of confusion when words are composed from acronyms, because it's hard to +stick to a rule. For instance, a function designed to generate an ISN (initial +sequence number) for a TCP/IP connection could be called : + + - generateTcpipIsn() + - generateTcpIpIsn() + - generateTcpIpISN() + - generateTCPIPISN() + etc... + +None is right, none is wrong, these are just preferences which might change +along the code. Instead, please use an underscore to separate words. Lowercase +is preferred for the words, but if acronyms are upcased it's not dramatic. The +real advantage of this method is that it creates unambiguous levels even for +short names. + +Valid examples : + + - generate_tcpip_isn() + - generate_tcp_ip_isn() + - generate_TCPIP_ISN() + - generate_TCP_IP_ISN() + +Another example is easy to understand when 3 arguments are involved in naming +the function : + +Wrong (naming conflict) : + + | /* returns A + B * C */ + | int mulABC(int a, int b, int c) + | { + | return a + b * c; + | } + | + | /* returns (A + B) * C */ + | int mulABC(int a, int b, int c) + | { + | return (a + b) * c; + | } + +Right (unambiguous naming) : + + | /* returns A + B * C */ + | int mul_a_bc(int a, int b, int c) + | { + | return a + b * c; + | } + | + | /* returns (A + B) * C */ + | int mul_ab_c(int a, int b, int c) + | { + | return (a + b) * c; + | } + +Whenever you manipulate pointers, try to declare them as "const", as it will +save you from many accidental misuses and will only cause warnings to be +emitted when there is a real risk. In the examples below, it is possible to +call my_strcpy() with a const string only in the first declaration. Note that +people who ignore "const" are often the ones who cast a lot and who complain +from segfaults when using strtok() ! + +Right : + + | void my_strcpy(char *d, const char *s) + | { + | while ((*d++ = *s++)); + | } + | + | void say_hello(char *dest) + | { + | my_strcpy(dest, "hello\n"); + | } + +Wrong : + + | void my_strcpy(char *d, char *s) + | { + | while ((*d++ = *s++)); + | } + | + | void say_hello(char *dest) + | { + | my_strcpy(dest, "hello\n"); + | } + + +9) Getting macros right +----------------------- + +It is very common for macros to do the wrong thing when used in a way their +author did not have in mind. For this reason, macros must always be named with +uppercase letters only. This is the only way to catch the developer's eye when +using them, so that they double-check whether they are taking a risk or not. First, +macros must never ever be terminated by a semi-colon, or they will close the +wrong block once in a while. For instance, the following will cause a build +error before the "else" due to the double semi-colon : + +Wrong : + + | #define WARN printf("warning\n"); + | ... + | if (a < 0) + | WARN; + | else + | a--; + +Right : + + | #define WARN printf("warning\n") + +If multiple instructions are needed, then use a do { } while (0) block, which +is the only construct which respects *exactly* the semantics of a single +instruction : + + | #define WARN do { printf("warning\n"); log("warning\n"); } while (0) + | ... + | + | if (a < 0) + | WARN; + | else + | a--; + +Second, do not put unprotected control statements in macros, they will +definitely cause bugs : + +Wrong : + + | #define WARN if (verbose) printf("warning\n") + | ... + | if (a < 0) + | WARN; + | else + | a--; + +Which is equivalent to the undesired form below : + + | if (a < 0) + | if (verbose) + | printf("warning\n"); + | else + | a--; + +Right way to do it : + + | #define WARN do { if (verbose) printf("warning\n"); } while (0) + | ... + | if (a < 0) + | WARN; + | else + | a--; + +Which is equivalent to : + + | if (a < 0) + | do { if (verbose) printf("warning\n"); } while (0); + | else + | a--; + +Macro parameters must always be surrounded by parenthesis, and must never be +duplicated in the same macro unless explicitly stated. Also, macros must not be +defined with operators without surrounding parenthesis. The MIN/MAX macros are +a pretty common example of multiple misuses, but this happens as early as when +using bit masks. Most often, in case of any doubt, try to use inline functions +instead. + +Wrong : + + | #define MIN(a, b) a < b ? a : b + | + | /* returns 2 * min(a,b) + 1 */ + | int double_min_p1(int a, int b) + | { + | return 2 * MIN(a, b) + 1; + | } + +What this will do : + + | int double_min_p1(int a, int b) + | { + | return 2 * a < b ? a : b + 1; + | } + +Which is equivalent to : + + | int double_min_p1(int a, int b) + | { + | return (2 * a) < b ? a : (b + 1); + | } + +The first thing to fix is to surround the macro definition with parenthesis to +avoid this mistake : + + | #define MIN(a, b) (a < b ? a : b) + +But this is still not enough, as can be seen in this example : + + | /* compares either a or b with c */ + | int min_ab_c(int a, int b, int c) + | { + | return MIN(a ? a : b, c); + | } + +Which is equivalent to : + + | int min_ab_c(int a, int b, int c) + | { + | return (a ? a : b < c ? a ? a : b : c); + | } + +Which in turn means a totally different thing due to precedence : + + | int min_ab_c(int a, int b, int c) + | { + | return (a ? a : ((b < c) ? (a ? a : b) : c)); + | } + +This can be fixed by surrounding *each* argument in the macro with parenthesis: + + | #define MIN(a, b) ((a) < (b) ? (a) : (b)) + +But this is still not enough, as can be seen in this example : + + | int min_ap1_b(int a, int b) + | { + | return MIN(++a, b); + | } + +Which is equivalent to : + + | int min_ap1_b(int a, int b) + | { + | return ((++a) < (b) ? (++a) : (b)); + | } + +Again, this is wrong because "a" is incremented twice if below b. The only way +to fix this is to use a compound statement and to assign each argument exactly +once to a local variable of the same type : + + | #define MIN(a, b) ({ typeof(a) __a = (a); typeof(b) __b = (b); \ + | ((__a) < (__b) ? (__a) : (__b)); \ + | }) + +At this point, using static inline functions is much cleaner if a single type +is to be used : + + | static inline int min(int a, int b) + | { + | return a < b ? a : b; + | } + + +10) Includes +------------ + +Includes are as much as possible listed in alphabetically ordered groups : + - the includes more or less system-specific (sys/*, netinet/*, ...) + - the libc-standard includes (those without any path component) + - includes from the local "import" subdirectory + - includes from the local "haproxy" subdirectory + +Each section is just visually delimited from the other ones using an empty +line. The two first ones above may be merged into a single section depending on +developer's preference. Please do not copy-paste include statements from other +files. Having too many includes significantly increases build time and makes it +hard to find which ones are needed later. Just include what you need and if +possible in alphabetical order so that when something is missing, it becomes +obvious where to look for it and where to add it. + +All files should include <haproxy/api.h> because this is where build options +are prepared. + +HAProxy header files are split in two, those exporting the types only (named +with a trailing "-t") and those exporting variables, functions and inline +functions. Types, structures, enums and #defines must go into the types files +which are the only ones that may be included by othertype files. Function +prototypes and inlined functions must go into the main files. This split is +because of inlined functions which cross-reference types from other files, +which cause a chicken-and-egg problem if the functions and types are declared +at the same place. + +Include files must be protected against multiple inclusion using the common +#ifndef/#define/#endif trick with a tag derived from the include file and its +location. + + +11) Comments +------------ + +Comments are preferably of the standard 'C' form using /* */. The C++ form "//" +are tolerated for very short comments (eg: a word or two) but should be avoided +as much as possible. Multi-line comments are made with each intermediate line +starting with a star aligned with the first one, as in this example : + + | /* + | * This is a multi-line + | * comment. + | */ + +If multiple code lines need a short comment, try to align them so that you can +have multi-line sentences. This is rarely needed, only for really complex +constructs. + +Do not tell what you're doing in comments, but explain why you're doing it if +it seems not to be obvious. Also *do* indicate at the top of function what they +accept and what they don't accept. For instance, strcpy() only accepts output +buffers at least as large as the input buffer, and does not support any NULL +pointer. There is nothing wrong with that if the caller knows it. + +Wrong use of comments : + + | int flsnz8(unsigned int x) + | { + | int ret = 0; /* initialize ret */ + | if (x >> 4) { x >>= 4; ret += 4; } /* add 4 to ret if needed */ + | return ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1; /* add ??? */ + | } + | ... + | bit = ~len + (skip << 3) + 9; /* update bit */ + +Right use of comments : + + | /* This function returns the position of the highest bit set in the lowest + | * byte of <x>, between 0 and 7. It only works if <x> is non-null. It uses + | * a 32-bit value as a lookup table to return one of 4 values for the + | * highest 16 possible 4-bit values. + | */ + | int flsnz8(unsigned int x) + | { + | int ret = 0; + | if (x >> 4) { x >>= 4; ret += 4; } + | return ret + ((0xFFFFAA50U >> (x << 1)) & 3) + 1; + | } + | ... + | bit = ~len + (skip << 3) + 9; /* (skip << 3) + (8 - len), saves 1 cycle */ + + +12) Use of assembly +------------------- + +There are many projects where use of assembly code is not welcome. There is no +problem with use of assembly in haproxy, provided that : + + a) an alternate C-form is provided for architectures not covered + b) the code is small enough and well commented enough to be maintained + +It is important to take care of various incompatibilities between compiler +versions, for instance regarding output and cloberred registers. There are +a number of documentations on the subject on the net. Anyway if you are +fiddling with assembly, you probably know that already. + +Example : + | /* gcc does not know when it can safely divide 64 bits by 32 bits. Use this + | * function when you know for sure that the result fits in 32 bits, because + | * it is optimal on x86 and on 64bit processors. + | */ + | static inline unsigned int div64_32(unsigned long long o1, unsigned int o2) + | { + | unsigned int result; + | #ifdef __i386__ + | asm("divl %2" + | : "=a" (result) + | : "A"(o1), "rm"(o2)); + | #else + | result = o1 / o2; + | #endif + | return result; + | } + + +13) Pointers +------------ + +A lot could be said about pointers, there's enough to fill entire books. Misuse +of pointers is one of the primary reasons for bugs in haproxy, and this rate +has significantly increased with the use of threads. Moreover, bogus pointers +cause the hardest to analyse bugs, because usually they result in modifications +to reassigned areas or accesses to unmapped areas, and in each case, bugs that +strike very far away from where they were located. Some bugs have already taken +up to 3 weeks of full time analysis, which has a severe impact on the project's +ability to make forward progress on important features. For this reason, code +that doesn't look robust enough or that doesn't follow some of the rules below +will be rejected, and may even be reverted after being merged if the trouble is +detected late! + + +13.1) No test before freeing +---------------------------- + +All platforms where haproxy is supported have a well-defined and documented +behavior for free(NULL), which is to do nothing at all. In other words, free() +does test for the pointer's nullity. As such, there is no point in testing +if a pointer is NULL or not before calling free(). And further, you must not +do it, because it adds some confusion to the reader during debugging sessions, +making one think that the code's authors weren't very sure about what they +were doing. This will not cause a bug but will result in your code to get +rejected. + +Wrong call to free : + + | static inline int blah_free(struct blah *blah) + | { + | if (blah->str1) + | free(blah->str1); + | if (blah->str2) + | free(blah->str2); + | free(blah); + | } + +Correct call to free : + + | static inline int blah_free(struct blah *blah) + | { + | free(blah->str1); + | free(blah->str2); + | free(blah); + | } + + +13.2) No dangling pointers +-------------------------- + +Pointers are very commonly used as booleans: if they're not NULL, then the +area they point to is valid and may be used. This is convenient for many things +and is even emphasized with threads where they can atomically be swapped with +another value (even NULL), and as such provide guaranteed atomic resource +allocation and sharing. + +The problem with this is when someone forgets to delete a pointer when an area +is no longer valid, because this may result in the pointer being accessed later +and pointing to a wrong location, one that was reallocated for something else +and causing all sort of nastiness like crashes or memory corruption. Moreover, +thanks to the memory pools, it is extremely likely that a just released pointer +will be reassigned to a similar object with comparable values (flags etc) at +the same positions, making tests apparently succeed for a while. Some such bugs +have gone undetected for several years. + +The rule is pretty simple: + + +-----------------------------------------------------------------+ + | NO REACHABLE POINTER MAY EVER POINT TO AN UNREACHABLE LOCATION. | + +-----------------------------------------------------------------+ + +By "reachable pointer", here we mean a pointer that is accessible from a +reachable structure or a global variable. This means that any pointer found +anywhere in any structure in the code may always be dereferenced. This can +seem obvious but this is not always enforced. + +This means that when freeing an area, the pointer that was used to find that +area must be overwritten with NULL, and all other such pointers must as well +if any. It is one case where one can find more convenient to write the NULL +on the same line as the call to free() to make things easier to check. Be +careful about any potential "if" when doing this. + +Wrong use of free : + + | static inline int blah_recycle(struct blah *blah) + | { + | free(blah->str1); + | free(blah->str2); + | } + +Correct use of free : + + | static inline int blah_recycle(struct blah *blah) + | { + | free(blah->str1); blah->str1 = NULL; + | free(blah->str2); blah->str2 = NULL; + | } + +Sometimes the code doesn't permit this to be done. It is not a matter of code +but a matter of architecture. Example: + +Initialization: + + | static struct foo *foo_init() + | { + | struct foo *foo; + | struct bar *bar; + | + | foo = pool_alloc(foo_head); + | bar = pool_alloc(bar_head); + | if (!foo || !bar) + | goto fail; + | foo->bar = bar; + | ... + | } + +Scheduled task 1: + + | static inline int foo_timeout(struct foo *foo) + | { + | free(foo->bar); + | free(foo); + | } + +Scheduled task 2: + + | static inline int bar_timeout(struct bar *bar) + | { + | free(bar); + | } + +Here it's obvious that if "bar" times out, it will be freed but its pointer in +"foo" will remain here, and if foo times out just after, it will lead to a +double free. Or worse, if another instance allocates a pointer and receives bar +again, when foo times out, it will release the old bar pointer which now points +to a new object, and the code using that new object will crash much later, or +even worse, will share the same area as yet another instance having inherited +that pointer again. + +Here this simply means that the data model is wrong. If bar may be freed alone, +it MUST have a pointer to foo so that bar->foo->bar is set to NULL to let foo +finish its life peacefully. This also means that the code dealing with foo must +be written in a way to support bar's leaving. + + +13.3) Don't abuse pointers as booleans +-------------------------------------- + +Given the common use of a pointer to know if the area it points to is valid, +there is a big incentive in using such pointers as booleans to describe +something a bit higher level, like "is the user authenticated". This must not +be done. The reason stems from the points above. Initially this perfectly +matches and the code is simple. Then later some extra options need to be added, +and more pointers are needed, all allocated together. At this point they all +start to become their own booleans, supposedly always equivalent, but if that +were true, they would be a single area with a single pointer. And things start +to fall apart with some code areas relying on one pointer for the condition and +other ones relying on other pointers. Pointers may be substituted with "flags" +or "present in list" etc here. And from this point, things quickly degrade with +pointers needing to remain set even if pointing to wrong areas, just for the +sake of not being NULL and not breaking some assumptions. At this point the +bugs are already there and the code is not trustable anymore. + +The only way to avoid this is to strictly respect this rule: pointers do not +represent a functionality but a storage area. Of course it is very frequent to +consider that if an optional string is not set, a feature is not enabled. This +can be fine to some extents. But as soon as any slightest condition is added +anywhere into the mux, the code relying on the pointer must be replaced with +something else so that the pointer may live its own life and be released (and +reset) earlier if needed. + + +13.4) Mixing const and non-const +-------------------------------- + +Something often encountered, especially when assembling error messages, is +functions that collect strings, assemble them into larger messages and free +everything. The problem here is that if strings are defined as variables, there +will rightfully be build warnings when reporting string constants such as bare +keywords or messages, and if strings are defined as constants, it is not +possible to free them. The temptation is sometimes huge to force some free() +calls on casted strings. Do not do that! It will inevitably lead to someone +getting caught passing a constant string that will make the process crash (if +lucky). Document the expectations, indicate that all arguments must be freeable +and that the caller must be capable of strdup(), and make your function support +NULLs and document it (so that callers can deal with a failing strdup() on +allocation error). + +One valid alternative is to use a secondary channel to indicate whether the +message may be freed or not. A flag in a complex structure can be used for this +purpose, for example. If you are certain that your strings are aligned to a +certain number of bytes, it can be possible to instrument the code to use the +lowest bit to indicate the need to free (e.g. by always adding one to every +const string). But such a solution will require good enough instrumentation so +that it doesn't constitute a new set of traps. + + +13.5) No pointer casts +---------------------- + +Except in rare occasions caused by legacy APIs (e.g. sockaddr) or special cases +which explicitly require a form of aliasing, there is no valid reason for +casting pointers, and usually this is used to hide other problems that will +strike later. The only suitable type of cast is the cast from the generic void* +used to store a context for example. But in C, there is no need to cast to nor +from void*, so this is not required. However those coming from C++ tend to be +used to this practice, and others argue that it makes the intent more visible. + +As a corollary, do not abuse void*. Placing void* everywhere to avoid casting +is a bad practice as well. The use of void* is only for generic functions or +structures which do not have a limited set of types supported. When only a few +types are supported, generally their type can be passed using a side channel, +and the void* can be turned into a union that makes the code more readable and +more verifiable. + +An alternative in haproxy is to use a pointer to an obj_type enum. Usually it +is placed at the beginning of a structure. It works like a void* except that +the type is read directly from the object. This is convenient when a small set +of remote objects may be attached to another one because a single of them will +match a non-null pointer (e.g. a connection or an applet). + +Example: + + | static inline int blah_free(struct blah *blah) + | { + | /* only one of them (at most) will not be null */ + | pool_free(pool_head_connection, objt_conn(blah->target)); + | pool_free(pool_head_appctx, objt_appctx(blah->target)); + | pool_free(pool_head_stream, objt_stream(blah->target)); + | blah->target = NULL; + | } + + +13.6) Extreme caution when using non-canonical pointers +------------------------------------------------------- + +It can be particularly convenient to embed some logic in the unused bits or +code points of a pointer. Indeed, when it is known by design that a given +pointer will always follow a certain alignment, a few lower bits will always +remain zero, and as such may be used as optional flags. For example, the ebtree +code uses the lowest bit to differentiate left/right attachments to the parent +and node/leaf in branches. It is also known that values very close to NULL will +never represent a valid pointer, and the thread-safe MT_LIST code uses this to +lock visited pointers. + +There are a few rules to respect in order to do this: + - the deviations from the canonical pointers must be exhaustively documented + where the pointer type is defined, and the whole control logic with its + implications and possible and impossible cases must be enumerated as well ; + + - make sure that the operations will work on every supported platform, which + includes 32-bit platforms where structures may be aligned on as little as + 32-bit. 32-bit alignment leaves only two LSB available. When doing so, make + sure the target structures are not labelled with the "packed" attribute, or + that they're always perfectly aligned. All platforms where haproxy runs + have their NULL pointer mapped at address zero, and use page sizes at least + 4096 bytes large, leaving all values form 1 to 4095 unused. Anything + outside of this is unsafe. In particular, never use negative numbers to + represent a supposedly invalid address. On 32-bits platforms it will often + correspond to a system address or a special page. Always try a variety of + platforms when doing such a thing. + + - the code must not use such pointers as booleans anymore even if it is known + that "it works" because that keeps a doubt open for the reviewer. Only the + canonical pointer may be tested. There can be a rare exception which is if + this is on a critical path where severe performance degradation may result + from this. In this case, *each* of the checks must be duly documented and + the equivalent BUG_ON() instances must be placed to prove the claim. + + - some inline functions (or macros) must be used to turn the pointers to/from + their canonical form so that the regular code doesn't have to see the + operations, and so that the representation may be easily adjusted in the + future. A few comments indicating to a human how to turn a pointer back and + forth from inside a debugger will be appreciated, as macros often end up + not being trivially readable nor directly usable. + + - do not use int types to cast the pointers, this will only work on 32-bit + platforms. While "long" is usually fine, it is not recommended anymore due + to the Windows platform being LLP64 and having it set to 32 bits. And + "long long" isn't good either for always being 64 bits. More suitable types + are ptrdiff_t or size_t. Note that while those were not available everywhere + in the early days of hparoxy, size_t is now heavily used and known to work + everywhere. And do not perform the operations on the pointers, only on the + integer types (and cast back again). Some compilers such as gcc are + extremely picky about this and will often emit wrong code when they see + equality conditions they believe is impossible and decide to optimize them + away. + + +13.7) Pointers in unions +------------------------ + +Before placing multiple aliasing pointers inside a same union, there MUST be a +SINGLE well-defined way to figure them out from each other. It may be thanks to +a side-channel information (as done in the samples with a defined type), it may +be based on in-area information (as done using obj_types), or any other trusted +solution. In any case, if pointers are mixed with any other type (integer or +float) in a union, there must be a very simple way to distinguish them, and not +a platform-dependent nor compiler-dependent one. diff --git a/doc/configuration.txt b/doc/configuration.txt new file mode 100644 index 0000000..be81b9a --- /dev/null +++ b/doc/configuration.txt @@ -0,0 +1,23704 @@ + ---------------------- + HAProxy + Configuration Manual + ---------------------- + version 2.6 + 2023/03/28 + + +This document covers the configuration language as implemented in the version +specified above. It does not provide any hints, examples, or advice. For such +documentation, please refer to the Reference Manual or the Architecture Manual. +The summary below is meant to help you find sections by name and navigate +through the document. + +Note to documentation contributors : + This document is formatted with 80 columns per line, with even number of + spaces for indentation and without tabs. Please follow these rules strictly + so that it remains easily printable everywhere. If a line needs to be + printed verbatim and does not fit, please end each line with a backslash + ('\') and continue on next line, indented by two characters. It is also + sometimes useful to prefix all output lines (logs, console outputs) with 3 + closing angle brackets ('>>>') in order to emphasize the difference between + inputs and outputs when they may be ambiguous. If you add sections, + please update the summary below for easier searching. + + +Summary +------- + +1. Quick reminder about HTTP +1.1. The HTTP transaction model +1.2. HTTP request +1.2.1. The request line +1.2.2. The request headers +1.3. HTTP response +1.3.1. The response line +1.3.2. The response headers + +2. Configuring HAProxy +2.1. Configuration file format +2.2. Quoting and escaping +2.3. Environment variables +2.4. Conditional blocks +2.5. Time format +2.6. Examples + +3. Global parameters +3.1. Process management and security +3.2. Performance tuning +3.3. Debugging +3.4. Userlists +3.5. Peers +3.6. Mailers +3.7. Programs +3.8. HTTP-errors +3.9. Rings +3.10. Log forwarding + +4. Proxies +4.1. Proxy keywords matrix +4.2. Alphabetically sorted keywords reference + +5. Bind and server options +5.1. Bind options +5.2. Server and default-server options +5.3. Server DNS resolution +5.3.1. Global overview +5.3.2. The resolvers section + +6. Cache +6.1. Limitation +6.2. Setup +6.2.1. Cache section +6.2.2. Proxy section + +7. Using ACLs and fetching samples +7.1. ACL basics +7.1.1. Matching booleans +7.1.2. Matching integers +7.1.3. Matching strings +7.1.4. Matching regular expressions (regexes) +7.1.5. Matching arbitrary data blocks +7.1.6. Matching IPv4 and IPv6 addresses +7.2. Using ACLs to form conditions +7.3. Fetching samples +7.3.1. Converters +7.3.2. Fetching samples from internal states +7.3.3. Fetching samples at Layer 4 +7.3.4. Fetching samples at Layer 5 +7.3.5. Fetching samples from buffer contents (Layer 6) +7.3.6. Fetching HTTP samples (Layer 7) +7.3.7. Fetching samples for developers +7.4. Pre-defined ACLs + +8. Logging +8.1. Log levels +8.2. Log formats +8.2.1. Default log format +8.2.2. TCP log format +8.2.3. HTTP log format +8.2.4. HTTPS log format +8.2.5. Error log format +8.2.6. Custom log format +8.3. Advanced logging options +8.3.1. Disabling logging of external tests +8.3.2. Logging before waiting for the session to terminate +8.3.3. Raising log level upon errors +8.3.4. Disabling logging of successful connections +8.4. Timing events +8.5. Session state at disconnection +8.6. Non-printable characters +8.7. Capturing HTTP cookies +8.8. Capturing HTTP headers +8.9. Examples of logs + +9. Supported filters +9.1. Trace +9.2. HTTP compression +9.3. Stream Processing Offload Engine (SPOE) +9.4. Cache +9.5. fcgi-app +9.6. OpenTracing + +10. FastCGI applications +10.1. Setup +10.1.1. Fcgi-app section +10.1.2. Proxy section +10.1.3. Example +10.2. Default parameters +10.3. Limitations + +11. Address formats +11.1. Address family prefixes +11.2. Socket type prefixes +11.3. Protocol prefixes + +1. Quick reminder about HTTP +---------------------------- + +When HAProxy is running in HTTP mode, both the request and the response are +fully analyzed and indexed, thus it becomes possible to build matching criteria +on almost anything found in the contents. + +However, it is important to understand how HTTP requests and responses are +formed, and how HAProxy decomposes them. It will then become easier to write +correct rules and to debug existing configurations. + + +1.1. The HTTP transaction model +------------------------------- + +The HTTP protocol is transaction-driven. This means that each request will lead +to one and only one response. Traditionally, a TCP connection is established +from the client to the server, a request is sent by the client through the +connection, the server responds, and the connection is closed. A new request +will involve a new connection : + + [CON1] [REQ1] ... [RESP1] [CLO1] [CON2] [REQ2] ... [RESP2] [CLO2] ... + +In this mode, called the "HTTP close" mode, there are as many connection +establishments as there are HTTP transactions. Since the connection is closed +by the server after the response, the client does not need to know the content +length. + +Due to the transactional nature of the protocol, it was possible to improve it +to avoid closing a connection between two subsequent transactions. In this mode +however, it is mandatory that the server indicates the content length for each +response so that the client does not wait indefinitely. For this, a special +header is used: "Content-length". This mode is called the "keep-alive" mode : + + [CON] [REQ1] ... [RESP1] [REQ2] ... [RESP2] [CLO] ... + +Its advantages are a reduced latency between transactions, and less processing +power required on the server side. It is generally better than the close mode, +but not always because the clients often limit their concurrent connections to +a smaller value. + +Another improvement in the communications is the pipelining mode. It still uses +keep-alive, but the client does not wait for the first response to send the +second request. This is useful for fetching large number of images composing a +page : + + [CON] [REQ1] [REQ2] ... [RESP1] [RESP2] [CLO] ... + +This can obviously have a tremendous benefit on performance because the network +latency is eliminated between subsequent requests. Many HTTP agents do not +correctly support pipelining since there is no way to associate a response with +the corresponding request in HTTP. For this reason, it is mandatory for the +server to reply in the exact same order as the requests were received. + +The next improvement is the multiplexed mode, as implemented in HTTP/2 and HTTP/3. +This time, each transaction is assigned a single stream identifier, and all +streams are multiplexed over an existing connection. Many requests can be sent in +parallel by the client, and responses can arrive in any order since they also +carry the stream identifier. + + +HTTP/3 is implemented over QUIC, itself implemented over UDP. QUIC solves the +head of line blocking at transport level by means of independently treated +streams. Indeed, when experiencing loss, an impacted stream does not affect the +other streams. + +By default HAProxy operates in keep-alive mode with regards to persistent +connections: for each connection it processes each request and response, and +leaves the connection idle on both sides between the end of a response and the +start of a new request. When it receives HTTP/2 connections from a client, it +processes all the requests in parallel and leaves the connection idling, +waiting for new requests, just as if it was a keep-alive HTTP connection. + +HAProxy supports 4 connection modes : + - keep alive : all requests and responses are processed (default) + - tunnel : only the first request and response are processed, + everything else is forwarded with no analysis (deprecated). + - server close : the server-facing connection is closed after the response. + - close : the connection is actively closed after end of response. + + + +1.2. HTTP request +----------------- + +First, let's consider this HTTP request : + + Line Contents + number + 1 GET /serv/login.php?lang=en&profile=2 HTTP/1.1 + 2 Host: www.mydomain.com + 3 User-agent: my small browser + 4 Accept: image/jpeg, image/gif + 5 Accept: image/png + + +1.2.1. The Request line +----------------------- + +Line 1 is the "request line". It is always composed of 3 fields : + + - a METHOD : GET + - a URI : /serv/login.php?lang=en&profile=2 + - a version tag : HTTP/1.1 + +All of them are delimited by what the standard calls LWS (linear white spaces), +which are commonly spaces, but can also be tabs or line feeds/carriage returns +followed by spaces/tabs. The method itself cannot contain any colon (':') and +is limited to alphabetic letters. All those various combinations make it +desirable that HAProxy performs the splitting itself rather than leaving it to +the user to write a complex or inaccurate regular expression. + +The URI itself can have several forms : + + - A "relative URI" : + + /serv/login.php?lang=en&profile=2 + + It is a complete URL without the host part. This is generally what is + received by servers, reverse proxies and transparent proxies. + + - An "absolute URI", also called a "URL" : + + http://192.168.0.12:8080/serv/login.php?lang=en&profile=2 + + It is composed of a "scheme" (the protocol name followed by '://'), a host + name or address, optionally a colon (':') followed by a port number, then + a relative URI beginning at the first slash ('/') after the address part. + This is generally what proxies receive, but a server supporting HTTP/1.1 + must accept this form too. + + - a star ('*') : this form is only accepted in association with the OPTIONS + method and is not relayable. It is used to inquiry a next hop's + capabilities. + + - an address:port combination : 192.168.0.12:80 + This is used with the CONNECT method, which is used to establish TCP + tunnels through HTTP proxies, generally for HTTPS, but sometimes for + other protocols too. + +In a relative URI, two sub-parts are identified. The part before the question +mark is called the "path". It is typically the relative path to static objects +on the server. The part after the question mark is called the "query string". +It is mostly used with GET requests sent to dynamic scripts and is very +specific to the language, framework or application in use. + +HTTP/2 doesn't convey a version information with the request, so the version is +assumed to be the same as the one of the underlying protocol (i.e. "HTTP/2"). + + +1.2.2. The request headers +-------------------------- + +The headers start at the second line. They are composed of a name at the +beginning of the line, immediately followed by a colon (':'). Traditionally, +an LWS is added after the colon but that's not required. Then come the values. +Multiple identical headers may be folded into one single line, delimiting the +values with commas, provided that their order is respected. This is commonly +encountered in the "Cookie:" field. A header may span over multiple lines if +the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5 +define a total of 3 values for the "Accept:" header. + +Contrary to a common misconception, header names are not case-sensitive, and +their values are not either if they refer to other header names (such as the +"Connection:" header). In HTTP/2, header names are always sent in lower case, +as can be seen when running in debug mode. Internally, all header names are +normalized to lower case so that HTTP/1.x and HTTP/2 use the exact same +representation, and they are sent as-is on the other side. This explains why an +HTTP/1.x request typed with camel case is delivered in lower case. + +The end of the headers is indicated by the first empty line. People often say +that it's a double line feed, which is not exact, even if a double line feed +is one valid form of empty line. + +Fortunately, HAProxy takes care of all these complex combinations when indexing +headers, checking values and counting them, so there is no reason to worry +about the way they could be written, but it is important not to accuse an +application of being buggy if it does unusual, valid things. + +Important note: + As suggested by RFC7231, HAProxy normalizes headers by replacing line breaks + in the middle of headers by LWS in order to join multi-line headers. This + is necessary for proper analysis and helps less capable HTTP parsers to work + correctly and not to be fooled by such complex constructs. + + +1.3. HTTP response +------------------ + +An HTTP response looks very much like an HTTP request. Both are called HTTP +messages. Let's consider this HTTP response : + + Line Contents + number + 1 HTTP/1.1 200 OK + 2 Content-length: 350 + 3 Content-Type: text/html + +As a special case, HTTP supports so called "Informational responses" as status +codes 1xx. These messages are special in that they don't convey any part of the +response, they're just used as sort of a signaling message to ask a client to +continue to post its request for instance. In the case of a status 100 response +the requested information will be carried by the next non-100 response message +following the informational one. This implies that multiple responses may be +sent to a single request, and that this only works when keep-alive is enabled +(1xx messages are HTTP/1.1 only). HAProxy handles these messages and is able to +correctly forward and skip them, and only process the next non-100 response. As +such, these messages are neither logged nor transformed, unless explicitly +state otherwise. Status 101 messages indicate that the protocol is changing +over the same connection and that HAProxy must switch to tunnel mode, just as +if a CONNECT had occurred. Then the Upgrade header would contain additional +information about the type of protocol the connection is switching to. + + +1.3.1. The response line +------------------------ + +Line 1 is the "response line". It is always composed of 3 fields : + + - a version tag : HTTP/1.1 + - a status code : 200 + - a reason : OK + +The status code is always 3-digit. The first digit indicates a general status : + - 1xx = informational message to be skipped (e.g. 100, 101) + - 2xx = OK, content is following (e.g. 200, 206) + - 3xx = OK, no content following (e.g. 302, 304) + - 4xx = error caused by the client (e.g. 401, 403, 404) + - 5xx = error caused by the server (e.g. 500, 502, 503) + +Please refer to RFC7231 for the detailed meaning of all such codes. The +"reason" field is just a hint, but is not parsed by clients. Anything can be +found there, but it's a common practice to respect the well-established +messages. It can be composed of one or multiple words, such as "OK", "Found", +or "Authentication Required". + +HAProxy may emit the following status codes by itself : + + Code When / reason + 200 access to stats page, and when replying to monitoring requests + 301 when performing a redirection, depending on the configured code + 302 when performing a redirection, depending on the configured code + 303 when performing a redirection, depending on the configured code + 307 when performing a redirection, depending on the configured code + 308 when performing a redirection, depending on the configured code + 400 for an invalid or too large request + 401 when an authentication is required to perform the action (when + accessing the stats page) + 403 when a request is forbidden by a "http-request deny" rule + 404 when the requested resource could not be found + 408 when the request timeout strikes before the request is complete + 410 when the requested resource is no longer available and will not + be available again + 500 when HAProxy encounters an unrecoverable internal error, such as a + memory allocation failure, which should never happen + 501 when HAProxy is unable to satisfy a client request because of an + unsupported feature + 502 when the server returns an empty, invalid or incomplete response, or + when an "http-response deny" rule blocks the response. + 503 when no server was available to handle the request, or in response to + monitoring requests which match the "monitor fail" condition + 504 when the response timeout strikes before the server responds + +The error 4xx and 5xx codes above may be customized (see "errorloc" in section +4.2). + + +1.3.2. The response headers +--------------------------- + +Response headers work exactly like request headers, and as such, HAProxy uses +the same parsing function for both. Please refer to paragraph 1.2.2 for more +details. + + +2. Configuring HAProxy +---------------------- + +2.1. Configuration file format +------------------------------ + +HAProxy's configuration process involves 3 major sources of parameters : + + - the arguments from the command-line, which always take precedence + - the configuration file(s), whose format is described here + - the running process's environment, in case some environment variables are + explicitly referenced + +The configuration file follows a fairly simple hierarchical format which obey +a few basic rules: + + 1. a configuration file is an ordered sequence of statements + + 2. a statement is a single non-empty line before any unprotected "#" (hash) + + 3. a line is a series of tokens or "words" delimited by unprotected spaces or + tab characters + + 4. the first word or sequence of words of a line is one of the keywords or + keyword sequences listed in this document + + 5. all other words are all arguments of the first one, some being well-known + keywords listed in this document, others being values, references to other + parts of the configuration, or expressions + + 6. certain keywords delimit a section inside which only a subset of keywords + are supported + + 7. a section ends at the end of a file or on a special keyword starting a new + section + +This is all that is needed to know to write a simple but reliable configuration +generator, but this is not enough to reliably parse any configuration nor to +figure how to deal with certain corner cases. + +First, there are a few consequences of the rules above. Rule 6 and 7 imply that +the keywords used to define a new section are valid everywhere and cannot have +a different meaning in a specific section. These keywords are always a single +word (as opposed to a sequence of words), and traditionally the section that +follows them is designated using the same name. For example when speaking about +the "global section", it designates the section of configuration that follows +the "global" keyword. This usage is used a lot in error messages to help locate +the parts that need to be addressed. + +A number of sections create an internal object or configuration space, which +requires to be distinguished from other ones. In this case they will take an +extra word which will set the name of this particular section. For some of them +the section name is mandatory. For example "frontend foo" will create a new +section of type "frontend" named "foo". Usually a name is specific to its +section and two sections of different types may use the same name, but this is +not recommended as it tends to complexify configuration management. + +A direct consequence of rule 7 is that when multiple files are read at once, +each of them must start with a new section, and the end of each file will end +a section. A file cannot contain sub-sections nor end an existing section and +start a new one. + +Rule 1 mentioned that ordering matters. Indeed, some keywords create directives +that can be repeated multiple times to create ordered sequences of rules to be +applied in a certain order. For example "tcp-request" can be used to alternate +"accept" and "reject" rules on varying criteria. As such, a configuration file +processor must always preserve a section's ordering when editing a file. The +ordering of sections usually does not matter except for the global section +which must be placed before other sections, but it may be repeated if needed. +In addition, some automatic identifiers may automatically be assigned to some +of the created objects (e.g. proxies), and by reordering sections, their +identifiers will change. These ones appear in the statistics for example. As +such, the configuration below will assign "foo" ID number 1 and "bar" ID number +2, which will be swapped if the two sections are reversed: + + listen foo + bind :80 + + listen bar + bind :81 + +Another important point is that according to rules 2 and 3 above, empty lines, +spaces, tabs, and comments following and unprotected "#" character are not part +of the configuration as they are just used as delimiters. This implies that the +following configurations are strictly equivalent: + + global#this is the global section + daemon#daemonize + frontend foo + mode http # or tcp + +and: + + global + daemon + + # this is the public web frontend + frontend foo + mode http + +The common practice is to align to the left only the keyword that initiates a +new section, and indent (i.e. prepend a tab character or a few spaces) all +other keywords so that it's instantly visible that they belong to the same +section (as done in the second example above). Placing comments before a new +section helps the reader decide if it's the desired one. Leaving a blank line +at the end of a section also visually helps spotting the end when editing it. + +Tabs are very convenient for indent but they do not copy-paste well. If spaces +are used instead, it is recommended to avoid placing too many (2 to 4) so that +editing in field doesn't become a burden with limited editors that do not +support automatic indent. + +In the early days it used to be common to see arguments split at fixed tab +positions because most keywords would not take more than two arguments. With +modern versions featuring complex expressions this practice does not stand +anymore, and is not recommended. + + +2.2. Quoting and escaping +------------------------- + +In modern configurations, some arguments require the use of some characters +that were previously considered as pure delimiters. In order to make this +possible, HAProxy supports character escaping by prepending a backslash ('\') +in front of the character to be escaped, weak quoting within double quotes +('"') and strong quoting within single quotes ("'"). + +This is pretty similar to what is done in a number of programming languages and +very close to what is commonly encountered in Bourne shell. The principle is +the following: while the configuration parser cuts the lines into words, it +also takes care of quotes and backslashes to decide whether a character is a +delimiter or is the raw representation of this character within the current +word. The escape character is then removed, the quotes are removed, and the +remaining word is used as-is as a keyword or argument for example. + +If a backslash is needed in a word, it must either be escaped using itself +(i.e. double backslash) or be strongly quoted. + +Escaping outside quotes is achieved by preceding a special character by a +backslash ('\'): + + \ to mark a space and differentiate it from a delimiter + \# to mark a hash and differentiate it from a comment + \\ to use a backslash + \' to use a single quote and differentiate it from strong quoting + \" to use a double quote and differentiate it from weak quoting + +In addition, a few non-printable characters may be emitted using their usual +C-language representation: + + \n to insert a line feed (LF, character \x0a or ASCII 10 decimal) + \r to insert a carriage return (CR, character \x0d or ASCII 13 decimal) + \t to insert a tab (character \x09 or ASCII 9 decimal) + \xNN to insert character having ASCII code hex NN (e.g \x0a for LF). + +Weak quoting is achieved by surrounding double quotes ("") around the character +or sequence of characters to protect. Weak quoting prevents the interpretation +of: + + space or tab as a word separator + ' single quote as a strong quoting delimiter + # hash as a comment start + +Weak quoting permits the interpretation of environment variables (which are not +evaluated outside of quotes) by preceding them with a dollar sign ('$'). If a +dollar character is needed inside double quotes, it must be escaped using a +backslash. + +Strong quoting is achieved by surrounding single quotes ('') around the +character or sequence of characters to protect. Inside single quotes, nothing +is interpreted, it's the efficient way to quote regular expressions. + +As a result, here is the matrix indicating how special characters can be +entered in different contexts (unprintable characters are replaced with their +name within angle brackets). Note that some characters that may only be +represented escaped have no possible representation inside single quotes, +hence the '-' there: + + Character | Unquoted | Weakly quoted | Strongly quoted + -----------+---------------+-----------------------------+----------------- + <TAB> | \<TAB>, \x09 | "<TAB>", "\<TAB>", "\x09" | '<TAB>' + <LF> | \n, \x0a | "\n", "\x0a" | - + <CR> | \r, \x0d | "\r", "\x0d" | - + <SPC> | \<SPC>, \x20 | "<SPC>", "\<SPC>", "\x20" | '<SPC>' + " | \", \x22 | "\"", "\x22" | '"' + # | \#, \x23 | "#", "\#", "\x23" | '#' + $ | $, \$, \x24 | "\$", "\x24" | '$' + ' | \', \x27 | "'", "\'", "\x27" | - + \ | \\, \x5c | "\\", "\x5c" | '\' + + Example: + # those are all strictly equivalent: + log-format %{+Q}o\ %t\ %s\ %{-Q}r + log-format "%{+Q}o %t %s %{-Q}r" + log-format '%{+Q}o %t %s %{-Q}r' + log-format "%{+Q}o %t"' %s %{-Q}r' + log-format "%{+Q}o %t"' %s'\ %{-Q}r + +There is one particular case where a second level of quoting or escaping may be +necessary. Some keywords take arguments within parenthesis, sometimes delimited +by commas. These arguments are commonly integers or predefined words, but when +they are arbitrary strings, it may be required to perform a separate level of +escaping to disambiguate the characters that belong to the argument from the +characters that are used to delimit the arguments themselves. A pretty common +case is the "regsub" converter. It takes a regular expression in argument, and +if a closing parenthesis is needed inside, this one will require to have its +own quotes. + +The keyword argument parser is exactly the same as the top-level one regarding +quotes, except that the \#, \$, and \xNN escapes are not processed. But what is +not always obvious is that the delimiters used inside must first be escaped or +quoted so that they are not resolved at the top level. + +Let's take this example making use of the "regsub" converter which takes 3 +arguments, one regular expression, one replacement string and one set of flags: + + # replace all occurrences of "foo" with "blah" in the path: + http-request set-path %[path,regsub(foo,blah,g)] + +Here no special quoting was necessary. But if now we want to replace either +"foo" or "bar" with "blah", we'll need the regular expression "(foo|bar)". We +cannot write: + + http-request set-path %[path,regsub((foo|bar),blah,g)] + +because we would like the string to cut like this: + + http-request set-path %[path,regsub((foo|bar),blah,g)] + |---------|----|-| + arg1 _/ / / + arg2 __________/ / + arg3 ______________/ + +but actually what is passed is a string between the opening and closing +parenthesis then garbage: + + http-request set-path %[path,regsub((foo|bar),blah,g)] + |--------|--------| + arg1=(foo|bar _/ / + trailing garbage _________/ + +The obvious solution here seems to be that the closing parenthesis needs to be +quoted, but alone this will not work, because as mentioned above, quotes are +processed by the top-level parser which will resolve them before processing +this word: + + http-request set-path %[path,regsub("(foo|bar)",blah,g)] + ------------ -------- ---------------------------------- + word1 word2 word3=%[path,regsub((foo|bar),blah,g)] + +So we didn't change anything for the argument parser at the second level which +still sees a truncated regular expression as the only argument, and garbage at +the end of the string. By escaping the quotes they will be passed unmodified to +the second level: + + http-request set-path %[path,regsub(\"(foo|bar)\",blah,g)] + ------------ -------- ------------------------------------ + word1 word2 word3=%[path,regsub("(foo|bar)",blah,g)] + |---------||----|-| + arg1=(foo|bar) _/ / / + arg2=blah ___________/ / + arg3=g _______________/ + +Another approach consists in using single quotes outside the whole string and +double quotes inside (so that the double quotes are not stripped again): + + http-request set-path '%[path,regsub("(foo|bar)",blah,g)]' + ------------ -------- ---------------------------------- + word1 word2 word3=%[path,regsub("(foo|bar)",blah,g)] + |---------||----|-| + arg1=(foo|bar) _/ / / + arg2 ___________/ / + arg3 _______________/ + +When using regular expressions, it can happen that the dollar ('$') character +appears in the expression or that a backslash ('\') is used in the replacement +string. In this case these ones will also be processed inside the double quotes +thus single quotes are preferred (or double escaping). Example: + + http-request set-path '%[path,regsub("^/(here)(/|$)","my/\1",g)]' + ------------ -------- ----------------------------------------- + word1 word2 word3=%[path,regsub("^/(here)(/|$)","my/\1",g)] + |-------------| |-----||-| + arg1=(here)(/|$) _/ / / + arg2=my/\1 ________________/ / + arg3 ______________________/ + +Remember that backslashes are not escape characters within single quotes and +that the whole word above is already protected against them using the single +quotes. Conversely, if double quotes had been used around the whole expression, +single the dollar character and the backslashes would have been resolved at top +level, breaking the argument contents at the second level. + +Unfortunately, since single quotes can't be escaped inside of strong quoting, +if you need to include single quotes in your argument, you will need to escape +or quote them twice. There are a few ways to do this: + + http-request set-var(txn.foo) str("\\'foo\\'") + http-request set-var(txn.foo) str(\"\'foo\'\") + http-request set-var(txn.foo) str(\\\'foo\\\') + +When in doubt, simply do not use quotes anywhere, and start to place single or +double quotes around arguments that require a comma or a closing parenthesis, +and think about escaping these quotes using a backslash if the string contains +a dollar or a backslash. Again, this is pretty similar to what is used under +a Bourne shell when double-escaping a command passed to "eval". For API writers +the best is probably to place escaped quotes around each and every argument, +regardless of their contents. Users will probably find that using single quotes +around the whole expression and double quotes around each argument provides +more readable configurations. + + +2.3. Environment variables +-------------------------- + +HAProxy's configuration supports environment variables. Those variables are +interpreted only within double quotes. Variables are expanded during the +configuration parsing. Variable names must be preceded by a dollar ("$") and +optionally enclosed with braces ("{}") similarly to what is done in Bourne +shell. Variable names can contain alphanumerical characters or the character +underscore ("_") but should not start with a digit. If the variable contains a +list of several values separated by spaces, it can be expanded as individual +arguments by enclosing the variable with braces and appending the suffix '[*]' +before the closing brace. It is also possible to specify a default value to +use when the variable is not set, by appending that value after a dash '-' +next to the variable name. Note that the default value only replaces non +existing variables, not empty ones. + + Example: + + bind "fd@${FD_APP1}" + + log "${LOCAL_SYSLOG-127.0.0.1}:514" local0 notice # send to local server + + user "$HAPROXY_USER" + +Some variables are defined by HAProxy, they can be used in the configuration +file, or could be inherited by a program (See 3.7. Programs): + +* HAPROXY_LOCALPEER: defined at the startup of the process which contains the + name of the local peer. (See "-L" in the management guide.) + +* HAPROXY_CFGFILES: list of the configuration files loaded by HAProxy, + separated by semicolons. Can be useful in the case you specified a + directory. + +* HAPROXY_MWORKER: In master-worker mode, this variable is set to 1. + +* HAPROXY_CLI: configured listeners addresses of the stats socket for every + processes, separated by semicolons. + +* HAPROXY_MASTER_CLI: In master-worker mode, listeners addresses of the master + CLI, separated by semicolons. + +* HAPROXY_STARTUP_VERSION: contains the version used to start, in master-worker + mode this is the version which was used to start the master, even after + updating the binary and reloading. + +In addition, some pseudo-variables are internally resolved and may be used as +regular variables. Pseudo-variables always start with a dot ('.'), and are the +only ones where the dot is permitted. The current list of pseudo-variables is: + +* .FILE: the name of the configuration file currently being parsed. + +* .LINE: the line number of the configuration file currently being parsed, + starting at one. + +* .SECTION: the name of the section currently being parsed, or its type if the + section doesn't have a name (e.g. "global"), or an empty string before the + first section. + +These variables are resolved at the location where they are parsed. For example +if a ".LINE" variable is used in a "log-format" directive located in a defaults +section, its line number will be resolved before parsing and compiling the +"log-format" directive, so this same line number will be reused by subsequent +proxies. + +This way it is possible to emit information to help locate a rule in variables, +logs, error statuses, health checks, header values, or even to use line numbers +to name some config objects like servers for example. + +See also "external-check command" for other variables. + + +2.4. Conditional blocks +----------------------- + +It may sometimes be convenient to be able to conditionally enable or disable +some arbitrary parts of the configuration, for example to enable/disable SSL or +ciphers, enable or disable some pre-production listeners without modifying the +configuration, or adjust the configuration's syntax to support two distinct +versions of HAProxy during a migration.. HAProxy brings a set of nestable +preprocessor-like directives which allow to integrate or ignore some blocks of +text. These directives must be placed on their own line and they act on the +lines that follow them. Two of them support an expression, the other ones only +switch to an alternate block or end a current level. The 4 following directives +are defined to form conditional blocks: + + - .if <condition> + - .elif <condition> + - .else + - .endif + +The ".if" directive nests a new level, ".elif" stays at the same level, ".else" +as well, and ".endif" closes a level. Each ".if" must be terminated by a +matching ".endif". The ".elif" may only be placed after ".if" or ".elif", and +there is no limit to the number of ".elif" that may be chained. There may be +only one ".else" per ".if" and it must always be after the ".if" or the last +".elif" of a block. + +Comments may be placed on the same line if needed after a '#', they will be +ignored. The directives are tokenized like other configuration directives, and +as such it is possible to use environment variables in conditions. + +Conditions can also be evaluated on startup with the -cc parameter. +See "3. Starting HAProxy" in the management doc. + +The conditions are either an empty string (which then returns false), or an +expression made of any combination of: + + - the integer zero ('0'), always returns "false" + - a non-nul integer (e.g. '1'), always returns "true". + - a predicate optionally followed by argument(s) in parenthesis. + - a condition placed between a pair of parenthesis '(' and ')' + - an exclamation mark ('!') preceding any of the non-empty elements above, + and which will negate its status. + - expressions combined with a logical AND ('&&'), which will be evaluated + from left to right until one returns false + - expressions combined with a logical OR ('||'), which will be evaluated + from right to left until one returns true + +Note that like in other languages, the AND operator has precedence over the OR +operator, so that "A && B || C && D" evalues as "(A && B) || (C && D)". + +The list of currently supported predicates is the following: + + - defined(<name>) : returns true if an environment variable <name> + exists, regardless of its contents + + - feature(<name>) : returns true if feature <name> is listed as present + in the features list reported by "haproxy -vv" + (which means a <name> appears after a '+') + + - streq(<str1>,<str2>) : returns true only if the two strings are equal + - strneq(<str1>,<str2>) : returns true only if the two strings differ + + - version_atleast(<ver>): returns true if the current haproxy version is + at least as recent as <ver> otherwise false. The + version syntax is the same as shown by "haproxy -v" + and missing components are assumed as being zero. + + - version_before(<ver>) : returns true if the current haproxy version is + strictly older than <ver> otherwise false. The + version syntax is the same as shown by "haproxy -v" + and missing components are assumed as being zero. + +Example: + + .if defined(HAPROXY_MWORKER) + listen mwcli_px + bind :1111 + ... + .endif + + .if strneq("$SSL_ONLY",yes) + bind :80 + .endif + + .if streq("$WITH_SSL",yes) + .if feature(OPENSSL) + bind :443 ssl crt ... + .endif + .endif + + .if feature(OPENSSL) && (streq("$WITH_SSL",yes) || streq("$SSL_ONLY",yes)) + bind :443 ssl crt ... + .endif + + .if version_atleast(2.4-dev19) + profiling.memory on + .endif + + .if !feature(OPENSSL) + .alert "SSL support is mandatory" + .endif + +Four other directives are provided to report some status: + + - .diag "message" : emit this message only when in diagnostic mode (-dD) + - .notice "message" : emit this message at level NOTICE + - .warning "message" : emit this message at level WARNING + - .alert "message" : emit this message at level ALERT + +Messages emitted at level WARNING may cause the process to fail to start if the +"strict-mode" is enabled. Messages emitted at level ALERT will always cause a +fatal error. These can be used to detect some inappropriate conditions and +provide advice to the user. + +Example: + + .if "${A}" + .if "${B}" + .notice "A=1, B=1" + .elif "${C}" + .notice "A=1, B=0, C=1" + .elif "${D}" + .warning "A=1, B=0, C=0, D=1" + .else + .alert "A=1, B=0, C=0, D=0" + .endif + .else + .notice "A=0" + .endif + + .diag "WTA/2021-05-07: replace 'redirect' with 'return' after switch to 2.4" + http-request redirect location /goaway if ABUSE + + +2.5. Time format +---------------- + +Some parameters involve values representing time, such as timeouts. These +values are generally expressed in milliseconds (unless explicitly stated +otherwise) but may be expressed in any other unit by suffixing the unit to the +numeric value. It is important to consider this because it will not be repeated +for every keyword. Supported units are : + + - us : microseconds. 1 microsecond = 1/1000000 second + - ms : milliseconds. 1 millisecond = 1/1000 second. This is the default. + - s : seconds. 1s = 1000ms + - m : minutes. 1m = 60s = 60000ms + - h : hours. 1h = 60m = 3600s = 3600000ms + - d : days. 1d = 24h = 1440m = 86400s = 86400000ms + + +2.6. Examples +------------- + + # Simple configuration for an HTTP proxy listening on port 80 on all + # interfaces and forwarding requests to a single backend "servers" with a + # single server "server1" listening on 127.0.0.1:8000 + global + daemon + maxconn 256 + + defaults + mode http + timeout connect 5000ms + timeout client 50000ms + timeout server 50000ms + + frontend http-in + bind *:80 + default_backend servers + + backend servers + server server1 127.0.0.1:8000 maxconn 32 + + + # The same configuration defined with a single listen block. Shorter but + # less expressive, especially in HTTP mode. + global + daemon + maxconn 256 + + defaults + mode http + timeout connect 5000ms + timeout client 50000ms + timeout server 50000ms + + listen http-in + bind *:80 + server server1 127.0.0.1:8000 maxconn 32 + + +Assuming haproxy is in $PATH, test these configurations in a shell with: + + $ sudo haproxy -f configuration.conf -c + + +3. Global parameters +-------------------- + +Parameters in the "global" section are process-wide and often OS-specific. They +are generally set once for all and do not need being changed once correct. Some +of them have command-line equivalents. + +The following keywords are supported in the "global" section : + + * Process management and security + - 51degrees-cache-size + - 51degrees-data-file + - 51degrees-property-name-list + - 51degrees-property-separator + - ca-base + - chroot + - cluster-secret + - cpu-map + - crt-base + - daemon + - default-path + - description + - deviceatlas-json-file + - deviceatlas-log-level + - deviceatlas-properties-cookie + - deviceatlas-separator + - expose-experimental-directives + - external-check + - fd-hard-limit + - gid + - grace + - group + - h1-accept-payload-with-any-method + - h1-case-adjust + - h1-case-adjust-file + - h2-workaround-bogus-websocket-clients + - hard-stop-after + - httpclient.resolvers.id + - httpclient.resolvers.prefer + - httpclient.ssl.ca-file + - httpclient.ssl.verify + - insecure-fork-wanted + - insecure-setuid-wanted + - issuers-chain-path + - localpeer + - log + - log-send-hostname + - log-tag + - lua-load + - lua-load-per-thread + - lua-prepend-path + - mworker-max-reloads + - nbthread + - node + - numa-cpu-mapping + - pidfile + - pp2-never-send-local + - presetenv + - resetenv + - set-dumpable + - set-var + - setenv + - ssl-default-bind-ciphers + - ssl-default-bind-ciphersuites + - ssl-default-bind-curves + - ssl-default-bind-options + - ssl-default-server-ciphers + - ssl-default-server-ciphersuites + - ssl-default-server-options + - ssl-dh-param-file + - ssl-propquery + - ssl-provider + - ssl-provider-path + - ssl-server-verify + - ssl-skip-self-issued-ca + - stats + - strict-limits + - uid + - ulimit-n + - unix-bind + - unsetenv + - user + - wurfl-cache-size + - wurfl-data-file + - wurfl-information-list + - wurfl-information-list-separator + + * Performance tuning + - busy-polling + - max-spread-checks + - maxcompcpuusage + - maxcomprate + - maxconn + - maxconnrate + - maxpipes + - maxsessrate + - maxsslconn + - maxsslrate + - maxzlibmem + - no-memory-trimming + - noepoll + - noevports + - nogetaddrinfo + - nokqueue + - nopoll + - noreuseport + - nosplice + - profiling.tasks + - server-state-base + - server-state-file + - spread-checks + - ssl-engine + - ssl-mode-async + - tune.buffers.limit + - tune.buffers.reserve + - tune.bufsize + - tune.comp.maxlevel + - tune.fail-alloc + - tune.fd.edge-triggered + - tune.h2.header-table-size + - tune.h2.initial-window-size + - tune.h2.max-concurrent-streams + - tune.http.cookielen + - tune.http.logurilen + - tune.http.maxhdr + - tune.idle-pool.shared + - tune.idletimer + - tune.lua.forced-yield + - tune.lua.maxmem + - tune.lua.service-timeout + - tune.lua.session-timeout + - tune.lua.task-timeout + - tune.maxaccept + - tune.maxpollevents + - tune.maxrewrite + - tune.pattern.cache-size + - tune.peers.max-updates-at-once + - tune.pipesize + - tune.pool-high-fd-ratio + - tune.pool-low-fd-ratio + - tune.quic.frontend.conn-tx-buffers.limit + - tune.quic.frontend.max-idle-timeout + - tune.quic.frontend.max-streams-bidi + - tune.quic.retry-threshold + - tune.rcvbuf.client + - tune.rcvbuf.server + - tune.recv_enough + - tune.runqueue-depth + - tune.sched.low-latency + - tune.sndbuf.client + - tune.sndbuf.server + - tune.ssl.cachesize + - tune.ssl.capture-buffer-size + - tune.ssl.capture-cipherlist-size (deprecated) + - tune.ssl.default-dh-param + - tune.ssl.force-private-cache + - tune.ssl.hard-maxrecord + - tune.ssl.keylog + - tune.ssl.lifetime + - tune.ssl.maxrecord + - tune.ssl.ssl-ctx-cache-size + - tune.vars.global-max-size + - tune.vars.proc-max-size + - tune.vars.reqres-max-size + - tune.vars.sess-max-size + - tune.vars.txn-max-size + - tune.zlib.memlevel + - tune.zlib.windowsize + + * Debugging + - quiet + - zero-warning + + +3.1. Process management and security +------------------------------------ + +51degrees-data-file <file path> + The path of the 51Degrees data file to provide device detection services. The + file should be unzipped and accessible by HAProxy with relevant permissions. + + Please note that this option is only available when HAProxy has been + compiled with USE_51DEGREES. + +51degrees-property-name-list [<string> ...] + A list of 51Degrees property names to be load from the dataset. A full list + of names is available on the 51Degrees website: + https://51degrees.com/resources/property-dictionary + + Please note that this option is only available when HAProxy has been + compiled with USE_51DEGREES. + +51degrees-property-separator <char> + A char that will be appended to every property value in a response header + containing 51Degrees results. If not set that will be set as ','. + + Please note that this option is only available when HAProxy has been + compiled with USE_51DEGREES. + +51degrees-cache-size <number> + Sets the size of the 51Degrees converter cache to <number> entries. This + is an LRU cache which reminds previous device detections and their results. + By default, this cache is disabled. + + Please note that this option is only available when HAProxy has been + compiled with USE_51DEGREES. + +ca-base <dir> + Assigns a default directory to fetch SSL CA certificates and CRLs from when a + relative path is used with "ca-file", "ca-verify-file" or "crl-file" + directives. Absolute locations specified in "ca-file", "ca-verify-file" and + "crl-file" prevail and ignore "ca-base". + +chroot <jail dir> + Changes current directory to <jail dir> and performs a chroot() there before + dropping privileges. This increases the security level in case an unknown + vulnerability would be exploited, since it would make it very hard for the + attacker to exploit the system. This only works when the process is started + with superuser privileges. It is important to ensure that <jail_dir> is both + empty and non-writable to anyone. + +close-spread-time <time> + Define a time window during which idle connections and active connections + closing is spread in case of soft-stop. After a SIGUSR1 is received and the + grace period is over (if any), the idle connections will all be closed at + once if this option is not set, and active HTTP or HTTP2 connections will be + ended after the next request is received, either by appending a "Connection: + close" line to the HTTP response, or by sending a GOAWAY frame in case of + HTTP2. When this option is set, connection closing will be spread over this + set <time>. + If the close-spread-time is set to "infinite", active connection closing + during a soft-stop will be disabled. The "Connection: close" header will not + be added to HTTP responses (or GOAWAY for HTTP2) anymore and idle connections + will only be closed once their timeout is reached (based on the various + timeouts set in the configuration). + + Arguments : + <time> is a time window (by default in milliseconds) during which + connection closing will be spread during a soft-stop operation, or + "infinite" if active connection closing should be disabled. + + It is recommended to set this setting to a value lower than the one used in + the "hard-stop-after" option if this one is used, so that all connections + have a chance to gracefully close before the process stops. + + See also: grace, hard-stop-after, idle-close-on-response + +cluster-secret <secret> + Define an ASCII string secret shared between several nodes belonging to the + same cluster. It could be used for different usages. It is at least used to + derive stateless reset tokens for all the QUIC connections instantiated by + this process. This is also the case to derive secrets used to encrypt Retry + tokens. If you do not set this parameter, the stateless reset and Retry QUIC + features will be both silently disabled. + +cpu-map [auto:]<thread-group>[/<thread-set>] <cpu-set>... + On some operating systems, it is possible to bind a thread group or a thread + to a specific CPU set. This means that the designated threads will never run + on other CPUs. The "cpu-map" directive specifies CPU sets for individual + threads or thread groups. The first argument is a thread group range, + optionally followed by a thread set. These ranges have the following format: + + all | odd | even | number[-[number]] + + <number> must be a number between 1 and 32 or 64, depending on the machine's + word size. Any process IDs above 1 and any thread IDs above nbthread are + ignored. It is possible to specify a range with two such number delimited by + a dash ('-'). It also is possible to specify all thraeds at once using + "all", only odd numbers using "odd" or even numbers using "even", just like + with the "bind-process" directive. The second and forthcoming arguments are + CPU sets. Each CPU set is either a unique number starting at 0 for the first + CPU or a range with two such numbers delimited by a dash ('-'). Outside of + Linux and BSDs, there may be a limitation on the maximum CPU index to either + 31 or 63. Multiple CPU numbers or ranges may be specified, and the processes + or threads will be allowed to bind to all of them. Obviously, multiple + "cpu-map" directives may be specified. Each "cpu-map" directive will replace + the previous ones when they overlap. A thread will be bound on the + intersection of its mapping and the one of the process on which it is + attached. If the intersection is null, no specific binding will be set for + the thread. + + Ranges can be partially defined. The higher bound can be omitted. In such + case, it is replaced by the corresponding maximum value, 32 or 64 depending + on the machine's word size. + + The prefix "auto:" can be added before the process set to let HAProxy + automatically bind a process or a thread to a CPU by incrementing threads and + CPU sets. To be valid, both sets must have the same size. No matter the + declaration order of the CPU sets, it will be bound from the lowest to the + highest bound. Having both a process and a thread range with the "auto:" + prefix is not supported. Only one range is supported, the other one must be + a fixed number. + + Note that process ranges are supported for historical reasons. Nowadays, a + lone number designates a process and must be 1, and specifying a thread range + or number requires to prepend "1/" in front of it. Finally, "1" is strictly + equivalent to "1/all" and designates all threads on the process. + + Examples: + cpu-map 1/all 0-3 # bind all threads of the first process on the + # first 4 CPUs + + cpu-map 1/1- 0- # will be replaced by "cpu-map 1/1-64 0-63" + # or "cpu-map 1/1-32 0-31" depending on the machine's + # word size. + + # all these lines bind the thread 1 to the cpu 0, the thread 2 to cpu 1 + # and so on. + cpu-map auto:1/1-4 0-3 + cpu-map auto:1/1-4 0-1 2-3 + cpu-map auto:1/1-4 3 2 1 0 + + # bind each thread to exactly one CPU using all/odd/even keyword + cpu-map auto:1/all 0-63 + cpu-map auto:1/even 0-31 + cpu-map auto:1/odd 32-63 + + # invalid cpu-map because thread and CPU sets have different sizes. + cpu-map auto:1/1-4 0 # invalid + cpu-map auto:1/1 0-3 # invalid + +crt-base <dir> + Assigns a default directory to fetch SSL certificates from when a relative + path is used with "crtfile" or "crt" directives. Absolute locations specified + prevail and ignore "crt-base". + +daemon + Makes the process fork into background. This is the recommended mode of + operation. It is equivalent to the command line "-D" argument. It can be + disabled by the command line "-db" argument. This option is ignored in + systemd mode. + +default-path { current | config | parent | origin <path> } + By default HAProxy loads all files designated by a relative path from the + location the process is started in. In some circumstances it might be + desirable to force all relative paths to start from a different location + just as if the process was started from such locations. This is what this + directive is made for. Technically it will perform a temporary chdir() to + the designated location while processing each configuration file, and will + return to the original directory after processing each file. It takes an + argument indicating the policy to use when loading files whose path does + not start with a slash ('/'): + - "current" indicates that all relative files are to be loaded from the + directory the process is started in ; this is the default. + + - "config" indicates that all relative files should be loaded from the + directory containing the configuration file. More specifically, if the + configuration file contains a slash ('/'), the longest part up to the + last slash is used as the directory to change to, otherwise the current + directory is used. This mode is convenient to bundle maps, errorfiles, + certificates and Lua scripts together as relocatable packages. When + multiple configuration files are loaded, the directory is updated for + each of them. + + - "parent" indicates that all relative files should be loaded from the + parent of the directory containing the configuration file. More + specifically, if the configuration file contains a slash ('/'), ".." + is appended to the longest part up to the last slash is used as the + directory to change to, otherwise the directory is "..". This mode is + convenient to bundle maps, errorfiles, certificates and Lua scripts + together as relocatable packages, but where each part is located in a + different subdirectory (e.g. "config/", "certs/", "maps/", ...). + + - "origin" indicates that all relative files should be loaded from the + designated (mandatory) path. This may be used to ease management of + different HAProxy instances running in parallel on a system, where each + instance uses a different prefix but where the rest of the sections are + made easily relocatable. + + Each "default-path" directive instantly replaces any previous one and will + possibly result in switching to a different directory. While this should + always result in the desired behavior, it is really not a good practice to + use multiple default-path directives, and if used, the policy ought to remain + consistent across all configuration files. + + Warning: some configuration elements such as maps or certificates are + uniquely identified by their configured path. By using a relocatable layout, + it becomes possible for several of them to end up with the same unique name, + making it difficult to update them at run time, especially when multiple + configuration files are loaded from different directories. It is essential to + observe a strict collision-free file naming scheme before adopting relative + paths. A robust approach could consist in prefixing all files names with + their respective site name, or in doing so at the directory level. + +description <text> + Add a text that describes the instance. + + Please note that it is required to escape certain characters (# for example) + and this text is inserted into a html page so you should avoid using + "<" and ">" characters. + +deviceatlas-json-file <path> + Sets the path of the DeviceAtlas JSON data file to be loaded by the API. + The path must be a valid JSON data file and accessible by HAProxy process. + +deviceatlas-log-level <value> + Sets the level of information returned by the API. This directive is + optional and set to 0 by default if not set. + +deviceatlas-properties-cookie <name> + Sets the client cookie's name used for the detection if the DeviceAtlas + Client-side component was used during the request. This directive is optional + and set to DAPROPS by default if not set. + +deviceatlas-separator <char> + Sets the character separator for the API properties results. This directive + is optional and set to | by default if not set. + +expose-experimental-directives + This statement must appear before using directives tagged as experimental or + the config file will be rejected. + +external-check + Allows the use of an external agent to perform health checks. This is + disabled by default as a security precaution, and even when enabled, checks + may still fail unless "insecure-fork-wanted" is enabled as well. If the + program launched makes use of a setuid executable (it should really not), + you may also need to set "insecure-setuid-wanted" in the global section. + See "option external-check", and "insecure-fork-wanted", and + "insecure-setuid-wanted". + +fd-hard-limit <number> + Sets an upper bound to the maximum number of file descriptors that the + process will use, regardless of system limits. While "ulimit-n" and "maxconn" + may be used to enforce a value, when they are not set, the process will be + limited to the hard limit of the RLIMIT_NOFILE setting as reported by + "ulimit -n -H". But some modern operating systems are now allowing extremely + large values here (in the order of 1 billion), which will consume way too + much RAM for regular usage. The fd-hard-limit setting is provided to enforce + a possibly lower bound to this limit. This means that it will always respect + the system-imposed limits when they are below <number> but the specified + value will be used if system-imposed limits are higher. In the example below, + no other setting is specified and the maxconn value will automatically adapt + to the lower of "fd-hard-limit" and the system-imposed limit: + + global + # use as many FDs as possible but no more than 50000 + fd-hard-limit 50000 + + See also: ulimit-n, maxconn + +gid <number> + Changes the process's group ID to <number>. It is recommended that the group + ID is dedicated to HAProxy or to a small set of similar daemons. HAProxy must + be started with a user belonging to this group, or with superuser privileges. + Note that if HAProxy is started from a user having supplementary groups, it + will only be able to drop these groups if started with superuser privileges. + See also "group" and "uid". + +grace <time> + Defines a delay between SIGUSR1 and real soft-stop. + + Arguments : + <time> is an extra delay (by default in milliseconds) after receipt of the + SIGUSR1 signal that will be waited for before proceeding with the + soft-stop operation. + + This is used for compatibility with legacy environments where the haproxy + process needs to be stopped but some external components need to detect the + status before listeners are unbound. The principle is that the internal + "stopping" variable (which is reported by the "stopping" sample fetch + function) will be turned to true, but listeners will continue to accept + connections undisturbed, until the delay expires, after what the regular + soft-stop will proceed. This must not be used with processes that are + reloaded, or this will prevent the old process from unbinding, and may + prevent the new one from starting, or simply cause trouble. + + Example: + + global + grace 10s + + # Returns 200 OK until stopping is set via SIGUSR1 + frontend ext-check + bind :9999 + monitor-uri /ext-check + monitor fail if { stopping } + + Please note that a more flexible and durable approach would instead consist + for an orchestration system in setting a global variable from the CLI, use + that variable to respond to external checks, then after a delay send the + SIGUSR1 signal. + + Example: + + # Returns 200 OK until proc.stopping is set to non-zero. May be done + # from HTTP using set-var(proc.stopping) or from the CLI using: + # > set var proc.stopping int(1) + frontend ext-check + bind :9999 + monitor-uri /ext-check + monitor fail if { var(proc.stopping) -m int gt 0 } + + See also: hard-stop-after, monitor + +group <group name> + Similar to "gid" but uses the GID of group name <group name> from /etc/group. + See also "gid" and "user". + +h1-accept-payload-with-any-method + Does not reject HTTP/1.0 GET/HEAD/DELETE requests with a payload. + + While It is explicitly allowed in HTTP/1.1, HTTP/1.0 is not clear on this + point and some old servers don't expect any payload and never look for body + length (via Content-Length or Transfer-Encoding headers). It means that some + intermediaries may properly handle the payload for HTTP/1.0 GET/HEAD/DELETE + requests, while some others may totally ignore it. That may lead to security + issues because a request smuggling attack is possible. Thus, by default, + HAProxy rejects HTTP/1.0 GET/HEAD/DELETE requests with a payload. + + However, it may be an issue with some old clients. In this case, this global + option may be set. + +h1-case-adjust <from> <to> + Defines the case adjustment to apply, when enabled, to the header name + <from>, to change it to <to> before sending it to HTTP/1 clients or + servers. <from> must be in lower case, and <from> and <to> must not differ + except for their case. It may be repeated if several header names need to be + adjusted. Duplicate entries are not allowed. If a lot of header names have to + be adjusted, it might be more convenient to use "h1-case-adjust-file". + Please note that no transformation will be applied unless "option + h1-case-adjust-bogus-client" or "option h1-case-adjust-bogus-server" is + specified in a proxy. + + There is no standard case for header names because, as stated in RFC7230, + they are case-insensitive. So applications must handle them in a case- + insensitive manner. But some bogus applications violate the standards and + erroneously rely on the cases most commonly used by browsers. This problem + becomes critical with HTTP/2 because all header names must be exchanged in + lower case, and HAProxy follows the same convention. All header names are + sent in lower case to clients and servers, regardless of the HTTP version. + + Applications which fail to properly process requests or responses may require + to temporarily use such workarounds to adjust header names sent to them for + the time it takes the application to be fixed. Please note that an + application which requires such workarounds might be vulnerable to content + smuggling attacks and must absolutely be fixed. + + Example: + global + h1-case-adjust content-length Content-Length + + See "h1-case-adjust-file", "option h1-case-adjust-bogus-client" and + "option h1-case-adjust-bogus-server". + +h1-case-adjust-file <hdrs-file> + Defines a file containing a list of key/value pairs used to adjust the case + of some header names before sending them to HTTP/1 clients or servers. The + file <hdrs-file> must contain 2 header names per line. The first one must be + in lower case and both must not differ except for their case. Lines which + start with '#' are ignored, just like empty lines. Leading and trailing tabs + and spaces are stripped. Duplicate entries are not allowed. Please note that + no transformation will be applied unless "option h1-case-adjust-bogus-client" + or "option h1-case-adjust-bogus-server" is specified in a proxy. + + If this directive is repeated, only the last one will be processed. It is an + alternative to the directive "h1-case-adjust" if a lot of header names need + to be adjusted. Please read the risks associated with using this. + + See "h1-case-adjust", "option h1-case-adjust-bogus-client" and + "option h1-case-adjust-bogus-server". + +h2-workaround-bogus-websocket-clients + This disables the announcement of the support for h2 websockets to clients. + This can be use to overcome clients which have issues when implementing the + relatively fresh RFC8441, such as Firefox 88. To allow clients to + automatically downgrade to http/1.1 for the websocket tunnel, specify h2 + support on the bind line using "alpn" without an explicit "proto" keyword. If + this statement was previously activated, this can be disabled by prefixing + the keyword with "no'. + +hard-stop-after <time> + Defines the maximum time allowed to perform a clean soft-stop. + + Arguments : + <time> is the maximum time (by default in milliseconds) for which the + instance will remain alive when a soft-stop is received via the + SIGUSR1 signal. + + This may be used to ensure that the instance will quit even if connections + remain opened during a soft-stop (for example with long timeouts for a proxy + in tcp mode). It applies both in TCP and HTTP mode. + + Example: + global + hard-stop-after 30s + + See also: grace + +httpclient.resolvers.id <resolvers id> + This option defines the resolvers section with which the httpclient will try + to resolve. + + Default option is the "default" resolvers ID. By default, if this option is + not used, it will simply disable the resolving if the section is not found. + + However, when this option is explicitly enabled it will trigger a + configuration error if it fails to load. + +httpclient.resolvers.prefer <ipv4|ipv6> + This option allows to chose which family of IP you want when resolving, + which is convenient when IPv6 is not available on your network. Default + option is "ipv6". + +httpclient.ssl.ca-file <cafile> + This option defines the ca-file which should be used to verify the server + certificate. It takes the same parameters as the "ca-file" option on the + server line. + + By default and when this option is not used, the value is + "@system-ca" which tries to load the CA of the system. If it fails the SSL + will be disabled for the httpclient. + + However, when this option is explicitly enabled it will trigger a + configuration error if it fails. + +httpclient.ssl.verify [none|required] + Works the same way as the verify option on server lines. If specified to 'none', + servers certificates are not verified. Default option is "required". + + By default and when this option is not used, the value is + "required". If it fails the SSL will be disabled for the httpclient. + + However, when this option is explicitly enabled it will trigger a + configuration error if it fails. + +insecure-fork-wanted + By default HAProxy tries hard to prevent any thread and process creation + after it starts. Doing so is particularly important when using Lua files of + uncertain origin, and when experimenting with development versions which may + still contain bugs whose exploitability is uncertain. And generally speaking + it's good hygiene to make sure that no unexpected background activity can be + triggered by traffic. But this prevents external checks from working, and may + break some very specific Lua scripts which actively rely on the ability to + fork. This option is there to disable this protection. Note that it is a bad + idea to disable it, as a vulnerability in a library or within HAProxy itself + will be easier to exploit once disabled. In addition, forking from Lua or + anywhere else is not reliable as the forked process may randomly embed a lock + set by another thread and never manage to finish an operation. As such it is + highly recommended that this option is never used and that any workload + requiring such a fork be reconsidered and moved to a safer solution (such as + agents instead of external checks). This option supports the "no" prefix to + disable it. + +insecure-setuid-wanted + HAProxy doesn't need to call executables at run time (except when using + external checks which are strongly recommended against), and is even expected + to isolate itself into an empty chroot. As such, there basically is no valid + reason to allow a setuid executable to be called without the user being fully + aware of the risks. In a situation where HAProxy would need to call external + checks and/or disable chroot, exploiting a vulnerability in a library or in + HAProxy itself could lead to the execution of an external program. On Linux + it is possible to lock the process so that any setuid bit present on such an + executable is ignored. This significantly reduces the risk of privilege + escalation in such a situation. This is what HAProxy does by default. In case + this causes a problem to an external check (for example one which would need + the "ping" command), then it is possible to disable this protection by + explicitly adding this directive in the global section. If enabled, it is + possible to turn it back off by prefixing it with the "no" keyword. + +issuers-chain-path <dir> + Assigns a directory to load certificate chain for issuer completion. All + files must be in PEM format. For certificates loaded with "crt" or "crt-list", + if certificate chain is not included in PEM (also commonly known as + intermediate certificate), HAProxy will complete chain if the issuer of the + certificate corresponds to the first certificate of the chain loaded with + "issuers-chain-path". + A "crt" file with PrivateKey+Certificate+IntermediateCA2+IntermediateCA1 + could be replaced with PrivateKey+Certificate. HAProxy will complete the + chain if a file with IntermediateCA2+IntermediateCA1 is present in + "issuers-chain-path" directory. All other certificates with the same issuer + will share the chain in memory. + +localpeer <name> + Sets the local instance's peer name. It will be ignored if the "-L" + command line argument is specified or if used after "peers" section + definitions. In such cases, a warning message will be emitted during + the configuration parsing. + + This option will also set the HAPROXY_LOCALPEER environment variable. + See also "-L" in the management guide and "peers" section below. + +log <address> [len <length>] [format <format>] [sample <ranges>:<sample_size>] + <facility> [max level [min level]] + Adds a global syslog server. Several global servers can be defined. They + will receive logs for starts and exits, as well as all logs from proxies + configured with "log global". + + <address> can be one of: + + - An IPv4 address optionally followed by a colon and a UDP port. If + no port is specified, 514 is used by default (the standard syslog + port). + + - An IPv6 address followed by a colon and optionally a UDP port. If + no port is specified, 514 is used by default (the standard syslog + port). + + - A filesystem path to a datagram UNIX domain socket, keeping in mind + considerations for chroot (be sure the path is accessible inside + the chroot) and uid/gid (be sure the path is appropriately + writable). + + - A file descriptor number in the form "fd@<number>", which may point + to a pipe, terminal, or socket. In this case unbuffered logs are used + and one writev() call per log is performed. This is a bit expensive + but acceptable for most workloads. Messages sent this way will not be + truncated but may be dropped, in which case the DroppedLogs counter + will be incremented. The writev() call is atomic even on pipes for + messages up to PIPE_BUF size, which POSIX recommends to be at least + 512 and which is 4096 bytes on most modern operating systems. Any + larger message may be interleaved with messages from other processes. + Exceptionally for debugging purposes the file descriptor may also be + directed to a file, but doing so will significantly slow HAProxy down + as non-blocking calls will be ignored. Also there will be no way to + purge nor rotate this file without restarting the process. Note that + the configured syslog format is preserved, so the output is suitable + for use with a TCP syslog server. See also the "short" and "raw" + format below. + + - "stdout" / "stderr", which are respectively aliases for "fd@1" and + "fd@2", see above. + + - A ring buffer in the form "ring@<name>", which will correspond to an + in-memory ring buffer accessible over the CLI using the "show events" + command, which will also list existing rings and their sizes. Such + buffers are lost on reload or restart but when used as a complement + this can help troubleshooting by having the logs instantly available. + + You may want to reference some environment variables in the address + parameter, see section 2.3 about environment variables. + + <length> is an optional maximum line length. Log lines larger than this value + will be truncated before being sent. The reason is that syslog + servers act differently on log line length. All servers support the + default value of 1024, but some servers simply drop larger lines + while others do log them. If a server supports long lines, it may + make sense to set this value here in order to avoid truncating long + lines. Similarly, if a server drops long lines, it is preferable to + truncate them before sending them. Accepted values are 80 to 65535 + inclusive. The default value of 1024 is generally fine for all + standard usages. Some specific cases of long captures or + JSON-formatted logs may require larger values. You may also need to + increase "tune.http.logurilen" if your request URIs are truncated. + + <format> is the log format used when generating syslog messages. It may be + one of the following : + + local Analog to rfc3164 syslog message format except that hostname + field is stripped. This is the default. + Note: option "log-send-hostname" switches the default to + rfc3164. + + rfc3164 The RFC3164 syslog message format. + (https://tools.ietf.org/html/rfc3164) + + rfc5424 The RFC5424 syslog message format. + (https://tools.ietf.org/html/rfc5424) + + priority A message containing only a level plus syslog facility between + angle brackets such as '<63>', followed by the text. The PID, + date, time, process name and system name are omitted. This is + designed to be used with a local log server. + + short A message containing only a level between angle brackets such as + '<3>', followed by the text. The PID, date, time, process name + and system name are omitted. This is designed to be used with a + local log server. This format is compatible with what the systemd + logger consumes. + + timed A message containing only a level between angle brackets such as + '<3>', followed by ISO date and by the text. The PID, process + name and system name are omitted. This is designed to be + used with a local log server. + + iso A message containing only the ISO date, followed by the text. + The PID, process name and system name are omitted. This is + designed to be used with a local log server. + + raw A message containing only the text. The level, PID, date, time, + process name and system name are omitted. This is designed to be + used in containers or during development, where the severity only + depends on the file descriptor used (stdout/stderr). + + <ranges> A list of comma-separated ranges to identify the logs to sample. + This is used to balance the load of the logs to send to the log + server. The limits of the ranges cannot be null. They are numbered + from 1. The size or period (in number of logs) of the sample must be + set with <sample_size> parameter. + + <sample_size> + The size of the sample in number of logs to consider when balancing + their logging loads. It is used to balance the load of the logs to + send to the syslog server. This size must be greater or equal to the + maximum of the high limits of the ranges. + (see also <ranges> parameter). + + <facility> must be one of the 24 standard syslog facilities : + + kern user mail daemon auth syslog lpr news + uucp cron auth2 ftp ntp audit alert cron2 + local0 local1 local2 local3 local4 local5 local6 local7 + + Note that the facility is ignored for the "short" and "raw" + formats, but still required as a positional field. It is + recommended to use "daemon" in this case to make it clear that + it's only supposed to be used locally. + + An optional level can be specified to filter outgoing messages. By default, + all messages are sent. If a maximum level is specified, only messages with a + severity at least as important as this level will be sent. An optional minimum + level can be specified. If it is set, logs emitted with a more severe level + than this one will be capped to this level. This is used to avoid sending + "emerg" messages on all terminals on some default syslog configurations. + Eight levels are known : + + emerg alert crit err warning notice info debug + +log-send-hostname [<string>] + Sets the hostname field in the syslog header. If optional "string" parameter + is set the header is set to the string contents, otherwise uses the hostname + of the system. Generally used if one is not relaying logs through an + intermediate syslog server or for simply customizing the hostname printed in + the logs. + +log-tag <string> + Sets the tag field in the syslog header to this string. It defaults to the + program name as launched from the command line, which usually is "haproxy". + Sometimes it can be useful to differentiate between multiple processes + running on the same host. See also the per-proxy "log-tag" directive. + +lua-load <file> + This global directive loads and executes a Lua file in the shared context + that is visible to all threads. Any variable set in such a context is visible + from any thread. This is the easiest and recommended way to load Lua programs + but it will not scale well if a lot of Lua calls are performed, as only one + thread may be running on the global state at a time. A program loaded this + way will always see 0 in the "core.thread" variable. This directive can be + used multiple times. + +lua-load-per-thread <file> + This global directive loads and executes a Lua file into each started thread. + Any global variable has a thread-local visibility so that each thread could + see a different value. As such it is strongly recommended not to use global + variables in programs loaded this way. An independent copy is loaded and + initialized for each thread, everything is done sequentially and in the + thread's numeric order from 1 to nbthread. If some operations need to be + performed only once, the program should check the "core.thread" variable to + figure what thread is being initialized. Programs loaded this way will run + concurrently on all threads and will be highly scalable. This is the + recommended way to load simple functions that register sample-fetches, + converters, actions or services once it is certain the program doesn't depend + on global variables. For the sake of simplicity, the directive is available + even if only one thread is used and even if threads are disabled (in which + case it will be equivalent to lua-load). This directive can be used multiple + times. + +lua-prepend-path <string> [<type>] + Prepends the given string followed by a semicolon to Lua's package.<type> + variable. + <type> must either be "path" or "cpath". If <type> is not given it defaults + to "path". + + Lua's paths are semicolon delimited lists of patterns that specify how the + `require` function attempts to find the source file of a library. Question + marks (?) within a pattern will be replaced by module name. The path is + evaluated left to right. This implies that paths that are prepended later + will be checked earlier. + + As an example by specifying the following path: + + lua-prepend-path /usr/share/haproxy-lua/?/init.lua + lua-prepend-path /usr/share/haproxy-lua/?.lua + + When `require "example"` is being called Lua will first attempt to load the + /usr/share/haproxy-lua/example.lua script, if that does not exist the + /usr/share/haproxy-lua/example/init.lua will be attempted and the default + paths if that does not exist either. + + See https://www.lua.org/pil/8.1.html for the details within the Lua + documentation. + +master-worker [no-exit-on-failure] + Master-worker mode. It is equivalent to the command line "-W" argument. + This mode will launch a "master" which will monitor the "workers". Using + this mode, you can reload HAProxy directly by sending a SIGUSR2 signal to + the master. The master-worker mode is compatible either with the foreground + or daemon mode. + + By default, if a worker exits with a bad return code, in the case of a + segfault for example, all workers will be killed, and the master will leave. + It is convenient to combine this behavior with Restart=on-failure in a + systemd unit file in order to relaunch the whole process. If you don't want + this behavior, you must use the keyword "no-exit-on-failure". + + See also "-W" in the management guide. + +mworker-max-reloads <number> + In master-worker mode, this option limits the number of time a worker can + survive to a reload. If the worker did not leave after a reload, once its + number of reloads is greater than this number, the worker will receive a + SIGTERM. This option helps to keep under control the number of workers. + See also "show proc" in the Management Guide. + +nbthread <number> + This setting is only available when support for threads was built in. It + makes HAProxy run on <number> threads. "nbthread" also works when HAProxy is + started in foreground. On some platforms supporting CPU affinity, the default + "nbthread" value is automatically set to the number of CPUs the process is + bound to upon startup. This means that the thread count can easily be + adjusted from the calling process using commands like "taskset" or "cpuset". + Otherwise, this value defaults to 1. The default value is reported in the + output of "haproxy -vv". + +numa-cpu-mapping + If running on a NUMA-aware platform, HAProxy inspects on startup the CPU + topology of the machine. If a multi-socket machine is detected, the affinity + is automatically calculated to run on the CPUs of a single node. This is done + in order to not suffer from the performance penalties caused by the + inter-socket bus latency. However, if the applied binding is non optimal on a + particular architecture, it can be disabled with the statement 'no + numa-cpu-mapping'. This automatic binding is also not applied if a nbthread + statement is present in the configuration, or the affinity of the process is + already specified, for example via the 'cpu-map' directive or the taskset + utility. + +pidfile <pidfile> + Writes PIDs of all daemons into file <pidfile> when daemon mode or writes PID + of master process into file <pidfile> when master-worker mode. This option is + equivalent to the "-p" command line argument. The file must be accessible to + the user starting the process. See also "daemon" and "master-worker". + +pp2-never-send-local + A bug in the PROXY protocol v2 implementation was present in HAProxy up to + version 2.1, causing it to emit a PROXY command instead of a LOCAL command + for health checks. This is particularly minor but confuses some servers' + logs. Sadly, the bug was discovered very late and revealed that some servers + which possibly only tested their PROXY protocol implementation against + HAProxy fail to properly handle the LOCAL command, and permanently remain in + the "down" state when HAProxy checks them. When this happens, it is possible + to enable this global option to revert to the older (bogus) behavior for the + time it takes to contact the affected components' vendors and get them fixed. + This option is disabled by default and acts on all servers having the + "send-proxy-v2" statement. + +presetenv <name> <value> + Sets environment variable <name> to value <value>. If the variable exists, it + is NOT overwritten. The changes immediately take effect so that the next line + in the configuration file sees the new value. See also "setenv", "resetenv", + and "unsetenv". + +resetenv [<name> ...] + Removes all environment variables except the ones specified in argument. It + allows to use a clean controlled environment before setting new values with + setenv or unsetenv. Please note that some internal functions may make use of + some environment variables, such as time manipulation functions, but also + OpenSSL or even external checks. This must be used with extreme care and only + after complete validation. The changes immediately take effect so that the + next line in the configuration file sees the new environment. See also + "setenv", "presetenv", and "unsetenv". + +stats bind-process [ all | odd | even | <process_num>[-[process_num>]] ] ... + Deprecated. Before threads were supported, this was used to force some stats + instances on certain processes only. The default and only accepted value is + "1" (along with "all" and "odd" which alias it). Do not use this setting. + +server-state-base <directory> + Specifies the directory prefix to be prepended in front of all servers state + file names which do not start with a '/'. See also "server-state-file", + "load-server-state-from-file" and "server-state-file-name". + +server-state-file <file> + Specifies the path to the file containing state of servers. If the path starts + with a slash ('/'), it is considered absolute, otherwise it is considered + relative to the directory specified using "server-state-base" (if set) or to + the current directory. Before reloading HAProxy, it is possible to save the + servers' current state using the stats command "show servers state". The + output of this command must be written in the file pointed by <file>. When + starting up, before handling traffic, HAProxy will read, load and apply state + for each server found in the file and available in its current running + configuration. See also "server-state-base" and "show servers state", + "load-server-state-from-file" and "server-state-file-name" + +set-dumpable + This option is better left disabled by default and enabled only upon a + developer's request. If it has been enabled, it may still be forcibly + disabled by prefixing it with the "no" keyword. It has no impact on + performance nor stability but will try hard to re-enable core dumps that were + possibly disabled by file size limitations (ulimit -f), core size limitations + (ulimit -c), or "dumpability" of a process after changing its UID/GID (such + as /proc/sys/fs/suid_dumpable on Linux). Core dumps might still be limited by + the current directory's permissions (check what directory the file is started + from), the chroot directory's permission (it may be needed to temporarily + disable the chroot directive or to move it to a dedicated writable location), + or any other system-specific constraint. For example, some Linux flavours are + notorious for replacing the default core file with a path to an executable + not even installed on the system (check /proc/sys/kernel/core_pattern). Often, + simply writing "core", "core.%p" or "/var/log/core/core.%p" addresses the + issue. When trying to enable this option waiting for a rare issue to + re-appear, it's often a good idea to first try to obtain such a dump by + issuing, for example, "kill -11" to the "haproxy" process and verify that it + leaves a core where expected when dying. + +set-var <var-name> <expr> + Sets the process-wide variable '<var-name>' to the result of the evaluation + of the sample expression <expr>. The variable '<var-name>' may only be a + process-wide variable (using the 'proc.' prefix). It works exactly like the + 'set-var' action in TCP or HTTP rules except that the expression is evaluated + at configuration parsing time and that the variable is instantly set. The + sample fetch functions and converters permitted in the expression are only + those using internal data, typically 'int(value)' or 'str(value)'. It is + possible to reference previously allocated variables as well. These variables + will then be readable (and modifiable) from the regular rule sets. + + Example: + global + set-var proc.current_state str(primary) + set-var proc.prio int(100) + set-var proc.threshold int(200),sub(proc.prio) + +set-var-fmt <var-name> <fmt> + Sets the process-wide variable '<var-name>' to the string resulting from the + evaluation of the log-format <fmt>. The variable '<var-name>' may only be a + process-wide variable (using the 'proc.' prefix). It works exactly like the + 'set-var-fmt' action in TCP or HTTP rules except that the expression is + evaluated at configuration parsing time and that the variable is instantly + set. The sample fetch functions and converters permitted in the expression + are only those using internal data, typically 'int(value)' or 'str(value)'. + It is possible to reference previously allocated variables as well. These + variables will then be readable (and modifiable) from the regular rule sets. + Please see section 8.2.4 for details on the log-format syntax. + + Example: + global + set-var-fmt proc.current_state "primary" + set-var-fmt proc.bootid "%pid|%t" + +setenv <name> <value> + Sets environment variable <name> to value <value>. If the variable exists, it + is overwritten. The changes immediately take effect so that the next line in + the configuration file sees the new value. See also "presetenv", "resetenv", + and "unsetenv". + +ssl-default-bind-ciphers <ciphers> + This setting is only available when support for OpenSSL was built in. It sets + the default string describing the list of cipher algorithms ("cipher suite") + that are negotiated during the SSL/TLS handshake up to TLSv1.2 for all + "bind" lines which do not explicitly define theirs. The format of the string + is defined in "man 1 ciphers" from OpenSSL man pages. For background + information and recommendations see e.g. + (https://wiki.mozilla.org/Security/Server_Side_TLS) and + (https://mozilla.github.io/server-side-tls/ssl-config-generator/). For TLSv1.3 + cipher configuration, please check the "ssl-default-bind-ciphersuites" keyword. + Please check the "bind" keyword for more information. + +ssl-default-bind-ciphersuites <ciphersuites> + This setting is only available when support for OpenSSL was built in and + OpenSSL 1.1.1 or later was used to build HAProxy. It sets the default string + describing the list of cipher algorithms ("cipher suite") that are negotiated + during the TLSv1.3 handshake for all "bind" lines which do not explicitly define + theirs. The format of the string is defined in + "man 1 ciphers" from OpenSSL man pages under the section "ciphersuites". For + cipher configuration for TLSv1.2 and earlier, please check the + "ssl-default-bind-ciphers" keyword. Please check the "bind" keyword for more + information. + +ssl-default-bind-curves <curves> + This setting is only available when support for OpenSSL was built in. It sets + the default string describing the list of elliptic curves algorithms ("curve + suite") that are negotiated during the SSL/TLS handshake with ECDHE. The format + of the string is a colon-delimited list of curve name. + Please check the "bind" keyword for more information. + +ssl-default-bind-options [<option>]... + This setting is only available when support for OpenSSL was built in. It sets + default ssl-options to force on all "bind" lines. Please check the "bind" + keyword to see available options. + + Example: + global + ssl-default-bind-options ssl-min-ver TLSv1.0 no-tls-tickets + +ssl-default-server-ciphers <ciphers> + This setting is only available when support for OpenSSL was built in. It + sets the default string describing the list of cipher algorithms that are + negotiated during the SSL/TLS handshake up to TLSv1.2 with the server, + for all "server" lines which do not explicitly define theirs. The format of + the string is defined in "man 1 ciphers" from OpenSSL man pages. For background + information and recommendations see e.g. + (https://wiki.mozilla.org/Security/Server_Side_TLS) and + (https://mozilla.github.io/server-side-tls/ssl-config-generator/). + For TLSv1.3 cipher configuration, please check the + "ssl-default-server-ciphersuites" keyword. Please check the "server" keyword + for more information. + +ssl-default-server-ciphersuites <ciphersuites> + This setting is only available when support for OpenSSL was built in and + OpenSSL 1.1.1 or later was used to build HAProxy. It sets the default + string describing the list of cipher algorithms that are negotiated during + the TLSv1.3 handshake with the server, for all "server" lines which do not + explicitly define theirs. The format of the string is defined in + "man 1 ciphers" from OpenSSL man pages under the section "ciphersuites". For + cipher configuration for TLSv1.2 and earlier, please check the + "ssl-default-server-ciphers" keyword. Please check the "server" keyword for + more information. + +ssl-default-server-options [<option>]... + This setting is only available when support for OpenSSL was built in. It sets + default ssl-options to force on all "server" lines. Please check the "server" + keyword to see available options. + +ssl-dh-param-file <file> + This setting is only available when support for OpenSSL was built in. It sets + the default DH parameters that are used during the SSL/TLS handshake when + ephemeral Diffie-Hellman (DHE) key exchange is used, for all "bind" lines + which do not explicitly define theirs. It will be overridden by custom DH + parameters found in a bind certificate file if any. If custom DH parameters + are not specified either by using ssl-dh-param-file or by setting them + directly in the certificate file, DHE ciphers will not be used, unless + tune.ssl.default-dh-param is set. In this latter case, pre-defined DH + parameters of the specified size will be used. Custom parameters are known to + be more secure and therefore their use is recommended. + Custom DH parameters may be generated by using the OpenSSL command + "openssl dhparam <size>", where size should be at least 2048, as 1024-bit DH + parameters should not be considered secure anymore. + +ssl-propquery <query> + This setting is only available when support for OpenSSL was built in and when + OpenSSL's version is at least 3.0. It allows to define a default property + string used when fetching algorithms in providers. It behave the same way as + the openssl propquery option and it follows the same syntax (described in + https://www.openssl.org/docs/man3.0/man7/property.html). For instance, if you + have two providers loaded, the foo one and the default one, the propquery + "?provider=foo" allows to pick the algorithm implementations provided by the + foo provider by default, and to fallback on the default provider's one if it + was not found. + +ssl-provider <name> + This setting is only available when support for OpenSSL was built in and when + OpenSSL's version is at least 3.0. It allows to load a provider during init. + If loading is successful, any capabilities provided by the loaded provider + might be used by HAProxy. Multiple 'ssl-provider' options can be specified in + a configuration file. The providers will be loaded in their order of + appearance. + + Please note that loading a provider explicitly prevents OpenSSL from loading + the 'default' provider automatically. OpenSSL also allows to define the + providers that should be loaded directly in its configuration file + (openssl.cnf for instance) so it is not necessary to use this 'ssl-provider' + option to load providers. The "show ssl providers" CLI command can be used to + show all the providers that were successfully loaded. + + The default search path of OpenSSL provider can be found in the output of the + "openssl version -a" command. If the provider is in another directory, you + can set the OPENSSL_MODULES environment variable, which takes the directory + where your provider can be found. + + See also "ssl-propquery" and "ssl-provider-path". + +ssl-provider-path <path> + This setting is only available when support for OpenSSL was built in and when + OpenSSL's version is at least 3.0. It allows to specify the search path that + is to be used by OpenSSL for looking for providers. It behaves the same way + as the OPENSSL_MODULES environment variable. It will be used for any + following 'ssl-provider' option or until a new 'ssl-provider-path' is + defined. + See also "ssl-provider". + +ssl-load-extra-del-ext + This setting allows to configure the way HAProxy does the lookup for the + extra SSL files. By default HAProxy adds a new extension to the filename. + (ex: with "foobar.crt" load "foobar.crt.key"). With this option enabled, + HAProxy removes the extension before adding the new one (ex: with + "foobar.crt" load "foobar.key"). + + Your crt file must have a ".crt" extension for this option to work. + + This option is not compatible with bundle extensions (.ecdsa, .rsa. .dsa) + and won't try to remove them. + + This option is disabled by default. See also "ssl-load-extra-files". + +ssl-load-extra-files <none|all|bundle|sctl|ocsp|issuer|key>* + This setting alters the way HAProxy will look for unspecified files during + the loading of the SSL certificates. This option applies to certificates + associated to "bind" lines as well as "server" lines but some of the extra + files will not have any functional impact for "server" line certificates. + + By default, HAProxy discovers automatically a lot of files not specified in + the configuration, and you may want to disable this behavior if you want to + optimize the startup time. + + "none": Only load the files specified in the configuration. Don't try to load + a certificate bundle if the file does not exist. In the case of a directory, + it won't try to bundle the certificates if they have the same basename. + + "all": This is the default behavior, it will try to load everything, + bundles, sctl, ocsp, issuer, key. + + "bundle": When a file specified in the configuration does not exist, HAProxy + will try to load a "cert bundle". Certificate bundles are only managed on the + frontend side and will not work for backend certificates. + + Starting from HAProxy 2.3, the bundles are not loaded in the same OpenSSL + certificate store, instead it will loads each certificate in a separate + store which is equivalent to declaring multiple "crt". OpenSSL 1.1.1 is + required to achieve this. Which means that bundles are now used only for + backward compatibility and are not mandatory anymore to do an hybrid RSA/ECC + bind configuration. + + To associate these PEM files into a "cert bundle" that is recognized by + HAProxy, they must be named in the following way: All PEM files that are to + be bundled must have the same base name, with a suffix indicating the key + type. Currently, three suffixes are supported: rsa, dsa and ecdsa. For + example, if www.example.com has two PEM files, an RSA file and an ECDSA + file, they must be named: "example.pem.rsa" and "example.pem.ecdsa". The + first part of the filename is arbitrary; only the suffix matters. To load + this bundle into HAProxy, specify the base name only: + + Example : bind :8443 ssl crt example.pem + + Note that the suffix is not given to HAProxy; this tells HAProxy to look for + a cert bundle. + + HAProxy will load all PEM files in the bundle as if they were configured + separately in several "crt". + + The bundle loading does not have an impact anymore on the directory loading + since files are loading separately. + + On the CLI, bundles are seen as separate files, and the bundle extension is + required to commit them. + + OCSP files (.ocsp), issuer files (.issuer), Certificate Transparency (.sctl) + as well as private keys (.key) are supported with multi-cert bundling. + + "sctl": Try to load "<basename>.sctl" for each crt keyword. If provided for + a backend certificate, it will be loaded but will not have any functional + impact. + + "ocsp": Try to load "<basename>.ocsp" for each crt keyword. If provided for + a backend certificate, it will be loaded but will not have any functional + impact. + + "issuer": Try to load "<basename>.issuer" if the issuer of the OCSP file is + not provided in the PEM file. If provided for a backend certificate, it will + be loaded but will not have any functional impact. + + "key": If the private key was not provided by the PEM file, try to load a + file "<basename>.key" containing a private key. + + The default behavior is "all". + + Example: + ssl-load-extra-files bundle sctl + ssl-load-extra-files sctl ocsp issuer + ssl-load-extra-files none + + See also: "crt", section 5.1 about bind options and section 5.2 about server + options. + +ssl-server-verify [none|required] + The default behavior for SSL verify on servers side. If specified to 'none', + servers certificates are not verified. The default is 'required' except if + forced using cmdline option '-dV'. + +ssl-skip-self-issued-ca + Self issued CA, aka x509 root CA, is the anchor for chain validation: as a + server is useless to send it, client must have it. Standard configuration + need to not include such CA in PEM file. This option allows you to keep such + CA in PEM file without sending it to the client. Use case is to provide + issuer for ocsp without the need for '.issuer' file and be able to share it + with 'issuers-chain-path'. This concerns all certificates without intermediate + certificates. It's useless for BoringSSL, .issuer is ignored because ocsp + bits does not need it. Requires at least OpenSSL 1.0.2. + +stats maxconn <connections> + By default, the stats socket is limited to 10 concurrent connections. It is + possible to change this value with "stats maxconn". + +stats socket [<address:port>|<path>] [param*] + Binds a UNIX socket to <path> or a TCPv4/v6 address to <address:port>. + Connections to this socket will return various statistics outputs and even + allow some commands to be issued to change some runtime settings. Please + consult section 9.3 "Unix Socket commands" of Management Guide for more + details. + + All parameters supported by "bind" lines are supported, for instance to + restrict access to some users or their access rights. Please consult + section 5.1 for more information. + +stats timeout <timeout, in milliseconds> + The default timeout on the stats socket is set to 10 seconds. It is possible + to change this value with "stats timeout". The value must be passed in + milliseconds, or be suffixed by a time unit among { us, ms, s, m, h, d }. + +strict-limits + Makes process fail at startup when a setrlimit fails. HAProxy tries to set the + best setrlimit according to what has been calculated. If it fails, it will + emit a warning. This option is here to guarantee an explicit failure of + HAProxy when those limits fail. It is enabled by default. It may still be + forcibly disabled by prefixing it with the "no" keyword. + +thread-group <group> [<thread-range>...] + This setting is only available when support for threads was built in. It + enumerates the list of threads that will compose thread group <group>. + Thread numbers and group numbers start at 1. Thread ranges are defined either + using a single thread number at once, or by specifying the lower and upper + bounds delimited by a dash '-' (e.g. "1-16"). Unassigned threads will be + automatically assigned to unassigned thread groups, and thread groups + defined with this directive will never receive more threads than those + defined. Defining the same group multiple times overrides previous + definitions with the new one. See also "nbthread" and "thread-groups". + +thread-groups <number> + This setting is only available when support for threads was built in. It + makes HAProxy split its threads into <number> independent groups. At the + moment, the limit is 1 and is also the default value. See also "nbthread". + +uid <number> + Changes the process's user ID to <number>. It is recommended that the user ID + is dedicated to HAProxy or to a small set of similar daemons. HAProxy must + be started with superuser privileges in order to be able to switch to another + one. See also "gid" and "user". + +ulimit-n <number> + Sets the maximum number of per-process file-descriptors to <number>. By + default, it is automatically computed, so it is recommended not to use this + option. If the intent is only to limit the number of file descriptors, better + use "fd-hard-limit" instead. + + Note that the dynamic servers are not taken into account in this automatic + resource calculation. If using a large number of them, it may be needed to + manually specify this value. + + See also: fd-hard-limit, maxconn + +unix-bind [ prefix <prefix> ] [ mode <mode> ] [ user <user> ] [ uid <uid> ] + [ group <group> ] [ gid <gid> ] + + Fixes common settings to UNIX listening sockets declared in "bind" statements. + This is mainly used to simplify declaration of those UNIX sockets and reduce + the risk of errors, since those settings are most commonly required but are + also process-specific. The <prefix> setting can be used to force all socket + path to be relative to that directory. This might be needed to access another + component's chroot. Note that those paths are resolved before HAProxy chroots + itself, so they are absolute. The <mode>, <user>, <uid>, <group> and <gid> + all have the same meaning as their homonyms used by the "bind" statement. If + both are specified, the "bind" statement has priority, meaning that the + "unix-bind" settings may be seen as process-wide default settings. + +unsetenv [<name> ...] + Removes environment variables specified in arguments. This can be useful to + hide some sensitive information that are occasionally inherited from the + user's environment during some operations. Variables which did not exist are + silently ignored so that after the operation, it is certain that none of + these variables remain. The changes immediately take effect so that the next + line in the configuration file will not see these variables. See also + "setenv", "presetenv", and "resetenv". + +user <user name> + Similar to "uid" but uses the UID of user name <user name> from /etc/passwd. + See also "uid" and "group". + +node <name> + Only letters, digits, hyphen and underscore are allowed, like in DNS names. + + This statement is useful in HA configurations where two or more processes or + servers share the same IP address. By setting a different node-name on all + nodes, it becomes easy to immediately spot what server is handling the + traffic. + +wurfl-cache-size <size> + Sets the WURFL Useragent cache size. For faster lookups, already processed user + agents are kept in a LRU cache : + - "0" : no cache is used. + - <size> : size of lru cache in elements. + + Please note that this option is only available when HAProxy has been compiled + with USE_WURFL=1. + +wurfl-data-file <file path> + The path of the WURFL data file to provide device detection services. The + file should be accessible by HAProxy with relevant permissions. + + Please note that this option is only available when HAProxy has been compiled + with USE_WURFL=1. + +wurfl-information-list [<capability>]* + A space-delimited list of WURFL capabilities, virtual capabilities, property + names we plan to use in injected headers. A full list of capability and + virtual capability names is available on the Scientiamobile website : + + https://www.scientiamobile.com/wurflCapability + + Valid WURFL properties are: + - wurfl_id Contains the device ID of the matched device. + + - wurfl_root_id Contains the device root ID of the matched + device. + + - wurfl_isdevroot Tells if the matched device is a root device. + Possible values are "TRUE" or "FALSE". + + - wurfl_useragent The original useragent coming with this + particular web request. + + - wurfl_api_version Contains a string representing the currently + used Libwurfl API version. + + - wurfl_info A string containing information on the parsed + wurfl.xml and its full path. + + - wurfl_last_load_time Contains the UNIX timestamp of the last time + WURFL has been loaded successfully. + + - wurfl_normalized_useragent The normalized useragent. + + Please note that this option is only available when HAProxy has been compiled + with USE_WURFL=1. + +wurfl-information-list-separator <char> + A char that will be used to separate values in a response header containing + WURFL results. If not set that a comma (',') will be used by default. + + Please note that this option is only available when HAProxy has been compiled + with USE_WURFL=1. + +wurfl-patch-file [<file path>] + A list of WURFL patch file paths. Note that patches are loaded during startup + thus before the chroot. + + Please note that this option is only available when HAProxy has been compiled + with USE_WURFL=1. + +3.2. Performance tuning +----------------------- + +busy-polling + In some situations, especially when dealing with low latency on processors + supporting a variable frequency or when running inside virtual machines, each + time the process waits for an I/O using the poller, the processor goes back + to sleep or is offered to another VM for a long time, and it causes + excessively high latencies. This option provides a solution preventing the + processor from sleeping by always using a null timeout on the pollers. This + results in a significant latency reduction (30 to 100 microseconds observed) + at the expense of a risk to overheat the processor. It may even be used with + threads, in which case improperly bound threads may heavily conflict, + resulting in a worse performance and high values for the CPU stolen fields + in "show info" output, indicating which threads are misconfigured. It is + important not to let the process run on the same processor as the network + interrupts when this option is used. It is also better to avoid using it on + multiple CPU threads sharing the same core. This option is disabled by + default. If it has been enabled, it may still be forcibly disabled by + prefixing it with the "no" keyword. It is ignored by the "select" and + "poll" pollers. + + This option is automatically disabled on old processes in the context of + seamless reload; it avoids too much cpu conflicts when multiple processes + stay around for some time waiting for the end of their current connections. + +max-spread-checks <delay in milliseconds> + By default, HAProxy tries to spread the start of health checks across the + smallest health check interval of all the servers in a farm. The principle is + to avoid hammering services running on the same server. But when using large + check intervals (10 seconds or more), the last servers in the farm take some + time before starting to be tested, which can be a problem. This parameter is + used to enforce an upper bound on delay between the first and the last check, + even if the servers' check intervals are larger. When servers run with + shorter intervals, their intervals will be respected though. + +maxcompcpuusage <number> + Sets the maximum CPU usage HAProxy can reach before stopping the compression + for new requests or decreasing the compression level of current requests. + It works like 'maxcomprate' but measures CPU usage instead of incoming data + bandwidth. The value is expressed in percent of the CPU used by HAProxy. A + value of 100 disable the limit. The default value is 100. Setting a lower + value will prevent the compression work from slowing the whole process down + and from introducing high latencies. + +maxcomprate <number> + Sets the maximum per-process input compression rate to <number> kilobytes + per second. For each session, if the maximum is reached, the compression + level will be decreased during the session. If the maximum is reached at the + beginning of a session, the session will not compress at all. If the maximum + is not reached, the compression level will be increased up to + tune.comp.maxlevel. A value of zero means there is no limit, this is the + default value. + +maxconn <number> + Sets the maximum per-process number of concurrent connections to <number>. It + is equivalent to the command-line argument "-n". Proxies will stop accepting + connections when this limit is reached. The "ulimit-n" parameter is + automatically adjusted according to this value. See also "ulimit-n". Note: + the "select" poller cannot reliably use more than 1024 file descriptors on + some platforms. If your platform only supports select and reports "select + FAILED" on startup, you need to reduce maxconn until it works (slightly + below 500 in general). If this value is not set, it will automatically be + calculated based on the current file descriptors limit reported by the + "ulimit -n" command, possibly reduced to a lower value if a memory limit + is enforced, based on the buffer size, memory allocated to compression, SSL + cache size, and use or not of SSL and the associated maxsslconn (which can + also be automatic). In any case, the fd-hard-limit applies if set. + + See also: fd-hard-limit, ulimit-n + +maxconnrate <number> + Sets the maximum per-process number of connections per second to <number>. + Proxies will stop accepting connections when this limit is reached. It can be + used to limit the global capacity regardless of each frontend capacity. It is + important to note that this can only be used as a service protection measure, + as there will not necessarily be a fair share between frontends when the + limit is reached, so it's a good idea to also limit each frontend to some + value close to its expected share. Also, lowering tune.maxaccept can improve + fairness. + +maxpipes <number> + Sets the maximum per-process number of pipes to <number>. Currently, pipes + are only used by kernel-based tcp splicing. Since a pipe contains two file + descriptors, the "ulimit-n" value will be increased accordingly. The default + value is maxconn/4, which seems to be more than enough for most heavy usages. + The splice code dynamically allocates and releases pipes, and can fall back + to standard copy, so setting this value too low may only impact performance. + +maxsessrate <number> + Sets the maximum per-process number of sessions per second to <number>. + Proxies will stop accepting connections when this limit is reached. It can be + used to limit the global capacity regardless of each frontend capacity. It is + important to note that this can only be used as a service protection measure, + as there will not necessarily be a fair share between frontends when the + limit is reached, so it's a good idea to also limit each frontend to some + value close to its expected share. Also, lowering tune.maxaccept can improve + fairness. + +maxsslconn <number> + Sets the maximum per-process number of concurrent SSL connections to + <number>. By default there is no SSL-specific limit, which means that the + global maxconn setting will apply to all connections. Setting this limit + avoids having openssl use too much memory and crash when malloc returns NULL + (since it unfortunately does not reliably check for such conditions). Note + that the limit applies both to incoming and outgoing connections, so one + connection which is deciphered then ciphered accounts for 2 SSL connections. + If this value is not set, but a memory limit is enforced, this value will be + automatically computed based on the memory limit, maxconn, the buffer size, + memory allocated to compression, SSL cache size, and use of SSL in either + frontends, backends or both. If neither maxconn nor maxsslconn are specified + when there is a memory limit, HAProxy will automatically adjust these values + so that 100% of the connections can be made over SSL with no risk, and will + consider the sides where it is enabled (frontend, backend, both). + +maxsslrate <number> + Sets the maximum per-process number of SSL sessions per second to <number>. + SSL listeners will stop accepting connections when this limit is reached. It + can be used to limit the global SSL CPU usage regardless of each frontend + capacity. It is important to note that this can only be used as a service + protection measure, as there will not necessarily be a fair share between + frontends when the limit is reached, so it's a good idea to also limit each + frontend to some value close to its expected share. It is also important to + note that the sessions are accounted before they enter the SSL stack and not + after, which also protects the stack against bad handshakes. Also, lowering + tune.maxaccept can improve fairness. + +maxzlibmem <number> + Sets the maximum amount of RAM in megabytes per process usable by the zlib. + When the maximum amount is reached, future sessions will not compress as long + as RAM is unavailable. When sets to 0, there is no limit. + The default value is 0. The value is available in bytes on the UNIX socket + with "show info" on the line "MaxZlibMemUsage", the memory used by zlib is + "ZlibMemUsage" in bytes. + +no-memory-trimming + Disables memory trimming ("malloc_trim") at a few moments where attempts are + made to reclaim lots of memory (on memory shortage or on reload). Trimming + memory forces the system's allocator to scan all unused areas and to release + them. This is generally seen as nice action to leave more available memory to + a new process while the old one is unlikely to make significant use of it. + But some systems dealing with tens to hundreds of thousands of concurrent + connections may experience a lot of memory fragmentation, that may render + this release operation extremely long. During this time, no more traffic + passes through the process, new connections are not accepted anymore, some + health checks may even fail, and the watchdog may even trigger and kill the + unresponsive process, leaving a huge core dump. If this ever happens, then it + is suggested to use this option to disable trimming and stop trying to be + nice with the new process. Note that advanced memory allocators usually do + not suffer from such a problem. + +noepoll + Disables the use of the "epoll" event polling system on Linux. It is + equivalent to the command-line argument "-de". The next polling system + used will generally be "poll". See also "nopoll". + +noevports + Disables the use of the event ports event polling system on SunOS systems + derived from Solaris 10 and later. It is equivalent to the command-line + argument "-dv". The next polling system used will generally be "poll". See + also "nopoll". + +nogetaddrinfo + Disables the use of getaddrinfo(3) for name resolving. It is equivalent to + the command line argument "-dG". Deprecated gethostbyname(3) will be used. + +nokqueue + Disables the use of the "kqueue" event polling system on BSD. It is + equivalent to the command-line argument "-dk". The next polling system + used will generally be "poll". See also "nopoll". + +nopoll + Disables the use of the "poll" event polling system. It is equivalent to the + command-line argument "-dp". The next polling system used will be "select". + It should never be needed to disable "poll" since it's available on all + platforms supported by HAProxy. See also "nokqueue", "noepoll" and + "noevports". + +noreuseport + Disables the use of SO_REUSEPORT - see socket(7). It is equivalent to the + command line argument "-dR". + +nosplice + Disables the use of kernel tcp splicing between sockets on Linux. It is + equivalent to the command line argument "-dS". Data will then be copied + using conventional and more portable recv/send calls. Kernel tcp splicing is + limited to some very recent instances of kernel 2.6. Most versions between + 2.6.25 and 2.6.28 are buggy and will forward corrupted data, so they must not + be used. This option makes it easier to globally disable kernel splicing in + case of doubt. See also "option splice-auto", "option splice-request" and + "option splice-response". + +profiling.memory { on | off } + Enables ('on') or disables ('off') per-function memory profiling. This will + keep usage statistics of malloc/calloc/realloc/free calls anywhere in the + process (including libraries) which will be reported on the CLI using the + "show profiling" command. This is essentially meant to be used when an + abnormal memory usage is observed that cannot be explained by the pools and + other info are required. The performance hit will typically be around 1%, + maybe a bit more on highly threaded machines, so it is normally suitable for + use in production. The same may be achieved at run time on the CLI using the + "set profiling memory" command, please consult the management manual. + +profiling.tasks { auto | on | off } + Enables ('on') or disables ('off') per-task CPU profiling. When set to 'auto' + the profiling automatically turns on a thread when it starts to suffer from + an average latency of 1000 microseconds or higher as reported in the + "avg_loop_us" activity field, and automatically turns off when the latency + returns below 990 microseconds (this value is an average over the last 1024 + loops so it does not vary quickly and tends to significantly smooth short + spikes). It may also spontaneously trigger from time to time on overloaded + systems, containers, or virtual machines, or when the system swaps (which + must absolutely never happen on a load balancer). + + CPU profiling per task can be very convenient to report where the time is + spent and which requests have what effect on which other request. Enabling + it will typically affect the overall's performance by less than 1%, thus it + is recommended to leave it to the default 'auto' value so that it only + operates when a problem is identified. This feature requires a system + supporting the clock_gettime(2) syscall with clock identifiers + CLOCK_MONOTONIC and CLOCK_THREAD_CPUTIME_ID, otherwise the reported time will + be zero. This option may be changed at run time using "set profiling" on the + CLI. + +spread-checks <0..50, in percent> + Sometimes it is desirable to avoid sending agent and health checks to + servers at exact intervals, for instance when many logical servers are + located on the same physical server. With the help of this parameter, it + becomes possible to add some randomness in the check interval between 0 + and +/- 50%. A value between 2 and 5 seems to show good results. The + default value remains at 0. + +ssl-engine <name> [algo <comma-separated list of algorithms>] + Sets the OpenSSL engine to <name>. List of valid values for <name> may be + obtained using the command "openssl engine". This statement may be used + multiple times, it will simply enable multiple crypto engines. Referencing an + unsupported engine will prevent HAProxy from starting. Note that many engines + will lead to lower HTTPS performance than pure software with recent + processors. The optional command "algo" sets the default algorithms an ENGINE + will supply using the OPENSSL function ENGINE_set_default_string(). A value + of "ALL" uses the engine for all cryptographic operations. If no list of + algo is specified then the value of "ALL" is used. A comma-separated list + of different algorithms may be specified, including: RSA, DSA, DH, EC, RAND, + CIPHERS, DIGESTS, PKEY, PKEY_CRYPTO, PKEY_ASN1. This is the same format that + openssl configuration file uses: + https://www.openssl.org/docs/man1.0.2/apps/config.html + +ssl-mode-async + Adds SSL_MODE_ASYNC mode to the SSL context. This enables asynchronous TLS + I/O operations if asynchronous capable SSL engines are used. The current + implementation supports a maximum of 32 engines. The Openssl ASYNC API + doesn't support moving read/write buffers and is not compliant with + HAProxy's buffer management. So the asynchronous mode is disabled on + read/write operations (it is only enabled during initial and renegotiation + handshakes). + +tune.buffers.limit <number> + Sets a hard limit on the number of buffers which may be allocated per process. + The default value is zero which means unlimited. The minimum non-zero value + will always be greater than "tune.buffers.reserve" and should ideally always + be about twice as large. Forcing this value can be particularly useful to + limit the amount of memory a process may take, while retaining a sane + behavior. When this limit is reached, sessions which need a buffer wait for + another one to be released by another session. Since buffers are dynamically + allocated and released, the waiting time is very short and not perceptible + provided that limits remain reasonable. In fact sometimes reducing the limit + may even increase performance by increasing the CPU cache's efficiency. Tests + have shown good results on average HTTP traffic with a limit to 1/10 of the + expected global maxconn setting, which also significantly reduces memory + usage. The memory savings come from the fact that a number of connections + will not allocate 2*tune.bufsize. It is best not to touch this value unless + advised to do so by an HAProxy core developer. + +tune.buffers.reserve <number> + Sets the number of buffers which are pre-allocated and reserved for use only + during memory shortage conditions resulting in failed memory allocations. The + minimum value is 2 and is also the default. There is no reason a user would + want to change this value, it's mostly aimed at HAProxy core developers. + +tune.bufsize <number> + Sets the buffer size to this size (in bytes). Lower values allow more + sessions to coexist in the same amount of RAM, and higher values allow some + applications with very large cookies to work. The default value is 16384 and + can be changed at build time. It is strongly recommended not to change this + from the default value, as very low values will break some services such as + statistics, and values larger than default size will increase memory usage, + possibly causing the system to run out of memory. At least the global maxconn + parameter should be decreased by the same factor as this one is increased. In + addition, use of HTTP/2 mandates that this value must be 16384 or more. If an + HTTP request is larger than (tune.bufsize - tune.maxrewrite), HAProxy will + return HTTP 400 (Bad Request) error. Similarly if an HTTP response is larger + than this size, HAProxy will return HTTP 502 (Bad Gateway). Note that the + value set using this parameter will automatically be rounded up to the next + multiple of 8 on 32-bit machines and 16 on 64-bit machines. + +tune.comp.maxlevel <number> + Sets the maximum compression level. The compression level affects CPU + usage during compression. This value affects CPU usage during compression. + Each session using compression initializes the compression algorithm with + this value. The default value is 1. + +tune.fail-alloc + If compiled with DEBUG_FAIL_ALLOC or started with "-dMfail", gives the + percentage of chances an allocation attempt fails. Must be between 0 (no + failure) and 100 (no success). This is useful to debug and make sure memory + failures are handled gracefully. + +tune.fd.edge-triggered { on | off } [ EXPERIMENTAL ] + Enables ('on') or disables ('off') the edge-triggered polling mode for FDs + that support it. This is currently only support with epoll. It may noticeably + reduce the number of epoll_ctl() calls and slightly improve performance in + certain scenarios. This is still experimental, it may result in frozen + connections if bugs are still present, and is disabled by default. + +tune.h2.header-table-size <number> + Sets the HTTP/2 dynamic header table size. It defaults to 4096 bytes and + cannot be larger than 65536 bytes. A larger value may help certain clients + send more compact requests, depending on their capabilities. This amount of + memory is consumed for each HTTP/2 connection. It is recommended not to + change it. + +tune.h2.initial-window-size <number> + Sets the HTTP/2 initial window size, which is the number of bytes the client + can upload before waiting for an acknowledgment from HAProxy. This setting + only affects payload contents (i.e. the body of POST requests), not headers. + The default value is 65535, which roughly allows up to 5 Mbps of upload + bandwidth per client over a network showing a 100 ms ping time, or 500 Mbps + over a 1-ms local network. It can make sense to increase this value to allow + faster uploads, or to reduce it to increase fairness when dealing with many + clients. It doesn't affect resource usage. + +tune.h2.max-concurrent-streams <number> + Sets the HTTP/2 maximum number of concurrent streams per connection (ie the + number of outstanding requests on a single connection). The default value is + 100. A larger one may slightly improve page load time for complex sites when + visited over high latency networks, but increases the amount of resources a + single client may allocate. A value of zero disables the limit so a single + client may create as many streams as allocatable by HAProxy. It is highly + recommended not to change this value. + +tune.h2.max-frame-size <number> + Sets the HTTP/2 maximum frame size that HAProxy announces it is willing to + receive to its peers. The default value is the largest between 16384 and the + buffer size (tune.bufsize). In any case, HAProxy will not announce support + for frame sizes larger than buffers. The main purpose of this setting is to + allow to limit the maximum frame size setting when using large buffers. Too + large frame sizes might have performance impact or cause some peers to + misbehave. It is highly recommended not to change this value. + +tune.http.cookielen <number> + Sets the maximum length of captured cookies. This is the maximum value that + the "capture cookie xxx len yyy" will be allowed to take, and any upper value + will automatically be truncated to this one. It is important not to set too + high a value because all cookie captures still allocate this size whatever + their configured value (they share a same pool). This value is per request + per response, so the memory allocated is twice this value per connection. + When not specified, the limit is set to 63 characters. It is recommended not + to change this value. + +tune.http.logurilen <number> + Sets the maximum length of request URI in logs. This prevents truncating long + request URIs with valuable query strings in log lines. This is not related + to syslog limits. If you increase this limit, you may also increase the + 'log ... len yyy' parameter. Your syslog daemon may also need specific + configuration directives too. + The default value is 1024. + +tune.http.maxhdr <number> + Sets the maximum number of headers in a request. When a request comes with a + number of headers greater than this value (including the first line), it is + rejected with a "400 Bad Request" status code. Similarly, too large responses + are blocked with "502 Bad Gateway". The default value is 101, which is enough + for all usages, considering that the widely deployed Apache server uses the + same limit. It can be useful to push this limit further to temporarily allow + a buggy application to work by the time it gets fixed. The accepted range is + 1..32767. Keep in mind that each new header consumes 32bits of memory for + each session, so don't push this limit too high. + +tune.idle-pool.shared { on | off } + Enables ('on') or disables ('off') sharing of idle connection pools between + threads for a same server. The default is to share them between threads in + order to minimize the number of persistent connections to a server, and to + optimize the connection reuse rate. But to help with debugging or when + suspecting a bug in HAProxy around connection reuse, it can be convenient to + forcefully disable this idle pool sharing between multiple threads, and force + this option to "off". The default is on. It is strongly recommended against + disabling this option without setting a conservative value on "pool-low-conn" + for all servers relying on connection reuse to achieve a high performance + level, otherwise connections might be closed very often as the thread count + increases. + +tune.idletimer <timeout> + Sets the duration after which HAProxy will consider that an empty buffer is + probably associated with an idle stream. This is used to optimally adjust + some packet sizes while forwarding large and small data alternatively. The + decision to use splice() or to send large buffers in SSL is modulated by this + parameter. The value is in milliseconds between 0 and 65535. A value of zero + means that HAProxy will not try to detect idle streams. The default is 1000, + which seems to correctly detect end user pauses (e.g. read a page before + clicking). There should be no reason for changing this value. Please check + tune.ssl.maxrecord below. + +tune.listener.multi-queue { on | off } + Enables ('on') or disables ('off') the listener's multi-queue accept which + spreads the incoming traffic to all threads a "bind" line is allowed to run + on instead of taking them for itself. This provides a smoother traffic + distribution and scales much better, especially in environments where threads + may be unevenly loaded due to external activity (network interrupts colliding + with one thread for example). This option is enabled by default, but it may + be forcefully disabled for troubleshooting or for situations where it is + estimated that the operating system already provides a good enough + distribution and connections are extremely short-lived. + +tune.lua.forced-yield <number> + This directive forces the Lua engine to execute a yield each <number> of + instructions executed. This permits interrupting a long script and allows the + HAProxy scheduler to process other tasks like accepting connections or + forwarding traffic. The default value is 10000 instructions. If HAProxy often + executes some Lua code but more responsiveness is required, this value can be + lowered. If the Lua code is quite long and its result is absolutely required + to process the data, the <number> can be increased. + +tune.lua.maxmem + Sets the maximum amount of RAM in megabytes per process usable by Lua. By + default it is zero which means unlimited. It is important to set a limit to + ensure that a bug in a script will not result in the system running out of + memory. + +tune.lua.session-timeout <timeout> + This is the execution timeout for the Lua sessions. This is useful for + preventing infinite loops or spending too much time in Lua. This timeout + counts only the pure Lua runtime. If the Lua does a sleep, the sleep is + not taken in account. The default timeout is 4s. + +tune.lua.service-timeout <timeout> + This is the execution timeout for the Lua services. This is useful for + preventing infinite loops or spending too much time in Lua. This timeout + counts only the pure Lua runtime. If the Lua does a sleep, the sleep is + not taken in account. The default timeout is 4s. + +tune.lua.task-timeout <timeout> + Purpose is the same as "tune.lua.session-timeout", but this timeout is + dedicated to the tasks. By default, this timeout isn't set because a task may + remain alive during of the lifetime of HAProxy. For example, a task used to + check servers. + +tune.maxaccept <number> + Sets the maximum number of consecutive connections a process may accept in a + row before switching to other work. In single process mode, higher numbers + used to give better performance at high connection rates, though this is not + the case anymore with the multi-queue. This value applies individually to + each listener, so that the number of processes a listener is bound to is + taken into account. This value defaults to 4 which showed best results. If a + significantly higher value was inherited from an ancient config, it might be + worth removing it as it will both increase performance and lower response + time. In multi-process mode, it is divided by twice the number of processes + the listener is bound to. Setting this value to -1 completely disables the + limitation. It should normally not be needed to tweak this value. + +tune.maxpollevents <number> + Sets the maximum amount of events that can be processed at once in a call to + the polling system. The default value is adapted to the operating system. It + has been noticed that reducing it below 200 tends to slightly decrease + latency at the expense of network bandwidth, and increasing it above 200 + tends to trade latency for slightly increased bandwidth. + +tune.maxrewrite <number> + Sets the reserved buffer space to this size in bytes. The reserved space is + used for header rewriting or appending. The first reads on sockets will never + fill more than bufsize-maxrewrite. Historically it has defaulted to half of + bufsize, though that does not make much sense since there are rarely large + numbers of headers to add. Setting it too high prevents processing of large + requests or responses. Setting it too low prevents addition of new headers + to already large requests or to POST requests. It is generally wise to set it + to about 1024. It is automatically readjusted to half of bufsize if it is + larger than that. This means you don't have to worry about it when changing + bufsize. + +tune.pattern.cache-size <number> + Sets the size of the pattern lookup cache to <number> entries. This is an LRU + cache which reminds previous lookups and their results. It is used by ACLs + and maps on slow pattern lookups, namely the ones using the "sub", "reg", + "dir", "dom", "end", "bin" match methods as well as the case-insensitive + strings. It applies to pattern expressions which means that it will be able + to memorize the result of a lookup among all the patterns specified on a + configuration line (including all those loaded from files). It automatically + invalidates entries which are updated using HTTP actions or on the CLI. The + default cache size is set to 10000 entries, which limits its footprint to + about 5 MB per process/thread on 32-bit systems and 8 MB per process/thread + on 64-bit systems, as caches are thread/process local. There is a very low + risk of collision in this cache, which is in the order of the size of the + cache divided by 2^64. Typically, at 10000 requests per second with the + default cache size of 10000 entries, there's 1% chance that a brute force + attack could cause a single collision after 60 years, or 0.1% after 6 years. + This is considered much lower than the risk of a memory corruption caused by + aging components. If this is not acceptable, the cache can be disabled by + setting this parameter to 0. + +tune.peers.max-updates-at-once <number> + Sets the maximum number of stick-table updates that haproxy will try to + process at once when sending messages. Retrieving the data for these updates + requires some locking operations which can be CPU intensive on highly + threaded machines if unbound, and may also increase the traffic latency + during the initial batched transfer between an older and a newer process. + Conversely low values may also incur higher CPU overhead, and take longer + to complete. The default value is 200 and it is suggested not to change it. + +tune.pipesize <number> + Sets the kernel pipe buffer size to this size (in bytes). By default, pipes + are the default size for the system. But sometimes when using TCP splicing, + it can improve performance to increase pipe sizes, especially if it is + suspected that pipes are not filled and that many calls to splice() are + performed. This has an impact on the kernel's memory footprint, so this must + not be changed if impacts are not understood. + +tune.pool-high-fd-ratio <number> + This setting sets the max number of file descriptors (in percentage) used by + HAProxy globally against the maximum number of file descriptors HAProxy can + use before we start killing idle connections when we can't reuse a connection + and we have to create a new one. The default is 25 (one quarter of the file + descriptor will mean that roughly half of the maximum front connections can + keep an idle connection behind, anything beyond this probably doesn't make + much sense in the general case when targeting connection reuse). + +tune.pool-low-fd-ratio <number> + This setting sets the max number of file descriptors (in percentage) used by + HAProxy globally against the maximum number of file descriptors HAProxy can + use before we stop putting connection into the idle pool for reuse. The + default is 20. + +tune.quic.frontend.conn-tx-buffers.limit <number> + Warning: QUIC support in HAProxy is currently experimental. Configuration may + change without deprecation in the future. + + This settings defines the maximum number of buffers allocated for a QUIC + connection on data emission. By default, it is set to 30. QUIC buffers are + drained on ACK reception. This setting has a direct impact on the throughput + and memory consumption and can be adjusted according to an estimated round + time-trip. Each buffer is tune.bufsize. + +tune.quic.frontend.max-idle-timeout <timeout> + Warning: QUIC support in HAProxy is currently experimental. Configuration may + change without deprecation in the future. + + Sets the QUIC max_idle_timeout transport parameters in milliseconds for + frontends which determines the period of time after which a connection silently + closes if it has remained inactive during an effective period of time deduced + from the two max_idle_timeout values announced by the two endpoints: + - the minimum of the two values if both are not null, + - the maximum if only one of them is not null, + - if both values are null, this feature is disabled. + + The default value is 30000. + +tune.quic.frontend.max-streams-bidi <number> + Warning: QUIC support in HAProxy is currently experimental. Configuration may + change without deprecation in the future. + + Sets the QUIC initial_max_streams_bidi transport parameter for frontends. + This is the initial maximum number of bidirectional streams the remote peer + will be authorized to open. This determines the number of concurrent client + requests. + + The default value is 100. + +tune.quic.retry-threshold <number> + Warning: QUIC support in HAProxy is currently experimental. Configuration may + change without deprecation in the future. + + Dynamically enables the Retry feature for all the configured QUIC listeners + as soon as this number of half open connections is reached. A half open + connection is a connection whose handshake has not already successfully + completed or failed. To be functional this setting needs a cluster secret to + be set, if not it will be silently ignored (see "cluster-secret" setting). + This setting will be also silently ignored if the use of QUIC Retry was + forced (see "quic-force-retry"). + + The default value is 100. + + See https://www.rfc-editor.org/rfc/rfc9000.html#section-8.1.2 for more + information about QUIC retry. + +tune.rcvbuf.client <number> +tune.rcvbuf.server <number> + Forces the kernel socket receive buffer size on the client or the server side + to the specified value in bytes. This value applies to all TCP/HTTP frontends + and backends. It should normally never be set, and the default size (0) lets + the kernel auto-tune this value depending on the amount of available memory. + However it can sometimes help to set it to very low values (e.g. 4096) in + order to save kernel memory by preventing it from buffering too large amounts + of received data. Lower values will significantly increase CPU usage though. + +tune.recv_enough <number> + HAProxy uses some hints to detect that a short read indicates the end of the + socket buffers. One of them is that a read returns more than <recv_enough> + bytes, which defaults to 10136 (7 segments of 1448 each). This default value + may be changed by this setting to better deal with workloads involving lots + of short messages such as telnet or SSH sessions. + +tune.runqueue-depth <number> + Sets the maximum amount of task that can be processed at once when running + tasks. The default value depends on the number of threads but sits between 35 + and 280, which tend to show the highest request rates and lowest latencies. + Increasing it may incur latency when dealing with I/Os, making it too small + can incur extra overhead. Higher thread counts benefit from lower values. + When experimenting with much larger values, it may be useful to also enable + tune.sched.low-latency and possibly tune.fd.edge-triggered to limit the + maximum latency to the lowest possible. + +tune.sched.low-latency { on | off } + Enables ('on') or disables ('off') the low-latency task scheduler. By default + HAProxy processes tasks from several classes one class at a time as this is + the most efficient. But when running with large values of tune.runqueue-depth + this can have a measurable effect on request or connection latency. When this + low-latency setting is enabled, tasks of lower priority classes will always + be executed before other ones if they exist. This will permit to lower the + maximum latency experienced by new requests or connections in the middle of + massive traffic, at the expense of a higher impact on this large traffic. + For regular usage it is better to leave this off. The default value is off. + +tune.sndbuf.client <number> +tune.sndbuf.server <number> + Forces the kernel socket send buffer size on the client or the server side to + the specified value in bytes. This value applies to all TCP/HTTP frontends + and backends. It should normally never be set, and the default size (0) lets + the kernel auto-tune this value depending on the amount of available memory. + However it can sometimes help to set it to very low values (e.g. 4096) in + order to save kernel memory by preventing it from buffering too large amounts + of received data. Lower values will significantly increase CPU usage though. + Another use case is to prevent write timeouts with extremely slow clients due + to the kernel waiting for a large part of the buffer to be read before + notifying HAProxy again. + +tune.ssl.cachesize <number> + Sets the size of the global SSL session cache, in a number of blocks. A block + is large enough to contain an encoded session without peer certificate. An + encoded session with peer certificate is stored in multiple blocks depending + on the size of the peer certificate. A block uses approximately 200 bytes of + memory (based on `sizeof(struct sh_ssl_sess_hdr) + SHSESS_BLOCK_MIN_SIZE` + calculation used for `shctx_init` function). The default value may be forced + at build time, otherwise defaults to 20000. When the cache is full, the most + idle entries are purged and reassigned. Higher values reduce the occurrence + of such a purge, hence the number of CPU-intensive SSL handshakes by ensuring + that all users keep their session as long as possible. All entries are + pre-allocated upon startup. Setting this value to 0 disables the SSL session + cache. + +tune.ssl.capture-buffer-size <number> +tune.ssl.capture-cipherlist-size <number> (deprecated) + Sets the maximum size of the buffer used for capturing client hello cipher + list, extensions list, elliptic curves list and elliptic curve point + formats. If the value is 0 (default value) the capture is disabled, + otherwise a buffer is allocated for each SSL/TLS connection. + +tune.ssl.default-dh-param <number> + Sets the maximum size of the Diffie-Hellman parameters used for generating + the ephemeral/temporary Diffie-Hellman key in case of DHE key exchange. The + final size will try to match the size of the server's RSA (or DSA) key (e.g, + a 2048 bits temporary DH key for a 2048 bits RSA key), but will not exceed + this maximum value. Only 1024 or higher values are allowed. Higher values + will increase the CPU load, and values greater than 1024 bits are not + supported by Java 7 and earlier clients. This value is not used if static + Diffie-Hellman parameters are supplied either directly in the certificate + file or by using the ssl-dh-param-file parameter. + If there is neither a default-dh-param nor a ssl-dh-param-file defined, and + if the server's PEM file of a given frontend does not specify its own DH + parameters, then DHE ciphers will be unavailable for this frontend. + +tune.ssl.force-private-cache + This option disables SSL session cache sharing between all processes. It + should normally not be used since it will force many renegotiations due to + clients hitting a random process. But it may be required on some operating + systems where none of the SSL cache synchronization method may be used. In + this case, adding a first layer of hash-based load balancing before the SSL + layer might limit the impact of the lack of session sharing. + +tune.ssl.hard-maxrecord <number> + Sets the maximum amount of bytes passed to SSL_write() at any time. Default + value 0 means there is no limit. In contrast to tune.ssl.maxrecord this + settings will not be adjusted dynamically. Smaller records may decrease + throughput, but may be required when dealing with low-footprint clients. + +tune.ssl.keylog { on | off } + This option activates the logging of the TLS keys. It should be used with + care as it will consume more memory per SSL session and could decrease + performances. This is disabled by default. + + These sample fetches should be used to generate the SSLKEYLOGFILE that is + required to decipher traffic with wireshark. + + https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/Key_Log_Format + + The SSLKEYLOG is a series of lines which are formatted this way: + + <Label> <space> <ClientRandom> <space> <Secret> + + The ClientRandom is provided by the %[ssl_fc_client_random,hex] sample + fetch, the secret and the Label could be find in the array below. You need + to generate a SSLKEYLOGFILE with all the labels in this array. + + The following sample fetches are hexadecimal strings and does not need to be + converted. + + SSLKEYLOGFILE Label | Sample fetches for the Secrets + --------------------------------|----------------------------------------- + CLIENT_EARLY_TRAFFIC_SECRET | %[ssl_fc_client_early_traffic_secret] + CLIENT_HANDSHAKE_TRAFFIC_SECRET | %[ssl_fc_client_handshake_traffic_secret] + SERVER_HANDSHAKE_TRAFFIC_SECRET | %[ssl_fc_server_handshake_traffic_secret] + CLIENT_TRAFFIC_SECRET_0 | %[ssl_fc_client_traffic_secret_0] + SERVER_TRAFFIC_SECRET_0 | %[ssl_fc_server_traffic_secret_0] + EXPORTER_SECRET | %[ssl_fc_exporter_secret] + EARLY_EXPORTER_SECRET | %[ssl_fc_early_exporter_secret] + + This is only available with OpenSSL 1.1.1, and useful with TLS1.3 session. + + If you want to generate the content of a SSLKEYLOGFILE with TLS < 1.3, you + only need this line: + + "CLIENT_RANDOM %[ssl_fc_client_random,hex] %[ssl_fc_session_key,hex]" + +tune.ssl.lifetime <timeout> + Sets how long a cached SSL session may remain valid. This time is expressed + in seconds and defaults to 300 (5 min). It is important to understand that it + does not guarantee that sessions will last that long, because if the cache is + full, the longest idle sessions will be purged despite their configured + lifetime. The real usefulness of this setting is to prevent sessions from + being used for too long. + +tune.ssl.maxrecord <number> + Sets the maximum amount of bytes passed to SSL_write() at the beginning of + the data transfer. Default value 0 means there is no limit. Over SSL/TLS, + the client can decipher the data only once it has received a full record. + With large records, it means that clients might have to download up to 16kB + of data before starting to process them. Limiting the value can improve page + load times on browsers located over high latency or low bandwidth networks. + It is suggested to find optimal values which fit into 1 or 2 TCP segments + (generally 1448 bytes over Ethernet with TCP timestamps enabled, or 1460 when + timestamps are disabled), keeping in mind that SSL/TLS add some overhead. + Typical values of 1419 and 2859 gave good results during tests. Use + "strace -e trace=write" to find the best value. HAProxy will automatically + switch to this setting after an idle stream has been detected (see + tune.idletimer above). See also tune.ssl.hard-maxrecord. + +tune.ssl.ssl-ctx-cache-size <number> + Sets the size of the cache used to store generated certificates to <number> + entries. This is a LRU cache. Because generating a SSL certificate + dynamically is expensive, they are cached. The default cache size is set to + 1000 entries. + +tune.vars.global-max-size <size> +tune.vars.proc-max-size <size> +tune.vars.reqres-max-size <size> +tune.vars.sess-max-size <size> +tune.vars.txn-max-size <size> + These five tunes help to manage the maximum amount of memory used by the + variables system. "global" limits the overall amount of memory available for + all scopes. "proc" limits the memory for the process scope, "sess" limits the + memory for the session scope, "txn" for the transaction scope, and "reqres" + limits the memory for each request or response processing. + Memory accounting is hierarchical, meaning more coarse grained limits include + the finer grained ones: "proc" includes "sess", "sess" includes "txn", and + "txn" includes "reqres". + + For example, when "tune.vars.sess-max-size" is limited to 100, + "tune.vars.txn-max-size" and "tune.vars.reqres-max-size" cannot exceed + 100 either. If we create a variable "txn.var" that contains 100 bytes, + all available space is consumed. + Notice that exceeding the limits at runtime will not result in an error + message, but values might be cut off or corrupted. So make sure to accurately + plan for the amount of space needed to store all your variables. + +tune.zlib.memlevel <number> + Sets the memLevel parameter in zlib initialization for each session. It + defines how much memory should be allocated for the internal compression + state. A value of 1 uses minimum memory but is slow and reduces compression + ratio, a value of 9 uses maximum memory for optimal speed. Can be a value + between 1 and 9. The default value is 8. + +tune.zlib.windowsize <number> + Sets the window size (the size of the history buffer) as a parameter of the + zlib initialization for each session. Larger values of this parameter result + in better compression at the expense of memory usage. Can be a value between + 8 and 15. The default value is 15. + +3.3. Debugging +-------------- + +quiet + Do not display any message during startup. It is equivalent to the command- + line argument "-q". + +zero-warning + When this option is set, HAProxy will refuse to start if any warning was + emitted while processing the configuration. It is highly recommended to set + this option on configurations that are not changed often, as it helps detect + subtle mistakes and keep the configuration clean and forward-compatible. Note + that "haproxy -c" will also report errors in such a case. This option is + equivalent to command line argument "-dW". + + +3.4. Userlists +-------------- +It is possible to control access to frontend/backend/listen sections or to +http stats by allowing only authenticated and authorized users. To do this, +it is required to create at least one userlist and to define users. + +userlist <listname> + Creates new userlist with name <listname>. Many independent userlists can be + used to store authentication & authorization data for independent customers. + +group <groupname> [users <user>,<user>,(...)] + Adds group <groupname> to the current userlist. It is also possible to + attach users to this group by using a comma separated list of names + proceeded by "users" keyword. + +user <username> [password|insecure-password <password>] + [groups <group>,<group>,(...)] + Adds user <username> to the current userlist. Both secure (encrypted) and + insecure (unencrypted) passwords can be used. Encrypted passwords are + evaluated using the crypt(3) function, so depending on the system's + capabilities, different algorithms are supported. For example, modern Glibc + based Linux systems support MD5, SHA-256, SHA-512, and, of course, the + classic DES-based method of encrypting passwords. + + Attention: Be aware that using encrypted passwords might cause significantly + increased CPU usage, depending on the number of requests, and the algorithm + used. For any of the hashed variants, the password for each request must + be processed through the chosen algorithm, before it can be compared to the + value specified in the config file. Most current algorithms are deliberately + designed to be expensive to compute to achieve resistance against brute + force attacks. They do not simply salt/hash the clear text password once, + but thousands of times. This can quickly become a major factor in HAProxy's + overall CPU consumption! + + Example: + userlist L1 + group G1 users tiger,scott + group G2 users xdb,scott + + user tiger password $6$k6y3o.eP$JlKBx9za9667qe4(...)xHSwRv6J.C0/D7cV91 + user scott insecure-password elgato + user xdb insecure-password hello + + userlist L2 + group G1 + group G2 + + user tiger password $6$k6y3o.eP$JlKBx(...)xHSwRv6J.C0/D7cV91 groups G1 + user scott insecure-password elgato groups G1,G2 + user xdb insecure-password hello groups G2 + + Please note that both lists are functionally identical. + + +3.5. Peers +---------- +It is possible to propagate entries of any data-types in stick-tables between +several HAProxy instances over TCP connections in a multi-master fashion. Each +instance pushes its local updates and insertions to remote peers. The pushed +values overwrite remote ones without aggregation. As an exception, the data +type "conn_cur" is never learned from peers, as it is supposed to reflect local +values. Earlier versions used to synchronize it and to cause negative values in +active-active setups, and always-growing values upon reloads or active-passive +switches because the local value would reflect more connections than locally +present. This information, however, is pushed so that monitoring systems can +watch it. + +Interrupted exchanges are automatically detected and recovered from the last +known point. In addition, during a soft restart, the old process connects to +the new one using such a TCP connection to push all its entries before the new +process tries to connect to other peers. That ensures very fast replication +during a reload, it typically takes a fraction of a second even for large +tables. + +Note that Server IDs are used to identify servers remotely, so it is important +that configurations look similar or at least that the same IDs are forced on +each server on all participants. + +peers <peersect> + Creates a new peer list with name <peersect>. It is an independent section, + which is referenced by one or more stick-tables. + +bind [<address>]:<port_range> [, ...] [param*] + Defines the binding parameters of the local peer of this "peers" section. + Such lines are not supported with "peer" line in the same "peers" section. + +disabled + Disables a peers section. It disables both listening and any synchronization + related to this section. This is provided to disable synchronization of stick + tables without having to comment out all "peers" references. + +default-bind [param*] + Defines the binding parameters for the local peer, excepted its address. + +default-server [param*] + Change default options for a server in a "peers" section. + + Arguments: + <param*> is a list of parameters for this server. The "default-server" + keyword accepts an important number of options and has a complete + section dedicated to it. In a peers section, the transport + parameters of a "default-server" line are supported. Please refer + to section 5 for more details, and the "server" keyword below in + this section for some of the restrictions. + + See also: "server" and section 5 about server options + +enabled + This re-enables a peers section which was previously disabled via the + "disabled" keyword. + +log <address> [len <length>] [format <format>] [sample <ranges>:<sample_size>] + <facility> [<level> [<minlevel>]] + "peers" sections support the same "log" keyword as for the proxies to + log information about the "peers" listener. See "log" option for proxies for + more details. + +peer <peername> <ip>:<port> [param*] + Defines a peer inside a peers section. + If <peername> is set to the local peer name (by default hostname, or forced + using "-L" command line option or "localpeer" global configuration setting), + HAProxy will listen for incoming remote peer connection on <ip>:<port>. + Otherwise, <ip>:<port> defines where to connect to in order to join the + remote peer, and <peername> is used at the protocol level to identify and + validate the remote peer on the server side. + + During a soft restart, local peer <ip>:<port> is used by the old instance to + connect the new one and initiate a complete replication (teaching process). + + It is strongly recommended to have the exact same peers declaration on all + peers and to only rely on the "-L" command line argument or the "localpeer" + global configuration setting to change the local peer name. This makes it + easier to maintain coherent configuration files across all peers. + + You may want to reference some environment variables in the address + parameter, see section 2.3 about environment variables. + + Note: "peer" keyword may transparently be replaced by "server" keyword (see + "server" keyword explanation below). + +server <peername> [<ip>:<port>] [param*] + As previously mentioned, "peer" keyword may be replaced by "server" keyword + with a support for all "server" parameters found in 5.2 paragraph that are + related to transport settings. If the underlying peer is local, <ip>:<port> + parameters must not be present; these parameters must be provided on a "bind" + line (see "bind" keyword of this "peers" section). + + A number of "server" parameters are irrelevant for "peers" sections. Peers by + nature do not support dynamic host name resolution nor health checks, hence + parameters like "init_addr", "resolvers", "check", "agent-check", or "track" + are not supported. Similarly, there is no load balancing nor stickiness, thus + parameters such as "weight" or "cookie" have no effect. + + Example: + # The old way. + peers mypeers + peer haproxy1 192.168.0.1:1024 + peer haproxy2 192.168.0.2:1024 + peer haproxy3 10.2.0.1:1024 + + backend mybackend + mode tcp + balance roundrobin + stick-table type ip size 20k peers mypeers + stick on src + + server srv1 192.168.0.30:80 + server srv2 192.168.0.31:80 + + Example: + peers mypeers + bind 192.168.0.1:1024 ssl crt mycerts/pem + default-server ssl verify none + server haproxy1 #local peer + server haproxy2 192.168.0.2:1024 + server haproxy3 10.2.0.1:1024 + + +table <tablename> type {ip | integer | string [len <length>] | binary [len <length>]} + size <size> [expire <expire>] [nopurge] [store <data_type>]* + + Configure a stickiness table for the current section. This line is parsed + exactly the same way as the "stick-table" keyword in others section, except + for the "peers" argument which is not required here and with an additional + mandatory first parameter to designate the stick-table. Contrary to others + sections, there may be several "table" lines in "peers" sections (see also + "stick-table" keyword). + + Also be aware of the fact that "peers" sections have their own stick-table + namespaces to avoid collisions between stick-table names identical in + different "peers" section. This is internally handled prepending the "peers" + sections names to the name of the stick-tables followed by a '/' character. + If somewhere else in the configuration file you have to refer to such + stick-tables declared in "peers" sections you must use the prefixed version + of the stick-table name as follows: + + peers mypeers + peer A ... + peer B ... + table t1 ... + + frontend fe1 + tcp-request content track-sc0 src table mypeers/t1 + + This is also this prefixed version of the stick-table names which must be + used to refer to stick-tables through the CLI. + + About "peers" protocol, as only "peers" belonging to the same section may + communicate with each others, there is no need to do such a distinction. + Several "peers" sections may declare stick-tables with the same name. + This is shorter version of the stick-table name which is sent over the network. + There is only a '/' character as prefix to avoid stick-table name collisions between + stick-tables declared as backends and stick-table declared in "peers" sections + as follows in this weird but supported configuration: + + peers mypeers + peer A ... + peer B ... + table t1 type string size 10m store gpc0 + + backend t1 + stick-table type string size 10m store gpc0 peers mypeers + + Here "t1" table declared in "mypeers" section has "mypeers/t1" as global name. + "t1" table declared as a backend as "t1" as global name. But at peer protocol + level the former table is named "/t1", the latter is again named "t1". + +3.6. Mailers +------------ +It is possible to send email alerts when the state of servers changes. +If configured email alerts are sent to each mailer that is configured +in a mailers section. Email is sent to mailers using SMTP. + +mailers <mailersect> + Creates a new mailer list with the name <mailersect>. It is an + independent section which is referenced by one or more proxies. + +mailer <mailername> <ip>:<port> + Defines a mailer inside a mailers section. + + Example: + mailers mymailers + mailer smtp1 192.168.0.1:587 + mailer smtp2 192.168.0.2:587 + + backend mybackend + mode tcp + balance roundrobin + + email-alert mailers mymailers + email-alert from test1@horms.org + email-alert to test2@horms.org + + server srv1 192.168.0.30:80 + server srv2 192.168.0.31:80 + +timeout mail <time> + Defines the time available for a mail/connection to be made and send to + the mail-server. If not defined the default value is 10 seconds. To allow + for at least two SYN-ACK packets to be send during initial TCP handshake it + is advised to keep this value above 4 seconds. + + Example: + mailers mymailers + timeout mail 20s + mailer smtp1 192.168.0.1:587 + +3.7. Programs +------------- +In master-worker mode, it is possible to launch external binaries with the +master, these processes are called programs. These programs are launched and +managed the same way as the workers. + +During a reload of HAProxy, those processes are dealing with the same +sequence as a worker: + + - the master is re-executed + - the master sends a SIGUSR1 signal to the program + - if "option start-on-reload" is not disabled, the master launches a new + instance of the program + +During a stop, or restart, a SIGTERM is sent to the programs. + +program <name> + This is a new program section, this section will create an instance <name> + which is visible in "show proc" on the master CLI. (See "9.4. Master CLI" in + the management guide). + +command <command> [arguments*] + Define the command to start with optional arguments. The command is looked + up in the current PATH if it does not include an absolute path. This is a + mandatory option of the program section. Arguments containing spaces must + be enclosed in quotes or double quotes or be prefixed by a backslash. + +user <user name> + Changes the executed command user ID to the <user name> from /etc/passwd. + See also "group". + +group <group name> + Changes the executed command group ID to the <group name> from /etc/group. + See also "user". + +option start-on-reload +no option start-on-reload + Start (or not) a new instance of the program upon a reload of the master. + The default is to start a new instance. This option may only be used in a + program section. + + +3.8. HTTP-errors +---------------- + +It is possible to globally declare several groups of HTTP errors, to be +imported afterwards in any proxy section. Same group may be referenced at +several places and can be fully or partially imported. + +http-errors <name> + Create a new http-errors group with the name <name>. It is an independent + section that may be referenced by one or more proxies using its name. + +errorfile <code> <file> + Associate a file contents to an HTTP error code + + Arguments : + <code> is the HTTP status code. Currently, HAProxy is capable of + generating codes 200, 400, 401, 403, 404, 405, 407, 408, 410, + 425, 429, 500, 501, 502, 503, and 504. + + <file> designates a file containing the full HTTP response. It is + recommended to follow the common practice of appending ".http" to + the filename so that people do not confuse the response with HTML + error pages, and to use absolute paths, since files are read + before any chroot is performed. + + Please referrers to "errorfile" keyword in section 4 for details. + + Example: + http-errors website-1 + errorfile 400 /etc/haproxy/errorfiles/site1/400.http + errorfile 404 /etc/haproxy/errorfiles/site1/404.http + errorfile 408 /dev/null # work around Chrome pre-connect bug + + http-errors website-2 + errorfile 400 /etc/haproxy/errorfiles/site2/400.http + errorfile 404 /etc/haproxy/errorfiles/site2/404.http + errorfile 408 /dev/null # work around Chrome pre-connect bug + +3.9. Rings +---------- + +It is possible to globally declare ring-buffers, to be used as target for log +servers or traces. + +ring <ringname> + Creates a new ring-buffer with name <ringname>. + +backing-file <path> + This replaces the regular memory allocation by a RAM-mapped file to store the + ring. This can be useful for collecting traces or logs for post-mortem + analysis, without having to attach a slow client to the CLI. Newer contents + will automatically replace older ones so that the latest contents are always + available. The contents written to the ring will be visible in that file once + the process stops (most often they will even be seen very soon after but + there is no such guarantee since writes are not synchronous). + + When this option is used, the total storage area is reduced by the size of + the "struct ring" that starts at the beginning of the area, and that is + required to recover the area's contents. The file will be created with the + starting user's ownership, with mode 0600 and will be of the size configured + by the "size" directive. When the directive is parsed (thus even during + config checks), any existing non-empty file will first be renamed with the + extra suffix ".bak", and any previously existing file with suffix ".bak" will + be removed. This ensures that instant reload or restart of the process will + not wipe precious debugging information, and will leave time for an admin to + spot this new ".bak" file and to archive it if needed. As such, after a crash + the file designated by <path> will contain the freshest information, and if + the service is restarted, the "<path>.bak" file will have it instead. This + means that the total storage capacity required will be double of the ring + size. Failures to rotate the file are silently ignored, so placing the file + into a directory without write permissions will be sufficient to avoid the + backup file if not desired. + + WARNING: there are stability and security implications in using this feature. + First, backing the ring to a slow device (e.g. physical hard drive) may cause + perceptible slowdowns during accesses, and possibly even panics if too many + threads compete for accesses. Second, an external process modifying the area + could cause the haproxy process to crash or to overwrite some of its own + memory with traces. Third, if the file system fills up before the ring, + writes to the ring may cause the process to crash. + + The information present in this ring are structured and are NOT directly + readable using a text editor (even though most of it looks barely readable). + The output of this file is only intended for developers. + +description <text> + The description is an optional description string of the ring. It will + appear on CLI. By default, <name> is reused to fill this field. + +format <format> + Format used to store events into the ring buffer. + + Arguments: + <format> is the log format used when generating syslog messages. It may be + one of the following : + + iso A message containing only the ISO date, followed by the text. + The PID, process name and system name are omitted. This is + designed to be used with a local log server. + + local Analog to rfc3164 syslog message format except that hostname + field is stripped. This is the default. + Note: option "log-send-hostname" switches the default to + rfc3164. + + raw A message containing only the text. The level, PID, date, time, + process name and system name are omitted. This is designed to be + used in containers or during development, where the severity + only depends on the file descriptor used (stdout/stderr). This + is the default. + + rfc3164 The RFC3164 syslog message format. + (https://tools.ietf.org/html/rfc3164) + + rfc5424 The RFC5424 syslog message format. + (https://tools.ietf.org/html/rfc5424) + + short A message containing only a level between angle brackets such as + '<3>', followed by the text. The PID, date, time, process name + and system name are omitted. This is designed to be used with a + local log server. This format is compatible with what the systemd + logger consumes. + + priority A message containing only a level plus syslog facility between angle + brackets such as '<63>', followed by the text. The PID, date, time, + process name and system name are omitted. This is designed to be used + with a local log server. + + timed A message containing only a level between angle brackets such as + '<3>', followed by ISO date and by the text. The PID, process + name and system name are omitted. This is designed to be + used with a local log server. + +maxlen <length> + The maximum length of an event message stored into the ring, + including formatted header. If an event message is longer than + <length>, it will be truncated to this length. + +server <name> <address> [param*] + Used to configure a syslog tcp server to forward messages from ring buffer. + This supports for all "server" parameters found in 5.2 paragraph. Some of + these parameters are irrelevant for "ring" sections. Important point: there + is little reason to add more than one server to a ring, because all servers + will receive the exact same copy of the ring contents, and as such the ring + will progress at the speed of the slowest server. If one server does not + respond, it will prevent old messages from being purged and may block new + messages from being inserted into the ring. The proper way to send messages + to multiple servers is to use one distinct ring per log server, not to + attach multiple servers to the same ring. Note that specific server directive + "log-proto" is used to set the protocol used to send messages. + +size <size> + This is the optional size in bytes for the ring-buffer. Default value is + set to BUFSIZE. + +timeout connect <timeout> + Set the maximum time to wait for a connection attempt to a server to succeed. + + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + +timeout server <timeout> + Set the maximum time for pending data staying into output buffer. + + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + Example: + global + log ring@myring local7 + + ring myring + description "My local buffer" + format rfc3164 + maxlen 1200 + size 32764 + timeout connect 5s + timeout server 10s + server mysyslogsrv 127.0.0.1:6514 log-proto octet-count + +3.10. Log forwarding +------------------- + +It is possible to declare one or multiple log forwarding section, +HAProxy will forward all received log messages to a log servers list. + +log-forward <name> + Creates a new log forwarder proxy identified as <name>. + +backlog <conns> + Give hints to the system about the approximate listen backlog desired size + on connections accept. + +bind <addr> [param*] + Used to configure a stream log listener to receive messages to forward. + This supports the "bind" parameters found in 5.1 paragraph including + those about ssl but some statements such as "alpn" may be irrelevant for + syslog protocol over TCP. + Those listeners support both "Octet Counting" and "Non-Transparent-Framing" + modes as defined in rfc-6587. + +dgram-bind <addr> [param*] + Used to configure a datagram log listener to receive messages to forward. + Addresses must be in IPv4 or IPv6 form,followed by a port. This supports + for some of the "bind" parameters found in 5.1 paragraph among which + "interface", "namespace" or "transparent", the other ones being + silently ignored as irrelevant for UDP/syslog case. + +log global +log <address> [len <length>] [format <format>] [sample <ranges>:<sample_size>] + <facility> [<level> [<minlevel>]] + Used to configure target log servers. See more details on proxies + documentation. + If no format specified, HAProxy tries to keep the incoming log format. + Configured facility is ignored, except if incoming message does not + present a facility but one is mandatory on the outgoing format. + If there is no timestamp available in the input format, but the field + exists in output format, HAProxy will use the local date. + + Example: + global + log stderr format iso local7 + + ring myring + description "My local buffer" + format rfc5424 + maxlen 1200 + size 32764 + timeout connect 5s + timeout server 10s + # syslog tcp server + server mysyslogsrv 127.0.0.1:514 log-proto octet-count + + log-forward sylog-loadb + dgram-bind 127.0.0.1:1514 + bind 127.0.0.1:1514 + # all messages on stderr + log global + # all messages on local tcp syslog server + log ring@myring local0 + # load balance messages on 4 udp syslog servers + log 127.0.0.1:10001 sample 1:4 local0 + log 127.0.0.1:10002 sample 2:4 local0 + log 127.0.0.1:10003 sample 3:4 local0 + log 127.0.0.1:10004 sample 4:4 local0 + +maxconn <conns> + Fix the maximum number of concurrent connections on a log forwarder. + 10 is the default. + +timeout client <timeout> + Set the maximum inactivity time on the client side. + +4. Proxies +---------- + +Proxy configuration can be located in a set of sections : + - defaults [<name>] [ from <defaults_name> ] + - frontend <name> [ from <defaults_name> ] + - backend <name> [ from <defaults_name> ] + - listen <name> [ from <defaults_name> ] + +A "frontend" section describes a set of listening sockets accepting client +connections. + +A "backend" section describes a set of servers to which the proxy will connect +to forward incoming connections. + +A "listen" section defines a complete proxy with its frontend and backend +parts combined in one section. It is generally useful for TCP-only traffic. + +A "defaults" section resets all settings to the documented ones and presets new +ones for use by subsequent sections. All of "frontend", "backend" and "listen" +sections always take their initial settings from a defaults section, by default +the latest one that appears before the newly created section. It is possible to +explicitly designate a specific "defaults" section to load the initial settings +from by indicating its name on the section line after the optional keyword +"from". While "defaults" section do not impose a name, this use is encouraged +for better readability. It is also the only way to designate a specific section +to use instead of the default previous one. Since "defaults" section names are +optional, by default a very permissive check is applied on their name and these +are even permitted to overlap. However if a "defaults" section is referenced by +any other section, its name must comply with the syntax imposed on all proxy +names, and this name must be unique among the defaults sections. Please note +that regardless of what is currently permitted, it is recommended to avoid +duplicate section names in general and to respect the same syntax as for proxy +names. This rule might be enforced in a future version. In addition, a warning +is emitted if a defaults section is explicitly used by a proxy while it is also +implicitly used by another one because it is the last one defined. It is highly +encouraged to not mix both usages by always using explicit references or by +adding a last common defaults section reserved for all implicit uses. + +Note that it is even possible for a defaults section to take its initial +settings from another one, and as such, inherit settings across multiple levels +of defaults sections. This can be convenient to establish certain configuration +profiles to carry groups of default settings (e.g. TCP vs HTTP or short vs long +timeouts) but can quickly become confusing to follow. + +All proxy names must be formed from upper and lower case letters, digits, +'-' (dash), '_' (underscore) , '.' (dot) and ':' (colon). ACL names are +case-sensitive, which means that "www" and "WWW" are two different proxies. + +Historically, all proxy names could overlap, it just caused troubles in the +logs. Since the introduction of content switching, it is mandatory that two +proxies with overlapping capabilities (frontend/backend) have different names. +However, it is still permitted that a frontend and a backend share the same +name, as this configuration seems to be commonly encountered. + +Right now, two major proxy modes are supported : "tcp", also known as layer 4, +and "http", also known as layer 7. In layer 4 mode, HAProxy simply forwards +bidirectional traffic between two sides. In layer 7 mode, HAProxy analyzes the +protocol, and can interact with it by allowing, blocking, switching, adding, +modifying, or removing arbitrary contents in requests or responses, based on +arbitrary criteria. + +In HTTP mode, the processing applied to requests and responses flowing over +a connection depends in the combination of the frontend's HTTP options and +the backend's. HAProxy supports 3 connection modes : + + - KAL : keep alive ("option http-keep-alive") which is the default mode : all + requests and responses are processed, and connections remain open but idle + between responses and new requests. + + - SCL: server close ("option http-server-close") : the server-facing + connection is closed after the end of the response is received, but the + client-facing connection remains open. + + - CLO: close ("option httpclose"): the connection is closed after the end of + the response and "Connection: close" appended in both directions. + +The effective mode that will be applied to a connection passing through a +frontend and a backend can be determined by both proxy modes according to the +following matrix, but in short, the modes are symmetric, keep-alive is the +weakest option and close is the strongest. + + Backend mode + + | KAL | SCL | CLO + ----+-----+-----+---- + KAL | KAL | SCL | CLO + ----+-----+-----+---- + mode SCL | SCL | SCL | CLO + ----+-----+-----+---- + CLO | CLO | CLO | CLO + +It is possible to chain a TCP frontend to an HTTP backend. It is pointless if +only HTTP traffic is handled. But it may be used to handle several protocols +within the same frontend. In this case, the client's connection is first handled +as a raw tcp connection before being upgraded to HTTP. Before the upgrade, the +content processings are performend on raw data. Once upgraded, data is parsed +and stored using an internal representation called HTX and it is no longer +possible to rely on raw representation. There is no way to go back. + +There are two kind of upgrades, in-place upgrades and destructive upgrades. The +first ones involves a TCP to HTTP/1 upgrade. In HTTP/1, the request +processings are serialized, thus the applicative stream can be preserved. The +second one involves a TCP to HTTP/2 upgrade. Because it is a multiplexed +protocol, the applicative stream cannot be associated to any HTTP/2 stream and +is destroyed. New applicative streams are then created when HAProxy receives +new HTTP/2 streams at the lower level, in the H2 multiplexer. It is important +to understand this difference because that drastically changes the way to +process data. When an HTTP/1 upgrade is performed, the content processings +already performed on raw data are neither lost nor reexecuted while for an +HTTP/2 upgrade, applicative streams are distinct and all frontend rules are +evaluated systematically on each one. And as said, the first stream, the TCP +one, is destroyed, but only after the frontend rules were evaluated. + +There is another importnat point to understand when HTTP processings are +performed from a TCP proxy. While HAProxy is able to parse HTTP/1 in-fly from +tcp-request content rules, it is not possible for HTTP/2. Only the HTTP/2 +preface can be parsed. This is a huge limitation regarding the HTTP content +analysis in TCP. Concretely it is only possible to know if received data are +HTTP. For instance, it is not possible to choose a backend based on the Host +header value while it is trivial in HTTP/1. Hopefully, there is a solution to +mitigate this drawback. + +There are two ways to perform an HTTP upgrade. The first one, the historical +method, is to select an HTTP backend. The upgrade happens when the backend is +set. Thus, for in-place upgrades, only the backend configuration is considered +in the HTTP data processing. For destructive upgrades, the applicative stream +is destroyed, thus its processing is stopped. With this method, possibilities +to choose a backend with an HTTP/2 connection are really limited, as mentioned +above, and a bit useless because the stream is destroyed. The second method is +to upgrade during the tcp-request content rules evaluation, thanks to the +"switch-mode http" action. In this case, the upgrade is performed in the +frontend context and it is possible to define HTTP directives in this +frontend. For in-place upgrades, it offers all the power of the HTTP analysis +as soon as possible. It is not that far from an HTTP frontend. For destructive +upgrades, it does not change anything except it is useless to choose a backend +on limited information. It is of course the recommended method. Thus, testing +the request protocol from the tcp-request content rules to perform an HTTP +upgrade is enough. All the remaining HTTP manipulation may be moved to the +frontend http-request ruleset. But keep in mind that tcp-request content rules +remains evaluated on each streams, that can't be changed. + +4.1. Proxy keywords matrix +-------------------------- + +The following list of keywords is supported. Most of them may only be used in a +limited set of section types. Some of them are marked as "deprecated" because +they are inherited from an old syntax which may be confusing or functionally +limited, and there are new recommended keywords to replace them. Keywords +marked with "(*)" can be optionally inverted using the "no" prefix, e.g. "no +option contstats". This makes sense when the option has been enabled by default +and must be disabled for a specific instance. Such options may also be prefixed +with "default" in order to restore default settings regardless of what has been +specified in a previous "defaults" section. Keywords supported in defaults +sections marked with "(!)" are only supported in named defaults sections, not +anonymous ones. + + + keyword defaults frontend listen backend +------------------------------------+----------+----------+---------+--------- +acl X (!) X X X +backlog X X X - +balance X - X X +bind - X X - +bind-process X X X X +capture cookie - X X - +capture request header - X X - +capture response header - X X - +clitcpka-cnt X X X - +clitcpka-idle X X X - +clitcpka-intvl X X X - +compression X X X X +cookie X - X X +declare capture - X X - +default-server X - X X +default_backend X X X - +description - X X X +disabled X X X X +dispatch - - X X +email-alert from X X X X +email-alert level X X X X +email-alert mailers X X X X +email-alert myhostname X X X X +email-alert to X X X X +enabled X X X X +errorfile X X X X +errorfiles X X X X +errorloc X X X X +errorloc302 X X X X +-- keyword -------------------------- defaults - frontend - listen -- backend - +errorloc303 X X X X +error-log-format X X X - +force-persist - - X X +filter - X X X +fullconn X - X X +hash-type X - X X +http-after-response X (!) X X X +http-check comment X - X X +http-check connect X - X X +http-check disable-on-404 X - X X +http-check expect X - X X +http-check send X - X X +http-check send-state X - X X +http-check set-var X - X X +http-check unset-var X - X X +http-error X X X X +http-request X (!) X X X +http-response X (!) X X X +http-reuse X - X X +http-send-name-header X - X X +id - X X X +ignore-persist - - X X +load-server-state-from-file X - X X +log (*) X X X X +log-format X X X - +log-format-sd X X X - +log-tag X X X X +max-keep-alive-queue X - X X +maxconn X X X - +mode X X X X +monitor fail - X X - +monitor-uri X X X - +option abortonclose (*) X - X X +option accept-invalid-http-request (*) X X X - +option accept-invalid-http-response (*) X - X X +option allbackups (*) X - X X +option checkcache (*) X - X X +option clitcpka (*) X X X - +option contstats (*) X X X - +option disable-h2-upgrade (*) X X X - +option dontlog-normal (*) X X X - +option dontlognull (*) X X X - +-- keyword -------------------------- defaults - frontend - listen -- backend - +option forwardfor X X X X +option h1-case-adjust-bogus-client (*) X X X - +option h1-case-adjust-bogus-server (*) X - X X +option http-buffer-request (*) X X X X +option http-ignore-probes (*) X X X - +option http-keep-alive (*) X X X X +option http-no-delay (*) X X X X +option http-pretend-keepalive (*) X - X X +option http-restrict-req-hdr-names X X X X +option http-server-close (*) X X X X +option http-use-proxy-header (*) X X X - +option httpchk X - X X +option httpclose (*) X X X X +option httplog X X X - +option httpslog X X X - +option independent-streams (*) X X X X +option ldap-check X - X X +option external-check X - X X +option log-health-checks (*) X - X X +option log-separate-errors (*) X X X - +option logasap (*) X X X - +option mysql-check X - X X +option nolinger (*) X X X X +option originalto X X X X +option persist (*) X - X X +option pgsql-check X - X X +option prefer-last-server (*) X - X X +option redispatch (*) X - X X +option redis-check X - X X +option smtpchk X - X X +option socket-stats (*) X X X - +option splice-auto (*) X X X X +option splice-request (*) X X X X +option splice-response (*) X X X X +option spop-check X - X X +option srvtcpka (*) X - X X +option ssl-hello-chk X - X X +-- keyword -------------------------- defaults - frontend - listen -- backend - +option tcp-check X - X X +option tcp-smart-accept (*) X X X - +option tcp-smart-connect (*) X - X X +option tcpka X X X X +option tcplog X X X X +option transparent (*) X - X X +option idle-close-on-response (*) X X X - +external-check command X - X X +external-check path X - X X +persist rdp-cookie X - X X +rate-limit sessions X X X - +redirect - X X X +-- keyword -------------------------- defaults - frontend - listen -- backend - +retries X - X X +retry-on X - X X +server - - X X +server-state-file-name X - X X +server-template - - X X +source X - X X +srvtcpka-cnt X - X X +srvtcpka-idle X - X X +srvtcpka-intvl X - X X +stats admin - X X X +stats auth X X X X +stats enable X X X X +stats hide-version X X X X +stats http-request - X X X +stats realm X X X X +stats refresh X X X X +stats scope X X X X +stats show-desc X X X X +stats show-legends X X X X +stats show-node X X X X +stats uri X X X X +-- keyword -------------------------- defaults - frontend - listen -- backend - +stick match - - X X +stick on - - X X +stick store-request - - X X +stick store-response - - X X +stick-table - X X X +tcp-check comment X - X X +tcp-check connect X - X X +tcp-check expect X - X X +tcp-check send X - X X +tcp-check send-lf X - X X +tcp-check send-binary X - X X +tcp-check send-binary-lf X - X X +tcp-check set-var X - X X +tcp-check unset-var X - X X +tcp-request connection X (!) X X - +tcp-request content X (!) X X X +tcp-request inspect-delay X (!) X X X +tcp-request session X (!) X X - +tcp-response content X (!) - X X +tcp-response inspect-delay X (!) - X X +timeout check X - X X +timeout client X X X - +timeout client-fin X X X - +timeout connect X - X X +timeout http-keep-alive X X X X +timeout http-request X X X X +timeout queue X - X X +timeout server X - X X +timeout server-fin X - X X +timeout tarpit X X X X +timeout tunnel X - X X +transparent (deprecated) X - X X +unique-id-format X X X - +unique-id-header X X X - +use_backend - X X - +use-fcgi-app - - X X +use-server - - X X +------------------------------------+----------+----------+---------+--------- + keyword defaults frontend listen backend + + +4.2. Alphabetically sorted keywords reference +--------------------------------------------- + +This section provides a description of each keyword and its usage. + + +acl <aclname> <criterion> [flags] [operator] <value> ... + Declare or complete an access list. + May be used in sections : defaults | frontend | listen | backend + yes(!) | yes | yes | yes + + This directive is only available from named defaults sections, not anonymous + ones. ACLs defined in a defaults section are not visible from other sections + using it. + + Example: + acl invalid_src src 0.0.0.0/7 224.0.0.0/3 + acl invalid_src src_port 0:1023 + acl local_dst hdr(host) -i localhost + + See section 7 about ACL usage. + + +backlog <conns> + Give hints to the system about the approximate listen backlog desired size + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <conns> is the number of pending connections. Depending on the operating + system, it may represent the number of already acknowledged + connections, of non-acknowledged ones, or both. + + In order to protect against SYN flood attacks, one solution is to increase + the system's SYN backlog size. Depending on the system, sometimes it is just + tunable via a system parameter, sometimes it is not adjustable at all, and + sometimes the system relies on hints given by the application at the time of + the listen() syscall. By default, HAProxy passes the frontend's maxconn value + to the listen() syscall. On systems which can make use of this value, it can + sometimes be useful to be able to specify a different value, hence this + backlog parameter. + + On Linux 2.4, the parameter is ignored by the system. On Linux 2.6, it is + used as a hint and the system accepts up to the smallest greater power of + two, and never more than some limits (usually 32768). + + See also : "maxconn" and the target operating system's tuning guide. + + +balance <algorithm> [ <arguments> ] +balance url_param <param> [check_post] + Define the load balancing algorithm to be used in a backend. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <algorithm> is the algorithm used to select a server when doing load + balancing. This only applies when no persistence information + is available, or when a connection is redispatched to another + server. <algorithm> may be one of the following : + + roundrobin Each server is used in turns, according to their weights. + This is the smoothest and fairest algorithm when the server's + processing time remains equally distributed. This algorithm + is dynamic, which means that server weights may be adjusted + on the fly for slow starts for instance. It is limited by + design to 4095 active servers per backend. Note that in some + large farms, when a server becomes up after having been down + for a very short time, it may sometimes take a few hundreds + requests for it to be re-integrated into the farm and start + receiving traffic. This is normal, though very rare. It is + indicated here in case you would have the chance to observe + it, so that you don't worry. + + static-rr Each server is used in turns, according to their weights. + This algorithm is as similar to roundrobin except that it is + static, which means that changing a server's weight on the + fly will have no effect. On the other hand, it has no design + limitation on the number of servers, and when a server goes + up, it is always immediately reintroduced into the farm, once + the full map is recomputed. It also uses slightly less CPU to + run (around -1%). + + leastconn The server with the lowest number of connections receives the + connection. Round-robin is performed within groups of servers + of the same load to ensure that all servers will be used. Use + of this algorithm is recommended where very long sessions are + expected, such as LDAP, SQL, TSE, etc... but is not very well + suited for protocols using short sessions such as HTTP. This + algorithm is dynamic, which means that server weights may be + adjusted on the fly for slow starts for instance. It will + also consider the number of queued connections in addition to + the established ones in order to minimize queuing. + + first The first server with available connection slots receives the + connection. The servers are chosen from the lowest numeric + identifier to the highest (see server parameter "id"), which + defaults to the server's position in the farm. Once a server + reaches its maxconn value, the next server is used. It does + not make sense to use this algorithm without setting maxconn. + The purpose of this algorithm is to always use the smallest + number of servers so that extra servers can be powered off + during non-intensive hours. This algorithm ignores the server + weight, and brings more benefit to long session such as RDP + or IMAP than HTTP, though it can be useful there too. In + order to use this algorithm efficiently, it is recommended + that a cloud controller regularly checks server usage to turn + them off when unused, and regularly checks backend queue to + turn new servers on when the queue inflates. Alternatively, + using "http-check send-state" may inform servers on the load. + + hash Takes a regular sample expression in argument. The expression + is evaluated for each request and hashed according to the + configured hash-type. The result of the hash is divided by + the total weight of the running servers to designate which + server will receive the request. This can be used in place of + "source", "uri", "hdr()", "url_param()", "rdp-cookie" to make + use of a converter, refine the evaluation, or be used to + extract data from local variables for example. When the data + is not available, round robin will apply. This algorithm is + static by default, which means that changing a server's + weight on the fly will have no effect, but this can be + changed using "hash-type". + + source The source IP address is hashed and divided by the total + weight of the running servers to designate which server will + receive the request. This ensures that the same client IP + address will always reach the same server as long as no + server goes down or up. If the hash result changes due to the + number of running servers changing, many clients will be + directed to a different server. This algorithm is generally + used in TCP mode where no cookie may be inserted. It may also + be used on the Internet to provide a best-effort stickiness + to clients which refuse session cookies. This algorithm is + static by default, which means that changing a server's + weight on the fly will have no effect, but this can be + changed using "hash-type". See also the "hash" option above. + + uri This algorithm hashes either the left part of the URI (before + the question mark) or the whole URI (if the "whole" parameter + is present) and divides the hash value by the total weight of + the running servers. The result designates which server will + receive the request. This ensures that the same URI will + always be directed to the same server as long as no server + goes up or down. This is used with proxy caches and + anti-virus proxies in order to maximize the cache hit rate. + Note that this algorithm may only be used in an HTTP backend. + This algorithm is static by default, which means that + changing a server's weight on the fly will have no effect, + but this can be changed using "hash-type". + + This algorithm supports two optional parameters "len" and + "depth", both followed by a positive integer number. These + options may be helpful when it is needed to balance servers + based on the beginning of the URI only. The "len" parameter + indicates that the algorithm should only consider that many + characters at the beginning of the URI to compute the hash. + Note that having "len" set to 1 rarely makes sense since most + URIs start with a leading "/". + + The "depth" parameter indicates the maximum directory depth + to be used to compute the hash. One level is counted for each + slash in the request. If both parameters are specified, the + evaluation stops when either is reached. + + A "path-only" parameter indicates that the hashing key starts + at the first '/' of the path. This can be used to ignore the + authority part of absolute URIs, and to make sure that HTTP/1 + and HTTP/2 URIs will provide the same hash. See also the + "hash" option above. + + url_param The URL parameter specified in argument will be looked up in + the query string of each HTTP GET request. + + If the modifier "check_post" is used, then an HTTP POST + request entity will be searched for the parameter argument, + when it is not found in a query string after a question mark + ('?') in the URL. The message body will only start to be + analyzed once either the advertised amount of data has been + received or the request buffer is full. In the unlikely event + that chunked encoding is used, only the first chunk is + scanned. Parameter values separated by a chunk boundary, may + be randomly balanced if at all. This keyword used to support + an optional <max_wait> parameter which is now ignored. + + If the parameter is found followed by an equal sign ('=') and + a value, then the value is hashed and divided by the total + weight of the running servers. The result designates which + server will receive the request. + + This is used to track user identifiers in requests and ensure + that a same user ID will always be sent to the same server as + long as no server goes up or down. If no value is found or if + the parameter is not found, then a round robin algorithm is + applied. Note that this algorithm may only be used in an HTTP + backend. This algorithm is static by default, which means + that changing a server's weight on the fly will have no + effect, but this can be changed using "hash-type". See also + the "hash" option above. + + hdr(<name>) The HTTP header <name> will be looked up in each HTTP + request. Just as with the equivalent ACL 'hdr()' function, + the header name in parenthesis is not case sensitive. If the + header is absent or if it does not contain any value, the + roundrobin algorithm is applied instead. + + An optional 'use_domain_only' parameter is available, for + reducing the hash algorithm to the main domain part with some + specific headers such as 'Host'. For instance, in the Host + value "haproxy.1wt.eu", only "1wt" will be considered. + + This algorithm is static by default, which means that + changing a server's weight on the fly will have no effect, + but this can be changed using "hash-type". See also the + "hash" option above. + + random + random(<draws>) + A random number will be used as the key for the consistent + hashing function. This means that the servers' weights are + respected, dynamic weight changes immediately take effect, as + well as new server additions. Random load balancing can be + useful with large farms or when servers are frequently added + or removed as it may avoid the hammering effect that could + result from roundrobin or leastconn in this situation. The + hash-balance-factor directive can be used to further improve + fairness of the load balancing, especially in situations + where servers show highly variable response times. When an + argument <draws> is present, it must be an integer value one + or greater, indicating the number of draws before selecting + the least loaded of these servers. It was indeed demonstrated + that picking the least loaded of two servers is enough to + significantly improve the fairness of the algorithm, by + always avoiding to pick the most loaded server within a farm + and getting rid of any bias that could be induced by the + unfair distribution of the consistent list. Higher values N + will take away N-1 of the highest loaded servers at the + expense of performance. With very high values, the algorithm + will converge towards the leastconn's result but much slower. + The default value is 2, which generally shows very good + distribution and performance. This algorithm is also known as + the Power of Two Random Choices and is described here : + http://www.eecs.harvard.edu/~michaelm/postscripts/handbook2001.pdf + + rdp-cookie + rdp-cookie(<name>) + The RDP cookie <name> (or "mstshash" if omitted) will be + looked up and hashed for each incoming TCP request. Just as + with the equivalent ACL 'req.rdp_cookie()' function, the name + is not case-sensitive. This mechanism is useful as a degraded + persistence mode, as it makes it possible to always send the + same user (or the same session ID) to the same server. If the + cookie is not found, the normal roundrobin algorithm is + used instead. + + Note that for this to work, the frontend must ensure that an + RDP cookie is already present in the request buffer. For this + you must use 'tcp-request content accept' rule combined with + a 'req.rdp_cookie_cnt' ACL. + + This algorithm is static by default, which means that + changing a server's weight on the fly will have no effect, + but this can be changed using "hash-type". See also the + "hash" option above. + + <arguments> is an optional list of arguments which may be needed by some + algorithms. Right now, only "url_param" and "uri" support an + optional argument. + + The load balancing algorithm of a backend is set to roundrobin when no other + algorithm, mode nor option have been set. The algorithm may only be set once + for each backend. + + With authentication schemes that require the same connection like NTLM, URI + based algorithms must not be used, as they would cause subsequent requests + to be routed to different backend servers, breaking the invalid assumptions + NTLM relies on. + + Examples : + balance roundrobin + balance url_param userid + balance url_param session_id check_post 64 + balance hdr(User-Agent) + balance hdr(host) + balance hdr(Host) use_domain_only + balance hash req.cookie(clientid) + balance hash var(req.client_id) + balance hash req.hdr_ip(x-forwarded-for,-1),ipmask(24) + + Note: the following caveats and limitations on using the "check_post" + extension with "url_param" must be considered : + + - all POST requests are eligible for consideration, because there is no way + to determine if the parameters will be found in the body or entity which + may contain binary data. Therefore another method may be required to + restrict consideration of POST requests that have no URL parameters in + the body. (see acl http_end) + + - using a <max_wait> value larger than the request buffer size does not + make sense and is useless. The buffer size is set at build time, and + defaults to 16 kB. + + - Content-Encoding is not supported, the parameter search will probably + fail; and load balancing will fall back to Round Robin. + + - Expect: 100-continue is not supported, load balancing will fall back to + Round Robin. + + - Transfer-Encoding (RFC7230 3.3.1) is only supported in the first chunk. + If the entire parameter value is not present in the first chunk, the + selection of server is undefined (actually, defined by how little + actually appeared in the first chunk). + + - This feature does not support generation of a 100, 411 or 501 response. + + - In some cases, requesting "check_post" MAY attempt to scan the entire + contents of a message body. Scanning normally terminates when linear + white space or control characters are found, indicating the end of what + might be a URL parameter list. This is probably not a concern with SGML + type message bodies. + + See also : "dispatch", "cookie", "transparent", "hash-type". + + +bind [<address>]:<port_range> [, ...] [param*] +bind /<path> [, ...] [param*] + Define one or several listening addresses and/or ports in a frontend. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + <address> is optional and can be a host name, an IPv4 address, an IPv6 + address, or '*'. It designates the address the frontend will + listen on. If unset, all IPv4 addresses of the system will be + listened on. The same will apply for '*' or the system's + special address "0.0.0.0". The IPv6 equivalent is '::'. Note + that if you bind a frontend to multiple UDP addresses you have + no guarantee about the address which will be used to respond. + This is why "0.0.0.0" addresses and lists of comma-separated + IP addresses have been forbidden to bind QUIC addresses. + Optionally, an address family prefix may be used before the + address to force the family regardless of the address format, + which can be useful to specify a path to a unix socket with + no slash ('/'). Currently supported prefixes are : + - 'ipv4@' -> address is always IPv4 + - 'ipv6@' -> address is always IPv6 + - 'udp@' -> address is resolved as IPv4 or IPv6 and + protocol UDP is used. Currently those listeners are + supported only in log-forward sections. + - 'udp4@' -> address is always IPv4 and protocol UDP + is used. Currently those listeners are supported + only in log-forward sections. + - 'udp6@' -> address is always IPv6 and protocol UDP + is used. Currently those listeners are supported + only in log-forward sections. + - 'unix@' -> address is a path to a local unix socket + - 'abns@' -> address is in abstract namespace (Linux only). + - 'fd@<n>' -> use file descriptor <n> inherited from the + parent. The fd must be bound and may or may not already + be listening. + - 'sockpair@<n>'-> like fd@ but you must use the fd of a + connected unix socket or of a socketpair. The bind waits + to receive a FD over the unix socket and uses it as if it + was the FD of an accept(). Should be used carefully. + - 'quic4@' -> address is resolved as IPv4 and protocol UDP + is used. Note that QUIC connections attached to a + listener will be multiplexed over the listener socket. + With a large traffic this has a noticeable impact on + performance and CPU consumption. To improve this, you + should duplicate QUIC listener instances over several + threads, for example using "shards" keyword. + - 'quic6@' -> address is resolved as IPv6 and protocol UDP + is used. The performance note for QUIC over IPv4 applies + as well. + + You may want to reference some environment variables in the + address parameter, see section 2.3 about environment + variables. + + <port_range> is either a unique TCP port, or a port range for which the + proxy will accept connections for the IP address specified + above. The port is mandatory for TCP listeners. Note that in + the case of an IPv6 address, the port is always the number + after the last colon (':'). A range can either be : + - a numerical port (ex: '80') + - a dash-delimited ports range explicitly stating the lower + and upper bounds (ex: '2000-2100') which are included in + the range. + + Particular care must be taken against port ranges, because + every <address:port> couple consumes one socket (= a file + descriptor), so it's easy to consume lots of descriptors + with a simple range, and to run out of sockets. Also, each + <address:port> couple must be used only once among all + instances running on a same system. Please note that binding + to ports lower than 1024 generally require particular + privileges to start the program, which are independent of + the 'uid' parameter. + + <path> is a UNIX socket path beginning with a slash ('/'). This is + alternative to the TCP listening port. HAProxy will then + receive UNIX connections on the socket located at this place. + The path must begin with a slash and by default is absolute. + It can be relative to the prefix defined by "unix-bind" in + the global section. Note that the total length of the prefix + followed by the socket path cannot exceed some system limits + for UNIX sockets, which commonly are set to 107 characters. + + <param*> is a list of parameters common to all sockets declared on the + same line. These numerous parameters depend on OS and build + options and have a complete section dedicated to them. Please + refer to section 5 to for more details. + + It is possible to specify a list of address:port combinations delimited by + commas. The frontend will then listen on all of these addresses. There is no + fixed limit to the number of addresses and ports which can be listened on in + a frontend, as well as there is no limit to the number of "bind" statements + in a frontend. + + Example : + listen http_proxy + bind :80,:443 + bind 10.0.0.1:10080,10.0.0.1:10443 + bind /var/run/ssl-frontend.sock user root mode 600 accept-proxy + + listen http_https_proxy + bind :80 + bind :443 ssl crt /etc/haproxy/site.pem + + listen http_https_proxy_explicit + bind ipv6@:80 + bind ipv4@public_ssl:443 ssl crt /etc/haproxy/site.pem + bind unix@ssl-frontend.sock user root mode 600 accept-proxy + + listen external_bind_app1 + bind "fd@${FD_APP1}" + + listen h3_quic_proxy + bind quic4@10.0.0.1:8888 ssl crt /etc/mycrt alpn h3 + + Note: regarding Linux's abstract namespace sockets, HAProxy uses the whole + sun_path length is used for the address length. Some other programs + such as socat use the string length only by default. Pass the option + ",unix-tightsocklen=0" to any abstract socket definition in socat to + make it compatible with HAProxy's. + + See also : "source", "option forwardfor", "unix-bind" and the PROXY protocol + documentation, and section 5 about bind options. + + +bind-process [ all | odd | even | <process_num>[-[<process_num>]] ] ... + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + + Deprecated. Before threads were supported, this was used to force some + frontends on certain processes only, or to adjust backends so that they + could match the frontends that used them. The default and only accepted + value is "1" (along with "all" and "odd" which alias it). Do not use this + setting. Threads can still be bound per-socket using the "process" bind + keyword. + + See also : "process" in section 5.1. + + +capture cookie <name> len <length> + Capture and log a cookie in the request and in the response. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + <name> is the beginning of the name of the cookie to capture. In order + to match the exact name, simply suffix the name with an equal + sign ('='). The full name will appear in the logs, which is + useful with application servers which adjust both the cookie name + and value (e.g. ASPSESSIONXXX). + + <length> is the maximum number of characters to report in the logs, which + include the cookie name, the equal sign and the value, all in the + standard "name=value" form. The string will be truncated on the + right if it exceeds <length>. + + Only the first cookie is captured. Both the "cookie" request headers and the + "set-cookie" response headers are monitored. This is particularly useful to + check for application bugs causing session crossing or stealing between + users, because generally the user's cookies can only change on a login page. + + When the cookie was not presented by the client, the associated log column + will report "-". When a request does not cause a cookie to be assigned by the + server, a "-" is reported in the response column. + + The capture is performed in the frontend only because it is necessary that + the log format does not change for a given frontend depending on the + backends. This may change in the future. Note that there can be only one + "capture cookie" statement in a frontend. The maximum capture length is set + by the global "tune.http.cookielen" setting and defaults to 63 characters. It + is not possible to specify a capture in a "defaults" section. + + Example: + capture cookie ASPSESSION len 32 + + See also : "capture request header", "capture response header" as well as + section 8 about logging. + + +capture request header <name> len <length> + Capture and log the last occurrence of the specified request header. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + <name> is the name of the header to capture. The header names are not + case-sensitive, but it is a common practice to write them as they + appear in the requests, with the first letter of each word in + upper case. The header name will not appear in the logs, only the + value is reported, but the position in the logs is respected. + + <length> is the maximum number of characters to extract from the value and + report in the logs. The string will be truncated on the right if + it exceeds <length>. + + The complete value of the last occurrence of the header is captured. The + value will be added to the logs between braces ('{}'). If multiple headers + are captured, they will be delimited by a vertical bar ('|') and will appear + in the same order they were declared in the configuration. Non-existent + headers will be logged just as an empty string. Common uses for request + header captures include the "Host" field in virtual hosting environments, the + "Content-length" when uploads are supported, "User-agent" to quickly + differentiate between real users and robots, and "X-Forwarded-For" in proxied + environments to find where the request came from. + + Note that when capturing headers such as "User-agent", some spaces may be + logged, making the log analysis more difficult. Thus be careful about what + you log if you know your log parser is not smart enough to rely on the + braces. + + There is no limit to the number of captured request headers nor to their + length, though it is wise to keep them low to limit memory usage per session. + In order to keep log format consistent for a same frontend, header captures + can only be declared in a frontend. It is not possible to specify a capture + in a "defaults" section. + + Example: + capture request header Host len 15 + capture request header X-Forwarded-For len 15 + capture request header Referer len 15 + + See also : "capture cookie", "capture response header" as well as section 8 + about logging. + + +capture response header <name> len <length> + Capture and log the last occurrence of the specified response header. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + <name> is the name of the header to capture. The header names are not + case-sensitive, but it is a common practice to write them as they + appear in the response, with the first letter of each word in + upper case. The header name will not appear in the logs, only the + value is reported, but the position in the logs is respected. + + <length> is the maximum number of characters to extract from the value and + report in the logs. The string will be truncated on the right if + it exceeds <length>. + + The complete value of the last occurrence of the header is captured. The + result will be added to the logs between braces ('{}') after the captured + request headers. If multiple headers are captured, they will be delimited by + a vertical bar ('|') and will appear in the same order they were declared in + the configuration. Non-existent headers will be logged just as an empty + string. Common uses for response header captures include the "Content-length" + header which indicates how many bytes are expected to be returned, the + "Location" header to track redirections. + + There is no limit to the number of captured response headers nor to their + length, though it is wise to keep them low to limit memory usage per session. + In order to keep log format consistent for a same frontend, header captures + can only be declared in a frontend. It is not possible to specify a capture + in a "defaults" section. + + Example: + capture response header Content-length len 9 + capture response header Location len 15 + + See also : "capture cookie", "capture request header" as well as section 8 + about logging. + + +clitcpka-cnt <count> + Sets the maximum number of keepalive probes TCP should send before dropping + the connection on the client side. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <count> is the maximum number of keepalive probes. + + This keyword corresponds to the socket option TCP_KEEPCNT. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_probes) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option clitcpka", "clitcpka-idle", "clitcpka-intvl". + + +clitcpka-idle <timeout> + Sets the time the connection needs to remain idle before TCP starts sending + keepalive probes, if enabled the sending of TCP keepalive packets on the + client side. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <timeout> is the time the connection needs to remain idle before TCP starts + sending keepalive probes. It is specified in seconds by default, + but can be in any other unit if the number is suffixed by the + unit, as explained at the top of this document. + + This keyword corresponds to the socket option TCP_KEEPIDLE. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_time) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option clitcpka", "clitcpka-cnt", "clitcpka-intvl". + + +clitcpka-intvl <timeout> + Sets the time between individual keepalive probes on the client side. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <timeout> is the time between individual keepalive probes. It is specified + in seconds by default, but can be in any other unit if the number + is suffixed by the unit, as explained at the top of this + document. + + This keyword corresponds to the socket option TCP_KEEPINTVL. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_intvl) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option clitcpka", "clitcpka-cnt", "clitcpka-idle". + + +compression algo <algorithm> ... +compression type <mime type> ... + Enable HTTP compression. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + algo is followed by the list of supported compression algorithms. + type is followed by the list of MIME types that will be compressed. + + The currently supported algorithms are : + identity this is mostly for debugging, and it was useful for developing + the compression feature. Identity does not apply any change on + data. + + gzip applies gzip compression. This setting is only available when + support for zlib or libslz was built in. + + deflate same as "gzip", but with deflate algorithm and zlib format. + Note that this algorithm has ambiguous support on many + browsers and no support at all from recent ones. It is + strongly recommended not to use it for anything else than + experimentation. This setting is only available when support + for zlib or libslz was built in. + + raw-deflate same as "deflate" without the zlib wrapper, and used as an + alternative when the browser wants "deflate". All major + browsers understand it and despite violating the standards, + it is known to work better than "deflate", at least on MSIE + and some versions of Safari. Do not use it in conjunction + with "deflate", use either one or the other since both react + to the same Accept-Encoding token. This setting is only + available when support for zlib or libslz was built in. + + Compression will be activated depending on the Accept-Encoding request + header. With identity, it does not take care of that header. + If backend servers support HTTP compression, these directives + will be no-op: HAProxy will see the compressed response and will not + compress again. If backend servers do not support HTTP compression and + there is Accept-Encoding header in request, HAProxy will compress the + matching response. + + Compression is disabled when: + * the request does not advertise a supported compression algorithm in the + "Accept-Encoding" header + * the response message is not HTTP/1.1 or above + * HTTP status code is not one of 200, 201, 202, or 203 + * response contain neither a "Content-Length" header nor a + "Transfer-Encoding" whose last value is "chunked" + * response contains a "Content-Type" header whose first value starts with + "multipart" + * the response contains the "no-transform" value in the "Cache-control" + header + * User-Agent matches "Mozilla/4" unless it is MSIE 6 with XP SP2, or MSIE 7 + and later + * The response contains a "Content-Encoding" header, indicating that the + response is already compressed (see compression offload) + * The response contains an invalid "ETag" header or multiple ETag headers + + Note: The compression does not emit the Warning header. + + Examples : + compression algo gzip + compression type text/html text/plain + + See also : "compression offload" + +compression offload + Makes HAProxy work as a compression offloader only. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + + The "offload" setting makes HAProxy remove the Accept-Encoding header to + prevent backend servers from compressing responses. It is strongly + recommended not to do this because this means that all the compression work + will be done on the single point where HAProxy is located. However in some + deployment scenarios, HAProxy may be installed in front of a buggy gateway + with broken HTTP compression implementation which can't be turned off. + In that case HAProxy can be used to prevent that gateway from emitting + invalid payloads. In this case, simply removing the header in the + configuration does not work because it applies before the header is parsed, + so that prevents HAProxy from compressing. The "offload" setting should + then be used for such scenarios. + + If this setting is used in a defaults section, a warning is emitted and the + option is ignored. + + See also : "compression type", "compression algo" + +cookie <name> [ rewrite | insert | prefix ] [ indirect ] [ nocache ] + [ postonly ] [ preserve ] [ httponly ] [ secure ] + [ domain <domain> ]* [ maxidle <idle> ] [ maxlife <life> ] + [ dynamic ] [ attr <value> ]* + Enable cookie-based persistence in a backend. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <name> is the name of the cookie which will be monitored, modified or + inserted in order to bring persistence. This cookie is sent to + the client via a "Set-Cookie" header in the response, and is + brought back by the client in a "Cookie" header in all requests. + Special care should be taken to choose a name which does not + conflict with any likely application cookie. Also, if the same + backends are subject to be used by the same clients (e.g. + HTTP/HTTPS), care should be taken to use different cookie names + between all backends if persistence between them is not desired. + + rewrite This keyword indicates that the cookie will be provided by the + server and that HAProxy will have to modify its value to set the + server's identifier in it. This mode is handy when the management + of complex combinations of "Set-cookie" and "Cache-control" + headers is left to the application. The application can then + decide whether or not it is appropriate to emit a persistence + cookie. Since all responses should be monitored, this mode + doesn't work in HTTP tunnel mode. Unless the application + behavior is very complex and/or broken, it is advised not to + start with this mode for new deployments. This keyword is + incompatible with "insert" and "prefix". + + insert This keyword indicates that the persistence cookie will have to + be inserted by HAProxy in server responses if the client did not + + already have a cookie that would have permitted it to access this + server. When used without the "preserve" option, if the server + emits a cookie with the same name, it will be removed before + processing. For this reason, this mode can be used to upgrade + existing configurations running in the "rewrite" mode. The cookie + will only be a session cookie and will not be stored on the + client's disk. By default, unless the "indirect" option is added, + the server will see the cookies emitted by the client. Due to + caching effects, it is generally wise to add the "nocache" or + "postonly" keywords (see below). The "insert" keyword is not + compatible with "rewrite" and "prefix". + + prefix This keyword indicates that instead of relying on a dedicated + cookie for the persistence, an existing one will be completed. + This may be needed in some specific environments where the client + does not support more than one single cookie and the application + already needs it. In this case, whenever the server sets a cookie + named <name>, it will be prefixed with the server's identifier + and a delimiter. The prefix will be removed from all client + requests so that the server still finds the cookie it emitted. + Since all requests and responses are subject to being modified, + this mode doesn't work with tunnel mode. The "prefix" keyword is + not compatible with "rewrite" and "insert". Note: it is highly + recommended not to use "indirect" with "prefix", otherwise server + cookie updates would not be sent to clients. + + indirect When this option is specified, no cookie will be emitted to a + client which already has a valid one for the server which has + processed the request. If the server sets such a cookie itself, + it will be removed, unless the "preserve" option is also set. In + "insert" mode, this will additionally remove cookies from the + requests transmitted to the server, making the persistence + mechanism totally transparent from an application point of view. + Note: it is highly recommended not to use "indirect" with + "prefix", otherwise server cookie updates would not be sent to + clients. + + nocache This option is recommended in conjunction with the insert mode + when there is a cache between the client and HAProxy, as it + ensures that a cacheable response will be tagged non-cacheable if + a cookie needs to be inserted. This is important because if all + persistence cookies are added on a cacheable home page for + instance, then all customers will then fetch the page from an + outer cache and will all share the same persistence cookie, + leading to one server receiving much more traffic than others. + See also the "insert" and "postonly" options. + + postonly This option ensures that cookie insertion will only be performed + on responses to POST requests. It is an alternative to the + "nocache" option, because POST responses are not cacheable, so + this ensures that the persistence cookie will never get cached. + Since most sites do not need any sort of persistence before the + first POST which generally is a login request, this is a very + efficient method to optimize caching without risking to find a + persistence cookie in the cache. + See also the "insert" and "nocache" options. + + preserve This option may only be used with "insert" and/or "indirect". It + allows the server to emit the persistence cookie itself. In this + case, if a cookie is found in the response, HAProxy will leave it + untouched. This is useful in order to end persistence after a + logout request for instance. For this, the server just has to + emit a cookie with an invalid value (e.g. empty) or with a date in + the past. By combining this mechanism with the "disable-on-404" + check option, it is possible to perform a completely graceful + shutdown because users will definitely leave the server after + they logout. + + httponly This option tells HAProxy to add an "HttpOnly" cookie attribute + when a cookie is inserted. This attribute is used so that a + user agent doesn't share the cookie with non-HTTP components. + Please check RFC6265 for more information on this attribute. + + secure This option tells HAProxy to add a "Secure" cookie attribute when + a cookie is inserted. This attribute is used so that a user agent + never emits this cookie over non-secure channels, which means + that a cookie learned with this flag will be presented only over + SSL/TLS connections. Please check RFC6265 for more information on + this attribute. + + domain This option allows to specify the domain at which a cookie is + inserted. It requires exactly one parameter: a valid domain + name. If the domain begins with a dot, the browser is allowed to + use it for any host ending with that name. It is also possible to + specify several domain names by invoking this option multiple + times. Some browsers might have small limits on the number of + domains, so be careful when doing that. For the record, sending + 10 domains to MSIE 6 or Firefox 2 works as expected. + + maxidle This option allows inserted cookies to be ignored after some idle + time. It only works with insert-mode cookies. When a cookie is + sent to the client, the date this cookie was emitted is sent too. + Upon further presentations of this cookie, if the date is older + than the delay indicated by the parameter (in seconds), it will + be ignored. Otherwise, it will be refreshed if needed when the + response is sent to the client. This is particularly useful to + prevent users who never close their browsers from remaining for + too long on the same server (e.g. after a farm size change). When + this option is set and a cookie has no date, it is always + accepted, but gets refreshed in the response. This maintains the + ability for admins to access their sites. Cookies that have a + date in the future further than 24 hours are ignored. Doing so + lets admins fix timezone issues without risking kicking users off + the site. + + maxlife This option allows inserted cookies to be ignored after some life + time, whether they're in use or not. It only works with insert + mode cookies. When a cookie is first sent to the client, the date + this cookie was emitted is sent too. Upon further presentations + of this cookie, if the date is older than the delay indicated by + the parameter (in seconds), it will be ignored. If the cookie in + the request has no date, it is accepted and a date will be set. + Cookies that have a date in the future further than 24 hours are + ignored. Doing so lets admins fix timezone issues without risking + kicking users off the site. Contrary to maxidle, this value is + not refreshed, only the first visit date counts. Both maxidle and + maxlife may be used at the time. This is particularly useful to + prevent users who never close their browsers from remaining for + too long on the same server (e.g. after a farm size change). This + is stronger than the maxidle method in that it forces a + redispatch after some absolute delay. + + dynamic Activate dynamic cookies. When used, a session cookie is + dynamically created for each server, based on the IP and port + of the server, and a secret key, specified in the + "dynamic-cookie-key" backend directive. + The cookie will be regenerated each time the IP address change, + and is only generated for IPv4/IPv6. + + attr This option tells HAProxy to add an extra attribute when a + cookie is inserted. The attribute value can contain any + characters except control ones or ";". This option may be + repeated. + + There can be only one persistence cookie per HTTP backend, and it can be + declared in a defaults section. The value of the cookie will be the value + indicated after the "cookie" keyword in a "server" statement. If no cookie + is declared for a given server, the cookie is not set. + + Examples : + cookie JSESSIONID prefix + cookie SRV insert indirect nocache + cookie SRV insert postonly indirect + cookie SRV insert indirect nocache maxidle 30m maxlife 8h + + See also : "balance source", "capture cookie", "server" and "ignore-persist". + + +declare capture [ request | response ] len <length> + Declares a capture slot. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments: + <length> is the length allowed for the capture. + + This declaration is only available in the frontend or listen section, but the + reserved slot can be used in the backends. The "request" keyword allocates a + capture slot for use in the request, and "response" allocates a capture slot + for use in the response. + + See also: "capture-req", "capture-res" (sample converters), + "capture.req.hdr", "capture.res.hdr" (sample fetches), + "http-request capture" and "http-response capture". + + +default-server [param*] + Change default options for a server in a backend + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments: + <param*> is a list of parameters for this server. The "default-server" + keyword accepts an important number of options and has a complete + section dedicated to it. Please refer to section 5 for more + details. + + Example : + default-server inter 1000 weight 13 + + See also: "server" and section 5 about server options + + +default_backend <backend> + Specify the backend to use when no "use_backend" rule has been matched. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <backend> is the name of the backend to use. + + When doing content-switching between frontend and backends using the + "use_backend" keyword, it is often useful to indicate which backend will be + used when no rule has matched. It generally is the dynamic backend which + will catch all undetermined requests. + + Example : + + use_backend dynamic if url_dyn + use_backend static if url_css url_img extension_img + default_backend dynamic + + See also : "use_backend" + + +description <string> + Describe a listen, frontend or backend. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + Arguments : string + + Allows to add a sentence to describe the related object in the HAProxy HTML + stats page. The description will be printed on the right of the object name + it describes. + No need to backslash spaces in the <string> arguments. + + +disabled + Disable a proxy, frontend or backend. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + The "disabled" keyword is used to disable an instance, mainly in order to + liberate a listening port or to temporarily disable a service. The instance + will still be created and its configuration will be checked, but it will be + created in the "stopped" state and will appear as such in the statistics. It + will not receive any traffic nor will it send any health-checks or logs. It + is possible to disable many instances at once by adding the "disabled" + keyword in a "defaults" section. + + See also : "enabled" + + +dispatch <address>:<port> + Set a default server address + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + Arguments : + + <address> is the IPv4 address of the default server. Alternatively, a + resolvable hostname is supported, but this name will be resolved + during start-up. + + <ports> is a mandatory port specification. All connections will be sent + to this port, and it is not permitted to use port offsets as is + possible with normal servers. + + The "dispatch" keyword designates a default server for use when no other + server can take the connection. In the past it was used to forward non + persistent connections to an auxiliary load balancer. Due to its simple + syntax, it has also been used for simple TCP relays. It is recommended not to + use it for more clarity, and to use the "server" directive instead. + + See also : "server" + + +dynamic-cookie-key <string> + Set the dynamic cookie secret key for a backend. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : The secret key to be used. + + When dynamic cookies are enabled (see the "dynamic" directive for cookie), + a dynamic cookie is created for each server (unless one is explicitly + specified on the "server" line), using a hash of the IP address of the + server, the TCP port, and the secret key. + That way, we can ensure session persistence across multiple load-balancers, + even if servers are dynamically added or removed. + +enabled + Enable a proxy, frontend or backend. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + The "enabled" keyword is used to explicitly enable an instance, when the + defaults has been set to "disabled". This is very rarely used. + + See also : "disabled" + + +errorfile <code> <file> + Return a file contents instead of errors generated by HAProxy + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <code> is the HTTP status code. Currently, HAProxy is capable of + generating codes 200, 400, 401, 403, 404, 405, 407, 408, 410, + 413, 425, 429, 500, 501, 502, 503, and 504. + + <file> designates a file containing the full HTTP response. It is + recommended to follow the common practice of appending ".http" to + the filename so that people do not confuse the response with HTML + error pages, and to use absolute paths, since files are read + before any chroot is performed. + + It is important to understand that this keyword is not meant to rewrite + errors returned by the server, but errors detected and returned by HAProxy. + This is why the list of supported errors is limited to a small set. + + Code 200 is emitted in response to requests matching a "monitor-uri" rule. + + The files are parsed when HAProxy starts and must be valid according to the + HTTP specification. They should not exceed the configured buffer size + (BUFSIZE), which generally is 16 kB, otherwise an internal error will be + returned. It is also wise not to put any reference to local contents + (e.g. images) in order to avoid loops between the client and HAProxy when all + servers are down, causing an error to be returned instead of an + image. Finally, The response cannot exceed (tune.bufsize - tune.maxrewrite) + so that "http-after-response" rules still have room to operate (see + "tune.maxrewrite"). + + The files are read at the same time as the configuration and kept in memory. + For this reason, the errors continue to be returned even when the process is + chrooted, and no file change is considered while the process is running. A + simple method for developing those files consists in associating them to the + 403 status code and interrogating a blocked URL. + + See also : "http-error", "errorloc", "errorloc302", "errorloc303" + + Example : + errorfile 400 /etc/haproxy/errorfiles/400badreq.http + errorfile 408 /dev/null # work around Chrome pre-connect bug + errorfile 403 /etc/haproxy/errorfiles/403forbid.http + errorfile 503 /etc/haproxy/errorfiles/503sorry.http + + +errorfiles <name> [<code> ...] + Import, fully or partially, the error files defined in the <name> http-errors + section. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <name> is the name of an existing http-errors section. + + <code> is a HTTP status code. Several status code may be listed. + Currently, HAProxy is capable of generating codes 200, 400, 401, + 403, 404, 405, 407, 408, 410, 413, 425, 429, 500, 501, 502, 503, + and 504. + + Errors defined in the http-errors section with the name <name> are imported + in the current proxy. If no status code is specified, all error files of the + http-errors section are imported. Otherwise, only error files associated to + the listed status code are imported. Those error files override the already + defined custom errors for the proxy. And they may be overridden by following + ones. Functionally, it is exactly the same as declaring all error files by + hand using "errorfile" directives. + + See also : "http-error", "errorfile", "errorloc", "errorloc302" , + "errorloc303" and section 3.8 about http-errors. + + Example : + errorfiles generic + errorfiles site-1 403 404 + + +errorloc <code> <url> +errorloc302 <code> <url> + Return an HTTP redirection to a URL instead of errors generated by HAProxy + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <code> is the HTTP status code. Currently, HAProxy is capable of + generating codes 200, 400, 401, 403, 404, 405, 407, 408, 410, + 413, 425, 429, 500, 501, 502, 503, and 504. + + <url> it is the exact contents of the "Location" header. It may contain + either a relative URI to an error page hosted on the same site, + or an absolute URI designating an error page on another site. + Special care should be given to relative URIs to avoid redirect + loops if the URI itself may generate the same error (e.g. 500). + + It is important to understand that this keyword is not meant to rewrite + errors returned by the server, but errors detected and returned by HAProxy. + This is why the list of supported errors is limited to a small set. + + Code 200 is emitted in response to requests matching a "monitor-uri" rule. + + Note that both keyword return the HTTP 302 status code, which tells the + client to fetch the designated URL using the same HTTP method. This can be + quite problematic in case of non-GET methods such as POST, because the URL + sent to the client might not be allowed for something other than GET. To + work around this problem, please use "errorloc303" which send the HTTP 303 + status code, indicating to the client that the URL must be fetched with a GET + request. + + See also : "http-error", "errorfile", "errorloc303" + + +errorloc303 <code> <url> + Return an HTTP redirection to a URL instead of errors generated by HAProxy + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <code> is the HTTP status code. Currently, HAProxy is capable of + generating codes 200, 400, 401, 403, 404, 405, 407, 408, 410, + 413, 425, 429, 500, 501, 502, 503, and 504. + + <url> it is the exact contents of the "Location" header. It may contain + either a relative URI to an error page hosted on the same site, + or an absolute URI designating an error page on another site. + Special care should be given to relative URIs to avoid redirect + loops if the URI itself may generate the same error (e.g. 500). + + It is important to understand that this keyword is not meant to rewrite + errors returned by the server, but errors detected and returned by HAProxy. + This is why the list of supported errors is limited to a small set. + + Code 200 is emitted in response to requests matching a "monitor-uri" rule. + + Note that both keyword return the HTTP 303 status code, which tells the + client to fetch the designated URL using the same HTTP GET method. This + solves the usual problems associated with "errorloc" and the 302 code. It is + possible that some very old browsers designed before HTTP/1.1 do not support + it, but no such problem has been reported till now. + + See also : "http-error", "errorfile", "errorloc", "errorloc302" + + +email-alert from <emailaddr> + Declare the from email address to be used in both the envelope and header + of email alerts. This is the address that email alerts are sent from. + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Arguments : + + <emailaddr> is the from email address to use when sending email alerts + + Also requires "email-alert mailers" and "email-alert to" to be set + and if so sending email alerts is enabled for the proxy. + + See also : "email-alert level", "email-alert mailers", + "email-alert myhostname", "email-alert to", section 3.6 about + mailers. + + +email-alert level <level> + Declare the maximum log level of messages for which email alerts will be + sent. This acts as a filter on the sending of email alerts. + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Arguments : + + <level> One of the 8 syslog levels: + emerg alert crit err warning notice info debug + The above syslog levels are ordered from lowest to highest. + + By default level is alert + + Also requires "email-alert from", "email-alert mailers" and + "email-alert to" to be set and if so sending email alerts is enabled + for the proxy. + + Alerts are sent when : + + * An un-paused server is marked as down and <level> is alert or lower + * A paused server is marked as down and <level> is notice or lower + * A server is marked as up or enters the drain state and <level> + is notice or lower + * "option log-health-checks" is enabled, <level> is info or lower, + and a health check status update occurs + + See also : "email-alert from", "email-alert mailers", + "email-alert myhostname", "email-alert to", + section 3.6 about mailers. + + +email-alert mailers <mailersect> + Declare the mailers to be used when sending email alerts + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Arguments : + + <mailersect> is the name of the mailers section to send email alerts. + + Also requires "email-alert from" and "email-alert to" to be set + and if so sending email alerts is enabled for the proxy. + + See also : "email-alert from", "email-alert level", "email-alert myhostname", + "email-alert to", section 3.6 about mailers. + + +email-alert myhostname <hostname> + Declare the to hostname address to be used when communicating with + mailers. + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Arguments : + + <hostname> is the hostname to use when communicating with mailers + + By default the systems hostname is used. + + Also requires "email-alert from", "email-alert mailers" and + "email-alert to" to be set and if so sending email alerts is enabled + for the proxy. + + See also : "email-alert from", "email-alert level", "email-alert mailers", + "email-alert to", section 3.6 about mailers. + + +email-alert to <emailaddr> + Declare both the recipient address in the envelope and to address in the + header of email alerts. This is the address that email alerts are sent to. + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Arguments : + + <emailaddr> is the to email address to use when sending email alerts + + Also requires "email-alert mailers" and "email-alert to" to be set + and if so sending email alerts is enabled for the proxy. + + See also : "email-alert from", "email-alert level", "email-alert mailers", + "email-alert myhostname", section 3.6 about mailers. + + +error-log-format <string> + Specifies the log format string to use in case of connection error on the frontend side. + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | no + + This directive specifies the log format string that will be used for logs + containing information related to errors, timeouts, retries redispatches or + HTTP status code 5xx. This format will in short be used for every log line + that would be concerned by the "log-separate-errors" option, including + connection errors described in section 8.2.5. + + If the directive is used in a defaults section, all subsequent frontends will + use the same log format. Please see section 8.2.4 which covers the log format + string in depth. + + "error-log-format" directive overrides previous "error-log-format" + directives. + + +force-persist { if | unless } <condition> + Declare a condition to force persistence on down servers + May be used in sections: defaults | frontend | listen | backend + no | no | yes | yes + + By default, requests are not dispatched to down servers. It is possible to + force this using "option persist", but it is unconditional and redispatches + to a valid server if "option redispatch" is set. That leaves with very little + possibilities to force some requests to reach a server which is artificially + marked down for maintenance operations. + + The "force-persist" statement allows one to declare various ACL-based + conditions which, when met, will cause a request to ignore the down status of + a server and still try to connect to it. That makes it possible to start a + server, still replying an error to the health checks, and run a specially + configured browser to test the service. Among the handy methods, one could + use a specific source IP address, or a specific cookie. The cookie also has + the advantage that it can easily be added/removed on the browser from a test + page. Once the service is validated, it is then possible to open the service + to the world by returning a valid response to health checks. + + The forced persistence is enabled when an "if" condition is met, or unless an + "unless" condition is met. The final redispatch is always disabled when this + is used. + + See also : "option redispatch", "ignore-persist", "persist", + and section 7 about ACL usage. + + +filter <name> [param*] + Add the filter <name> in the filter list attached to the proxy. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + Arguments : + <name> is the name of the filter. Officially supported filters are + referenced in section 9. + + <param*> is a list of parameters accepted by the filter <name>. The + parsing of these parameters are the responsibility of the + filter. Please refer to the documentation of the corresponding + filter (section 9) for all details on the supported parameters. + + Multiple occurrences of the filter line can be used for the same proxy. The + same filter can be referenced many times if needed. + + Example: + listen + bind *:80 + + filter trace name BEFORE-HTTP-COMP + filter compression + filter trace name AFTER-HTTP-COMP + + compression algo gzip + compression offload + + server srv1 192.168.0.1:80 + + See also : section 9. + + +fullconn <conns> + Specify at what backend load the servers will reach their maxconn + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <conns> is the number of connections on the backend which will make the + servers use the maximal number of connections. + + When a server has a "maxconn" parameter specified, it means that its number + of concurrent connections will never go higher. Additionally, if it has a + "minconn" parameter, it indicates a dynamic limit following the backend's + load. The server will then always accept at least <minconn> connections, + never more than <maxconn>, and the limit will be on the ramp between both + values when the backend has less than <conns> concurrent connections. This + makes it possible to limit the load on the servers during normal loads, but + push it further for important loads without overloading the servers during + exceptional loads. + + Since it's hard to get this value right, HAProxy automatically sets it to + 10% of the sum of the maxconns of all frontends that may branch to this + backend (based on "use_backend" and "default_backend" rules). That way it's + safe to leave it unset. However, "use_backend" involving dynamic names are + not counted since there is no way to know if they could match or not. + + Example : + # The servers will accept between 100 and 1000 concurrent connections each + # and the maximum of 1000 will be reached when the backend reaches 10000 + # connections. + backend dynamic + fullconn 10000 + server srv1 dyn1:80 minconn 100 maxconn 1000 + server srv2 dyn2:80 minconn 100 maxconn 1000 + + See also : "maxconn", "server" + + +hash-balance-factor <factor> + Specify the balancing factor for bounded-load consistent hashing + May be used in sections : defaults | frontend | listen | backend + yes | no | no | yes + Arguments : + <factor> is the control for the maximum number of concurrent requests to + send to a server, expressed as a percentage of the average number + of concurrent requests across all of the active servers. + + Specifying a "hash-balance-factor" for a server with "hash-type consistent" + enables an algorithm that prevents any one server from getting too many + requests at once, even if some hash buckets receive many more requests than + others. Setting <factor> to 0 (the default) disables the feature. Otherwise, + <factor> is a percentage greater than 100. For example, if <factor> is 150, + then no server will be allowed to have a load more than 1.5 times the average. + If server weights are used, they will be respected. + + If the first-choice server is disqualified, the algorithm will choose another + server based on the request hash, until a server with additional capacity is + found. A higher <factor> allows more imbalance between the servers, while a + lower <factor> means that more servers will be checked on average, affecting + performance. Reasonable values are from 125 to 200. + + This setting is also used by "balance random" which internally relies on the + consistent hashing mechanism. + + See also : "balance" and "hash-type". + + +hash-type <method> <function> <modifier> + Specify a method to use for mapping hashes to servers + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <method> is the method used to select a server from the hash computed by + the <function> : + + map-based the hash table is a static array containing all alive servers. + The hashes will be very smooth, will consider weights, but + will be static in that weight changes while a server is up + will be ignored. This means that there will be no slow start. + Also, since a server is selected by its position in the array, + most mappings are changed when the server count changes. This + means that when a server goes up or down, or when a server is + added to a farm, most connections will be redistributed to + different servers. This can be inconvenient with caches for + instance. + + consistent the hash table is a tree filled with many occurrences of each + server. The hash key is looked up in the tree and the closest + server is chosen. This hash is dynamic, it supports changing + weights while the servers are up, so it is compatible with the + slow start feature. It has the advantage that when a server + goes up or down, only its associations are moved. When a + server is added to the farm, only a few part of the mappings + are redistributed, making it an ideal method for caches. + However, due to its principle, the distribution will never be + very smooth and it may sometimes be necessary to adjust a + server's weight or its ID to get a more balanced distribution. + In order to get the same distribution on multiple load + balancers, it is important that all servers have the exact + same IDs. Note: consistent hash uses sdbm and avalanche if no + hash function is specified. + + <function> is the hash function to be used : + + sdbm this function was created initially for sdbm (a public-domain + reimplementation of ndbm) database library. It was found to do + well in scrambling bits, causing better distribution of the keys + and fewer splits. It also happens to be a good general hashing + function with good distribution, unless the total server weight + is a multiple of 64, in which case applying the avalanche + modifier may help. + + djb2 this function was first proposed by Dan Bernstein many years ago + on comp.lang.c. Studies have shown that for certain workload this + function provides a better distribution than sdbm. It generally + works well with text-based inputs though it can perform extremely + poorly with numeric-only input or when the total server weight is + a multiple of 33, unless the avalanche modifier is also used. + + wt6 this function was designed for HAProxy while testing other + functions in the past. It is not as smooth as the other ones, but + is much less sensible to the input data set or to the number of + servers. It can make sense as an alternative to sdbm+avalanche or + djb2+avalanche for consistent hashing or when hashing on numeric + data such as a source IP address or a visitor identifier in a URL + parameter. + + crc32 this is the most common CRC32 implementation as used in Ethernet, + gzip, PNG, etc. It is slower than the other ones but may provide + a better distribution or less predictable results especially when + used on strings. + + <modifier> indicates an optional method applied after hashing the key : + + avalanche This directive indicates that the result from the hash + function above should not be used in its raw form but that + a 4-byte full avalanche hash must be applied first. The + purpose of this step is to mix the resulting bits from the + previous hash in order to avoid any undesired effect when + the input contains some limited values or when the number of + servers is a multiple of one of the hash's components (64 + for SDBM, 33 for DJB2). Enabling avalanche tends to make the + result less predictable, but it's also not as smooth as when + using the original function. Some testing might be needed + with some workloads. This hash is one of the many proposed + by Bob Jenkins. + + The default hash type is "map-based" and is recommended for most usages. The + default function is "sdbm", the selection of a function should be based on + the range of the values being hashed. + + See also : "balance", "hash-balance-factor", "server" + + +http-after-response <action> <options...> [ { if | unless } <condition> ] + Access control for all Layer 7 responses (server, applet/service and internal + ones). + + May be used in sections: defaults | frontend | listen | backend + yes(!) | yes | yes | yes + + The http-after-response statement defines a set of rules which apply to layer + 7 processing. The rules are evaluated in their declaration order when they + are met in a frontend, listen or backend section. Any rule may optionally be + followed by an ACL-based condition, in which case it will only be evaluated + if the condition is true. Since these rules apply on responses, the backend + rules are applied first, followed by the frontend's rules. + + Unlike http-response rules, these ones are applied on all responses, the + server ones but also to all responses generated by HAProxy. These rules are + evaluated at the end of the responses analysis, before the data forwarding. + + The first keyword is the rule's action. Several types of actions are + supported: + - add-header <name> <fmt> + - allow + - capture <sample> id <id> + - del-header <name> [ -m <meth> ] + - replace-header <name> <regex-match> <replace-fmt> + - replace-value <name> <regex-match> <replace-fmt> + - set-header <name> <fmt> + - set-status <status> [reason <str>] + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - strict-mode { on | off } + - unset-var(<var-name>) + + The supported actions are described below. + + There is no limit to the number of http-after-response statements per + instance. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Note: Errors emitted in early stage of the request parsing are handled by the + multiplexer at a lower level, before any http analysis. Thus no + http-after-response ruleset is evaluated on these errors. + + Example: + http-after-response set-header Strict-Transport-Security "max-age=31536000" + http-after-response set-header Cache-Control "no-store,no-cache,private" + http-after-response set-header Pragma "no-cache" + +http-after-response add-header <name> <fmt> [ { if | unless } <condition> ] + + This appends an HTTP header field whose name is specified in <name> and whose + value is defined by <fmt>. Please refer to "http-request add-header" for a + complete description. + +http-after-response allow [ { if | unless } <condition> ] + + This stops the evaluation of the rules and lets the response pass the check. + No further "http-after-response" rules are evaluated for the current section. + +http-after-response capture <sample> id <id> [ { if | unless } <condition> ] + + This captures sample expression <sample> from the response buffer, and + converts it to a string. Please refer to "http-response capture" for a + complete description. + +http-after-response del-header <name> [ -m <meth> ] [ { if | unless } <condition> ] + + This removes all HTTP header fields whose name is specified in <name>. Please + refer to "http-request del-header" for a complete description. + +http-after-response replace-header <name> <regex-match> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "http-response replace-header". + + Example: + http-after-response replace-header Set-Cookie (C=[^;]*);(.*) \1;ip=%bi;\2 + + # applied to: + Set-Cookie: C=1; expires=Tue, 14-Jun-2016 01:40:45 GMT + + # outputs: + Set-Cookie: C=1;ip=192.168.1.20; expires=Tue, 14-Jun-2016 01:40:45 GMT + + # assuming the backend IP is 192.168.1.20. + +http-after-response replace-value <name> <regex-match> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "http-response replace-value". + + Example: + http-after-response replace-value Cache-control ^public$ private + + # applied to: + Cache-Control: max-age=3600, public + + # outputs: + Cache-Control: max-age=3600, private + +http-after-response set-header <name> <fmt> [ { if | unless } <condition> ] + + This does the same as "http-after-response add-header" except that the header + name is first removed if it existed. This is useful when passing security + information to the server, where the header must not be manipulated by + external users. + +http-after-response set-status <status> [reason <str>] + [ { if | unless } <condition> ] + + This replaces the response status code with <status> which must be an integer + between 100 and 999. Please refer to "http-response set-status" for a complete + description. + +http-after-response set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +http-after-response set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +http-after-response strict-mode { on | off } [ { if | unless } <condition> ] + + This enables or disables the strict rewriting mode for following + rules. Please refer to "http-request strict-mode" for a complete description. + +http-after-response unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. See "http-request set-var" for details + about <var-name>. + + +http-check comment <string> + Defines a comment for the following the http-check rule, reported in logs if + it fails. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <string> is the comment message to add in logs if the following http-check + rule fails. + + It only works for connect, send and expect rules. It is useful to make + user-friendly error reporting. + + See also : "option httpchk", "http-check connect", "http-check send" and + "http-check expect". + + +http-check connect [default] [port <expr>] [addr <ip>] [send-proxy] + [via-socks4] [ssl] [sni <sni>] [alpn <alpn>] [linger] + [proto <name>] [comment <msg>] + Opens a new connection to perform an HTTP health check + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + default Use default options of the server line to do the health + checks. The server options are used only if not redefined. + + port <expr> if not set, check port or server port is used. + It tells HAProxy where to open the connection to. + <port> must be a valid TCP port source integer, from 1 to + 65535 or an sample-fetch expression. + + addr <ip> defines the IP address to do the health check. + + send-proxy send a PROXY protocol string + + via-socks4 enables outgoing health checks using upstream socks4 proxy. + + ssl opens a ciphered connection + + sni <sni> specifies the SNI to use to do health checks over SSL. + + alpn <alpn> defines which protocols to advertise with ALPN. The protocol + list consists in a comma-delimited list of protocol names, + for instance: "h2,http/1.1". If it is not set, the server ALPN + is used. + + proto <name> forces the multiplexer's protocol to use for this connection. + It must be an HTTP mux protocol and it must be usable on the + backend side. The list of available protocols is reported in + haproxy -vv. + + linger cleanly close the connection instead of using a single RST. + + Just like tcp-check health checks, it is possible to configure the connection + to use to perform HTTP health check. This directive should also be used to + describe a scenario involving several request/response exchanges, possibly on + different ports or with different servers. + + When there are no TCP port configured on the server line neither server port + directive, then the first step of the http-check sequence must be to specify + the port with a "http-check connect". + + In an http-check ruleset a 'connect' is required, it is also mandatory to start + the ruleset with a 'connect' rule. Purpose is to ensure admin know what they + do. + + When a connect must start the ruleset, if may still be preceded by set-var, + unset-var or comment rules. + + Examples : + # check HTTP and HTTPs services on a server. + # first open port 80 thanks to server line port directive, then + # tcp-check opens port 443, ciphered and run a request on it: + option httpchk + + http-check connect + http-check send meth GET uri / ver HTTP/1.1 hdr host haproxy.1wt.eu + http-check expect status 200-399 + http-check connect port 443 ssl sni haproxy.1wt.eu + http-check send meth GET uri / ver HTTP/1.1 hdr host haproxy.1wt.eu + http-check expect status 200-399 + + server www 10.0.0.1 check port 80 + + See also : "option httpchk", "http-check send", "http-check expect" + + +http-check disable-on-404 + Enable a maintenance mode upon HTTP/404 response to health-checks + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When this option is set, a server which returns an HTTP code 404 will be + excluded from further load-balancing, but will still receive persistent + connections. This provides a very convenient method for Web administrators + to perform a graceful shutdown of their servers. It is also important to note + that a server which is detected as failed while it was in this mode will not + generate an alert, just a notice. If the server responds 2xx or 3xx again, it + will immediately be reinserted into the farm. The status on the stats page + reports "NOLB" for a server in this mode. It is important to note that this + option only works in conjunction with the "httpchk" option. If this option + is used with "http-check expect", then it has precedence over it so that 404 + responses will still be considered as soft-stop. Note also that a stopped + server will stay stopped even if it replies 404s. This option is only + evaluated for running servers. + + See also : "option httpchk" and "http-check expect". + + +http-check expect [min-recv <int>] [comment <msg>] + [ok-status <st>] [error-status <st>] [tout-status <st>] + [on-success <fmt>] [on-error <fmt>] [status-code <expr>] + [!] <match> <pattern> + Make HTTP health checks consider response contents or specific status codes + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + min-recv is optional and can define the minimum amount of data required to + evaluate the current expect rule. If the number of received bytes + is under this limit, the check will wait for more data. This + option can be used to resolve some ambiguous matching rules or to + avoid executing costly regex matches on content known to be still + incomplete. If an exact string is used, the minimum between the + string length and this parameter is used. This parameter is + ignored if it is set to -1. If the expect rule does not match, + the check will wait for more data. If set to 0, the evaluation + result is always conclusive. + + ok-status <st> is optional and can be used to set the check status if + the expect rule is successfully evaluated and if it is + the last rule in the tcp-check ruleset. "L7OK", "L7OKC", + "L6OK" and "L4OK" are supported : + - L7OK : check passed on layer 7 + - L7OKC : check conditionally passed on layer 7, set + server to NOLB state. + - L6OK : check passed on layer 6 + - L4OK : check passed on layer 4 + By default "L7OK" is used. + + error-status <st> is optional and can be used to set the check status if + an error occurred during the expect rule evaluation. + "L7OKC", "L7RSP", "L7STS", "L6RSP" and "L4CON" are + supported : + - L7OKC : check conditionally passed on layer 7, set + server to NOLB state. + - L7RSP : layer 7 invalid response - protocol error + - L7STS : layer 7 response error, for example HTTP 5xx + - L6RSP : layer 6 invalid response - protocol error + - L4CON : layer 1-4 connection problem + By default "L7RSP" is used. + + tout-status <st> is optional and can be used to set the check status if + a timeout occurred during the expect rule evaluation. + "L7TOUT", "L6TOUT", and "L4TOUT" are supported : + - L7TOUT : layer 7 (HTTP/SMTP) timeout + - L6TOUT : layer 6 (SSL) timeout + - L4TOUT : layer 1-4 timeout + By default "L7TOUT" is used. + + on-success <fmt> is optional and can be used to customize the + informational message reported in logs if the expect + rule is successfully evaluated and if it is the last rule + in the tcp-check ruleset. <fmt> is a log-format string. + + on-error <fmt> is optional and can be used to customize the + informational message reported in logs if an error + occurred during the expect rule evaluation. <fmt> is a + log-format string. + + <match> is a keyword indicating how to look for a specific pattern in the + response. The keyword may be one of "status", "rstatus", "hdr", + "fhdr", "string", or "rstring". The keyword may be preceded by an + exclamation mark ("!") to negate the match. Spaces are allowed + between the exclamation mark and the keyword. See below for more + details on the supported keywords. + + <pattern> is the pattern to look for. It may be a string, a regular + expression or a more complex pattern with several arguments. If + the string pattern contains spaces, they must be escaped with the + usual backslash ('\'). + + By default, "option httpchk" considers that response statuses 2xx and 3xx + are valid, and that others are invalid. When "http-check expect" is used, + it defines what is considered valid or invalid. Only one "http-check" + statement is supported in a backend. If a server fails to respond or times + out, the check obviously fails. The available matches are : + + status <codes> : test the status codes found parsing <codes> string. it + must be a comma-separated list of status codes or range + codes. A health check response will be considered as + valid if the response's status code matches any status + code or is inside any range of the list. If the "status" + keyword is prefixed with "!", then the response will be + considered invalid if the status code matches. + + rstatus <regex> : test a regular expression for the HTTP status code. + A health check response will be considered valid if the + response's status code matches the expression. If the + "rstatus" keyword is prefixed with "!", then the response + will be considered invalid if the status code matches. + This is mostly used to check for multiple codes. + + hdr { name | name-lf } [ -m <meth> ] <name> + [ { value | value-lf } [ -m <meth> ] <value> : + test the specified header pattern on the HTTP response + headers. The name pattern is mandatory but the value + pattern is optional. If not specified, only the header + presence is verified. <meth> is the matching method, + applied on the header name or the header value. Supported + matching methods are "str" (exact match), "beg" (prefix + match), "end" (suffix match), "sub" (substring match) or + "reg" (regex match). If not specified, exact matching + method is used. If the "name-lf" parameter is used, + <name> is evaluated as a log-format string. If "value-lf" + parameter is used, <value> is evaluated as a log-format + string. These parameters cannot be used with the regex + matching method. Finally, the header value is considered + as comma-separated list. Note that matchings are case + insensitive on the header names. + + fhdr { name | name-lf } [ -m <meth> ] <name> + [ { value | value-lf } [ -m <meth> ] <value> : + test the specified full header pattern on the HTTP + response headers. It does exactly the same than "hdr" + keyword, except the full header value is tested, commas + are not considered as delimiters. + + string <string> : test the exact string match in the HTTP response body. + A health check response will be considered valid if the + response's body contains this exact string. If the + "string" keyword is prefixed with "!", then the response + will be considered invalid if the body contains this + string. This can be used to look for a mandatory word at + the end of a dynamic page, or to detect a failure when a + specific error appears on the check page (e.g. a stack + trace). + + rstring <regex> : test a regular expression on the HTTP response body. + A health check response will be considered valid if the + response's body matches this expression. If the "rstring" + keyword is prefixed with "!", then the response will be + considered invalid if the body matches the expression. + This can be used to look for a mandatory word at the end + of a dynamic page, or to detect a failure when a specific + error appears on the check page (e.g. a stack trace). + + string-lf <fmt> : test a log-format string match in the HTTP response body. + A health check response will be considered valid if the + response's body contains the string resulting of the + evaluation of <fmt>, which follows the log-format rules. + If prefixed with "!", then the response will be + considered invalid if the body contains the string. + + It is important to note that the responses will be limited to a certain size + defined by the global "tune.bufsize" option, which defaults to 16384 bytes. + Thus, too large responses may not contain the mandatory pattern when using + "string" or "rstring". If a large response is absolutely required, it is + possible to change the default max size by setting the global variable. + However, it is worth keeping in mind that parsing very large responses can + waste some CPU cycles, especially when regular expressions are used, and that + it is always better to focus the checks on smaller resources. + + In an http-check ruleset, the last expect rule may be implicit. If no expect + rule is specified after the last "http-check send", an implicit expect rule + is defined to match on 2xx or 3xx status codes. It means this rule is also + defined if there is no "http-check" rule at all, when only "option httpchk" + is set. + + Last, if "http-check expect" is combined with "http-check disable-on-404", + then this last one has precedence when the server responds with 404. + + Examples : + # only accept status 200 as valid + http-check expect status 200,201,300-310 + + # be sure a sessid coookie is set + http-check expect header name "set-cookie" value -m beg "sessid=" + + # consider SQL errors as errors + http-check expect ! string SQL\ Error + + # consider status 5xx only as errors + http-check expect ! rstatus ^5 + + # check that we have a correct hexadecimal tag before /html + http-check expect rstring <!--tag:[0-9a-f]*--></html> + + See also : "option httpchk", "http-check connect", "http-check disable-on-404" + and "http-check send". + + +http-check send [meth <method>] [{ uri <uri> | uri-lf <fmt> }>] [ver <version>] + [hdr <name> <fmt>]* [{ body <string> | body-lf <fmt> }] + [comment <msg>] + Add a possible list of headers and/or a body to the request sent during HTTP + health checks. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + meth <method> is the optional HTTP method used with the requests. When not + set, the "OPTIONS" method is used, as it generally requires + low server processing and is easy to filter out from the + logs. Any method may be used, though it is not recommended + to invent non-standard ones. + + uri <uri> is optional and set the URI referenced in the HTTP requests + to the string <uri>. It defaults to "/" which is accessible + by default on almost any server, but may be changed to any + other URI. Query strings are permitted. + + uri-lf <fmt> is optional and set the URI referenced in the HTTP requests + using the log-format string <fmt>. It defaults to "/" which + is accessible by default on almost any server, but may be + changed to any other URI. Query strings are permitted. + + ver <version> is the optional HTTP version string. It defaults to + "HTTP/1.0" but some servers might behave incorrectly in HTTP + 1.0, so turning it to HTTP/1.1 may sometimes help. Note that + the Host field is mandatory in HTTP/1.1, use "hdr" argument + to add it. + + hdr <name> <fmt> adds the HTTP header field whose name is specified in + <name> and whose value is defined by <fmt>, which follows + to the log-format rules. + + body <string> add the body defined by <string> to the request sent during + HTTP health checks. If defined, the "Content-Length" header + is thus automatically added to the request. + + body-lf <fmt> add the body defined by the log-format string <fmt> to the + request sent during HTTP health checks. If defined, the + "Content-Length" header is thus automatically added to the + request. + + In addition to the request line defined by the "option httpchk" directive, + this one is the valid way to add some headers and optionally a body to the + request sent during HTTP health checks. If a body is defined, the associate + "Content-Length" header is automatically added. Thus, this header or + "Transfer-encoding" header should not be present in the request provided by + "http-check send". If so, it will be ignored. The old trick consisting to add + headers after the version string on the "option httpchk" line is now + deprecated. + + Also "http-check send" doesn't support HTTP keep-alive. Keep in mind that it + will automatically append a "Connection: close" header, unless a Connection + header has already already been configured via a hdr entry. + + Note that the Host header and the request authority, when both defined, are + automatically synchronized. It means when the HTTP request is sent, when a + Host is inserted in the request, the request authority is accordingly + updated. Thus, don't be surprised if the Host header value overwrites the + configured request authority. + + Note also for now, no Host header is automatically added in HTTP/1.1 or above + requests. You should add it explicitly. + + See also : "option httpchk", "http-check send-state" and "http-check expect". + + +http-check send-state + Enable emission of a state header with HTTP health checks + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When this option is set, HAProxy will systematically send a special header + "X-Haproxy-Server-State" with a list of parameters indicating to each server + how they are seen by HAProxy. This can be used for instance when a server is + manipulated without access to HAProxy and the operator needs to know whether + HAProxy still sees it up or not, or if the server is the last one in a farm. + + The header is composed of fields delimited by semi-colons, the first of which + is a word ("UP", "DOWN", "NOLB"), possibly followed by a number of valid + checks on the total number before transition, just as appears in the stats + interface. Next headers are in the form "<variable>=<value>", indicating in + no specific order some values available in the stats interface : + - a variable "address", containing the address of the backend server. + This corresponds to the <address> field in the server declaration. For + unix domain sockets, it will read "unix". + + - a variable "port", containing the port of the backend server. This + corresponds to the <port> field in the server declaration. For unix + domain sockets, it will read "unix". + + - a variable "name", containing the name of the backend followed by a slash + ("/") then the name of the server. This can be used when a server is + checked in multiple backends. + + - a variable "node" containing the name of the HAProxy node, as set in the + global "node" variable, otherwise the system's hostname if unspecified. + + - a variable "weight" indicating the weight of the server, a slash ("/") + and the total weight of the farm (just counting usable servers). This + helps to know if other servers are available to handle the load when this + one fails. + + - a variable "scur" indicating the current number of concurrent connections + on the server, followed by a slash ("/") then the total number of + connections on all servers of the same backend. + + - a variable "qcur" indicating the current number of requests in the + server's queue. + + Example of a header received by the application server : + >>> X-Haproxy-Server-State: UP 2/3; name=bck/srv2; node=lb1; weight=1/2; \ + scur=13/22; qcur=0 + + See also : "option httpchk", "http-check disable-on-404" and + "http-check send". + + +http-check set-var(<var-name>[,<cond>...]) <expr> +http-check set-var-fmt(<var-name>[,<cond>...]) <fmt> + This operation sets the content of a variable. The variable is declared inline. + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <var-name> The name of the variable starts with an indication about its + scope. The scopes allowed for http-check are: + "proc" : the variable is shared with the whole process. + "sess" : the variable is shared with the tcp-check session. + "check": the variable is declared for the lifetime of the tcp-check. + This prefix is followed by a name. The separator is a '.'. + The name may only contain characters 'a-z', 'A-Z', '0-9', '.', + and '-'. + + <cond> A set of conditions that must all be true for the variable to + actually be set (such as "ifnotempty", "ifgt" ...). See the + set-var converter's description for a full list of possible + conditions. + + <expr> Is a sample-fetch expression potentially followed by converters. + + <fmt> This is the value expressed using log-format rules (see Custom + Log Format in section 8.2.4). + + Examples : + http-check set-var(check.port) int(1234) + http-check set-var-fmt(check.port) "name=%H" + + +http-check unset-var(<var-name>) + Free a reference to a variable within its scope. + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <var-name> The name of the variable starts with an indication about its + scope. The scopes allowed for http-check are: + "proc" : the variable is shared with the whole process. + "sess" : the variable is shared with the tcp-check session. + "check": the variable is declared for the lifetime of the tcp-check. + This prefix is followed by a name. The separator is a '.'. + The name may only contain characters 'a-z', 'A-Z', '0-9', '.', + and '-'. + + Examples : + http-check unset-var(check.port) + + +http-error status <code> [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <fmt> ]* + Defines a custom error message to use instead of errors generated by HAProxy. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + status <code> is the HTTP status code. It must be specified. + Currently, HAProxy is capable of generating codes + 200, 400, 401, 403, 404, 405, 407, 408, 410, 413, 425, + 429, 500, 501, 502, 503, and 504. + + content-type <type> is the response content type, for instance + "text/plain". This parameter is ignored and should be + omitted when an errorfile is configured or when the + payload is empty. Otherwise, it must be defined. + + default-errorfiles Reset the previously defined error message for current + proxy for the status <code>. If used on a backend, the + frontend error message is used, if defined. If used on + a frontend, the default error message is used. + + errorfile <file> designates a file containing the full HTTP response. + It is recommended to follow the common practice of + appending ".http" to the filename so that people do + not confuse the response with HTML error pages, and to + use absolute paths, since files are read before any + chroot is performed. + + errorfiles <name> designates the http-errors section to use to import + the error message with the status code <code>. If no + such message is found, the proxy's error messages are + considered. + + file <file> specifies the file to use as response payload. If the + file is not empty, its content-type must be set as + argument to "content-type", otherwise, any + "content-type" argument is ignored. <file> is + considered as a raw string. + + string <str> specifies the raw string to use as response payload. + The content-type must always be set as argument to + "content-type". + + lf-file <file> specifies the file to use as response payload. If the + file is not empty, its content-type must be set as + argument to "content-type", otherwise, any + "content-type" argument is ignored. <file> is + evaluated as a log-format string. + + lf-string <str> specifies the log-format string to use as response + payload. The content-type must always be set as + argument to "content-type". + + hdr <name> <fmt> adds to the response the HTTP header field whose name + is specified in <name> and whose value is defined by + <fmt>, which follows to the log-format rules. + This parameter is ignored if an errorfile is used. + + This directive may be used instead of "errorfile", to define a custom error + message. As "errorfile" directive, it is used for errors detected and + returned by HAProxy. If an errorfile is defined, it is parsed when HAProxy + starts and must be valid according to the HTTP standards. The generated + response must not exceed the configured buffer size (BUFFSIZE), otherwise an + internal error will be returned. Finally, if you consider to use some + http-after-response rules to rewrite these errors, the reserved buffer space + should be available (see "tune.maxrewrite"). + + The files are read at the same time as the configuration and kept in memory. + For this reason, the errors continue to be returned even when the process is + chrooted, and no file change is considered while the process is running. + + Note: 400/408/500 errors emitted in early stage of the request parsing are + handled by the multiplexer at a lower level. No custom formatting is + supported at this level. Thus only static error messages, defined with + "errorfile" directive, are supported. However, this limitation only + exists during the request headers parsing or between two transactions. + + See also : "errorfile", "errorfiles", "errorloc", "errorloc302", + "errorloc303" and section 3.8 about http-errors. + + +http-request <action> [options...] [ { if | unless } <condition> ] + Access control for Layer 7 requests + + May be used in sections: defaults | frontend | listen | backend + yes(!) | yes | yes | yes + + The http-request statement defines a set of rules which apply to layer 7 + processing. The rules are evaluated in their declaration order when they are + met in a frontend, listen or backend section. Any rule may optionally be + followed by an ACL-based condition, in which case it will only be evaluated + if the condition is true. + + The first keyword is the rule's action. Several types of actions are + supported: + - add-acl(<file-name>) <key fmt> + - add-header <name> <fmt> + - allow + - auth [realm <realm>] + - cache-use <name> + - capture <sample> [ len <length> | id <id> ] + - del-acl(<file-name>) <key fmt> + - del-header <name> [ -m <meth> ] + - del-map(<file-name>) <key fmt> + - deny [ { status | deny_status } <code>] ... + - disable-l7-retry + - do-resolve(<var>,<resolvers>,[ipv4,ipv6]) <expr> + - early-hint <name> <fmt> + - normalize-uri <normalizer> + - redirect <rule> + - reject + - replace-header <name> <match-regex> <replace-fmt> + - replace-path <match-regex> <replace-fmt> + - replace-pathq <match-regex> <replace-fmt> + - replace-uri <match-regex> <replace-fmt> + - replace-value <name> <match-regex> <replace-fmt> + - return [status <code>] [content-type <type>] ... + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - set-dst <expr> + - set-dst-port <expr> + - set-header <name> <fmt> + - set-log-level <level> + - set-map(<file-name>) <key fmt> <value fmt> + - set-mark <mark> + - set-method <fmt> + - set-nice <nice> + - set-path <fmt> + - set-pathq <fmt> + - set-priority-class <expr> + - set-priority-offset <expr> + - set-query <fmt> + - set-src <expr> + - set-src-port <expr> + - set-timeout { server | tunnel } { <timeout> | <expr> } + - set-tos <tos> + - set-uri <fmt> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - send-spoe-group <engine-name> <group-name> + - silent-drop + - strict-mode { on | off } + - tarpit [ { status | deny_status } <code>] ... + - track-sc0 <key> [table <table>] + - track-sc1 <key> [table <table>] + - track-sc2 <key> [table <table>] + - unset-var(<var-name>) + - use-service <service-name> + - wait-for-body time <time> [ at-least <bytes> ] + - wait-for-handshake + - cache-use <name> + + The supported actions are described below. + + There is no limit to the number of http-request statements per instance. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Example: + acl nagios src 192.168.129.3 + acl local_net src 192.168.0.0/16 + acl auth_ok http_auth(L1) + + http-request allow if nagios + http-request allow if local_net auth_ok + http-request auth realm Gimme if local_net auth_ok + http-request deny + + Example: + acl key req.hdr(X-Add-Acl-Key) -m found + acl add path /addacl + acl del path /delacl + + acl myhost hdr(Host) -f myhost.lst + + http-request add-acl(myhost.lst) %[req.hdr(X-Add-Acl-Key)] if key add + http-request del-acl(myhost.lst) %[req.hdr(X-Add-Acl-Key)] if key del + + Example: + acl value req.hdr(X-Value) -m found + acl setmap path /setmap + acl delmap path /delmap + + use_backend bk_appli if { hdr(Host),map_str(map.lst) -m found } + + http-request set-map(map.lst) %[src] %[req.hdr(X-Value)] if setmap value + http-request del-map(map.lst) %[src] if delmap + + See also : "stats http-request", section 3.4 about userlists and section 7 + about ACL usage. + +http-request add-acl(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to add a new entry into an ACL. The ACL must be loaded from a + file (even a dummy empty file). The file name of the ACL to be updated is + passed between parentheses. It takes one argument: <key fmt>, which follows + log-format rules, to collect content of the new entry. It performs a lookup + in the ACL before insertion, to avoid duplicated (or more) values. This + lookup is done by a linear search and can be expensive with large lists! + It is the equivalent of the "add acl" command from the stats socket, but can + be triggered by an HTTP request. + +http-request add-header <name> <fmt> [ { if | unless } <condition> ] + + This appends an HTTP header field whose name is specified in <name> and + whose value is defined by <fmt> which follows the log-format rules (see + Custom Log Format in section 8.2.4). This is particularly useful to pass + connection-specific information to the server (e.g. the client's SSL + certificate), or to combine several headers into one. This rule is not + final, so it is possible to add other similar rules. Note that header + addition is performed immediately, so one rule might reuse the resulting + header from a previous rule. + +http-request allow [ { if | unless } <condition> ] + + This stops the evaluation of the rules and lets the request pass the check. + No further "http-request" rules are evaluated for the current section. + +http-request auth [realm <realm>] [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately responds with an + HTTP 401 or 407 error code to invite the user to present a valid user name + and password. No further "http-request" rules are evaluated. An optional + "realm" parameter is supported, it sets the authentication realm that is + returned with the response (typically the application's name). + + The corresponding proxy's error message is used. It may be customized using + an "errorfile" or an "http-error" directive. For 401 responses, all + occurrences of the WWW-Authenticate header are removed and replaced by a new + one with a basic authentication challenge for realm "<realm>". For 407 + responses, the same is done on the Proxy-Authenticate header. If the error + message must not be altered, consider to use "http-request return" rule + instead. + + Example: + acl auth_ok http_auth_group(L1) G1 + http-request auth unless auth_ok + +http-request cache-use <name> [ { if | unless } <condition> ] + + See section 6.2 about cache setup. + +http-request capture <sample> [ len <length> | id <id> ] + [ { if | unless } <condition> ] + + This captures sample expression <sample> from the request buffer, and + converts it to a string of at most <len> characters. The resulting string is + stored into the next request "capture" slot, so it will possibly appear next + to some captured HTTP headers. It will then automatically appear in the logs, + and it will be possible to extract it using sample fetch rules to feed it + into headers or anything. The length should be limited given that this size + will be allocated for each capture during the whole session life. + Please check section 7.3 (Fetching samples) and "capture request header" for + more information. + + If the keyword "id" is used instead of "len", the action tries to store the + captured string in a previously declared capture slot. This is useful to run + captures in backends. The slot id can be declared by a previous directive + "http-request capture" or with the "declare capture" keyword. + + When using this action in a backend, double check that the relevant + frontend(s) have the required capture slots otherwise, this rule will be + ignored at run time. This can't be detected at configuration parsing time + due to HAProxy's ability to dynamically resolve backend name at runtime. + +http-request del-acl(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to delete an entry from an ACL. The ACL must be loaded from a + file (even a dummy empty file). The file name of the ACL to be updated is + passed between parentheses. It takes one argument: <key fmt>, which follows + log-format rules, to collect content of the entry to delete. + It is the equivalent of the "del acl" command from the stats socket, but can + be triggered by an HTTP request. + +http-request del-header <name> [ -m <meth> ] [ { if | unless } <condition> ] + + This removes all HTTP header fields whose name is specified in <name>. <meth> + is the matching method, applied on the header name. Supported matching methods + are "str" (exact match), "beg" (prefix match), "end" (suffix match), "sub" + (substring match) and "reg" (regex match). If not specified, exact matching + method is used. + +http-request del-map(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to delete an entry from a MAP. The MAP must be loaded from a + file (even a dummy empty file). The file name of the MAP to be updated is + passed between parentheses. It takes one argument: <key fmt>, which follows + log-format rules, to collect content of the entry to delete. + It takes one argument: "file name" It is the equivalent of the "del map" + command from the stats socket, but can be triggered by an HTTP request. + +http-request deny [deny_status <status>] [ { if | unless } <condition> ] +http-request deny [ { status | deny_status } <code>] [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <fmt> ]* + [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately rejects the request. + By default an HTTP 403 error is returned. But the response may be customized + using same syntax than "http-request return" rules. Thus, see "http-request + return" for details. For compatibility purpose, when no argument is defined, + or only "deny_status", the argument "default-errorfiles" is implied. It means + "http-request deny [deny_status <status>]" is an alias of + "http-request deny [status <status>] default-errorfiles". + No further "http-request" rules are evaluated. + See also "http-request return". + +http-request disable-l7-retry [ { if | unless } <condition> ] + This disables any attempt to retry the request if it fails for any other + reason than a connection failure. This can be useful for example to make + sure POST requests aren't retried on failure. + +http-request do-resolve(<var>,<resolvers>,[ipv4,ipv6]) <expr> + [ { if | unless } <condition> ] + + This action performs a DNS resolution of the output of <expr> and stores + the result in the variable <var>. It uses the DNS resolvers section + pointed by <resolvers>. + It is possible to choose a resolution preference using the optional + arguments 'ipv4' or 'ipv6'. + When performing the DNS resolution, the client side connection is on + pause waiting till the end of the resolution. + If an IP address can be found, it is stored into <var>. If any kind of + error occurs, then <var> is not set. + One can use this action to discover a server IP address at run time and + based on information found in the request (IE a Host header). + If this action is used to find the server's IP address (using the + "set-dst" action), then the server IP address in the backend must be set + to 0.0.0.0. The do-resolve action takes an host-only parameter, any port must + be removed from the string. + + Example: + resolvers mydns + nameserver local 127.0.0.53:53 + nameserver google 8.8.8.8:53 + timeout retry 1s + hold valid 10s + hold nx 3s + hold other 3s + hold obsolete 0s + accepted_payload_size 8192 + + frontend fe + bind 10.42.0.1:80 + http-request do-resolve(txn.myip,mydns,ipv4) hdr(Host),host_only + http-request capture var(txn.myip) len 40 + + # return 503 when the variable is not set, + # which mean DNS resolution error + use_backend b_503 unless { var(txn.myip) -m found } + + default_backend be + + backend b_503 + # dummy backend used to return 503. + # one can use the errorfile directive to send a nice + # 503 error page to end users + + backend be + # rule to prevent HAProxy from reconnecting to services + # on the local network (forged DNS name used to scan the network) + http-request deny if { var(txn.myip) -m ip 127.0.0.0/8 10.0.0.0/8 } + http-request set-dst var(txn.myip) + server clear 0.0.0.0:0 + + NOTE: Don't forget to set the "protection" rules to ensure HAProxy won't + be used to scan the network or worst won't loop over itself... + +http-request early-hint <name> <fmt> [ { if | unless } <condition> ] + + This is used to build an HTTP 103 Early Hints response prior to any other one. + This appends an HTTP header field to this response whose name is specified in + <name> and whose value is defined by <fmt> which follows the log-format rules + (see Custom Log Format in section 8.2.4). This is particularly useful to pass + to the client some Link headers to preload resources required to render the + HTML documents. + + See RFC 8297 for more information. + +http-request normalize-uri <normalizer> [ { if | unless } <condition> ] +http-request normalize-uri fragment-encode [ { if | unless } <condition> ] +http-request normalize-uri fragment-strip [ { if | unless } <condition> ] +http-request normalize-uri path-merge-slashes [ { if | unless } <condition> ] +http-request normalize-uri path-strip-dot [ { if | unless } <condition> ] +http-request normalize-uri path-strip-dotdot [ full ] [ { if | unless } <condition> ] +http-request normalize-uri percent-decode-unreserved [ strict ] [ { if | unless } <condition> ] +http-request normalize-uri percent-to-uppercase [ strict ] [ { if | unless } <condition> ] +http-request normalize-uri query-sort-by-name [ { if | unless } <condition> ] + + Performs normalization of the request's URI. + + URI normalization in HAProxy 2.4 is currently available as an experimental + technical preview. As such, it requires the global directive + 'expose-experimental-directives' first to be able to invoke it. You should be + prepared that the behavior of normalizers might change to fix possible + issues, possibly breaking proper request processing in your infrastructure. + + Each normalizer handles a single type of normalization to allow for a + fine-grained selection of the level of normalization that is appropriate for + the supported backend. + + As an example the "path-strip-dotdot" normalizer might be useful for a static + fileserver that directly maps the requested URI to the path within the local + filesystem. However it might break routing of an API that expects a specific + number of segments in the path. + + It is important to note that some normalizers might result in unsafe + transformations for broken URIs. It might also be possible that a combination + of normalizers that are safe by themselves results in unsafe transformations + when improperly combined. + + As an example the "percent-decode-unreserved" normalizer might result in + unexpected results when a broken URI includes bare percent characters. One + such a broken URI is "/%%36%36" which would be decoded to "/%66" which in + turn is equivalent to "/f". By specifying the "strict" option requests to + such a broken URI would safely be rejected. + + The following normalizers are available: + + - fragment-encode: Encodes "#" as "%23". + + The "fragment-strip" normalizer should be preferred, unless it is known + that broken clients do not correctly encode '#' within the path component. + + Example: + - /#foo -> /%23foo + + - fragment-strip: Removes the URI's "fragment" component. + + According to RFC 3986#3.5 the "fragment" component of an URI should not + be sent, but handled by the User Agent after retrieving a resource. + + This normalizer should be applied first to ensure that the fragment is + not interpreted as part of the request's path component. + + Example: + - /#foo -> / + + - path-strip-dot: Removes "/./" segments within the "path" component + (RFC 3986#6.2.2.3). + + Segments including percent encoded dots ("%2E") will not be detected. Use + the "percent-decode-unreserved" normalizer first if this is undesired. + + Example: + - /. -> / + - /./bar/ -> /bar/ + - /a/./a -> /a/a + - /.well-known/ -> /.well-known/ (no change) + + - path-strip-dotdot: Normalizes "/../" segments within the "path" component + (RFC 3986#6.2.2.3). + + This merges segments that attempt to access the parent directory with + their preceding segment. + + Empty segments do not receive special treatment. Use the "merge-slashes" + normalizer first if this is undesired. + + Segments including percent encoded dots ("%2E") will not be detected. Use + the "percent-decode-unreserved" normalizer first if this is undesired. + + Example: + - /foo/../ -> / + - /foo/../bar/ -> /bar/ + - /foo/bar/../ -> /foo/ + - /../bar/ -> /../bar/ + - /bar/../../ -> /../ + - /foo//../ -> /foo/ + - /foo/%2E%2E/ -> /foo/%2E%2E/ + + If the "full" option is specified then "../" at the beginning will be + removed as well: + + Example: + - /../bar/ -> /bar/ + - /bar/../../ -> / + + - path-merge-slashes: Merges adjacent slashes within the "path" component + into a single slash. + + Example: + - // -> / + - /foo//bar -> /foo/bar + + - percent-decode-unreserved: Decodes unreserved percent encoded characters to + their representation as a regular character (RFC 3986#6.2.2.2). + + The set of unreserved characters includes all letters, all digits, "-", + ".", "_", and "~". + + Example: + - /%61dmin -> /admin + - /foo%3Fbar=baz -> /foo%3Fbar=baz (no change) + - /%%36%36 -> /%66 (unsafe) + - /%ZZ -> /%ZZ + + If the "strict" option is specified then invalid sequences will result + in a HTTP 400 Bad Request being returned. + + Example: + - /%%36%36 -> HTTP 400 + - /%ZZ -> HTTP 400 + + - percent-to-uppercase: Uppercases letters within percent-encoded sequences + (RFC 3986#6.2.2.1). + + Example: + - /%6f -> /%6F + - /%zz -> /%zz + + If the "strict" option is specified then invalid sequences will result + in a HTTP 400 Bad Request being returned. + + Example: + - /%zz -> HTTP 400 + + - query-sort-by-name: Sorts the query string parameters by parameter name. + Parameters are assumed to be delimited by '&'. Shorter names sort before + longer names and identical parameter names maintain their relative order. + + Example: + - /?c=3&a=1&b=2 -> /?a=1&b=2&c=3 + - /?aaa=3&a=1&aa=2 -> /?a=1&aa=2&aaa=3 + - /?a=3&b=4&a=1&b=5&a=2 -> /?a=3&a=1&a=2&b=4&b=5 + +http-request redirect <rule> [ { if | unless } <condition> ] + + This performs an HTTP redirection based on a redirect rule. This is exactly + the same as the "redirect" statement except that it inserts a redirect rule + which can be processed in the middle of other "http-request" rules and that + these rules use the "log-format" strings. See the "redirect" keyword for the + rule's syntax. + +http-request reject [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately closes the connection + without sending any response. It acts similarly to the + "tcp-request content reject" rules. It can be useful to force an immediate + connection closure on HTTP/2 connections. + +http-request replace-header <name> <match-regex> <replace-fmt> + [ { if | unless } <condition> ] + + This matches the value of all occurrences of header field <name> against + <match-regex>. Matching is performed case-sensitively. Matching values are + completely replaced by <replace-fmt>. Format characters are allowed in + <replace-fmt> and work like <fmt> arguments in "http-request add-header". + Standard back-references using the backslash ('\') followed by a number are + supported. + + This action acts on whole header lines, regardless of the number of values + they may contain. Thus it is well-suited to process headers naturally + containing commas in their value, such as If-Modified-Since. Headers that + contain a comma-separated list of values, such as Accept, should be processed + using "http-request replace-value". + + Example: + http-request replace-header Cookie foo=([^;]*);(.*) foo=\1;ip=%bi;\2 + + # applied to: + Cookie: foo=foobar; expires=Tue, 14-Jun-2016 01:40:45 GMT; + + # outputs: + Cookie: foo=foobar;ip=192.168.1.20; expires=Tue, 14-Jun-2016 01:40:45 GMT; + + # assuming the backend IP is 192.168.1.20 + + http-request replace-header User-Agent curl foo + + # applied to: + User-Agent: curl/7.47.0 + + # outputs: + User-Agent: foo + +http-request replace-path <match-regex> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "replace-header" except that it works on the request's path + component instead of a header. The path component starts at the first '/' + after an optional scheme+authority and ends before the question mark. Thus, + the replacement does not modify the scheme, the authority and the + query-string. + + It is worth noting that regular expressions may be more expensive to evaluate + than certain ACLs, so rare replacements may benefit from a condition to avoid + performing the evaluation at all if it does not match. + + Example: + # prefix /foo : turn /bar?q=1 into /foo/bar?q=1 : + http-request replace-path (.*) /foo\1 + + # strip /foo : turn /foo/bar?q=1 into /bar?q=1 + http-request replace-path /foo/(.*) /\1 + # or more efficient if only some requests match : + http-request replace-path /foo/(.*) /\1 if { url_beg /foo/ } + +http-request replace-pathq <match-regex> <replace-fmt> + [ { if | unless } <condition> ] + + This does the same as "http-request replace-path" except that the path + contains the query-string if any is present. Thus, the path and the + query-string are replaced. + + Example: + # suffix /foo : turn /bar?q=1 into /bar/foo?q=1 : + http-request replace-pathq ([^?]*)(\?(.*))? \1/foo\2 + +http-request replace-uri <match-regex> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "replace-header" except that it works on the request's URI part + instead of a header. The URI part may contain an optional scheme, authority or + query string. These are considered to be part of the value that is matched + against. + + It is worth noting that regular expressions may be more expensive to evaluate + than certain ACLs, so rare replacements may benefit from a condition to avoid + performing the evaluation at all if it does not match. + + IMPORTANT NOTE: historically in HTTP/1.x, the vast majority of requests sent + by browsers use the "origin form", which differs from the "absolute form" in + that they do not contain a scheme nor authority in the URI portion. Mostly + only requests sent to proxies, those forged by hand and some emitted by + certain applications use the absolute form. As such, "replace-uri" usually + works fine most of the time in HTTP/1.x with rules starting with a "/". But + with HTTP/2, clients are encouraged to send absolute URIs only, which look + like the ones HTTP/1 clients use to talk to proxies. Such partial replace-uri + rules may then fail in HTTP/2 when they work in HTTP/1. Either the rules need + to be adapted to optionally match a scheme and authority, or replace-path + should be used. + + Example: + # rewrite all "http" absolute requests to "https": + http-request replace-uri ^http://(.*) https://\1 + + # prefix /foo : turn /bar?q=1 into /foo/bar?q=1 : + http-request replace-uri ([^/:]*://[^/]*)?(.*) \1/foo\2 + +http-request replace-value <name> <match-regex> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "replace-header" except that it matches the regex against + every comma-delimited value of the header field <name> instead of the + entire header. This is suited for all headers which are allowed to carry + more than one value. An example could be the Accept header. + + Example: + http-request replace-value X-Forwarded-For ^192\.168\.(.*)$ 172.16.\1 + + # applied to: + X-Forwarded-For: 192.168.10.1, 192.168.13.24, 10.0.0.37 + + # outputs: + X-Forwarded-For: 172.16.10.1, 172.16.13.24, 10.0.0.37 + +http-request return [status <code>] [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <fmt> ]* + [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately returns a response. The + default status code used for the response is 200. It can be optionally + specified as an arguments to "status". The response content-type may also be + specified as an argument to "content-type". Finally the response itself may + be defined. It can be a full HTTP response specifying the errorfile to use, + or the response payload specifying the file or the string to use. These rules + are followed to create the response : + + * If neither the errorfile nor the payload to use is defined, a dummy + response is returned. Only the "status" argument is considered. It can be + any code in the range [200, 599]. The "content-type" argument, if any, is + ignored. + + * If "default-errorfiles" argument is set, the proxy's errorfiles are + considered. If the "status" argument is defined, it must be one of the + status code handled by HAProxy (200, 400, 403, 404, 405, 408, 410, 413, + 425, 429, 500, 501, 502, 503, and 504). The "content-type" argument, if + any, is ignored. + + * If a specific errorfile is defined, with an "errorfile" argument, the + corresponding file, containing a full HTTP response, is returned. Only the + "status" argument is considered. It must be one of the status code handled + by HAProxy (200, 400, 403, 404, 405, 408, 410, 413, 425, 429, 500, 501, + 502, 503, and 504). The "content-type" argument, if any, is ignored. + + * If an http-errors section is defined, with an "errorfiles" argument, the + corresponding file in the specified http-errors section, containing a full + HTTP response, is returned. Only the "status" argument is considered. It + must be one of the status code handled by HAProxy (200, 400, 403, 404, 405, + 408, 410, 413, 425, 429, 500, 501, 502, 503, and 504). The "content-type" + argument, if any, is ignored. + + * If a "file" or a "lf-file" argument is specified, the file's content is + used as the response payload. If the file is not empty, its content-type + must be set as argument to "content-type". Otherwise, any "content-type" + argument is ignored. With a "lf-file" argument, the file's content is + evaluated as a log-format string. With a "file" argument, it is considered + as a raw content. + + * If a "string" or "lf-string" argument is specified, the defined string is + used as the response payload. The content-type must always be set as + argument to "content-type". With a "lf-string" argument, the string is + evaluated as a log-format string. With a "string" argument, it is + considered as a raw string. + + When the response is not based on an errorfile, it is possible to append HTTP + header fields to the response using "hdr" arguments. Otherwise, all "hdr" + arguments are ignored. For each one, the header name is specified in <name> + and its value is defined by <fmt> which follows the log-format rules. + + Note that the generated response must be smaller than a buffer. And to avoid + any warning, when an errorfile or a raw file is loaded, the buffer space + reserved for the headers rewriting should also be free. + + No further "http-request" rules are evaluated. + + Example: + http-request return errorfile /etc/haproxy/errorfiles/200.http \ + if { path /ping } + + http-request return content-type image/x-icon file /var/www/favicon.ico \ + if { path /favicon.ico } + + http-request return status 403 content-type text/plain \ + lf-string "Access denied. IP %[src] is blacklisted." \ + if { src -f /etc/haproxy/blacklist.lst } + +http-request sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] + + This actions increments the General Purpose Counter at the index <idx> + of the array associated to the sticky counter designated by <sc-id>. + If an error occurs, this action silently fails and the actions evaluation + continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer + between 0 and 2. It also silently fails if the there is no GPC stored + at this index. + This action applies only to the 'gpc' and 'gpc_rate' array data_types (and + not to the legacy 'gpc0', 'gpc1', 'gpc0_rate' nor 'gpc1_rate' data_types). + +http-request sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +http-request sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + This actions increments the GPC0 or GPC1 counter according with the sticky + counter designated by <sc-id>. If an error occurs, this action silently fails + and the actions evaluation continues. + +http-request sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + This action sets the 32-bit unsigned GPT at the index <idx> of the array + associated to the sticky counter designated by <sc-id> at the value of + <int>/<expr>. The expected result is a boolean. + If an error occurs, this action silently fails and the actions evaluation + continues. <idx> is an integer between 0 and 99 and <sc-id> is an integer + between 0 and 2. It also silently fails if the there is no GPT stored + at this index. + This action applies only to the 'gpt' array data_type (and not to the + legacy 'gpt0' data-type). + +http-request sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + This action sets the 32-bit unsigned GPT0 tag according to the sticky counter + designated by <sc-id> and the value of <int>/<expr>. The expected result is a + boolean. If an error occurs, this action silently fails and the actions + evaluation continues. + +http-request send-spoe-group <engine-name> <group-name> + [ { if | unless } <condition> ] + + This action is used to trigger sending of a group of SPOE messages. To do so, + the SPOE engine used to send messages must be defined, as well as the SPOE + group to send. Of course, the SPOE engine must refer to an existing SPOE + filter. If not engine name is provided on the SPOE filter line, the SPOE + agent name must be used. + + Arguments: + <engine-name> The SPOE engine name. + + <group-name> The SPOE group name as specified in the engine + configuration. + +http-request set-dst <expr> [ { if | unless } <condition> ] + + This is used to set the destination IP address to the value of specified + expression. Useful when a proxy in front of HAProxy rewrites destination IP, + but provides the correct IP in a HTTP header; or you want to mask the IP for + privacy. If you want to connect to the new address/port, use '0.0.0.0:0' as a + server address in the backend. + + Arguments: + <expr> Is a standard HAProxy expression formed by a sample-fetch followed + by some converters. + + Example: + http-request set-dst hdr(x-dst) + http-request set-dst dst,ipmask(24) + + When possible, set-dst preserves the original destination port as long as the + address family allows it, otherwise the destination port is set to 0. + +http-request set-dst-port <expr> [ { if | unless } <condition> ] + + This is used to set the destination port address to the value of specified + expression. If you want to connect to the new address/port, use '0.0.0.0:0' + as a server address in the backend. + + Arguments: + <expr> Is a standard HAProxy expression formed by a sample-fetch + followed by some converters. + + Example: + http-request set-dst-port hdr(x-port) + http-request set-dst-port int(4000) + + When possible, set-dst-port preserves the original destination address as + long as the address family supports a port, otherwise it forces the + destination address to IPv4 "0.0.0.0" before rewriting the port. + +http-request set-header <name> <fmt> [ { if | unless } <condition> ] + + This does the same as "http-request add-header" except that the header name + is first removed if it existed. This is useful when passing security + information to the server, where the header must not be manipulated by + external users. Note that the new value is computed before the removal so it + is possible to concatenate a value to an existing header. + + Example: + http-request set-header X-Haproxy-Current-Date %T + http-request set-header X-SSL %[ssl_fc] + http-request set-header X-SSL-Session_ID %[ssl_fc_session_id,hex] + http-request set-header X-SSL-Client-Verify %[ssl_c_verify] + http-request set-header X-SSL-Client-DN %{+Q}[ssl_c_s_dn] + http-request set-header X-SSL-Client-CN %{+Q}[ssl_c_s_dn(cn)] + http-request set-header X-SSL-Issuer %{+Q}[ssl_c_i_dn] + http-request set-header X-SSL-Client-NotBefore %{+Q}[ssl_c_notbefore] + http-request set-header X-SSL-Client-NotAfter %{+Q}[ssl_c_notafter] + +http-request set-log-level <level> [ { if | unless } <condition> ] + + This is used to change the log level of the current request when a certain + condition is met. Valid levels are the 8 syslog levels (see the "log" + keyword) plus the special level "silent" which disables logging for this + request. This rule is not final so the last matching rule wins. This rule + can be useful to disable health checks coming from another equipment. + +http-request set-map(<file-name>) <key fmt> <value fmt> + [ { if | unless } <condition> ] + + This is used to add a new entry into a MAP. The MAP must be loaded from a + file (even a dummy empty file). The file name of the MAP to be updated is + passed between parentheses. It takes 2 arguments: <key fmt>, which follows + log-format rules, used to collect MAP key, and <value fmt>, which follows + log-format rules, used to collect content for the new entry. + It performs a lookup in the MAP before insertion, to avoid duplicated (or + more) values. This lookup is done by a linear search and can be expensive + with large lists! It is the equivalent of the "set map" command from the + stats socket, but can be triggered by an HTTP request. + +http-request set-mark <mark> [ { if | unless } <condition> ] + + This is used to set the Netfilter/IPFW MARK on all packets sent to the client + to the value passed in <mark> on platforms which support it. This value is an + unsigned 32 bit value which can be matched by netfilter/ipfw and by the + routing table or monitoring the packets through DTrace. It can be expressed + both in decimal or hexadecimal format (prefixed by "0x"). + This can be useful to force certain packets to take a different route (for + example a cheaper network path for bulk downloads). This works on Linux + kernels 2.6.32 and above and requires admin privileges, as well on FreeBSD + and OpenBSD. + +http-request set-method <fmt> [ { if | unless } <condition> ] + + This rewrites the request method with the result of the evaluation of format + string <fmt>. There should be very few valid reasons for having to do so as + this is more likely to break something than to fix it. + +http-request set-nice <nice> [ { if | unless } <condition> ] + + This sets the "nice" factor of the current request being processed. It only + has effect against the other requests being processed at the same time. + The default value is 0, unless altered by the "nice" setting on the "bind" + line. The accepted range is -1024..1024. The higher the value, the nicest + the request will be. Lower values will make the request more important than + other ones. This can be useful to improve the speed of some requests, or + lower the priority of non-important requests. Using this setting without + prior experimentation can cause some major slowdown. + +http-request set-path <fmt> [ { if | unless } <condition> ] + + This rewrites the request path with the result of the evaluation of format + string <fmt>. The query string, if any, is left intact. If a scheme and + authority is found before the path, they are left intact as well. If the + request doesn't have a path ("*"), this one is replaced with the format. + This can be used to prepend a directory component in front of a path for + example. See also "http-request set-query" and "http-request set-uri". + + Example : + # prepend the host name before the path + http-request set-path /%[hdr(host)]%[path] + +http-request set-pathq <fmt> [ { if | unless } <condition> ] + + This does the same as "http-request set-path" except that the query-string is + also rewritten. It may be used to remove the query-string, including the + question mark (it is not possible using "http-request set-query"). + +http-request set-priority-class <expr> [ { if | unless } <condition> ] + + This is used to set the queue priority class of the current request. + The value must be a sample expression which converts to an integer in the + range -2047..2047. Results outside this range will be truncated. + The priority class determines the order in which queued requests are + processed. Lower values have higher priority. + +http-request set-priority-offset <expr> [ { if | unless } <condition> ] + + This is used to set the queue priority timestamp offset of the current + request. The value must be a sample expression which converts to an integer + in the range -524287..524287. Results outside this range will be truncated. + When a request is queued, it is ordered first by the priority class, then by + the current timestamp adjusted by the given offset in milliseconds. Lower + values have higher priority. + Note that the resulting timestamp is is only tracked with enough precision + for 524,287ms (8m44s287ms). If the request is queued long enough to where the + adjusted timestamp exceeds this value, it will be misidentified as highest + priority. Thus it is important to set "timeout queue" to a value, where when + combined with the offset, does not exceed this limit. + +http-request set-query <fmt> [ { if | unless } <condition> ] + + This rewrites the request's query string which appears after the first + question mark ("?") with the result of the evaluation of format string <fmt>. + The part prior to the question mark is left intact. If the request doesn't + contain a question mark and the new value is not empty, then one is added at + the end of the URI, followed by the new value. If a question mark was + present, it will never be removed even if the value is empty. This can be + used to add or remove parameters from the query string. + + See also "http-request set-query" and "http-request set-uri". + + Example: + # replace "%3D" with "=" in the query string + http-request set-query %[query,regsub(%3D,=,g)] + +http-request set-src <expr> [ { if | unless } <condition> ] + This is used to set the source IP address to the value of specified + expression. Useful when a proxy in front of HAProxy rewrites source IP, but + provides the correct IP in a HTTP header; or you want to mask source IP for + privacy. All subsequent calls to "src" fetch will return this value + (see example). + + Arguments : + <expr> Is a standard HAProxy expression formed by a sample-fetch followed + by some converters. + + See also "option forwardfor". + + Example: + http-request set-src hdr(x-forwarded-for) + http-request set-src src,ipmask(24) + + # After the masking this will track connections + # based on the IP address with the last byte zeroed out. + http-request track-sc0 src + + When possible, set-src preserves the original source port as long as the + address family allows it, otherwise the source port is set to 0. + +http-request set-src-port <expr> [ { if | unless } <condition> ] + + This is used to set the source port address to the value of specified + expression. + + Arguments: + <expr> Is a standard HAProxy expression formed by a sample-fetch followed + by some converters. + + Example: + http-request set-src-port hdr(x-port) + http-request set-src-port int(4000) + + When possible, set-src-port preserves the original source address as long as + the address family supports a port, otherwise it forces the source address to + IPv4 "0.0.0.0" before rewriting the port. + +http-request set-timeout { server | tunnel } { <timeout> | <expr> } + [ { if | unless } <condition> ] + + This action overrides the specified "server" or "tunnel" timeout for the + current stream only. The timeout can be specified in millisecond or with any + other unit if the number is suffixed by the unit as explained at the top of + this document. It is also possible to write an expression which must returns + a number interpreted as a timeout in millisecond. + + Note that the server/tunnel timeouts are only relevant on the backend side + and thus this rule is only available for the proxies with backend + capabilities. Also the timeout value must be non-null to obtain the expected + results. + + Example: + http-request set-timeout tunnel 5s + http-request set-timeout server req.hdr(host),map_int(host.lst) + +http-request set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. This value + represents the whole 8 bits of the IP TOS field, and can be expressed both in + decimal or hexadecimal format (prefixed by "0x"). Note that only the 6 higher + bits are used in DSCP or TOS, and the two lower bits are always 0. This can + be used to adjust some routing behavior on border routers based on some + information from the request. + + See RFC 2474, 2597, 3260 and 4594 for more information. + +http-request set-uri <fmt> [ { if | unless } <condition> ] + + This rewrites the request URI with the result of the evaluation of format + string <fmt>. The scheme, authority, path and query string are all replaced + at once. This can be used to rewrite hosts in front of proxies, or to perform + complex modifications to the URI such as moving parts between the path and + the query string. If an absolute URI is set, it will be sent as is to + HTTP/1.1 servers. If it is not the desired behavior, the host, the path + and/or the query string should be set separately. + See also "http-request set-path" and "http-request set-query". + +http-request set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +http-request set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. + + Arguments: + <var-name> The name of the variable starts with an indication about its + scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction + (request and response) + "req" : the variable is shared only during request + processing + "res" : the variable is shared only during response + processing + This prefix is followed by a name. The separator is a '.'. + The name may only contain characters 'a-z', 'A-Z', '0-9' + and '_'. + + <cond> A set of conditions that must all be true for the variable to + actually be set (such as "ifnotempty", "ifgt" ...). See the + set-var converter's description for a full list of possible + conditions. + + <expr> Is a standard HAProxy expression formed by a sample-fetch + followed by some converters. + + <fmt> This is the value expressed using log-format rules (see Custom + Log Format in section 8.2.4). + + Example: + http-request set-var(req.my_var) req.fhdr(user-agent),lower + http-request set-var-fmt(txn.from) %[src]:%[src_port] + +http-request silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. The effect it then that the client still sees an + established connection while there's none on HAProxy. The purpose is to + achieve a comparable effect to "tarpit" except that it doesn't use any local + resource at all on the machine running HAProxy. It can resist much higher + loads than "tarpit", and slow down stronger attackers. It is important to + understand the impact of using this mechanism. All stateful equipment placed + between the client and HAProxy (firewalls, proxies, load balancers) will also + keep the established connection for a long time and may suffer from this + action. + On modern Linux systems running with enough privileges, the TCP_REPAIR socket + option is used to block the emission of a TCP reset. On other systems, the + socket's TTL is reduced to 1 so that the TCP reset doesn't pass the first + router, though it's still delivered to local networks. Do not use it unless + you fully understand how it works. + +http-request strict-mode { on | off } [ { if | unless } <condition> ] + + This enables or disables the strict rewriting mode for following rules. It + does not affect rules declared before it and it is only applicable on rules + performing a rewrite on the requests. When the strict mode is enabled, any + rewrite failure triggers an internal error. Otherwise, such errors are + silently ignored. The purpose of the strict rewriting mode is to make some + rewrites optional while others must be performed to continue the request + processing. + + By default, the strict rewriting mode is enabled. Its value is also reset + when a ruleset evaluation ends. So, for instance, if you change the mode on + the frontend, the default mode is restored when HAProxy starts the backend + rules evaluation. + +http-request tarpit [deny_status <status>] [ { if | unless } <condition> ] +http-request tarpit [ { status | deny_status } <code>] [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <fmt> ]* + [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately blocks the request + without responding for a delay specified by "timeout tarpit" or + "timeout connect" if the former is not set. After that delay, if the client + is still connected, a response is returned so that the client does not + suspect it has been tarpitted. Logs will report the flags "PT". The goal of + the tarpit rule is to slow down robots during an attack when they're limited + on the number of concurrent requests. It can be very efficient against very + dumb robots, and will significantly reduce the load on firewalls compared to + a "deny" rule. But when facing "correctly" developed robots, it can make + things worse by forcing HAProxy and the front firewall to support insane + number of concurrent connections. By default an HTTP error 500 is returned. + But the response may be customized using same syntax than + "http-request return" rules. Thus, see "http-request return" for details. + For compatibility purpose, when no argument is defined, or only "deny_status", + the argument "default-errorfiles" is implied. It means + "http-request tarpit [deny_status <status>]" is an alias of + "http-request tarpit [status <status>] default-errorfiles". + No further "http-request" rules are evaluated. + See also "http-request return" and "http-request silent-drop". + +http-request track-sc0 <key> [table <table>] [ { if | unless } <condition> ] +http-request track-sc1 <key> [table <table>] [ { if | unless } <condition> ] +http-request track-sc2 <key> [table <table>] [ { if | unless } <condition> ] + + This enables tracking of sticky counters from current request. These rules do + not stop evaluation and do not change default action. The number of counters + that may be simultaneously tracked by the same connection is set in + MAX_SESS_STKCTR at build time (reported in haproxy -vv) which defaults to 3, + so the track-sc number is between 0 and (MAX_SESS_STKCTR-1). The first + "track-sc0" rule executed enables tracking of the counters of the specified + table as the first set. The first "track-sc1" rule executed enables tracking + of the counters of the specified table as the second set. The first + "track-sc2" rule executed enables tracking of the counters of the specified + table as the third set. It is a recommended practice to use the first set of + counters for the per-frontend counters and the second set for the per-backend + ones. But this is just a guideline, all may be used everywhere. + + Arguments : + <key> is mandatory, and is a sample expression rule as described in + section 7.3. It describes what elements of the incoming request or + connection will be analyzed, extracted, combined, and used to + select which table entry to update the counters. + + <table> is an optional table to be used instead of the default one, which + is the stick-table declared in the current proxy. All the counters + for the matches and updates for the key will then be performed in + that table until the session ends. + + Once a "track-sc*" rule is executed, the key is looked up in the table and if + it is not found, an entry is allocated for it. Then a pointer to that entry + is kept during all the session's life, and this entry's counters are updated + as often as possible, every time the session's counters are updated, and also + systematically when the session ends. Counters are only updated for events + that happen after the tracking has been started. As an exception, connection + counters and request counters are systematically updated so that they reflect + useful information. + + If the entry tracks concurrent connection counters, one connection is counted + for as long as the entry is tracked, and the entry will not expire during + that time. Tracking counters also provides a performance advantage over just + checking the keys, because only one table lookup is performed for all ACL + checks that make use of it. + +http-request unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. See above for details about <var-name>. + + Example: + http-request unset-var(req.my_var) + +http-request use-service <service-name> [ { if | unless } <condition> ] + + This directive executes the configured HTTP service to reply to the request + and stops the evaluation of the rules. An HTTP service may choose to reply by + sending any valid HTTP response or it may immediately close the connection + without sending any response. Outside natives services, for instance the + Prometheus exporter, it is possible to write your own services in Lua. No + further "http-request" rules are evaluated. + + Arguments : + <service-name> is mandatory. It is the service to call + + Example: + http-request use-service prometheus-exporter if { path /metrics } + +http-request wait-for-body time <time> [ at-least <bytes> ] + [ { if | unless } <condition> ] + + This will delay the processing of the request waiting for the payload for at + most <time> milliseconds. if "at-least" argument is specified, HAProxy stops + to wait the payload when the first <bytes> bytes are received. 0 means no + limit, it is the default value. Regardless the "at-least" argument value, + HAProxy stops to wait if the whole payload is received or if the request + buffer is full. This action may be used as a replacement to "option + http-buffer-request". + + Arguments : + + <time> is mandatory. It is the maximum time to wait for the body. It + follows the HAProxy time format and is expressed in milliseconds. + + <bytes> is optional. It is the minimum payload size to receive to stop to + wait. It follows the HAProxy size format and is expressed in + bytes. + + Example: + http-request wait-for-body time 1s at-least 1k if METH_POST + + See also : "option http-buffer-request" + +http-request wait-for-handshake [ { if | unless } <condition> ] + + This will delay the processing of the request until the SSL handshake + happened. This is mostly useful to delay processing early data until we're + sure they are valid. + + +http-response <action> <options...> [ { if | unless } <condition> ] + Access control for Layer 7 responses + + May be used in sections: defaults | frontend | listen | backend + yes(!) | yes | yes | yes + + The http-response statement defines a set of rules which apply to layer 7 + processing. The rules are evaluated in their declaration order when they are + met in a frontend, listen or backend section. Any rule may optionally be + followed by an ACL-based condition, in which case it will only be evaluated + if the condition is true. Since these rules apply on responses, the backend + rules are applied first, followed by the frontend's rules. + + The first keyword is the rule's action. Several types of actions are + supported: + - add-acl(<file-name>) <key fmt> + - add-header <name> <fmt> + - allow + - cache-store <name> + - capture <sample> id <id> + - del-acl(<file-name>) <key fmt> + - del-header <name> [ -m <meth> ] + - del-map(<file-name>) <key fmt> + - deny [ { status | deny_status } <code>] ... + - redirect <rule> + - replace-header <name> <regex-match> <replace-fmt> + - replace-value <name> <regex-match> <replace-fmt> + - return [status <code>] [content-type <type>] ... + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - send-spoe-group <engine-name> <group-name> + - set-header <name> <fmt> + - set-log-level <level> + - set-map(<file-name>) <key fmt> <value fmt> + - set-mark <mark> + - set-nice <nice> + - set-status <status> [reason <str>] + - set-tos <tos> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - silent-drop + - strict-mode { on | off } + - track-sc0 <key> [table <table>] + - track-sc1 <key> [table <table>] + - track-sc2 <key> [table <table>] + - unset-var(<var-name>) + - wait-for-body time <time> [ at-least <bytes> ] + + The supported actions are described below. + + There is no limit to the number of http-response statements per instance. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Example: + acl key_acl res.hdr(X-Acl-Key) -m found + + acl myhost hdr(Host) -f myhost.lst + + http-response add-acl(myhost.lst) %[res.hdr(X-Acl-Key)] if key_acl + http-response del-acl(myhost.lst) %[res.hdr(X-Acl-Key)] if key_acl + + Example: + acl value res.hdr(X-Value) -m found + + use_backend bk_appli if { hdr(Host),map_str(map.lst) -m found } + + http-response set-map(map.lst) %[src] %[res.hdr(X-Value)] if value + http-response del-map(map.lst) %[src] if ! value + + See also : "http-request", section 3.4 about userlists and section 7 about + ACL usage. + +http-response add-acl(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to add a new entry into an ACL. Please refer to "http-request + add-acl" for a complete description. + +http-response add-header <name> <fmt> [ { if | unless } <condition> ] + + This appends an HTTP header field whose name is specified in <name> and whose + value is defined by <fmt>. Please refer to "http-request add-header" for a + complete description. + +http-response allow [ { if | unless } <condition> ] + + This stops the evaluation of the rules and lets the response pass the check. + No further "http-response" rules are evaluated for the current section. + +http-response cache-store <name> [ { if | unless } <condition> ] + + See section 6.2 about cache setup. + +http-response capture <sample> id <id> [ { if | unless } <condition> ] + + This captures sample expression <sample> from the response buffer, and + converts it to a string. The resulting string is stored into the next request + "capture" slot, so it will possibly appear next to some captured HTTP + headers. It will then automatically appear in the logs, and it will be + possible to extract it using sample fetch rules to feed it into headers or + anything. Please check section 7.3 (Fetching samples) and + "capture response header" for more information. + + The keyword "id" is the id of the capture slot which is used for storing the + string. The capture slot must be defined in an associated frontend. + This is useful to run captures in backends. The slot id can be declared by a + previous directive "http-response capture" or with the "declare capture" + keyword. + + When using this action in a backend, double check that the relevant + frontend(s) have the required capture slots otherwise, this rule will be + ignored at run time. This can't be detected at configuration parsing time + due to HAProxy's ability to dynamically resolve backend name at runtime. + +http-response del-acl(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to delete an entry from an ACL. Please refer to "http-request + del-acl" for a complete description. + +http-response del-header <name> [ -m <meth> ] [ { if | unless } <condition> ] + + This removes all HTTP header fields whose name is specified in <name>. Please + refer to "http-request del-header" for a complete description. + +http-response del-map(<file-name>) <key fmt> [ { if | unless } <condition> ] + + This is used to delete an entry from a MAP. Please refer to "http-request + del-map" for a complete description. + +http-response deny [deny_status <status>] [ { if | unless } <condition> ] +http-response deny [ { status | deny_status } <code>] [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <fmt> ]* + [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately rejects the response. + By default an HTTP 502 error is returned. But the response may be customized + using same syntax than "http-response return" rules. Thus, see + "http-response return" for details. For compatibility purpose, when no + argument is defined, or only "deny_status", the argument "default-errorfiles" + is implied. It means "http-response deny [deny_status <status>]" is an alias + of "http-response deny [status <status>] default-errorfiles". + No further "http-response" rules are evaluated. + See also "http-response return". + +http-response redirect <rule> [ { if | unless } <condition> ] + + This performs an HTTP redirection based on a redirect rule. + This supports a format string similarly to "http-request redirect" rules, + with the exception that only the "location" type of redirect is possible on + the response. See the "redirect" keyword for the rule's syntax. When a + redirect rule is applied during a response, connections to the server are + closed so that no data can be forwarded from the server to the client. + +http-response replace-header <name> <regex-match> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "http-request replace-header" except that it works on the + server's response instead of the client's request. + + Example: + http-response replace-header Set-Cookie (C=[^;]*);(.*) \1;ip=%bi;\2 + + # applied to: + Set-Cookie: C=1; expires=Tue, 14-Jun-2016 01:40:45 GMT + + # outputs: + Set-Cookie: C=1;ip=192.168.1.20; expires=Tue, 14-Jun-2016 01:40:45 GMT + + # assuming the backend IP is 192.168.1.20. + +http-response replace-value <name> <regex-match> <replace-fmt> + [ { if | unless } <condition> ] + + This works like "http-request replace-value" except that it works on the + server's response instead of the client's request. + + Example: + http-response replace-value Cache-control ^public$ private + + # applied to: + Cache-Control: max-age=3600, public + + # outputs: + Cache-Control: max-age=3600, private + +http-response return [status <code>] [content-type <type>] + [ { default-errorfiles | errorfile <file> | errorfiles <name> | + file <file> | lf-file <file> | string <str> | lf-string <fmt> } ] + [ hdr <name> <value> ]* + [ { if | unless } <condition> ] + + This stops the evaluation of the rules and immediately returns a + response. Please refer to "http-request return" for a complete + description. No further "http-response" rules are evaluated. + +http-response sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] +http-response sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +http-response sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + These actions increment the General Purppose Counters according to the sticky + counter designated by <sc-id>. Please refer to "http-request sc-inc-gpc", + "http-request sc-inc-gpc0" and "http-request sc-inc-gpc1" for a complete + description. + +http-response sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] +http-response sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + These actions set the 32-bit unsigned General Purpose Tags according to the + sticky counter designated by <sc-id>. Please refer to "http-request + sc-inc-gpt" and "http-request sc-inc-gpt0" for a complete description. + +http-response send-spoe-group <engine-name> <group-name> + [ { if | unless } <condition> ] + + This action is used to trigger sending of a group of SPOE messages. Please + refer to "http-request send-spoe-group" for a complete description. + +http-response set-header <name> <fmt> [ { if | unless } <condition> ] + + This does the same as "http-response add-header" except that the header name + is first removed if it existed. This is useful when passing security + information to the server, where the header must not be manipulated by + external users. + +http-response set-log-level <level> [ { if | unless } <condition> ] + + This is used to change the log level of the current response. Please refer to + "http-request set-log-level" for a complete description. + +http-response set-map(<file-name>) <key fmt> <value fmt> + + This is used to add a new entry into a MAP. Please refer to "http-request + set-map" for a complete description. + +http-response set-mark <mark> [ { if | unless } <condition> ] + + This action is used to set the Netfilter/IPFW MARK in all packets sent to the + client to the value passed in <mark> on platforms which support it. Please + refer to "http-request set-mark" for a complete description. + +http-response set-nice <nice> [ { if | unless } <condition> ] + + This sets the "nice" factor of the current request being processed. Please + refer to "http-request set-nice" for a complete description. + +http-response set-status <status> [reason <str>] + [ { if | unless } <condition> ] + + This replaces the response status code with <status> which must be an integer + between 100 and 999. Optionally, a custom reason text can be provided defined + by <str>, or the default reason for the specified code will be used as a + fallback. + + Example: + # return "431 Request Header Fields Too Large" + http-response set-status 431 + # return "503 Slow Down", custom reason + http-response set-status 503 reason "Slow Down". + +http-response set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. Please refer to + "http-request set-tos" for a complete description. + +http-response set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +http-response set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +http-response silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. Please refer to "http-request silent-drop" for a + complete description. + +http-response strict-mode { on | off } [ { if | unless } <condition> ] + + This enables or disables the strict rewriting mode for following + rules. Please refer to "http-request strict-mode" for a complete description. + +http-response track-sc0 <key> [table <table>] [ { if | unless } <condition> ] +http-response track-sc1 <key> [table <table>] [ { if | unless } <condition> ] +http-response track-sc2 <key> [table <table>] [ { if | unless } <condition> ] + + This enables tracking of sticky counters from current connection. Please + refer to "http-request track-sc0", "http-request track-sc1" and "http-request + track-sc2" for a complete description. + +http-response unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. See "http-request set-var" for details + about <var-name>. + +http-response wait-for-body time <time> [ at-least <bytes> ] + [ { if | unless } <condition> ] + + This will delay the processing of the response waiting for the payload for at + most <time> milliseconds. Please refer to "http-request wait-for-body" for a + complete description. + + +http-reuse { never | safe | aggressive | always } + Declare how idle HTTP connections may be shared between requests + + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + By default, a connection established between HAProxy and the backend server + which is considered safe for reuse is moved back to the server's idle + connections pool so that any other request can make use of it. This is the + "safe" strategy below. + + The argument indicates the desired connection reuse strategy : + + - "never" : idle connections are never shared between sessions. This mode + may be enforced to cancel a different strategy inherited from + a defaults section or for troubleshooting. For example, if an + old bogus application considers that multiple requests over + the same connection come from the same client and it is not + possible to fix the application, it may be desirable to + disable connection sharing in a single backend. An example of + such an application could be an old HAProxy using cookie + insertion in tunnel mode and not checking any request past the + first one. + + - "safe" : this is the default and the recommended strategy. The first + request of a session is always sent over its own connection, + and only subsequent requests may be dispatched over other + existing connections. This ensures that in case the server + closes the connection when the request is being sent, the + browser can decide to silently retry it. Since it is exactly + equivalent to regular keep-alive, there should be no side + effects. There is also a special handling for the connections + using protocols subject to Head-of-line blocking (backend with + h2 or fcgi). In this case, when at least one stream is + processed, the used connection is reserved to handle streams + of the same session. When no more streams are processed, the + connection is released and can be reused. + + - "aggressive" : this mode may be useful in webservices environments where + all servers are not necessarily known and where it would be + appreciable to deliver most first requests over existing + connections. In this case, first requests are only delivered + over existing connections that have been reused at least once, + proving that the server correctly supports connection reuse. + It should only be used when it's sure that the client can + retry a failed request once in a while and where the benefit + of aggressive connection reuse significantly outweighs the + downsides of rare connection failures. + + - "always" : this mode is only recommended when the path to the server is + known for never breaking existing connections quickly after + releasing them. It allows the first request of a session to be + sent to an existing connection. This can provide a significant + performance increase over the "safe" strategy when the backend + is a cache farm, since such components tend to show a + consistent behavior and will benefit from the connection + sharing. It is recommended that the "http-keep-alive" timeout + remains low in this mode so that no dead connections remain + usable. In most cases, this will lead to the same performance + gains as "aggressive" but with more risks. It should only be + used when it improves the situation over "aggressive". + + When http connection sharing is enabled, a great care is taken to respect the + connection properties and compatibility. Indeed, some properties are specific + and it is not possibly to reuse it blindly. Those are the SSL SNI, source + and destination address and proxy protocol block. A connection is reused only + if it shares the same set of properties with the request. + + Also note that connections with certain bogus authentication schemes (relying + on the connection) like NTLM are marked private and never shared. + + A connection pool is involved and configurable with "pool-max-conn". + + Note: connection reuse improves the accuracy of the "server maxconn" setting, + because almost no new connection will be established while idle connections + remain available. This is particularly true with the "always" strategy. + + The rules to decide to keep an idle connection opened or to close it after + processing are also governed by the "tune.pool-low-fd-ratio" (default: 20%) + and "tune.pool-high-fd-ratio" (default: 25%). These correspond to the + percentage of total file descriptors spent in idle connections above which + haproxy will respectively refrain from keeping a connection opened after a + response, and actively kill idle connections. Some setups using a very high + ratio of idle connections, either because of too low a global "maxconn", or + due to a lot of HTTP/2 or HTTP/3 traffic on the frontend (few connections) + but HTTP/1 connections on the backend, may observe a lower reuse rate because + too few connections are kept open. It may be desirable in this case to adjust + such thresholds or simply to increase the global "maxconn" value. + + Similarly, when thread groups are explicitly enabled, it is important to + understand that idle connections are only usable between threads from a same + group. As such it may happen that unfair load between groups leads to more + idle connections being needed, causing a lower reuse rate. The same solution + may then be applied (increase global "maxconn" or increase pool ratios). + + See also : "option http-keep-alive", "server maxconn", "thread-groups", + "tune.pool-high-fd-ratio", "tune.pool-low-fd-ratio" + + +http-send-name-header [<header>] + Add the server name to a request. Use the header string given by <header> + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <header> The header string to use to send the server name + + The "http-send-name-header" statement causes the header field named <header> + to be set to the name of the target server at the moment the request is about + to be sent on the wire. Any existing occurrences of this header are removed. + Upon retries and redispatches, the header field is updated to always reflect + the server being attempted to connect to. Given that this header is modified + very late in the connection setup, it may have unexpected effects on already + modified headers. For example using it with transport-level header such as + connection, content-length, transfer-encoding and so on will likely result in + invalid requests being sent to the server. Additionally it has been reported + that this directive is currently being used as a way to overwrite the Host + header field in outgoing requests; while this trick has been known to work + as a side effect of the feature for some time, it is not officially supported + and might possibly not work anymore in a future version depending on the + technical difficulties this feature induces. A long-term solution instead + consists in fixing the application which required this trick so that it binds + to the correct host name. + + See also : "server" + +id <value> + Set a persistent ID to a proxy. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + Arguments : none + + Set a persistent ID for the proxy. This ID must be unique and positive. + An unused ID will automatically be assigned if unset. The first assigned + value will be 1. This ID is currently only returned in statistics. + + +ignore-persist { if | unless } <condition> + Declare a condition to ignore persistence + May be used in sections: defaults | frontend | listen | backend + no | no | yes | yes + + By default, when cookie persistence is enabled, every requests containing + the cookie are unconditionally persistent (assuming the target server is up + and running). + + The "ignore-persist" statement allows one to declare various ACL-based + conditions which, when met, will cause a request to ignore persistence. + This is sometimes useful to load balance requests for static files, which + often don't require persistence. This can also be used to fully disable + persistence for a specific User-Agent (for example, some web crawler bots). + + The persistence is ignored when an "if" condition is met, or unless an + "unless" condition is met. + + Example: + acl url_static path_beg /static /images /img /css + acl url_static path_end .gif .png .jpg .css .js + ignore-persist if url_static + + See also : "force-persist", "cookie", and section 7 about ACL usage. + +load-server-state-from-file { global | local | none } + Allow seamless reload of HAProxy + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + This directive points HAProxy to a file where server state from previous + running process has been saved. That way, when starting up, before handling + traffic, the new process can apply old states to servers exactly has if no + reload occurred. The purpose of the "load-server-state-from-file" directive is + to tell HAProxy which file to use. For now, only 2 arguments to either prevent + loading state or load states from a file containing all backends and servers. + The state file can be generated by running the command "show servers state" + over the stats socket and redirect output. + + The format of the file is versioned and is very specific. To understand it, + please read the documentation of the "show servers state" command (chapter + 9.3 of Management Guide). + + Arguments: + global load the content of the file pointed by the global directive + named "server-state-file". + + local load the content of the file pointed by the directive + "server-state-file-name" if set. If not set, then the backend + name is used as a file name. + + none don't load any stat for this backend + + Notes: + - server's IP address is preserved across reloads by default, but the + order can be changed thanks to the server's "init-addr" setting. This + means that an IP address change performed on the CLI at run time will + be preserved, and that any change to the local resolver (e.g. /etc/hosts) + will possibly not have any effect if the state file is in use. + + - server's weight is applied from previous running process unless it has + has changed between previous and new configuration files. + + Example: Minimal configuration + + global + stats socket /tmp/socket + server-state-file /tmp/server_state + + defaults + load-server-state-from-file global + + backend bk + server s1 127.0.0.1:22 check weight 11 + server s2 127.0.0.1:22 check weight 12 + + + Then one can run : + + socat /tmp/socket - <<< "show servers state" > /tmp/server_state + + Content of the file /tmp/server_state would be like this: + + 1 + # <field names skipped for the doc example> + 1 bk 1 s1 127.0.0.1 2 0 11 11 4 6 3 4 6 0 0 + 1 bk 2 s2 127.0.0.1 2 0 12 12 4 6 3 4 6 0 0 + + Example: Minimal configuration + + global + stats socket /tmp/socket + server-state-base /etc/haproxy/states + + defaults + load-server-state-from-file local + + backend bk + server s1 127.0.0.1:22 check weight 11 + server s2 127.0.0.1:22 check weight 12 + + + Then one can run : + + socat /tmp/socket - <<< "show servers state bk" > /etc/haproxy/states/bk + + Content of the file /etc/haproxy/states/bk would be like this: + + 1 + # <field names skipped for the doc example> + 1 bk 1 s1 127.0.0.1 2 0 11 11 4 6 3 4 6 0 0 + 1 bk 2 s2 127.0.0.1 2 0 12 12 4 6 3 4 6 0 0 + + See also: "server-state-file", "server-state-file-name", and + "show servers state" + + +log global +log <address> [len <length>] [format <format>] [sample <ranges>:<sample_size>] + <facility> [<level> [<minlevel>]] +no log + Enable per-instance logging of events and traffic. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + + Prefix : + no should be used when the logger list must be flushed. For example, + if you don't want to inherit from the default logger list. This + prefix does not allow arguments. + + Arguments : + global should be used when the instance's logging parameters are the + same as the global ones. This is the most common usage. "global" + replaces <address>, <facility> and <level> with those of the log + entries found in the "global" section. Only one "log global" + statement may be used per instance, and this form takes no other + parameter. + + <address> indicates where to send the logs. It takes the same format as + for the "global" section's logs, and can be one of : + + - An IPv4 address optionally followed by a colon (':') and a UDP + port. If no port is specified, 514 is used by default (the + standard syslog port). + + - An IPv6 address followed by a colon (':') and optionally a UDP + port. If no port is specified, 514 is used by default (the + standard syslog port). + + - A filesystem path to a UNIX domain socket, keeping in mind + considerations for chroot (be sure the path is accessible + inside the chroot) and uid/gid (be sure the path is + appropriately writable). + + - A file descriptor number in the form "fd@<number>", which may + point to a pipe, terminal, or socket. In this case unbuffered + logs are used and one writev() call per log is performed. This + is a bit expensive but acceptable for most workloads. Messages + sent this way will not be truncated but may be dropped, in + which case the DroppedLogs counter will be incremented. The + writev() call is atomic even on pipes for messages up to + PIPE_BUF size, which POSIX recommends to be at least 512 and + which is 4096 bytes on most modern operating systems. Any + larger message may be interleaved with messages from other + processes. Exceptionally for debugging purposes the file + descriptor may also be directed to a file, but doing so will + significantly slow HAProxy down as non-blocking calls will be + ignored. Also there will be no way to purge nor rotate this + file without restarting the process. Note that the configured + syslog format is preserved, so the output is suitable for use + with a TCP syslog server. See also the "short" and "raw" + formats below. + + - "stdout" / "stderr", which are respectively aliases for "fd@1" + and "fd@2", see above. + + - A ring buffer in the form "ring@<name>", which will correspond + to an in-memory ring buffer accessible over the CLI using the + "show events" command, which will also list existing rings and + their sizes. Such buffers are lost on reload or restart but + when used as a complement this can help troubleshooting by + having the logs instantly available. + + - An explicit stream address prefix such as "tcp@","tcp6@", + "tcp4@" or "uxst@" will allocate an implicit ring buffer with + a stream forward server targeting the given address. + + You may want to reference some environment variables in the + address parameter, see section 2.3 about environment variables. + + <length> is an optional maximum line length. Log lines larger than this + value will be truncated before being sent. The reason is that + syslog servers act differently on log line length. All servers + support the default value of 1024, but some servers simply drop + larger lines while others do log them. If a server supports long + lines, it may make sense to set this value here in order to avoid + truncating long lines. Similarly, if a server drops long lines, + it is preferable to truncate them before sending them. Accepted + values are 80 to 65535 inclusive. The default value of 1024 is + generally fine for all standard usages. Some specific cases of + long captures or JSON-formatted logs may require larger values. + + <ranges> A list of comma-separated ranges to identify the logs to sample. + This is used to balance the load of the logs to send to the log + server. The limits of the ranges cannot be null. They are numbered + from 1. The size or period (in number of logs) of the sample must + be set with <sample_size> parameter. + + <sample_size> + The size of the sample in number of logs to consider when balancing + their logging loads. It is used to balance the load of the logs to + send to the syslog server. This size must be greater or equal to the + maximum of the high limits of the ranges. + (see also <ranges> parameter). + + <format> is the log format used when generating syslog messages. It may be + one of the following : + + local Analog to rfc3164 syslog message format except that hostname + field is stripped. This is the default. + Note: option "log-send-hostname" switches the default to + rfc3164. + + rfc3164 The RFC3164 syslog message format. + (https://tools.ietf.org/html/rfc3164) + + rfc5424 The RFC5424 syslog message format. + (https://tools.ietf.org/html/rfc5424) + + priority A message containing only a level plus syslog facility between + angle brackets such as '<63>', followed by the text. The PID, + date, time, process name and system name are omitted. This is + designed to be used with a local log server. + + short A message containing only a level between angle brackets such as + '<3>', followed by the text. The PID, date, time, process name + and system name are omitted. This is designed to be used with a + local log server. This format is compatible with what the + systemd logger consumes. + + timed A message containing only a level between angle brackets such as + '<3>', followed by ISO date and by the text. The PID, process + name and system name are omitted. This is designed to be + used with a local log server. + + iso A message containing only the ISO date, followed by the text. + The PID, process name and system name are omitted. This is + designed to be used with a local log server. + + raw A message containing only the text. The level, PID, date, time, + process name and system name are omitted. This is designed to + be used in containers or during development, where the severity + only depends on the file descriptor used (stdout/stderr). + + <facility> must be one of the 24 standard syslog facilities : + + kern user mail daemon auth syslog lpr news + uucp cron auth2 ftp ntp audit alert cron2 + local0 local1 local2 local3 local4 local5 local6 local7 + + Note that the facility is ignored for the "short" and "raw" + formats, but still required as a positional field. It is + recommended to use "daemon" in this case to make it clear that + it's only supposed to be used locally. + + <level> is optional and can be specified to filter outgoing messages. By + default, all messages are sent. If a level is specified, only + messages with a severity at least as important as this level + will be sent. An optional minimum level can be specified. If it + is set, logs emitted with a more severe level than this one will + be capped to this level. This is used to avoid sending "emerg" + messages on all terminals on some default syslog configurations. + Eight levels are known : + + emerg alert crit err warning notice info debug + + It is important to keep in mind that it is the frontend which decides what to + log from a connection, and that in case of content switching, the log entries + from the backend will be ignored. Connections are logged at level "info". + + However, backend log declaration define how and where servers status changes + will be logged. Level "notice" will be used to indicate a server going up, + "warning" will be used for termination signals and definitive service + termination, and "alert" will be used for when a server goes down. + + Note : According to RFC3164, messages are truncated to 1024 bytes before + being emitted. + + Example : + log global + log stdout format short daemon # send log to systemd + log stdout format raw daemon # send everything to stdout + log stderr format raw daemon notice # send important events to stderr + log 127.0.0.1:514 local0 notice # only send important events + log tcp@127.0.0.1:514 local0 notice notice # same but limit output + # level and send in tcp + log "${LOCAL_SYSLOG}:514" local0 notice # send to local server + + +log-format <string> + Specifies the log format string to use for traffic logs + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | no + + This directive specifies the log format string that will be used for all logs + resulting from traffic passing through the frontend using this line. If the + directive is used in a defaults section, all subsequent frontends will use + the same log format. Please see section 8.2.4 which covers the log format + string in depth. + A specific log-format used only in case of connection error can also be + defined, see the "error-log-format" option. + + "log-format" directive overrides previous "option tcplog", "log-format", + "option httplog" and "option httpslog" directives. + +log-format-sd <string> + Specifies the RFC5424 structured-data log format string + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | no + + This directive specifies the RFC5424 structured-data log format string that + will be used for all logs resulting from traffic passing through the frontend + using this line. If the directive is used in a defaults section, all + subsequent frontends will use the same log format. Please see section 8.2.4 + which covers the log format string in depth. + + See https://tools.ietf.org/html/rfc5424#section-6.3 for more information + about the RFC5424 structured-data part. + + Note : This log format string will be used only for loggers that have set + log format to "rfc5424". + + Example : + log-format-sd [exampleSDID@1234\ bytes=\"%B\"\ status=\"%ST\"] + + +log-tag <string> + Specifies the log tag to use for all outgoing logs + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + + Sets the tag field in the syslog header to this string. It defaults to the + log-tag set in the global section, otherwise the program name as launched + from the command line, which usually is "HAProxy". Sometimes it can be useful + to differentiate between multiple processes running on the same host, or to + differentiate customer instances running in the same process. In the backend, + logs about servers up/down will use this tag. As a hint, it can be convenient + to set a log-tag related to a hosted customer in a defaults section then put + all the frontends and backends for that customer, then start another customer + in a new defaults section. See also the global "log-tag" directive. + +max-keep-alive-queue <value> + Set the maximum server queue size for maintaining keep-alive connections + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + HTTP keep-alive tries to reuse the same server connection whenever possible, + but sometimes it can be counter-productive, for example if a server has a lot + of connections while other ones are idle. This is especially true for static + servers. + + The purpose of this setting is to set a threshold on the number of queued + connections at which HAProxy stops trying to reuse the same server and prefers + to find another one. The default value, -1, means there is no limit. A value + of zero means that keep-alive requests will never be queued. For very close + servers which can be reached with a low latency and which are not sensible to + breaking keep-alive, a low value is recommended (e.g. local static server can + use a value of 10 or less). For remote servers suffering from a high latency, + higher values might be needed to cover for the latency and/or the cost of + picking a different server. + + Note that this has no impact on responses which are maintained to the same + server consecutively to a 401 response. They will still go to the same server + even if they have to be queued. + + See also : "option http-server-close", "option prefer-last-server", server + "maxconn" and cookie persistence. + +max-session-srv-conns <nb> + Set the maximum number of outgoing connections we can keep idling for a given + client session. The default is 5 (it precisely equals MAX_SRV_LIST which is + defined at build time). + +maxconn <conns> + Fix the maximum number of concurrent connections on a frontend + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <conns> is the maximum number of concurrent connections the frontend will + accept to serve. Excess connections will be queued by the system + in the socket's listen queue and will be served once a connection + closes. + + If the system supports it, it can be useful on big sites to raise this limit + very high so that HAProxy manages connection queues, instead of leaving the + clients with unanswered connection attempts. This value should not exceed the + global maxconn. Also, keep in mind that a connection contains two buffers + of tune.bufsize (16kB by default) each, as well as some other data resulting + in about 33 kB of RAM being consumed per established connection. That means + that a medium system equipped with 1GB of RAM can withstand around + 20000-25000 concurrent connections if properly tuned. + + Also, when <conns> is set to large values, it is possible that the servers + are not sized to accept such loads, and for this reason it is generally wise + to assign them some reasonable connection limits. + + When this value is set to zero, which is the default, the global "maxconn" + value is used. + + See also : "server", global section's "maxconn", "fullconn" + + +mode { tcp|http } + Set the running mode or protocol of the instance + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + tcp The instance will work in pure TCP mode. A full-duplex connection + will be established between clients and servers, and no layer 7 + examination will be performed. This is the default mode. It + should be used for SSL, SSH, SMTP, ... + + http The instance will work in HTTP mode. The client request will be + analyzed in depth before connecting to any server. Any request + which is not RFC-compliant will be rejected. Layer 7 filtering, + processing and switching will be possible. This is the mode which + brings HAProxy most of its value. + + When doing content switching, it is mandatory that the frontend and the + backend are in the same mode (generally HTTP), otherwise the configuration + will be refused. + + Example : + defaults http_instances + mode http + + +monitor fail { if | unless } <condition> + Add a condition to report a failure to a monitor HTTP request. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + if <cond> the monitor request will fail if the condition is satisfied, + and will succeed otherwise. The condition should describe a + combined test which must induce a failure if all conditions + are met, for instance a low number of servers both in a + backend and its backup. + + unless <cond> the monitor request will succeed only if the condition is + satisfied, and will fail otherwise. Such a condition may be + based on a test on the presence of a minimum number of active + servers in a list of backends. + + This statement adds a condition which can force the response to a monitor + request to report a failure. By default, when an external component queries + the URI dedicated to monitoring, a 200 response is returned. When one of the + conditions above is met, HAProxy will return 503 instead of 200. This is + very useful to report a site failure to an external component which may base + routing advertisements between multiple sites on the availability reported by + HAProxy. In this case, one would rely on an ACL involving the "nbsrv" + criterion. Note that "monitor fail" only works in HTTP mode. Both status + messages may be tweaked using "errorfile" or "errorloc" if needed. + + Example: + frontend www + mode http + acl site_dead nbsrv(dynamic) lt 2 + acl site_dead nbsrv(static) lt 2 + monitor-uri /site_alive + monitor fail if site_dead + + See also : "monitor-uri", "errorfile", "errorloc" + + +monitor-uri <uri> + Intercept a URI used by external components' monitor requests + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <uri> is the exact URI which we want to intercept to return HAProxy's + health status instead of forwarding the request. + + When an HTTP request referencing <uri> will be received on a frontend, + HAProxy will not forward it nor log it, but instead will return either + "HTTP/1.0 200 OK" or "HTTP/1.0 503 Service unavailable", depending on failure + conditions defined with "monitor fail". This is normally enough for any + front-end HTTP probe to detect that the service is UP and running without + forwarding the request to a backend server. Note that the HTTP method, the + version and all headers are ignored, but the request must at least be valid + at the HTTP level. This keyword may only be used with an HTTP-mode frontend. + + Monitor requests are processed very early, just after the request is parsed + and even before any "http-request". The only rulesets applied before are the + tcp-request ones. They cannot be logged either, and it is the intended + purpose. Only one URI may be configured for monitoring; when multiple + "monitor-uri" statements are present, the last one will define the URI to + be used. They are only used to report HAProxy's health to an upper component, + nothing more. However, it is possible to add any number of conditions using + "monitor fail" and ACLs so that the result can be adjusted to whatever check + can be imagined (most often the number of available servers in a backend). + + Note: if <uri> starts by a slash ('/'), the matching is performed against the + request's path instead of the request's uri. It is a workaround to let + the HTTP/2 requests match the monitor-uri. Indeed, in HTTP/2, clients + are encouraged to send absolute URIs only. + + Example : + # Use /haproxy_test to report HAProxy's status + frontend www + mode http + monitor-uri /haproxy_test + + See also : "monitor fail" + + +option abortonclose +no option abortonclose + Enable or disable early dropping of aborted requests pending in queues. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + In presence of very high loads, the servers will take some time to respond. + The per-instance connection queue will inflate, and the response time will + increase respective to the size of the queue times the average per-session + response time. When clients will wait for more than a few seconds, they will + often hit the "STOP" button on their browser, leaving a useless request in + the queue, and slowing down other users, and the servers as well, because the + request will eventually be served, then aborted at the first error + encountered while delivering the response. + + As there is no way to distinguish between a full STOP and a simple output + close on the client side, HTTP agents should be conservative and consider + that the client might only have closed its output channel while waiting for + the response. However, this introduces risks of congestion when lots of users + do the same, and is completely useless nowadays because probably no client at + all will close the session while waiting for the response. Some HTTP agents + support this behavior (Squid, Apache, HAProxy), and others do not (TUX, most + hardware-based load balancers). So the probability for a closed input channel + to represent a user hitting the "STOP" button is close to 100%, and the risk + of being the single component to break rare but valid traffic is extremely + low, which adds to the temptation to be able to abort a session early while + still not served and not pollute the servers. + + In HAProxy, the user can choose the desired behavior using the option + "abortonclose". By default (without the option) the behavior is HTTP + compliant and aborted requests will be served. But when the option is + specified, a session with an incoming channel closed will be aborted while + it is still possible, either pending in the queue for a connection slot, or + during the connection establishment if the server has not yet acknowledged + the connection request. This considerably reduces the queue size and the load + on saturated servers when users are tempted to click on STOP, which in turn + reduces the response time for other users. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "timeout queue" and server's "maxconn" and "maxqueue" parameters + + +option accept-invalid-http-request +no option accept-invalid-http-request + Enable or disable relaxing of HTTP request parsing + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, HAProxy complies with RFC7230 in terms of message parsing. This + means that invalid characters in header names are not permitted and cause an + error to be returned to the client. This is the desired behavior as such + forbidden characters are essentially used to build attacks exploiting server + weaknesses, and bypass security filtering. Sometimes, a buggy browser or + server will emit invalid header names for whatever reason (configuration, + implementation) and the issue will not be immediately fixed. In such a case, + it is possible to relax HAProxy's header name parser to accept any character + even if that does not make sense, by specifying this option. Similarly, the + list of characters allowed to appear in a URI is well defined by RFC3986, and + chars 0-31, 32 (space), 34 ('"'), 60 ('<'), 62 ('>'), 92 ('\'), 94 ('^'), 96 + ('`'), 123 ('{'), 124 ('|'), 125 ('}'), 127 (delete) and anything above are + not allowed at all. HAProxy always blocks a number of them (0..32, 127). The + remaining ones are blocked by default unless this option is enabled. This + option also relaxes the test on the HTTP version, it allows HTTP/0.9 requests + to pass through (no version specified), as well as different protocol names + (e.g. RTSP), and multiple digits for both the major and the minor version. + + This option should never be enabled by default as it hides application bugs + and open security breaches. It should only be deployed after a problem has + been confirmed. + + When this option is enabled, erroneous header names will still be accepted in + requests, but the complete request will be captured in order to permit later + analysis using the "show errors" request on the UNIX stats socket. Similarly, + requests containing invalid chars in the URI part will be logged. Doing this + also helps confirming that the issue has been solved. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option accept-invalid-http-response" and "show errors" on the + stats socket. + + +option accept-invalid-http-response +no option accept-invalid-http-response + Enable or disable relaxing of HTTP response parsing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + By default, HAProxy complies with RFC7230 in terms of message parsing. This + means that invalid characters in header names are not permitted and cause an + error to be returned to the client. This is the desired behavior as such + forbidden characters are essentially used to build attacks exploiting server + weaknesses, and bypass security filtering. Sometimes, a buggy browser or + server will emit invalid header names for whatever reason (configuration, + implementation) and the issue will not be immediately fixed. In such a case, + it is possible to relax HAProxy's header name parser to accept any character + even if that does not make sense, by specifying this option. This option also + relaxes the test on the HTTP version format, it allows multiple digits for + both the major and the minor version. + + This option should never be enabled by default as it hides application bugs + and open security breaches. It should only be deployed after a problem has + been confirmed. + + When this option is enabled, erroneous header names will still be accepted in + responses, but the complete response will be captured in order to permit + later analysis using the "show errors" request on the UNIX stats socket. + Doing this also helps confirming that the issue has been solved. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option accept-invalid-http-request" and "show errors" on the + stats socket. + + +option allbackups +no option allbackups + Use either all backup servers at a time or only the first one + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + By default, the first operational backup server gets all traffic when normal + servers are all down. Sometimes, it may be preferred to use multiple backups + at once, because one will not be enough. When "option allbackups" is enabled, + the load balancing will be performed among all backup servers when all normal + ones are unavailable. The same load balancing algorithm will be used and the + servers' weights will be respected. Thus, there will not be any priority + order between the backup servers anymore. + + This option is mostly used with static server farms dedicated to return a + "sorry" page when an application is completely offline. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + +option checkcache +no option checkcache + Analyze all server responses and block responses with cacheable cookies + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + Some high-level frameworks set application cookies everywhere and do not + always let enough control to the developer to manage how the responses should + be cached. When a session cookie is returned on a cacheable object, there is a + high risk of session crossing or stealing between users traversing the same + caches. In some situations, it is better to block the response than to let + some sensitive session information go in the wild. + + The option "checkcache" enables deep inspection of all server responses for + strict compliance with HTTP specification in terms of cacheability. It + carefully checks "Cache-control", "Pragma" and "Set-cookie" headers in server + response to check if there's a risk of caching a cookie on a client-side + proxy. When this option is enabled, the only responses which can be delivered + to the client are : + - all those without "Set-Cookie" header; + - all those with a return code other than 200, 203, 204, 206, 300, 301, + 404, 405, 410, 414, 501, provided that the server has not set a + "Cache-control: public" header field; + - all those that result from a request using a method other than GET, HEAD, + OPTIONS, TRACE, provided that the server has not set a 'Cache-Control: + public' header field; + - those with a 'Pragma: no-cache' header + - those with a 'Cache-control: private' header + - those with a 'Cache-control: no-store' header + - those with a 'Cache-control: max-age=0' header + - those with a 'Cache-control: s-maxage=0' header + - those with a 'Cache-control: no-cache' header + - those with a 'Cache-control: no-cache="set-cookie"' header + - those with a 'Cache-control: no-cache="set-cookie,' header + (allowing other fields after set-cookie) + + If a response doesn't respect these requirements, then it will be blocked + just as if it was from an "http-response deny" rule, with an "HTTP 502 bad + gateway". The session state shows "PH--" meaning that the proxy blocked the + response during headers processing. Additionally, an alert will be sent in + the logs so that admins are informed that there's something to be fixed. + + Due to the high impact on the application, the application should be tested + in depth with the option enabled before going to production. It is also a + good practice to always activate it during tests, even if it is not used in + production, as it will report potentially dangerous application behaviors. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + +option clitcpka +no option clitcpka + Enable or disable the sending of TCP keepalive packets on the client side + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + When there is a firewall or any session-aware component between a client and + a server, and when the protocol involves very long sessions with long idle + periods (e.g. remote desktops), there is a risk that one of the intermediate + components decides to expire a session which has remained idle for too long. + + Enabling socket-level TCP keep-alives makes the system regularly send packets + to the other end of the connection, leaving it active. The delay between + keep-alive probes is controlled by the system only and depends both on the + operating system and its tuning parameters. + + It is important to understand that keep-alive packets are neither emitted nor + received at the application level. It is only the network stacks which sees + them. For this reason, even if one side of the proxy already uses keep-alives + to maintain its connection alive, those keep-alive packets will not be + forwarded to the other side of the proxy. + + Please note that this has nothing to do with HTTP keep-alive. + + Using option "clitcpka" enables the emission of TCP keep-alive probes on the + client side of a connection, which should help when session expirations are + noticed between HAProxy and a client. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option srvtcpka", "option tcpka" + + +option contstats + Enable continuous traffic statistics updates + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, counters used for statistics calculation are incremented + only when a session finishes. It works quite well when serving small + objects, but with big ones (for example large images or archives) or + with A/V streaming, a graph generated from HAProxy counters looks like + a hedgehog. With this option enabled counters get incremented frequently + along the session, typically every 5 seconds, which is often enough to + produce clean graphs. Recounting touches a hotpath directly so it is not + not enabled by default, as it can cause a lot of wakeups for very large + session counts and cause a small performance drop. + +option disable-h2-upgrade +no option disable-h2-upgrade + Enable or disable the implicit HTTP/2 upgrade from an HTTP/1.x client + connection. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, HAProxy is able to implicitly upgrade an HTTP/1.x client + connection to an HTTP/2 connection if the first request it receives from a + given HTTP connection matches the HTTP/2 connection preface (i.e. the string + "PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n"). This way, it is possible to support + HTTP/1.x and HTTP/2 clients on a non-SSL connections. This option must be + used to disable the implicit upgrade. Note this implicit upgrade is only + supported for HTTP proxies, thus this option too. Note also it is possible to + force the HTTP/2 on clear connections by specifying "proto h2" on the bind + line. Finally, this option is applied on all bind lines. To disable implicit + HTTP/2 upgrades for a specific bind line, it is possible to use "proto h1". + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + +option dontlog-normal +no option dontlog-normal + Enable or disable logging of normal, successful connections + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + There are large sites dealing with several thousand connections per second + and for which logging is a major pain. Some of them are even forced to turn + logs off and cannot debug production issues. Setting this option ensures that + normal connections, those which experience no error, no timeout, no retry nor + redispatch, will not be logged. This leaves disk space for anomalies. In HTTP + mode, the response status code is checked and return codes 5xx will still be + logged. + + It is strongly discouraged to use this option as most of the time, the key to + complex issues is in the normal logs which will not be logged here. If you + need to separate logs, see the "log-separate-errors" option instead. + + See also : "log", "dontlognull", "log-separate-errors" and section 8 about + logging. + + +option dontlognull +no option dontlognull + Enable or disable logging of null connections + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + In certain environments, there are components which will regularly connect to + various systems to ensure that they are still alive. It can be the case from + another load balancer as well as from monitoring systems. By default, even a + simple port probe or scan will produce a log. If those connections pollute + the logs too much, it is possible to enable option "dontlognull" to indicate + that a connection on which no data has been transferred will not be logged, + which typically corresponds to those probes. Note that errors will still be + returned to the client and accounted for in the stats. If this is not what is + desired, option http-ignore-probes can be used instead. + + It is generally recommended not to use this option in uncontrolled + environments (e.g. internet), otherwise scans and other malicious activities + would not be logged. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "log", "http-ignore-probes", "monitor-uri", and + section 8 about logging. + + +option forwardfor [ except <network> ] [ header <name> ] [ if-none ] + Enable insertion of the X-Forwarded-For header to requests sent to servers + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <network> is an optional argument used to disable this option for sources + matching <network> + <name> an optional argument to specify a different "X-Forwarded-For" + header name. + + Since HAProxy works in reverse-proxy mode, the servers see its IP address as + their client address. This is sometimes annoying when the client's IP address + is expected in server logs. To solve this problem, the well-known HTTP header + "X-Forwarded-For" may be added by HAProxy to all requests sent to the server. + This header contains a value representing the client's IP address. Since this + header is always appended at the end of the existing header list, the server + must be configured to always use the last occurrence of this header only. See + the server's manual to find how to enable use of this standard header. Note + that only the last occurrence of the header must be used, since it is really + possible that the client has already brought one. + + The keyword "header" may be used to supply a different header name to replace + the default "X-Forwarded-For". This can be useful where you might already + have a "X-Forwarded-For" header from a different application (e.g. stunnel), + and you need preserve it. Also if your backend server doesn't use the + "X-Forwarded-For" header and requires different one (e.g. Zeus Web Servers + require "X-Cluster-Client-IP"). + + Sometimes, a same HAProxy instance may be shared between a direct client + access and a reverse-proxy access (for instance when an SSL reverse-proxy is + used to decrypt HTTPS traffic). It is possible to disable the addition of the + header for a known source address or network by adding the "except" keyword + followed by the network address. In this case, any source IP matching the + network will not cause an addition of this header. Most common uses are with + private networks or 127.0.0.1. IPv4 and IPv6 are both supported. + + Alternatively, the keyword "if-none" states that the header will only be + added if it is not present. This should only be used in perfectly trusted + environment, as this might cause a security issue if headers reaching HAProxy + are under the control of the end-user. + + This option may be specified either in the frontend or in the backend. If at + least one of them uses it, the header will be added. Note that the backend's + setting of the header subargument takes precedence over the frontend's if + both are defined. In the case of the "if-none" argument, if at least one of + the frontend or the backend does not specify it, it wants the addition to be + mandatory, so it wins. + + Example : + # Public HTTP address also used by stunnel on the same machine + frontend www + mode http + option forwardfor except 127.0.0.1 # stunnel already adds the header + + # Those servers want the IP Address in X-Client + backend www + mode http + option forwardfor header X-Client + + See also : "option httpclose", "option http-server-close", + "option http-keep-alive" + + +option h1-case-adjust-bogus-client +no option h1-case-adjust-bogus-client + Enable or disable the case adjustment of HTTP/1 headers sent to bogus clients + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + There is no standard case for header names because, as stated in RFC7230, + they are case-insensitive. So applications must handle them in a case- + insensitive manner. But some bogus applications violate the standards and + erroneously rely on the cases most commonly used by browsers. This problem + becomes critical with HTTP/2 because all header names must be exchanged in + lower case, and HAProxy follows the same convention. All header names are + sent in lower case to clients and servers, regardless of the HTTP version. + + When HAProxy receives an HTTP/1 response, its header names are converted to + lower case and manipulated and sent this way to the clients. If a client is + known to violate the HTTP standards and to fail to process a response coming + from HAProxy, it is possible to transform the lower case header names to a + different format when the response is formatted and sent to the client, by + enabling this option and specifying the list of headers to be reformatted + using the global directives "h1-case-adjust" or "h1-case-adjust-file". This + must only be a temporary workaround for the time it takes the client to be + fixed, because clients which require such workarounds might be vulnerable to + content smuggling attacks and must absolutely be fixed. + + Please note that this option will not affect standards-compliant clients. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also: "option h1-case-adjust-bogus-server", "h1-case-adjust", + "h1-case-adjust-file". + + +option h1-case-adjust-bogus-server +no option h1-case-adjust-bogus-server + Enable or disable the case adjustment of HTTP/1 headers sent to bogus servers + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + There is no standard case for header names because, as stated in RFC7230, + they are case-insensitive. So applications must handle them in a case- + insensitive manner. But some bogus applications violate the standards and + erroneously rely on the cases most commonly used by browsers. This problem + becomes critical with HTTP/2 because all header names must be exchanged in + lower case, and HAProxy follows the same convention. All header names are + sent in lower case to clients and servers, regardless of the HTTP version. + + When HAProxy receives an HTTP/1 request, its header names are converted to + lower case and manipulated and sent this way to the servers. If a server is + known to violate the HTTP standards and to fail to process a request coming + from HAProxy, it is possible to transform the lower case header names to a + different format when the request is formatted and sent to the server, by + enabling this option and specifying the list of headers to be reformatted + using the global directives "h1-case-adjust" or "h1-case-adjust-file". This + must only be a temporary workaround for the time it takes the server to be + fixed, because servers which require such workarounds might be vulnerable to + content smuggling attacks and must absolutely be fixed. + + Please note that this option will not affect standards-compliant servers. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also: "option h1-case-adjust-bogus-client", "h1-case-adjust", + "h1-case-adjust-file". + + +option http-buffer-request +no option http-buffer-request + Enable or disable waiting for whole HTTP request body before proceeding + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + It is sometimes desirable to wait for the body of an HTTP request before + taking a decision. This is what is being done by "balance url_param" for + example. The first use case is to buffer requests from slow clients before + connecting to the server. Another use case consists in taking the routing + decision based on the request body's contents. This option placed in a + frontend or backend forces the HTTP processing to wait until either the whole + body is received or the request buffer is full. It can have undesired side + effects with some applications abusing HTTP by expecting unbuffered + transmissions between the frontend and the backend, so this should definitely + not be used by default. + + See also : "option http-no-delay", "timeout http-request", + "http-request wait-for-body" + + +option http-ignore-probes +no option http-ignore-probes + Enable or disable logging of null connections and request timeouts + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + Recently some browsers started to implement a "pre-connect" feature + consisting in speculatively connecting to some recently visited web sites + just in case the user would like to visit them. This results in many + connections being established to web sites, which end up in 408 Request + Timeout if the timeout strikes first, or 400 Bad Request when the browser + decides to close them first. These ones pollute the log and feed the error + counters. There was already "option dontlognull" but it's insufficient in + this case. Instead, this option does the following things : + - prevent any 400/408 message from being sent to the client if nothing + was received over a connection before it was closed; + - prevent any log from being emitted in this situation; + - prevent any error counter from being incremented + + That way the empty connection is silently ignored. Note that it is better + not to use this unless it is clear that it is needed, because it will hide + real problems. The most common reason for not receiving a request and seeing + a 408 is due to an MTU inconsistency between the client and an intermediary + element such as a VPN, which blocks too large packets. These issues are + generally seen with POST requests as well as GET with large cookies. The logs + are often the only way to detect them. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "log", "dontlognull", "errorfile", and section 8 about logging. + + +option http-keep-alive +no option http-keep-alive + Enable or disable HTTP keep-alive from client to server for HTTP/1.x + connections + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + By default HAProxy operates in keep-alive mode with regards to persistent + HTTP/1.x connections: for each connection it processes each request and + response, and leaves the connection idle on both sides. This mode may be + changed by several options such as "option http-server-close" or "option + httpclose". This option allows to set back the keep-alive mode, which can be + useful when another mode was used in a defaults section. + + Setting "option http-keep-alive" enables HTTP keep-alive mode on the client- + and server- sides. This provides the lowest latency on the client side (slow + network) and the fastest session reuse on the server side at the expense + of maintaining idle connections to the servers. In general, it is possible + with this option to achieve approximately twice the request rate that the + "http-server-close" option achieves on small objects. There are mainly two + situations where this option may be useful : + + - when the server is non-HTTP compliant and authenticates the connection + instead of requests (e.g. NTLM authentication) + + - when the cost of establishing the connection to the server is significant + compared to the cost of retrieving the associated object from the server. + + This last case can happen when the server is a fast static server of cache. + + At the moment, logs will not indicate whether requests came from the same + session or not. The accept date reported in the logs corresponds to the end + of the previous request, and the request time corresponds to the time spent + waiting for a new request. The keep-alive request time is still bound to the + timeout defined by "timeout http-keep-alive" or "timeout http-request" if + not set. + + This option disables and replaces any previous "option httpclose" or "option + http-server-close". + + See also : "option httpclose",, "option http-server-close", + "option prefer-last-server" and "option http-pretend-keepalive". + + +option http-no-delay +no option http-no-delay + Instruct the system to favor low interactive delays over performance in HTTP + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + In HTTP, each payload is unidirectional and has no notion of interactivity. + Any agent is expected to queue data somewhat for a reasonably low delay. + There are some very rare server-to-server applications that abuse the HTTP + protocol and expect the payload phase to be highly interactive, with many + interleaved data chunks in both directions within a single request. This is + absolutely not supported by the HTTP specification and will not work across + most proxies or servers. When such applications attempt to do this through + HAProxy, it works but they will experience high delays due to the network + optimizations which favor performance by instructing the system to wait for + enough data to be available in order to only send full packets. Typical + delays are around 200 ms per round trip. Note that this only happens with + abnormal uses. Normal uses such as CONNECT requests nor WebSockets are not + affected. + + When "option http-no-delay" is present in either the frontend or the backend + used by a connection, all such optimizations will be disabled in order to + make the exchanges as fast as possible. Of course this offers no guarantee on + the functionality, as it may break at any other place. But if it works via + HAProxy, it will work as fast as possible. This option should never be used + by default, and should never be used at all unless such a buggy application + is discovered. The impact of using this option is an increase of bandwidth + usage and CPU usage, which may significantly lower performance in high + latency environments. + + See also : "option http-buffer-request" + + +option http-pretend-keepalive +no option http-pretend-keepalive + Define whether HAProxy will announce keepalive for HTTP/1.x connection to the + server or not + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When running with "option http-server-close" or "option httpclose", HAProxy + adds a "Connection: close" header to the HTTP/1.x request forwarded to the + server. Unfortunately, when some servers see this header, they automatically + refrain from using the chunked encoding for responses of unknown length, + while this is totally unrelated. The effect is that a client or a cache could + receive an incomplete response without being aware of it, and consider the + response complete. + + By setting "option http-pretend-keepalive", HAProxy will make the server + believe it will keep the connection alive. The server will then not fall back + to the abnormal undesired above. When HAProxy gets the whole response, it + will close the connection with the server just as it would do with the + "option httpclose". That way the client gets a normal response and the + connection is correctly closed on the server side. + + It is recommended not to enable this option by default, because most servers + will more efficiently close the connection themselves after the last packet, + and release its buffers slightly earlier. Also, the added packet on the + network could slightly reduce the overall peak performance. However it is + worth noting that when this option is enabled, HAProxy will have slightly + less work to do. So if HAProxy is the bottleneck on the whole architecture, + enabling this option might save a few CPU cycles. + + This option may be set in backend and listen sections. Using it in a frontend + section will be ignored and a warning will be reported during startup. It is + a backend related option, so there is no real reason to set it on a + frontend. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option httpclose", "option http-server-close", and + "option http-keep-alive" + +option http-restrict-req-hdr-names { preserve | delete | reject } + Set HAProxy policy about HTTP request header names containing characters + outside the "[a-zA-Z0-9-]" charset + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + preserve disable the filtering. It is the default mode for HTTP proxies + with no FastCGI application configured. + + delete remove request headers with a name containing a character + outside the "[a-zA-Z0-9-]" charset. It is the default mode for + HTTP backends with a configured FastCGI application. + + reject reject the request with a 403-Forbidden response if it contains a + header name with a character outside the "[a-zA-Z0-9-]" charset. + + This option may be used to restrict the request header names to alphanumeric + and hyphen characters ([A-Za-z0-9-]). This may be mandatory to interoperate + with non-HTTP compliant servers that fail to handle some characters in header + names. It may also be mandatory for FastCGI applications because all + non-alphanumeric characters in header names are replaced by an underscore + ('_'). Thus, it is easily possible to mix up header names and bypass some + rules. For instance, "X-Forwarded-For" and "X_Forwarded-For" headers are both + converted to "HTTP_X_FORWARDED_FOR" in FastCGI. + + Note this option is evaluated per proxy and after the http-request rules + evaluation. + +option http-server-close +no option http-server-close + Enable or disable HTTP/1.x connection closing on the server side + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + By default HAProxy operates in keep-alive mode with regards to persistent + HTTP/1.x connections: for each connection it processes each request and + response, and leaves the connection idle on both sides. This mode may be + changed by several options such as "option http-server-close" or "option + httpclose". Setting "option http-server-close" enables HTTP connection-close + mode on the server side while keeping the ability to support HTTP keep-alive + and pipelining on the client side. This provides the lowest latency on the + client side (slow network) and the fastest session reuse on the server side + to save server resources, similarly to "option httpclose". It also permits + non-keepalive capable servers to be served in keep-alive mode to the clients + if they conform to the requirements of RFC7230. Please note that some servers + do not always conform to those requirements when they see "Connection: close" + in the request. The effect will be that keep-alive will never be used. A + workaround consists in enabling "option http-pretend-keepalive". + + At the moment, logs will not indicate whether requests came from the same + session or not. The accept date reported in the logs corresponds to the end + of the previous request, and the request time corresponds to the time spent + waiting for a new request. The keep-alive request time is still bound to the + timeout defined by "timeout http-keep-alive" or "timeout http-request" if + not set. + + This option may be set both in a frontend and in a backend. It is enabled if + at least one of the frontend or backend holding a connection has it enabled. + It disables and replaces any previous "option httpclose" or "option + http-keep-alive". Please check section 4 ("Proxies") to see how this option + combines with others when frontend and backend options differ. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option httpclose", "option http-pretend-keepalive" and + "option http-keep-alive". + +option http-use-proxy-header +no option http-use-proxy-header + Make use of non-standard Proxy-Connection header instead of Connection + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + While RFC7230 explicitly states that HTTP/1.1 agents must use the + Connection header to indicate their wish of persistent or non-persistent + connections, both browsers and proxies ignore this header for proxied + connections and make use of the undocumented, non-standard Proxy-Connection + header instead. The issue begins when trying to put a load balancer between + browsers and such proxies, because there will be a difference between what + HAProxy understands and what the client and the proxy agree on. + + By setting this option in a frontend, HAProxy can automatically switch to use + that non-standard header if it sees proxied requests. A proxied request is + defined here as one where the URI begins with neither a '/' nor a '*'. This + is incompatible with the HTTP tunnel mode. Note that this option can only be + specified in a frontend and will affect the request along its whole life. + + Also, when this option is set, a request which requires authentication will + automatically switch to use proxy authentication headers if it is itself a + proxied request. That makes it possible to check or enforce authentication in + front of an existing proxy. + + This option should normally never be used, except in front of a proxy. + + See also : "option httpclose", and "option http-server-close". + +option httpchk +option httpchk <uri> +option httpchk <method> <uri> +option httpchk <method> <uri> <version> + Enables HTTP protocol to check on the servers health + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <method> is the optional HTTP method used with the requests. When not set, + the "OPTIONS" method is used, as it generally requires low server + processing and is easy to filter out from the logs. Any method + may be used, though it is not recommended to invent non-standard + ones. + + <uri> is the URI referenced in the HTTP requests. It defaults to " / " + which is accessible by default on almost any server, but may be + changed to any other URI. Query strings are permitted. + + <version> is the optional HTTP version string. It defaults to "HTTP/1.0" + but some servers might behave incorrectly in HTTP 1.0, so turning + it to HTTP/1.1 may sometimes help. Note that the Host field is + mandatory in HTTP/1.1, use "http-check send" directive to add it. + + By default, server health checks only consist in trying to establish a TCP + connection. When "option httpchk" is specified, a complete HTTP request is + sent once the TCP connection is established, and responses 2xx and 3xx are + considered valid, while all other ones indicate a server failure, including + the lack of any response. + + Combined with "http-check" directives, it is possible to customize the + request sent during the HTTP health checks or the matching rules on the + response. It is also possible to configure a send/expect sequence, just like + with the directive "tcp-check" for TCP health checks. + + The server configuration is used by default to open connections to perform + HTTP health checks. By it is also possible to overwrite server parameters + using "http-check connect" rules. + + "httpchk" option does not necessarily require an HTTP backend, it also works + with plain TCP backends. This is particularly useful to check simple scripts + bound to some dedicated ports using the inetd daemon. However, it will always + internally relies on an HTX multiplexer. Thus, it means the request + formatting and the response parsing will be strict. + + Note : For a while, there was no way to add headers or body in the request + used for HTTP health checks. So a workaround was to hide it at the end + of the version string with a "\r\n" after the version. It is now + deprecated. The directive "http-check send" must be used instead. + + Examples : + # Relay HTTPS traffic to Apache instance and check service availability + # using HTTP request "OPTIONS * HTTP/1.1" on port 80. + backend https_relay + mode tcp + option httpchk OPTIONS * HTTP/1.1 + http-check send hdr Host www + server apache1 192.168.1.1:443 check port 80 + + See also : "option ssl-hello-chk", "option smtpchk", "option mysql-check", + "option pgsql-check", "http-check" and the "check", "port" and + "inter" server options. + + +option httpclose +no option httpclose + Enable or disable HTTP/1.x connection closing + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + By default HAProxy operates in keep-alive mode with regards to persistent + HTTP/1.x connections: for each connection it processes each request and + response, and leaves the connection idle on both sides. This mode may be + changed by several options such as "option http-server-close" or "option + httpclose". + + If "option httpclose" is set, HAProxy will close the client or the server + connection, depending where the option is set. Only the frontend is + considered for client connections while the frontend and the backend are + considered for server ones. In this case the option is enabled if at least + one of the frontend or backend holding the connection has it enabled. If the + option is set on a listener, it is applied both on client and server + connections. It will check if a "Connection: close" header is already set in + each direction, and will add one if missing. + + This option may also be combined with "option http-pretend-keepalive", which + will disable sending of the "Connection: close" request header, but will + still cause the connection to be closed once the whole response is received. + + It disables and replaces any previous "option http-server-close" or "option + http-keep-alive". + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option http-server-close". + + +option httplog [ clf ] + Enable logging of HTTP request, session state and timers + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + clf if the "clf" argument is added, then the output format will be + the CLF format instead of HAProxy's default HTTP format. You can + use this when you need to feed HAProxy's logs through a specific + log analyzer which only support the CLF format and which is not + extensible. + + By default, the log output format is very poor, as it only contains the + source and destination addresses, and the instance name. By specifying + "option httplog", each log line turns into a much richer format including, + but not limited to, the HTTP request, the connection timers, the session + status, the connections numbers, the captured headers and cookies, the + frontend, backend and server name, and of course the source address and + ports. + + Specifying only "option httplog" will automatically clear the 'clf' mode + if it was set by default. + + "option httplog" overrides any previous "log-format" directive. + + See also : section 8 about logging. + +option httpslog + Enable logging of HTTPS request, session state and timers + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + + By default, the log output format is very poor, as it only contains the + source and destination addresses, and the instance name. By specifying + "option httpslog", each log line turns into a much richer format including, + but not limited to, the HTTP request, the connection timers, the session + status, the connections numbers, the captured headers and cookies, the + frontend, backend and server name, the SSL certificate verification and SSL + handshake statuses, and of course the source address and ports. + + "option httpslog" overrides any previous "log-format" directive. + + See also : section 8 about logging. + + +option independent-streams +no option independent-streams + Enable or disable independent timeout processing for both directions + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + By default, when data is sent over a socket, both the write timeout and the + read timeout for that socket are refreshed, because we consider that there is + activity on that socket, and we have no other means of guessing if we should + receive data or not. + + While this default behavior is desirable for almost all applications, there + exists a situation where it is desirable to disable it, and only refresh the + read timeout if there are incoming data. This happens on sessions with large + timeouts and low amounts of exchanged data such as telnet session. If the + server suddenly disappears, the output data accumulates in the system's + socket buffers, both timeouts are correctly refreshed, and there is no way + to know the server does not receive them, so we don't timeout. However, when + the underlying protocol always echoes sent data, it would be enough by itself + to detect the issue using the read timeout. Note that this problem does not + happen with more verbose protocols because data won't accumulate long in the + socket buffers. + + When this option is set on the frontend, it will disable read timeout updates + on data sent to the client. There probably is little use of this case. When + the option is set on the backend, it will disable read timeout updates on + data sent to the server. Doing so will typically break large HTTP posts from + slow lines, so use it with caution. + + See also : "timeout client", "timeout server" and "timeout tunnel" + + +option ldap-check + Use LDAPv3 health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + It is possible to test that the server correctly talks LDAPv3 instead of just + testing that it accepts the TCP connection. When this option is set, an + LDAPv3 anonymous simple bind message is sent to the server, and the response + is analyzed to find an LDAPv3 bind response message. + + The server is considered valid only when the LDAP response contains success + resultCode (http://tools.ietf.org/html/rfc4511#section-4.1.9). + + Logging of bind requests is server dependent see your documentation how to + configure it. + + Example : + option ldap-check + + See also : "option httpchk" + + +option external-check + Use external processes for server health checks + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + It is possible to test the health of a server using an external command. + This is achieved by running the executable set using "external-check + command". + + Requires the "external-check" global to be set. + + See also : "external-check", "external-check command", "external-check path" + + +option idle-close-on-response +no option idle-close-on-response + Avoid closing idle frontend connections if a soft stop is in progress + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, idle connections will be closed during a soft stop. In some + environments, a client talking to the proxy may have prepared some idle + connections in order to send requests later. If there is no proper retry on + write errors, this can result in errors while haproxy is reloading. Even + though a proper implementation should retry on connection/write errors, this + option was introduced to support backwards compatibility with haproxy prior + to version 2.4. Indeed before v2.4, haproxy used to wait for a last request + and response to add a "connection: close" header before closing, thus + notifying the client that the connection would not be reusable. + + In a real life example, this behavior was seen in AWS using the ALB in front + of a haproxy. The end result was ALB sending 502 during haproxy reloads. + + Users are warned that using this option may increase the number of old + processes if connections remain idle for too long. Adjusting the client + timeouts and/or the "hard-stop-after" parameter accordingly might be + needed in case of frequent reloads. + + See also: "timeout client", "timeout client-fin", "timeout http-request", + "hard-stop-after" + + +option log-health-checks +no option log-health-checks + Enable or disable logging of health checks status updates + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + By default, failed health check are logged if server is UP and successful + health checks are logged if server is DOWN, so the amount of additional + information is limited. + + When this option is enabled, any change of the health check status or to + the server's health will be logged, so that it becomes possible to know + that a server was failing occasional checks before crashing, or exactly when + it failed to respond a valid HTTP status, then when the port started to + reject connections, then when the server stopped responding at all. + + Note that status changes not caused by health checks (e.g. enable/disable on + the CLI) are intentionally not logged by this option. + + See also: "option httpchk", "option ldap-check", "option mysql-check", + "option pgsql-check", "option redis-check", "option smtpchk", + "option tcp-check", "log" and section 8 about logging. + + +option log-separate-errors +no option log-separate-errors + Change log level for non-completely successful connections + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + Sometimes looking for errors in logs is not easy. This option makes HAProxy + raise the level of logs containing potentially interesting information such + as errors, timeouts, retries, redispatches, or HTTP status codes 5xx. The + level changes from "info" to "err". This makes it possible to log them + separately to a different file with most syslog daemons. Be careful not to + remove them from the original file, otherwise you would lose ordering which + provides very important information. + + Using this option, large sites dealing with several thousand connections per + second may log normal traffic to a rotating buffer and only archive smaller + error logs. + + See also : "log", "dontlognull", "dontlog-normal" and section 8 about + logging. + + +option logasap +no option logasap + Enable or disable early logging. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, logs are emitted when all the log format variables and sample + fetches used in the definition of the log-format string return a value, or + when the session is terminated. This allows the built in log-format strings + to account for the transfer time, or the number of bytes in log messages. + + When handling long lived connections such as large file transfers or RDP, + it may take a while for the request or connection to appear in the logs. + Using "option logasap", the log message is created as soon as the server + connection is established in mode tcp, or as soon as the server sends the + complete headers in mode http. Missing information in the logs will be the + total number of bytes which will only indicate the amount of data transferred + before the message was created and the total time which will not take the + remainder of the connection life or transfer time into account. For the case + of HTTP, it is good practice to capture the Content-Length response header + so that the logs at least indicate how many bytes are expected to be + transferred. + + Examples : + listen http_proxy 0.0.0.0:80 + mode http + option httplog + option logasap + log 192.168.2.200 local3 + + >>> Feb 6 12:14:14 localhost \ + haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] http-in \ + static/srv1 9/10/7/14/+30 200 +243 - - ---- 3/1/1/1/0 1/0 \ + "GET /image.iso HTTP/1.0" + + See also : "option httplog", "capture response header", and section 8 about + logging. + + +option mysql-check [ user <username> [ { post-41 | pre-41 } ] ] + Use MySQL health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <username> This is the username which will be used when connecting to MySQL + server. + post-41 Send post v4.1 client compatible checks (the default) + pre-41 Send pre v4.1 client compatible checks + + If you specify a username, the check consists of sending two MySQL packet, + one Client Authentication packet, and one QUIT packet, to correctly close + MySQL session. We then parse the MySQL Handshake Initialization packet and/or + Error packet. It is a basic but useful test which does not produce error nor + aborted connect on the server. However, it requires an unlocked authorised + user without a password. To create a basic limited user in MySQL with + optional resource limits: + + CREATE USER '<username>'@'<ip_of_haproxy|network_of_haproxy/netmask>' + /*!50701 WITH MAX_QUERIES_PER_HOUR 1 MAX_UPDATES_PER_HOUR 0 */ + /*M!100201 MAX_STATEMENT_TIME 0.0001 */; + + If you don't specify a username (it is deprecated and not recommended), the + check only consists in parsing the Mysql Handshake Initialization packet or + Error packet, we don't send anything in this mode. It was reported that it + can generate lockout if check is too frequent and/or if there is not enough + traffic. In fact, you need in this case to check MySQL "max_connect_errors" + value as if a connection is established successfully within fewer than MySQL + "max_connect_errors" attempts after a previous connection was interrupted, + the error count for the host is cleared to zero. If HAProxy's server get + blocked, the "FLUSH HOSTS" statement is the only way to unblock it. + + Remember that this does not check database presence nor database consistency. + To do this, you can use an external check with xinetd for example. + + The check requires MySQL >=3.22, for older version, please use TCP check. + + Most often, an incoming MySQL server needs to see the client's IP address for + various purposes, including IP privilege matching and connection logging. + When possible, it is often wise to masquerade the client's IP address when + connecting to the server using the "usesrc" argument of the "source" keyword, + which requires the transparent proxy feature to be compiled in, and the MySQL + server to route the client via the machine hosting HAProxy. + + See also: "option httpchk" + + +option nolinger +no option nolinger + Enable or disable immediate session resource cleaning after close + May be used in sections: defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + When clients or servers abort connections in a dirty way (e.g. they are + physically disconnected), the session timeouts triggers and the session is + closed. But it will remain in FIN_WAIT1 state for some time in the system, + using some resources and possibly limiting the ability to establish newer + connections. + + When this happens, it is possible to activate "option nolinger" which forces + the system to immediately remove any socket's pending data on close. Thus, + a TCP RST is emitted, any pending data are truncated, and the session is + instantly purged from the system's tables. The generally visible effect for + a client is that responses are truncated if the close happens with a last + block of data (e.g. on a redirect or error response). On the server side, + it may help release the source ports immediately when forwarding a client + aborts in tunnels. In both cases, TCP resets are emitted and given that + the session is instantly destroyed, there will be no retransmit. On a lossy + network this can increase problems, especially when there is a firewall on + the lossy side, because the firewall might see and process the reset (hence + purge its session) and block any further traffic for this session,, including + retransmits from the other side. So if the other side doesn't receive it, + it will never receive any RST again, and the firewall might log many blocked + packets. + + For all these reasons, it is strongly recommended NOT to use this option, + unless absolutely needed as a last resort. In most situations, using the + "client-fin" or "server-fin" timeouts achieves similar results with a more + reliable behavior. On Linux it's also possible to use the "tcp-ut" bind or + server setting. + + This option may be used both on frontends and backends, depending on the side + where it is required. Use it on the frontend for clients, and on the backend + for servers. While this option is technically supported in "defaults" + sections, it must really not be used there as it risks to accidentally + propagate to sections that must no use it and to cause problems there. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also: "timeout client-fin", "timeout server-fin", "tcp-ut" bind or server + keywords. + +option originalto [ except <network> ] [ header <name> ] + Enable insertion of the X-Original-To header to requests sent to servers + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <network> is an optional argument used to disable this option for sources + matching <network> + <name> an optional argument to specify a different "X-Original-To" + header name. + + Since HAProxy can work in transparent mode, every request from a client can + be redirected to the proxy and HAProxy itself can proxy every request to a + complex SQUID environment and the destination host from SO_ORIGINAL_DST will + be lost. This is annoying when you want access rules based on destination ip + addresses. To solve this problem, a new HTTP header "X-Original-To" may be + added by HAProxy to all requests sent to the server. This header contains a + value representing the original destination IP address. Since this must be + configured to always use the last occurrence of this header only. Note that + only the last occurrence of the header must be used, since it is really + possible that the client has already brought one. + + The keyword "header" may be used to supply a different header name to replace + the default "X-Original-To". This can be useful where you might already + have a "X-Original-To" header from a different application, and you need + preserve it. Also if your backend server doesn't use the "X-Original-To" + header and requires different one. + + Sometimes, a same HAProxy instance may be shared between a direct client + access and a reverse-proxy access (for instance when an SSL reverse-proxy is + used to decrypt HTTPS traffic). It is possible to disable the addition of the + header for a known destination address or network by adding the "except" + keyword followed by the network address. In this case, any destination IP + matching the network will not cause an addition of this header. Most common + uses are with private networks or 127.0.0.1. IPv4 and IPv6 are both + supported. + + This option may be specified either in the frontend or in the backend. If at + least one of them uses it, the header will be added. Note that the backend's + setting of the header subargument takes precedence over the frontend's if + both are defined. + + Examples : + # Original Destination address + frontend www + mode http + option originalto except 127.0.0.1 + + # Those servers want the IP Address in X-Client-Dst + backend www + mode http + option originalto header X-Client-Dst + + See also : "option httpclose", "option http-server-close". + + +option persist +no option persist + Enable or disable forced persistence on down servers + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When an HTTP request reaches a backend with a cookie which references a dead + server, by default it is redispatched to another server. It is possible to + force the request to be sent to the dead server first using "option persist" + if absolutely needed. A common use case is when servers are under extreme + load and spend their time flapping. In this case, the users would still be + directed to the server they opened the session on, in the hope they would be + correctly served. It is recommended to use "option redispatch" in conjunction + with this option so that in the event it would not be possible to connect to + the server at all (server definitely dead), the client would finally be + redirected to another valid server. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option redispatch", "retries", "force-persist" + + +option pgsql-check user <username> + Use PostgreSQL health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <username> This is the username which will be used when connecting to + PostgreSQL server. + + The check sends a PostgreSQL StartupMessage and waits for either + Authentication request or ErrorResponse message. It is a basic but useful + test which does not produce error nor aborted connect on the server. + This check is identical with the "mysql-check". + + See also: "option httpchk" + + +option prefer-last-server +no option prefer-last-server + Allow multiple load balanced requests to remain on the same server + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When the load balancing algorithm in use is not deterministic, and a previous + request was sent to a server to which HAProxy still holds a connection, it is + sometimes desirable that subsequent requests on a same session go to the same + server as much as possible. Note that this is different from persistence, as + we only indicate a preference which HAProxy tries to apply without any form + of warranty. The real use is for keep-alive connections sent to servers. When + this option is used, HAProxy will try to reuse the same connection that is + attached to the server instead of rebalancing to another server, causing a + close of the connection. This can make sense for static file servers. It does + not make much sense to use this in combination with hashing algorithms. Note, + HAProxy already automatically tries to stick to a server which sends a 401 or + to a proxy which sends a 407 (authentication required), when the load + balancing algorithm is not deterministic. This is mandatory for use with the + broken NTLM authentication challenge, and significantly helps in + troubleshooting some faulty applications. Option prefer-last-server might be + desirable in these environments as well, to avoid redistributing the traffic + after every other response. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also: "option http-keep-alive" + + +option redispatch +option redispatch <interval> +no option redispatch + Enable or disable session redistribution in case of connection failure + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <interval> The optional integer value that controls how often redispatches + occur when retrying connections. Positive value P indicates a + redispatch is desired on every Pth retry, and negative value + N indicate a redispatch is desired on the Nth retry prior to the + last retry. For example, the default of -1 preserves the + historical behavior of redispatching on the last retry, a + positive value of 1 would indicate a redispatch on every retry, + and a positive value of 3 would indicate a redispatch on every + third retry. You can disable redispatches with a value of 0. + + + In HTTP mode, if a server designated by a cookie is down, clients may + definitely stick to it because they cannot flush the cookie, so they will not + be able to access the service anymore. + + Specifying "option redispatch" will allow the proxy to break cookie or + consistent hash based persistence and redistribute them to a working server. + + Active servers are selected from a subset of the list of available + servers. Active servers that are not down or in maintenance (i.e., whose + health is not checked or that have been checked as "up"), are selected in the + following order: + + 1. Any active, non-backup server, if any, or, + + 2. If the "allbackups" option is not set, the first backup server in the + list, or + + 3. If the "allbackups" option is set, any backup server. + + When a retry occurs, HAProxy tries to select another server than the last + one. The new server is selected from the current list of servers. + + Sometimes, if the list is updated between retries (e.g., if numerous retries + occur and last longer than the time needed to check that a server is down, + remove it from the list and fall back on the list of backup servers), + connections may be redirected to a backup server, though. + + It also allows to retry connections to another server in case of multiple + connection failures. Of course, it requires having "retries" set to a nonzero + value. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "retries", "force-persist" + + +option redis-check + Use redis health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + It is possible to test that the server correctly talks REDIS protocol instead + of just testing that it accepts the TCP connection. When this option is set, + a PING redis command is sent to the server, and the response is analyzed to + find the "+PONG" response message. + + Example : + option redis-check + + See also : "option httpchk", "option tcp-check", "tcp-check expect" + + +option smtpchk +option smtpchk <hello> <domain> + Use SMTP health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <hello> is an optional argument. It is the "hello" command to use. It can + be either "HELO" (for SMTP) or "EHLO" (for ESMTP). All other + values will be turned into the default command ("HELO"). + + <domain> is the domain name to present to the server. It may only be + specified (and is mandatory) if the hello command has been + specified. By default, "localhost" is used. + + When "option smtpchk" is set, the health checks will consist in TCP + connections followed by an SMTP command. By default, this command is + "HELO localhost". The server's return code is analyzed and only return codes + starting with a "2" will be considered as valid. All other responses, + including a lack of response will constitute an error and will indicate a + dead server. + + This test is meant to be used with SMTP servers or relays. Depending on the + request, it is possible that some servers do not log each connection attempt, + so you may want to experiment to improve the behavior. Using telnet on port + 25 is often easier than adjusting the configuration. + + Most often, an incoming SMTP server needs to see the client's IP address for + various purposes, including spam filtering, anti-spoofing and logging. When + possible, it is often wise to masquerade the client's IP address when + connecting to the server using the "usesrc" argument of the "source" keyword, + which requires the transparent proxy feature to be compiled in. + + Example : + option smtpchk HELO mydomain.org + + See also : "option httpchk", "source" + + +option socket-stats +no option socket-stats + + Enable or disable collecting & providing separate statistics for each socket. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + + Arguments : none + + +option splice-auto +no option splice-auto + Enable or disable automatic kernel acceleration on sockets in both directions + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + When this option is enabled either on a frontend or on a backend, HAProxy + will automatically evaluate the opportunity to use kernel tcp splicing to + forward data between the client and the server, in either direction. HAProxy + uses heuristics to estimate if kernel splicing might improve performance or + not. Both directions are handled independently. Note that the heuristics used + are not much aggressive in order to limit excessive use of splicing. This + option requires splicing to be enabled at compile time, and may be globally + disabled with the global option "nosplice". Since splice uses pipes, using it + requires that there are enough spare pipes. + + Important note: kernel-based TCP splicing is a Linux-specific feature which + first appeared in kernel 2.6.25. It offers kernel-based acceleration to + transfer data between sockets without copying these data to user-space, thus + providing noticeable performance gains and CPU cycles savings. Since many + early implementations are buggy, corrupt data and/or are inefficient, this + feature is not enabled by default, and it should be used with extreme care. + While it is not possible to detect the correctness of an implementation, + 2.6.29 is the first version offering a properly working implementation. In + case of doubt, splicing may be globally disabled using the global "nosplice" + keyword. + + Example : + option splice-auto + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option splice-request", "option splice-response", and global + options "nosplice" and "maxpipes" + + +option splice-request +no option splice-request + Enable or disable automatic kernel acceleration on sockets for requests + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + When this option is enabled either on a frontend or on a backend, HAProxy + will use kernel tcp splicing whenever possible to forward data going from + the client to the server. It might still use the recv/send scheme if there + are no spare pipes left. This option requires splicing to be enabled at + compile time, and may be globally disabled with the global option "nosplice". + Since splice uses pipes, using it requires that there are enough spare pipes. + + Important note: see "option splice-auto" for usage limitations. + + Example : + option splice-request + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option splice-auto", "option splice-response", and global options + "nosplice" and "maxpipes" + + +option splice-response +no option splice-response + Enable or disable automatic kernel acceleration on sockets for responses + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + When this option is enabled either on a frontend or on a backend, HAProxy + will use kernel tcp splicing whenever possible to forward data going from + the server to the client. It might still use the recv/send scheme if there + are no spare pipes left. This option requires splicing to be enabled at + compile time, and may be globally disabled with the global option "nosplice". + Since splice uses pipes, using it requires that there are enough spare pipes. + + Important note: see "option splice-auto" for usage limitations. + + Example : + option splice-response + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option splice-auto", "option splice-request", and global options + "nosplice" and "maxpipes" + + +option spop-check + Use SPOP health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + It is possible to test that the server correctly talks SPOP protocol instead + of just testing that it accepts the TCP connection. When this option is set, + a HELLO handshake is performed between HAProxy and the server, and the + response is analyzed to check no error is reported. + + Example : + option spop-check + + See also : "option httpchk" + + +option srvtcpka +no option srvtcpka + Enable or disable the sending of TCP keepalive packets on the server side + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When there is a firewall or any session-aware component between a client and + a server, and when the protocol involves very long sessions with long idle + periods (e.g. remote desktops), there is a risk that one of the intermediate + components decides to expire a session which has remained idle for too long. + + Enabling socket-level TCP keep-alives makes the system regularly send packets + to the other end of the connection, leaving it active. The delay between + keep-alive probes is controlled by the system only and depends both on the + operating system and its tuning parameters. + + It is important to understand that keep-alive packets are neither emitted nor + received at the application level. It is only the network stacks which sees + them. For this reason, even if one side of the proxy already uses keep-alives + to maintain its connection alive, those keep-alive packets will not be + forwarded to the other side of the proxy. + + Please note that this has nothing to do with HTTP keep-alive. + + Using option "srvtcpka" enables the emission of TCP keep-alive probes on the + server side of a connection, which should help when session expirations are + noticed between HAProxy and a server. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option clitcpka", "option tcpka" + + +option ssl-hello-chk + Use SSLv3 client hello health checks for server testing + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + When some SSL-based protocols are relayed in TCP mode through HAProxy, it is + possible to test that the server correctly talks SSL instead of just testing + that it accepts the TCP connection. When "option ssl-hello-chk" is set, pure + SSLv3 client hello messages are sent once the connection is established to + the server, and the response is analyzed to find an SSL server hello message. + The server is considered valid only when the response contains this server + hello message. + + All servers tested till there correctly reply to SSLv3 client hello messages, + and most servers tested do not even log the requests containing only hello + messages, which is appreciable. + + Note that this check works even when SSL support was not built into HAProxy + because it forges the SSL message. When SSL support is available, it is best + to use native SSL health checks instead of this one. + + See also: "option httpchk", "check-ssl" + + +option tcp-check + Perform health checks using tcp-check send/expect sequences + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + This health check method is intended to be combined with "tcp-check" command + lists in order to support send/expect types of health check sequences. + + TCP checks currently support 4 modes of operations : + - no "tcp-check" directive : the health check only consists in a connection + attempt, which remains the default mode. + + - "tcp-check send" or "tcp-check send-binary" only is mentioned : this is + used to send a string along with a connection opening. With some + protocols, it helps sending a "QUIT" message for example that prevents + the server from logging a connection error for each health check. The + check result will still be based on the ability to open the connection + only. + + - "tcp-check expect" only is mentioned : this is used to test a banner. + The connection is opened and HAProxy waits for the server to present some + contents which must validate some rules. The check result will be based + on the matching between the contents and the rules. This is suited for + POP, IMAP, SMTP, FTP, SSH, TELNET. + + - both "tcp-check send" and "tcp-check expect" are mentioned : this is + used to test a hello-type protocol. HAProxy sends a message, the server + responds and its response is analyzed. the check result will be based on + the matching between the response contents and the rules. This is often + suited for protocols which require a binding or a request/response model. + LDAP, MySQL, Redis and SSL are example of such protocols, though they + already all have their dedicated checks with a deeper understanding of + the respective protocols. + In this mode, many questions may be sent and many answers may be + analyzed. + + A fifth mode can be used to insert comments in different steps of the script. + + For each tcp-check rule you create, you can add a "comment" directive, + followed by a string. This string will be reported in the log and stderr in + debug mode. It is useful to make user-friendly error reporting. The + "comment" is of course optional. + + During the execution of a health check, a variable scope is made available to + store data samples, using the "tcp-check set-var" operation. Freeing those + variable is possible using "tcp-check unset-var". + + + Examples : + # perform a POP check (analyze only server's banner) + option tcp-check + tcp-check expect string +OK\ POP3\ ready comment POP\ protocol + + # perform an IMAP check (analyze only server's banner) + option tcp-check + tcp-check expect string *\ OK\ IMAP4\ ready comment IMAP\ protocol + + # look for the redis master server after ensuring it speaks well + # redis protocol, then it exits properly. + # (send a command then analyze the response 3 times) + option tcp-check + tcp-check comment PING\ phase + tcp-check send PING\r\n + tcp-check expect string +PONG + tcp-check comment role\ check + tcp-check send info\ replication\r\n + tcp-check expect string role:master + tcp-check comment QUIT\ phase + tcp-check send QUIT\r\n + tcp-check expect string +OK + + forge a HTTP request, then analyze the response + (send many headers before analyzing) + option tcp-check + tcp-check comment forge\ and\ send\ HTTP\ request + tcp-check send HEAD\ /\ HTTP/1.1\r\n + tcp-check send Host:\ www.mydomain.com\r\n + tcp-check send User-Agent:\ HAProxy\ tcpcheck\r\n + tcp-check send \r\n + tcp-check expect rstring HTTP/1\..\ (2..|3..) comment check\ HTTP\ response + + + See also : "tcp-check connect", "tcp-check expect" and "tcp-check send". + + +option tcp-smart-accept +no option tcp-smart-accept + Enable or disable the saving of one ACK packet during the accept sequence + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + When an HTTP connection request comes in, the system acknowledges it on + behalf of HAProxy, then the client immediately sends its request, and the + system acknowledges it too while it is notifying HAProxy about the new + connection. HAProxy then reads the request and responds. This means that we + have one TCP ACK sent by the system for nothing, because the request could + very well be acknowledged by HAProxy when it sends its response. + + For this reason, in HTTP mode, HAProxy automatically asks the system to avoid + sending this useless ACK on platforms which support it (currently at least + Linux). It must not cause any problem, because the system will send it anyway + after 40 ms if the response takes more time than expected to come. + + During complex network debugging sessions, it may be desirable to disable + this optimization because delayed ACKs can make troubleshooting more complex + when trying to identify where packets are delayed. It is then possible to + fall back to normal behavior by specifying "no option tcp-smart-accept". + + It is also possible to force it for non-HTTP proxies by simply specifying + "option tcp-smart-accept". For instance, it can make sense with some services + such as SMTP where the server speaks first. + + It is recommended to avoid forcing this option in a defaults section. In case + of doubt, consider setting it back to automatic values by prepending the + "default" keyword before it, or disabling it using the "no" keyword. + + See also : "option tcp-smart-connect" + + +option tcp-smart-connect +no option tcp-smart-connect + Enable or disable the saving of one ACK packet during the connect sequence + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + On certain systems (at least Linux), HAProxy can ask the kernel not to + immediately send an empty ACK upon a connection request, but to directly + send the buffer request instead. This saves one packet on the network and + thus boosts performance. It can also be useful for some servers, because they + immediately get the request along with the incoming connection. + + This feature is enabled when "option tcp-smart-connect" is set in a backend. + It is not enabled by default because it makes network troubleshooting more + complex. + + It only makes sense to enable it with protocols where the client speaks first + such as HTTP. In other situations, if there is no data to send in place of + the ACK, a normal ACK is sent. + + If this option has been enabled in a "defaults" section, it can be disabled + in a specific instance by prepending the "no" keyword before it. + + See also : "option tcp-smart-accept" + + +option tcpka + Enable or disable the sending of TCP keepalive packets on both sides + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + When there is a firewall or any session-aware component between a client and + a server, and when the protocol involves very long sessions with long idle + periods (e.g. remote desktops), there is a risk that one of the intermediate + components decides to expire a session which has remained idle for too long. + + Enabling socket-level TCP keep-alives makes the system regularly send packets + to the other end of the connection, leaving it active. The delay between + keep-alive probes is controlled by the system only and depends both on the + operating system and its tuning parameters. + + It is important to understand that keep-alive packets are neither emitted nor + received at the application level. It is only the network stacks which sees + them. For this reason, even if one side of the proxy already uses keep-alives + to maintain its connection alive, those keep-alive packets will not be + forwarded to the other side of the proxy. + + Please note that this has nothing to do with HTTP keep-alive. + + Using option "tcpka" enables the emission of TCP keep-alive probes on both + the client and server sides of a connection. Note that this is meaningful + only in "defaults" or "listen" sections. If this option is used in a + frontend, only the client side will get keep-alives, and if this option is + used in a backend, only the server side will get keep-alives. For this + reason, it is strongly recommended to explicitly use "option clitcpka" and + "option srvtcpka" when the configuration is split between frontends and + backends. + + See also : "option clitcpka", "option srvtcpka" + + +option tcplog + Enable advanced logging of TCP connections with session state and timers + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : none + + By default, the log output format is very poor, as it only contains the + source and destination addresses, and the instance name. By specifying + "option tcplog", each log line turns into a much richer format including, but + not limited to, the connection timers, the session status, the connections + numbers, the frontend, backend and server name, and of course the source + address and ports. This option is useful for pure TCP proxies in order to + find which of the client or server disconnects or times out. For normal HTTP + proxies, it's better to use "option httplog" which is even more complete. + + "option tcplog" overrides any previous "log-format" directive. + + See also : "option httplog", and section 8 about logging. + + +option transparent +no option transparent + Enable client-side transparent proxying + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + This option was introduced in order to provide layer 7 persistence to layer 3 + load balancers. The idea is to use the OS's ability to redirect an incoming + connection for a remote address to a local process (here HAProxy), and let + this process know what address was initially requested. When this option is + used, sessions without cookies will be forwarded to the original destination + IP address of the incoming request (which should match that of another + equipment), while requests with cookies will still be forwarded to the + appropriate server. + + Note that contrary to a common belief, this option does NOT make HAProxy + present the client's IP to the server when establishing the connection. + + See also: the "usesrc" argument of the "source" keyword, and the + "transparent" option of the "bind" keyword. + + +external-check command <command> + Executable to run when performing an external-check + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <command> is the external command to run + + The arguments passed to the to the command are: + + <proxy_address> <proxy_port> <server_address> <server_port> + + The <proxy_address> and <proxy_port> are derived from the first listener + that is either IPv4, IPv6 or a UNIX socket. In the case of a UNIX socket + listener the proxy_address will be the path of the socket and the + <proxy_port> will be the string "NOT_USED". In a backend section, it's not + possible to determine a listener, and both <proxy_address> and <proxy_port> + will have the string value "NOT_USED". + + Some values are also provided through environment variables. + + Environment variables : + HAPROXY_PROXY_ADDR The first bind address if available (or empty if not + applicable, for example in a "backend" section). + + HAPROXY_PROXY_ID The backend id. + + HAPROXY_PROXY_NAME The backend name. + + HAPROXY_PROXY_PORT The first bind port if available (or empty if not + applicable, for example in a "backend" section or + for a UNIX socket). + + HAPROXY_SERVER_ADDR The server address. + + HAPROXY_SERVER_CURCONN The current number of connections on the server. + + HAPROXY_SERVER_ID The server id. + + HAPROXY_SERVER_MAXCONN The server max connections. + + HAPROXY_SERVER_NAME The server name. + + HAPROXY_SERVER_PORT The server port if available (or empty for a UNIX + socket). + + HAPROXY_SERVER_SSL "0" when SSL is not used, "1" when it is used + + HAPROXY_SERVER_PROTO The protocol used by this server, which can be one + of "cli" (the haproxy CLI), "syslog" (syslog TCP + server), "peers" (peers TCP server), "h1" (HTTP/1.x + server), "h2" (HTTP/2 server), or "tcp" (any other + TCP server). + + PATH The PATH environment variable used when executing + the command may be set using "external-check path". + + See also "2.3. Environment variables" for other variables. + + If the command executed and exits with a zero status then the check is + considered to have passed, otherwise the check is considered to have + failed. + + Example : + external-check command /bin/true + + See also : "external-check", "option external-check", "external-check path" + + +external-check path <path> + The value of the PATH environment variable used when running an external-check + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <path> is the path used when executing external command to run + + The default path is "". + + Example : + external-check path "/usr/bin:/bin" + + See also : "external-check", "option external-check", + "external-check command" + + +persist rdp-cookie +persist rdp-cookie(<name>) + Enable RDP cookie-based persistence + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <name> is the optional name of the RDP cookie to check. If omitted, the + default cookie name "msts" will be used. There currently is no + valid reason to change this name. + + This statement enables persistence based on an RDP cookie. The RDP cookie + contains all information required to find the server in the list of known + servers. So when this option is set in the backend, the request is analyzed + and if an RDP cookie is found, it is decoded. If it matches a known server + which is still UP (or if "option persist" is set), then the connection is + forwarded to this server. + + Note that this only makes sense in a TCP backend, but for this to work, the + frontend must have waited long enough to ensure that an RDP cookie is present + in the request buffer. This is the same requirement as with the "rdp-cookie" + load-balancing method. Thus it is highly recommended to put all statements in + a single "listen" section. + + Also, it is important to understand that the terminal server will emit this + RDP cookie only if it is configured for "token redirection mode", which means + that the "IP address redirection" option is disabled. + + Example : + listen tse-farm + bind :3389 + # wait up to 5s for an RDP cookie in the request + tcp-request inspect-delay 5s + tcp-request content accept if RDP_COOKIE + # apply RDP cookie persistence + persist rdp-cookie + # if server is unknown, let's balance on the same cookie. + # alternatively, "balance leastconn" may be useful too. + balance rdp-cookie + server srv1 1.1.1.1:3389 + server srv2 1.1.1.2:3389 + + See also : "balance rdp-cookie", "tcp-request" and the "req.rdp_cookie" ACL. + + +rate-limit sessions <rate> + Set a limit on the number of new sessions accepted per second on a frontend + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <rate> The <rate> parameter is an integer designating the maximum number + of new sessions per second to accept on the frontend. + + When the frontend reaches the specified number of new sessions per second, it + stops accepting new connections until the rate drops below the limit again. + During this time, the pending sessions will be kept in the socket's backlog + (in system buffers) and HAProxy will not even be aware that sessions are + pending. When applying very low limit on a highly loaded service, it may make + sense to increase the socket's backlog using the "backlog" keyword. + + This feature is particularly efficient at blocking connection-based attacks + or service abuse on fragile servers. Since the session rate is measured every + millisecond, it is extremely accurate. Also, the limit applies immediately, + no delay is needed at all to detect the threshold. + + Example : limit the connection rate on SMTP to 10 per second max + listen smtp + mode tcp + bind :25 + rate-limit sessions 10 + server smtp1 127.0.0.1:1025 + + Note : when the maximum rate is reached, the frontend's status is not changed + but its sockets appear as "WAITING" in the statistics if the + "socket-stats" option is enabled. + + See also : the "backlog" keyword and the "fe_sess_rate" ACL criterion. + + +redirect location <loc> [code <code>] <option> [{if | unless} <condition>] +redirect prefix <pfx> [code <code>] <option> [{if | unless} <condition>] +redirect scheme <sch> [code <code>] <option> [{if | unless} <condition>] + Return an HTTP redirection if/unless a condition is matched + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + + If/unless the condition is matched, the HTTP request will lead to a redirect + response. If no condition is specified, the redirect applies unconditionally. + + Arguments : + <loc> With "redirect location", the exact value in <loc> is placed into + the HTTP "Location" header. When used in an "http-request" rule, + <loc> value follows the log-format rules and can include some + dynamic values (see Custom Log Format in section 8.2.4). + + <pfx> With "redirect prefix", the "Location" header is built from the + concatenation of <pfx> and the complete URI path, including the + query string, unless the "drop-query" option is specified (see + below). As a special case, if <pfx> equals exactly "/", then + nothing is inserted before the original URI. It allows one to + redirect to the same URL (for instance, to insert a cookie). When + used in an "http-request" rule, <pfx> value follows the log-format + rules and can include some dynamic values (see Custom Log Format + in section 8.2.4). + + <sch> With "redirect scheme", then the "Location" header is built by + concatenating <sch> with "://" then the first occurrence of the + "Host" header, and then the URI path, including the query string + unless the "drop-query" option is specified (see below). If no + path is found or if the path is "*", then "/" is used instead. If + no "Host" header is found, then an empty host component will be + returned, which most recent browsers interpret as redirecting to + the same host. This directive is mostly used to redirect HTTP to + HTTPS. When used in an "http-request" rule, <sch> value follows + the log-format rules and can include some dynamic values (see + Custom Log Format in section 8.2.4). + + <code> The code is optional. It indicates which type of HTTP redirection + is desired. Only codes 301, 302, 303, 307 and 308 are supported, + with 302 used by default if no code is specified. 301 means + "Moved permanently", and a browser may cache the Location. 302 + means "Moved temporarily" and means that the browser should not + cache the redirection. 303 is equivalent to 302 except that the + browser will fetch the location with a GET method. 307 is just + like 302 but makes it clear that the same method must be reused. + Likewise, 308 replaces 301 if the same method must be used. + + <option> There are several options which can be specified to adjust the + expected behavior of a redirection : + + - "drop-query" + When this keyword is used in a prefix-based redirection, then the + location will be set without any possible query-string, which is useful + for directing users to a non-secure page for instance. It has no effect + with a location-type redirect. + + - "append-slash" + This keyword may be used in conjunction with "drop-query" to redirect + users who use a URL not ending with a '/' to the same one with the '/'. + It can be useful to ensure that search engines will only see one URL. + For this, a return code 301 is preferred. + + - "ignore-empty" + This keyword only has effect when a location is produced using a log + format expression (i.e. when used in http-request or http-response). + It indicates that if the result of the expression is empty, the rule + should silently be skipped. The main use is to allow mass-redirects + of known paths using a simple map. + + - "set-cookie NAME[=value]" + A "Set-Cookie" header will be added with NAME (and optionally "=value") + to the response. This is sometimes used to indicate that a user has + been seen, for instance to protect against some types of DoS. No other + cookie option is added, so the cookie will be a session cookie. Note + that for a browser, a sole cookie name without an equal sign is + different from a cookie with an equal sign. + + - "clear-cookie NAME[=]" + A "Set-Cookie" header will be added with NAME (and optionally "="), but + with the "Max-Age" attribute set to zero. This will tell the browser to + delete this cookie. It is useful for instance on logout pages. It is + important to note that clearing the cookie "NAME" will not remove a + cookie set with "NAME=value". You have to clear the cookie "NAME=" for + that, because the browser makes the difference. + + Example: move the login URL only to HTTPS. + acl clear dst_port 80 + acl secure dst_port 8080 + acl login_page url_beg /login + acl logout url_beg /logout + acl uid_given url_reg /login?userid=[^&]+ + acl cookie_set hdr_sub(cookie) SEEN=1 + + redirect prefix https://mysite.com set-cookie SEEN=1 if !cookie_set + redirect prefix https://mysite.com if login_page !secure + redirect prefix http://mysite.com drop-query if login_page !uid_given + redirect location http://mysite.com/ if !login_page secure + redirect location / clear-cookie USERID= if logout + + Example: send redirects for request for articles without a '/'. + acl missing_slash path_reg ^/article/[^/]*$ + redirect code 301 prefix / drop-query append-slash if missing_slash + + Example: redirect all HTTP traffic to HTTPS when SSL is handled by HAProxy. + redirect scheme https if !{ ssl_fc } + + Example: append 'www.' prefix in front of all hosts not having it + http-request redirect code 301 location \ + http://www.%[hdr(host)]%[capture.req.uri] \ + unless { hdr_beg(host) -i www } + + Example: permanently redirect only old URLs to new ones + http-request redirect code 301 location \ + %[path,map_str(old-blog-articles.map)] ignore-empty + + See section 7 about ACL usage. + + +retries <value> + Set the number of retries to perform on a server after a failure + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <value> is the number of times a request or connection attempt should be + retried on a server after a failure. + + By default, retries apply only to new connection attempts. However, when + the "retry-on" directive is used, other conditions might trigger a retry + (e.g. empty response, undesired status code), and each of them will count + one attempt, and when the total number attempts reaches the value here, an + error will be returned. + + In order to avoid immediate reconnections to a server which is restarting, + a turn-around timer of min("timeout connect", one second) is applied before + a retry occurs on the same server. + + When "option redispatch" is set, some retries may be performed on another + server even if a cookie references a different server. By default this will + only be the last retry unless an argument is passed to "option redispatch". + + See also : "option redispatch" + + +retry-on [space-delimited list of keywords] + Specify when to attempt to automatically retry a failed request. + This setting is only valid when "mode" is set to http and is silently ignored + otherwise. + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <keywords> is a space-delimited list of keywords or HTTP status codes, each + representing a type of failure event on which an attempt to + retry the request is desired. Please read the notes at the + bottom before changing this setting. The following keywords are + supported : + + none never retry + + conn-failure retry when the connection or the SSL handshake failed + and the request could not be sent. This is the default. + + empty-response retry when the server connection was closed after part + of the request was sent, and nothing was received from + the server. This type of failure may be caused by the + request timeout on the server side, poor network + condition, or a server crash or restart while + processing the request. + + junk-response retry when the server returned something not looking + like a complete HTTP response. This includes partial + responses headers as well as non-HTTP contents. It + usually is a bad idea to retry on such events, which + may be caused a configuration issue (wrong server port) + or by the request being harmful to the server (buffer + overflow attack for example). + + response-timeout the server timeout stroke while waiting for the server + to respond to the request. This may be caused by poor + network condition, the reuse of an idle connection + which has expired on the path, or by the request being + extremely expensive to process. It generally is a bad + idea to retry on such events on servers dealing with + heavy database processing (full scans, etc) as it may + amplify denial of service attacks. + + 0rtt-rejected retry requests which were sent over early data and were + rejected by the server. These requests are generally + considered to be safe to retry. + + <status> any HTTP status code among "401" (Unauthorized), "403" + (Forbidden), "404" (Not Found), "408" (Request Timeout), + "425" (Too Early), "500" (Server Error), "501" (Not + Implemented), "502" (Bad Gateway), "503" (Service + Unavailable), "504" (Gateway Timeout). + + all-retryable-errors + retry request for any error that are considered + retryable. This currently activates "conn-failure", + "empty-response", "junk-response", "response-timeout", + "0rtt-rejected", "500", "502", "503", and "504". + + Using this directive replaces any previous settings with the new ones; it is + not cumulative. + + Please note that using anything other than "none" and "conn-failure" requires + to allocate a buffer and copy the whole request into it, so it has memory and + performance impacts. Requests not fitting in a single buffer will never be + retried (see the global tune.bufsize setting). + + You have to make sure the application has a replay protection mechanism built + in such as a unique transaction IDs passed in requests, or that replaying the + same request has no consequence, or it is very dangerous to use any retry-on + value beside "conn-failure" and "none". Static file servers and caches are + generally considered safe against any type of retry. Using a status code can + be useful to quickly leave a server showing an abnormal behavior (out of + memory, file system issues, etc), but in this case it may be a good idea to + immediately redispatch the connection to another server (please see "option + redispatch" for this). Last, it is important to understand that most causes + of failures are the requests themselves and that retrying a request causing a + server to misbehave will often make the situation even worse for this server, + or for the whole service in case of redispatch. + + Unless you know exactly how the application deals with replayed requests, you + should not use this directive. + + The default is "conn-failure". + + Example: + retry-on 503 504 + + See also: "retries", "option redispatch", "tune.bufsize" + +server <name> <address>[:[port]] [param*] + Declare a server in a backend + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + Arguments : + <name> is the internal name assigned to this server. This name will + appear in logs and alerts. If "http-send-name-header" is + set, it will be added to the request header sent to the server. + + <address> is the IPv4 or IPv6 address of the server. Alternatively, a + resolvable hostname is supported, but this name will be resolved + during start-up. Address "0.0.0.0" or "*" has a special meaning. + It indicates that the connection will be forwarded to the same IP + address as the one from the client connection. This is useful in + transparent proxy architectures where the client's connection is + intercepted and HAProxy must forward to the original destination + address. This is more or less what the "transparent" keyword does + except that with a server it's possible to limit concurrency and + to report statistics. Optionally, an address family prefix may be + used before the address to force the family regardless of the + address format, which can be useful to specify a path to a unix + socket with no slash ('/'). Currently supported prefixes are : + - 'ipv4@' -> address is always IPv4 + - 'ipv6@' -> address is always IPv6 + - 'unix@' -> address is a path to a local unix socket + - 'abns@' -> address is in abstract namespace (Linux only) + - 'sockpair@' -> address is the FD of a connected unix + socket or of a socketpair. During a connection, the + backend creates a pair of connected sockets, and passes + one of them over the FD. The bind part will use the + received socket as the client FD. Should be used + carefully. + You may want to reference some environment variables in the + address parameter, see section 2.3 about environment + variables. The "init-addr" setting can be used to modify the way + IP addresses should be resolved upon startup. + + <port> is an optional port specification. If set, all connections will + be sent to this port. If unset, the same port the client + connected to will be used. The port may also be prefixed by a "+" + or a "-". In this case, the server's port will be determined by + adding this value to the client's port. + + <param*> is a list of parameters for this server. The "server" keywords + accepts an important number of options and has a complete section + dedicated to it. Please refer to section 5 for more details. + + Examples : + server first 10.1.1.1:1080 cookie first check inter 1000 + server second 10.1.1.2:1080 cookie second check inter 1000 + server transp ipv4@ + server backup "${SRV_BACKUP}:1080" backup + server www1_dc1 "${LAN_DC1}.101:80" + server www1_dc2 "${LAN_DC2}.101:80" + + Note: regarding Linux's abstract namespace sockets, HAProxy uses the whole + sun_path length is used for the address length. Some other programs + such as socat use the string length only by default. Pass the option + ",unix-tightsocklen=0" to any abstract socket definition in socat to + make it compatible with HAProxy's. + + See also: "default-server", "http-send-name-header" and section 5 about + server options + +server-state-file-name [ { use-backend-name | <file> } ] + Set the server state file to read, load and apply to servers available in + this backend. + May be used in sections: defaults | frontend | listen | backend + no | no | yes | yes + + It only applies when the directive "load-server-state-from-file" is set to + "local". When <file> is not provided, if "use-backend-name" is used or if + this directive is not set, then backend name is used. If <file> starts with a + slash '/', then it is considered as an absolute path. Otherwise, <file> is + concatenated to the global directive "server-state-base". + + Example: the minimal configuration below would make HAProxy look for the + state server file '/etc/haproxy/states/bk': + + global + server-state-file-base /etc/haproxy/states + + backend bk + load-server-state-from-file + + See also: "server-state-base", "load-server-state-from-file", and + "show servers state" + +server-template <prefix> <num | range> <fqdn>[:<port>] [params*] + Set a template to initialize servers with shared parameters. + The names of these servers are built from <prefix> and <num | range> parameters. + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + + Arguments: + <prefix> A prefix for the server names to be built. + + <num | range> + If <num> is provided, this template initializes <num> servers + with 1 up to <num> as server name suffixes. A range of numbers + <num_low>-<num_high> may also be used to use <num_low> up to + <num_high> as server name suffixes. + + <fqdn> A FQDN for all the servers this template initializes. + + <port> Same meaning as "server" <port> argument (see "server" keyword). + + <params*> + Remaining server parameters among all those supported by "server" + keyword. + + Examples: + # Initializes 3 servers with srv1, srv2 and srv3 as names, + # google.com as FQDN, and health-check enabled. + server-template srv 1-3 google.com:80 check + + # or + server-template srv 3 google.com:80 check + + # would be equivalent to: + server srv1 google.com:80 check + server srv2 google.com:80 check + server srv3 google.com:80 check + + + +source <addr>[:<port>] [usesrc { <addr2>[:<port2>] | client | clientip } ] +source <addr>[:<port>] [usesrc { <addr2>[:<port2>] | hdr_ip(<hdr>[,<occ>]) } ] +source <addr>[:<port>] [interface <name>] + Set the source address for outgoing connections + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <addr> is the IPv4 address HAProxy will bind to before connecting to a + server. This address is also used as a source for health checks. + + The default value of 0.0.0.0 means that the system will select + the most appropriate address to reach its destination. Optionally + an address family prefix may be used before the address to force + the family regardless of the address format, which can be useful + to specify a path to a unix socket with no slash ('/'). Currently + supported prefixes are : + - 'ipv4@' -> address is always IPv4 + - 'ipv6@' -> address is always IPv6 + - 'unix@' -> address is a path to a local unix socket + - 'abns@' -> address is in abstract namespace (Linux only) + You may want to reference some environment variables in the + address parameter, see section 2.3 about environment variables. + + <port> is an optional port. It is normally not needed but may be useful + in some very specific contexts. The default value of zero means + the system will select a free port. Note that port ranges are not + supported in the backend. If you want to force port ranges, you + have to specify them on each "server" line. + + <addr2> is the IP address to present to the server when connections are + forwarded in full transparent proxy mode. This is currently only + supported on some patched Linux kernels. When this address is + specified, clients connecting to the server will be presented + with this address, while health checks will still use the address + <addr>. + + <port2> is the optional port to present to the server when connections + are forwarded in full transparent proxy mode (see <addr2> above). + The default value of zero means the system will select a free + port. + + <hdr> is the name of a HTTP header in which to fetch the IP to bind to. + This is the name of a comma-separated header list which can + contain multiple IP addresses. By default, the last occurrence is + used. This is designed to work with the X-Forwarded-For header + and to automatically bind to the client's IP address as seen + by previous proxy, typically Stunnel. In order to use another + occurrence from the last one, please see the <occ> parameter + below. When the header (or occurrence) is not found, no binding + is performed so that the proxy's default IP address is used. Also + keep in mind that the header name is case insensitive, as for any + HTTP header. + + <occ> is the occurrence number of a value to be used in a multi-value + header. This is to be used in conjunction with "hdr_ip(<hdr>)", + in order to specify which occurrence to use for the source IP + address. Positive values indicate a position from the first + occurrence, 1 being the first one. Negative values indicate + positions relative to the last one, -1 being the last one. This + is helpful for situations where an X-Forwarded-For header is set + at the entry point of an infrastructure and must be used several + proxy layers away. When this value is not specified, -1 is + assumed. Passing a zero here disables the feature. + + <name> is an optional interface name to which to bind to for outgoing + traffic. On systems supporting this features (currently, only + Linux), this allows one to bind all traffic to the server to + this interface even if it is not the one the system would select + based on routing tables. This should be used with extreme care. + Note that using this option requires root privileges. + + The "source" keyword is useful in complex environments where a specific + address only is allowed to connect to the servers. It may be needed when a + private address must be used through a public gateway for instance, and it is + known that the system cannot determine the adequate source address by itself. + + An extension which is available on certain patched Linux kernels may be used + through the "usesrc" optional keyword. It makes it possible to connect to the + servers with an IP address which does not belong to the system itself. This + is called "full transparent proxy mode". For this to work, the destination + servers have to route their traffic back to this address through the machine + running HAProxy, and IP forwarding must generally be enabled on this machine. + + In this "full transparent proxy" mode, it is possible to force a specific IP + address to be presented to the servers. This is not much used in fact. A more + common use is to tell HAProxy to present the client's IP address. For this, + there are two methods : + + - present the client's IP and port addresses. This is the most transparent + mode, but it can cause problems when IP connection tracking is enabled on + the machine, because a same connection may be seen twice with different + states. However, this solution presents the huge advantage of not + limiting the system to the 64k outgoing address+port couples, because all + of the client ranges may be used. + + - present only the client's IP address and select a spare port. This + solution is still quite elegant but slightly less transparent (downstream + firewalls logs will not match upstream's). It also presents the downside + of limiting the number of concurrent connections to the usual 64k ports. + However, since the upstream and downstream ports are different, local IP + connection tracking on the machine will not be upset by the reuse of the + same session. + + This option sets the default source for all servers in the backend. It may + also be specified in a "defaults" section. Finer source address specification + is possible at the server level using the "source" server option. Refer to + section 5 for more information. + + In order to work, "usesrc" requires root privileges. + + Examples : + backend private + # Connect to the servers using our 192.168.1.200 source address + source 192.168.1.200 + + backend transparent_ssl1 + # Connect to the SSL farm from the client's source address + source 192.168.1.200 usesrc clientip + + backend transparent_ssl2 + # Connect to the SSL farm from the client's source address and port + # not recommended if IP conntrack is present on the local machine. + source 192.168.1.200 usesrc client + + backend transparent_ssl3 + # Connect to the SSL farm from the client's source address. It + # is more conntrack-friendly. + source 192.168.1.200 usesrc clientip + + backend transparent_smtp + # Connect to the SMTP farm from the client's source address/port + # with Tproxy version 4. + source 0.0.0.0 usesrc clientip + + backend transparent_http + # Connect to the servers using the client's IP as seen by previous + # proxy. + source 0.0.0.0 usesrc hdr_ip(x-forwarded-for,-1) + + See also : the "source" server option in section 5, the Tproxy patches for + the Linux kernel on www.balabit.com, the "bind" keyword. + + +srvtcpka-cnt <count> + Sets the maximum number of keepalive probes TCP should send before dropping + the connection on the server side. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <count> is the maximum number of keepalive probes. + + This keyword corresponds to the socket option TCP_KEEPCNT. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_probes) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option srvtcpka", "srvtcpka-idle", "srvtcpka-intvl". + + +srvtcpka-idle <timeout> + Sets the time the connection needs to remain idle before TCP starts sending + keepalive probes, if enabled the sending of TCP keepalive packets on the + server side. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the time the connection needs to remain idle before TCP starts + sending keepalive probes. It is specified in seconds by default, + but can be in any other unit if the number is suffixed by the + unit, as explained at the top of this document. + + This keyword corresponds to the socket option TCP_KEEPIDLE. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_time) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option srvtcpka", "srvtcpka-cnt", "srvtcpka-intvl". + + +srvtcpka-intvl <timeout> + Sets the time between individual keepalive probes on the server side. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the time between individual keepalive probes. It is specified + in seconds by default, but can be in any other unit if the number + is suffixed by the unit, as explained at the top of this + document. + + This keyword corresponds to the socket option TCP_KEEPINTVL. If this keyword + is not specified, system-wide TCP parameter (tcp_keepalive_intvl) is used. + The availability of this setting depends on the operating system. It is + known to work on Linux. + + See also : "option srvtcpka", "srvtcpka-cnt", "srvtcpka-idle". + + +stats admin { if | unless } <cond> + Enable statistics admin level if/unless a condition is matched + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + + This statement enables the statistics admin level if/unless a condition is + matched. + + The admin level allows to enable/disable servers from the web interface. By + default, statistics page is read-only for security reasons. + + Currently, the POST request is limited to the buffer size minus the reserved + buffer space, which means that if the list of servers is too long, the + request won't be processed. It is recommended to alter few servers at a + time. + + Example : + # statistics admin level only for localhost + backend stats_localhost + stats enable + stats admin if LOCALHOST + + Example : + # statistics admin level always enabled because of the authentication + backend stats_auth + stats enable + stats auth admin:AdMiN123 + stats admin if TRUE + + Example : + # statistics admin level depends on the authenticated user + userlist stats-auth + group admin users admin + user admin insecure-password AdMiN123 + group readonly users haproxy + user haproxy insecure-password haproxy + + backend stats_auth + stats enable + acl AUTH http_auth(stats-auth) + acl AUTH_ADMIN http_auth_group(stats-auth) admin + stats http-request auth unless AUTH + stats admin if AUTH_ADMIN + + See also : "stats enable", "stats auth", "stats http-request", section 3.4 + about userlists and section 7 about ACL usage. + + +stats auth <user>:<passwd> + Enable statistics with authentication and grant access to an account + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <user> is a user name to grant access to + + <passwd> is the cleartext password associated to this user + + This statement enables statistics with default settings, and restricts access + to declared users only. It may be repeated as many times as necessary to + allow as many users as desired. When a user tries to access the statistics + without a valid account, a "401 Forbidden" response will be returned so that + the browser asks the user to provide a valid user and password. The real + which will be returned to the browser is configurable using "stats realm". + + Since the authentication method is HTTP Basic Authentication, the passwords + circulate in cleartext on the network. Thus, it was decided that the + configuration file would also use cleartext passwords to remind the users + that those ones should not be sensitive and not shared with any other account. + + It is also possible to reduce the scope of the proxies which appear in the + report using "stats scope". + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats enable", "stats realm", "stats scope", "stats uri" + + +stats enable + Enable statistics reporting with default settings + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + This statement enables statistics reporting with default settings defined + at build time. Unless stated otherwise, these settings are used : + - stats uri : /haproxy?stats + - stats realm : "HAProxy Statistics" + - stats auth : no authentication + - stats scope : no restriction + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats realm", "stats uri" + + +stats hide-version + Enable statistics and hide HAProxy version reporting + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + By default, the stats page reports some useful status information along with + the statistics. Among them is HAProxy's version. However, it is generally + considered dangerous to report precise version to anyone, as it can help them + target known weaknesses with specific attacks. The "stats hide-version" + statement removes the version from the statistics report. This is recommended + for public sites or any site with a weak login/password. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats enable", "stats realm", "stats uri" + + +stats http-request { allow | deny | auth [realm <realm>] } + [ { if | unless } <condition> ] + Access control for statistics + + May be used in sections: defaults | frontend | listen | backend + no | no | yes | yes + + As "http-request", these set of options allow to fine control access to + statistics. Each option may be followed by if/unless and acl. + First option with matched condition (or option without condition) is final. + For "deny" a 403 error will be returned, for "allow" normal processing is + performed, for "auth" a 401/407 error code is returned so the client + should be asked to enter a username and password. + + There is no fixed limit to the number of http-request statements per + instance. + + See also : "http-request", section 3.4 about userlists and section 7 + about ACL usage. + + +stats realm <realm> + Enable statistics and set authentication realm + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <realm> is the name of the HTTP Basic Authentication realm reported to + the browser. The browser uses it to display it in the pop-up + inviting the user to enter a valid username and password. + + The realm is read as a single word, so any spaces in it should be escaped + using a backslash ('\'). + + This statement is useful only in conjunction with "stats auth" since it is + only related to authentication. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats enable", "stats uri" + + +stats refresh <delay> + Enable statistics with automatic refresh + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <delay> is the suggested refresh delay, specified in seconds, which will + be returned to the browser consulting the report page. While the + browser is free to apply any delay, it will generally respect it + and refresh the page this every seconds. The refresh interval may + be specified in any other non-default time unit, by suffixing the + unit after the value, as explained at the top of this document. + + This statement is useful on monitoring displays with a permanent page + reporting the load balancer's activity. When set, the HTML report page will + include a link "refresh"/"stop refresh" so that the user can select whether + they want automatic refresh of the page or not. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats enable", "stats realm", "stats uri" + + +stats scope { <name> | "." } + Enable statistics and limit access scope + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <name> is the name of a listen, frontend or backend section to be + reported. The special name "." (a single dot) designates the + section in which the statement appears. + + When this statement is specified, only the sections enumerated with this + statement will appear in the report. All other ones will be hidden. This + statement may appear as many times as needed if multiple sections need to be + reported. Please note that the name checking is performed as simple string + comparisons, and that it is never checked that a give section name really + exists. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats enable", "stats realm", "stats uri" + + +stats show-desc [ <desc> ] + Enable reporting of a description on the statistics page. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + + <desc> is an optional description to be reported. If unspecified, the + description from global section is automatically used instead. + + This statement is useful for users that offer shared services to their + customers, where node or description should be different for each customer. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. By default description is not shown. + + Example : + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats show-desc Master node for Europe, Asia, Africa + stats uri /admin?stats + stats refresh 5s + + See also: "show-node", "stats enable", "stats uri" and "description" in + global section. + + +stats show-legends + Enable reporting additional information on the statistics page + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + Enable reporting additional information on the statistics page : + - cap: capabilities (proxy) + - mode: one of tcp, http or health (proxy) + - id: SNMP ID (proxy, socket, server) + - IP (socket, server) + - cookie (backend, server) + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. Default behavior is not to show this information. + + See also: "stats enable", "stats uri". + + +stats show-modules + Enable display of extra statistics module on the statistics page + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : none + + New columns are added at the end of the line containing the extra statistics + values as a tooltip. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. Default behavior is not to show this information. + + See also: "stats enable", "stats uri". + + +stats show-node [ <name> ] + Enable reporting of a host name on the statistics page. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments: + <name> is an optional name to be reported. If unspecified, the + node name from global section is automatically used instead. + + This statement is useful for users that offer shared services to their + customers, where node or description might be different on a stats page + provided for each customer. Default behavior is not to show host name. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example: + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats show-node Europe-1 + stats uri /admin?stats + stats refresh 5s + + See also: "show-desc", "stats enable", "stats uri", and "node" in global + section. + + +stats uri <prefix> + Enable statistics and define the URI prefix to access them + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <prefix> is the prefix of any URI which will be redirected to stats. This + prefix may contain a question mark ('?') to indicate part of a + query string. + + The statistics URI is intercepted on the relayed traffic, so it appears as a + page within the normal application. It is strongly advised to ensure that the + selected URI will never appear in the application, otherwise it will never be + possible to reach it in the application. + + The default URI compiled in HAProxy is "/haproxy?stats", but this may be + changed at build time, so it's better to always explicitly specify it here. + It is generally a good idea to include a question mark in the URI so that + intermediate proxies refrain from caching the results. Also, since any string + beginning with the prefix will be accepted as a stats request, the question + mark helps ensuring that no valid URI will begin with the same words. + + It is sometimes very convenient to use "/" as the URI prefix, and put that + statement in a "listen" instance of its own. That makes it easy to dedicate + an address or a port to statistics only. + + Though this statement alone is enough to enable statistics reporting, it is + recommended to set all other settings in order to avoid relying on default + unobvious parameters. + + Example : + # public access (limited to this backend only) + backend public_www + server srv1 192.168.0.1:80 + stats enable + stats hide-version + stats scope . + stats uri /admin?stats + stats realm HAProxy\ Statistics + stats auth admin1:AdMiN123 + stats auth admin2:AdMiN321 + + # internal monitoring access (unlimited) + backend private_monitoring + stats enable + stats uri /admin?stats + stats refresh 5s + + See also : "stats auth", "stats enable", "stats realm" + + +stick match <pattern> [table <table>] [{if | unless} <cond>] + Define a request pattern matching condition to stick a user to a server + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + + Arguments : + <pattern> is a sample expression rule as described in section 7.3. It + describes what elements of the incoming request or connection + will be analyzed in the hope to find a matching entry in a + stickiness table. This rule is mandatory. + + <table> is an optional stickiness table name. If unspecified, the same + backend's table is used. A stickiness table is declared using + the "stick-table" statement. + + <cond> is an optional matching condition. It makes it possible to match + on a certain criterion only when other conditions are met (or + not met). For instance, it could be used to match on a source IP + address except when a request passes through a known proxy, in + which case we'd match on a header containing that IP address. + + Some protocols or applications require complex stickiness rules and cannot + always simply rely on cookies nor hashing. The "stick match" statement + describes a rule to extract the stickiness criterion from an incoming request + or connection. See section 7 for a complete list of possible patterns and + transformation rules. + + The table has to be declared using the "stick-table" statement. It must be of + a type compatible with the pattern. By default it is the one which is present + in the same backend. It is possible to share a table with other backends by + referencing it using the "table" keyword. If another table is referenced, + the server's ID inside the backends are used. By default, all server IDs + start at 1 in each backend, so the server ordering is enough. But in case of + doubt, it is highly recommended to force server IDs using their "id" setting. + + It is possible to restrict the conditions where a "stick match" statement + will apply, using "if" or "unless" followed by a condition. See section 7 for + ACL based conditions. + + There is no limit on the number of "stick match" statements. The first that + applies and matches will cause the request to be directed to the same server + as was used for the request which created the entry. That way, multiple + matches can be used as fallbacks. + + The stick rules are checked after the persistence cookies, so they will not + affect stickiness if a cookie has already been used to select a server. That + way, it becomes very easy to insert cookies and match on IP addresses in + order to maintain stickiness between HTTP and HTTPS. + + Example : + # forward SMTP users to the same server they just used for POP in the + # last 30 minutes + backend pop + mode tcp + balance roundrobin + stick store-request src + stick-table type ip size 200k expire 30m + server s1 192.168.1.1:110 + server s2 192.168.1.1:110 + + backend smtp + mode tcp + balance roundrobin + stick match src table pop + server s1 192.168.1.1:25 + server s2 192.168.1.1:25 + + See also : "stick-table", "stick on", and section 7 about ACLs and samples + fetching. + + +stick on <pattern> [table <table>] [{if | unless} <condition>] + Define a request pattern to associate a user to a server + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + + Note : This form is exactly equivalent to "stick match" followed by + "stick store-request", all with the same arguments. Please refer + to both keywords for details. It is only provided as a convenience + for writing more maintainable configurations. + + Examples : + # The following form ... + stick on src table pop if !localhost + + # ...is strictly equivalent to this one : + stick match src table pop if !localhost + stick store-request src table pop if !localhost + + + # Use cookie persistence for HTTP, and stick on source address for HTTPS as + # well as HTTP without cookie. Share the same table between both accesses. + backend http + mode http + balance roundrobin + stick on src table https + cookie SRV insert indirect nocache + server s1 192.168.1.1:80 cookie s1 + server s2 192.168.1.1:80 cookie s2 + + backend https + mode tcp + balance roundrobin + stick-table type ip size 200k expire 30m + stick on src + server s1 192.168.1.1:443 + server s2 192.168.1.1:443 + + See also : "stick match", "stick store-request". + + +stick store-request <pattern> [table <table>] [{if | unless} <condition>] + Define a request pattern used to create an entry in a stickiness table + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + + Arguments : + <pattern> is a sample expression rule as described in section 7.3. It + describes what elements of the incoming request or connection + will be analyzed, extracted and stored in the table once a + server is selected. + + <table> is an optional stickiness table name. If unspecified, the same + backend's table is used. A stickiness table is declared using + the "stick-table" statement. + + <cond> is an optional storage condition. It makes it possible to store + certain criteria only when some conditions are met (or not met). + For instance, it could be used to store the source IP address + except when the request passes through a known proxy, in which + case we'd store a converted form of a header containing that IP + address. + + Some protocols or applications require complex stickiness rules and cannot + always simply rely on cookies nor hashing. The "stick store-request" statement + describes a rule to decide what to extract from the request and when to do + it, in order to store it into a stickiness table for further requests to + match it using the "stick match" statement. Obviously the extracted part must + make sense and have a chance to be matched in a further request. Storing a + client's IP address for instance often makes sense. Storing an ID found in a + URL parameter also makes sense. Storing a source port will almost never make + any sense because it will be randomly matched. See section 7 for a complete + list of possible patterns and transformation rules. + + The table has to be declared using the "stick-table" statement. It must be of + a type compatible with the pattern. By default it is the one which is present + in the same backend. It is possible to share a table with other backends by + referencing it using the "table" keyword. If another table is referenced, + the server's ID inside the backends are used. By default, all server IDs + start at 1 in each backend, so the server ordering is enough. But in case of + doubt, it is highly recommended to force server IDs using their "id" setting. + + It is possible to restrict the conditions where a "stick store-request" + statement will apply, using "if" or "unless" followed by a condition. This + condition will be evaluated while parsing the request, so any criteria can be + used. See section 7 for ACL based conditions. + + There is no limit on the number of "stick store-request" statements, but + there is a limit of 8 simultaneous stores per request or response. This + makes it possible to store up to 8 criteria, all extracted from either the + request or the response, regardless of the number of rules. Only the 8 first + ones which match will be kept. Using this, it is possible to feed multiple + tables at once in the hope to increase the chance to recognize a user on + another protocol or access method. Using multiple store-request rules with + the same table is possible and may be used to find the best criterion to rely + on, by arranging the rules by decreasing preference order. Only the first + extracted criterion for a given table will be stored. All subsequent store- + request rules referencing the same table will be skipped and their ACLs will + not be evaluated. + + The "store-request" rules are evaluated once the server connection has been + established, so that the table will contain the real server that processed + the request. + + Example : + # forward SMTP users to the same server they just used for POP in the + # last 30 minutes + backend pop + mode tcp + balance roundrobin + stick store-request src + stick-table type ip size 200k expire 30m + server s1 192.168.1.1:110 + server s2 192.168.1.1:110 + + backend smtp + mode tcp + balance roundrobin + stick match src table pop + server s1 192.168.1.1:25 + server s2 192.168.1.1:25 + + See also : "stick-table", "stick on", about ACLs and sample fetching. + + +stick-table type {ip | integer | string [len <length>] | binary [len <length>]} + size <size> [expire <expire>] [nopurge] [peers <peersect>] [srvkey <srvkey>] + [store <data_type>]* + Configure the stickiness table for the current section + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | yes + + Arguments : + ip a table declared with "type ip" will only store IPv4 addresses. + This form is very compact (about 50 bytes per entry) and allows + very fast entry lookup and stores with almost no overhead. This + is mainly used to store client source IP addresses. + + ipv6 a table declared with "type ipv6" will only store IPv6 addresses. + This form is very compact (about 60 bytes per entry) and allows + very fast entry lookup and stores with almost no overhead. This + is mainly used to store client source IP addresses. + + integer a table declared with "type integer" will store 32bit integers + which can represent a client identifier found in a request for + instance. + + string a table declared with "type string" will store substrings of up + to <len> characters. If the string provided by the pattern + extractor is larger than <len>, it will be truncated before + being stored. During matching, at most <len> characters will be + compared between the string in the table and the extracted + pattern. When not specified, the string is automatically limited + to 32 characters. + + binary a table declared with "type binary" will store binary blocks + of <len> bytes. If the block provided by the pattern + extractor is larger than <len>, it will be truncated before + being stored. If the block provided by the sample expression + is shorter than <len>, it will be padded by 0. When not + specified, the block is automatically limited to 32 bytes. + + <length> is the maximum number of characters that will be stored in a + "string" type table (See type "string" above). Or the number + of bytes of the block in "binary" type table. Be careful when + changing this parameter as memory usage will proportionally + increase. + + <size> is the maximum number of entries that can fit in the table. This + value directly impacts memory usage. Count approximately + 50 bytes per entry, plus the size of a string if any. The size + supports suffixes "k", "m", "g" for 2^10, 2^20 and 2^30 factors. + + [nopurge] indicates that we refuse to purge older entries when the table + is full. When not specified and the table is full when HAProxy + wants to store an entry in it, it will flush a few of the oldest + entries in order to release some space for the new ones. This is + most often the desired behavior. In some specific cases, it + be desirable to refuse new entries instead of purging the older + ones. That may be the case when the amount of data to store is + far above the hardware limits and we prefer not to offer access + to new clients than to reject the ones already connected. When + using this parameter, be sure to properly set the "expire" + parameter (see below). + + <peersect> is the name of the peers section to use for replication. Entries + which associate keys to server IDs are kept synchronized with + the remote peers declared in this section. All entries are also + automatically learned from the local peer (old process) during a + soft restart. + + <expire> defines the maximum duration of an entry in the table since it + was last created, refreshed using 'track-sc' or matched using + 'stick match' or 'stick on' rule. The expiration delay is + defined using the standard time format, similarly as the various + timeouts. The maximum duration is slightly above 24 days. See + section 2.5 for more information. If this delay is not specified, + the session won't automatically expire, but older entries will + be removed once full. Be sure not to use the "nopurge" parameter + if not expiration delay is specified. + Note: 'table_*' converters performs lookups but won't update touch + expire since they don't require 'track-sc'. + + <srvkey> specifies how each server is identified for the purposes of the + stick table. The valid values are "name" and "addr". If "name" is + given, then <name> argument for the server (may be generated by + a template). If "addr" is given, then the server is identified + by its current network address, including the port. "addr" is + especially useful if you are using service discovery to generate + the addresses for servers with peered stick-tables and want + to consistently use the same host across peers for a stickiness + token. + + <data_type> is used to store additional information in the stick-table. This + may be used by ACLs in order to control various criteria related + to the activity of the client matching the stick-table. For each + item specified here, the size of each entry will be inflated so + that the additional data can fit. Several data types may be + stored with an entry. Multiple data types may be specified after + the "store" keyword, as a comma-separated list. Alternatively, + it is possible to repeat the "store" keyword followed by one or + several data types. Except for the "server_id" type which is + automatically detected and enabled, all data types must be + explicitly declared to be stored. If an ACL references a data + type which is not stored, the ACL will simply not match. Some + data types require an argument which must be passed just after + the type between parenthesis. See below for the supported data + types and their arguments. + + The data types that can be stored with an entry are the following : + - server_id : this is an integer which holds the numeric ID of the server a + request was assigned to. It is used by the "stick match", "stick store", + and "stick on" rules. It is automatically enabled when referenced. + + - gpc(<nb>) : General Purpose Counters Array of <nb> elements. This is an + array of positive 32-bit integers which may be used to count anything. + Most of the time they will be used as a incremental counters on some + entries, for instance to note that a limit is reached and trigger some + actions. This array is limited to a maximum of 100 elements: + gpc0 to gpc99, to ensure that the build of a peer update + message can fit into the buffer. Users should take in consideration + that a large amount of counters will increase the data size and the + traffic load using peers protocol since all data/counters are pushed + each time any of them is updated. + This data_type will exclude the usage of the legacy data_types 'gpc0' + and 'gpc1' on the same table. Using the 'gpc' array data_type, all 'gpc0' + and 'gpc1' related fetches and actions will apply to the two first + elements of this array. + + - gpc_rate(<nb>,<period>) : Array of increment rates of General Purpose + Counters over a period. Those elements are positive 32-bit integers which + may be used for anything. Just like <gpc>, the count events, but instead + of keeping a cumulative number, they maintain the rate at which the + counter is incremented. Most of the time it will be used to measure the + frequency of occurrence of certain events (e.g. requests to a specific + URL). This array is limited to a maximum of 100 elements: gpt(100) + allowing the storage of gpc0 to gpc99, to ensure that the build of a peer + update message can fit into the buffer. + The array cannot contain less than 1 element: use gpc(1) if you want to + store only the counter gpc0. + Users should take in consideration that a large amount of + counters will increase the data size and the traffic load using peers + protocol since all data/counters are pushed each time any of them is + updated. + This data_type will exclude the usage of the legacy data_types + 'gpc0_rate' and 'gpc1_rate' on the same table. Using the 'gpc_rate' + array data_type, all 'gpc0' and 'gpc1' related fetches and actions + will apply to the two first elements of this array. + + - gpc0 : first General Purpose Counter. It is a positive 32-bit integer + integer which may be used for anything. Most of the time it will be used + to put a special tag on some entries, for instance to note that a + specific behavior was detected and must be known for future matches. + + - gpc0_rate(<period>) : increment rate of the first General Purpose Counter + over a period. It is a positive 32-bit integer integer which may be used + for anything. Just like <gpc0>, it counts events, but instead of keeping + a cumulative number, it maintains the rate at which the counter is + incremented. Most of the time it will be used to measure the frequency of + occurrence of certain events (e.g. requests to a specific URL). + + - gpc1 : second General Purpose Counter. It is a positive 32-bit integer + integer which may be used for anything. Most of the time it will be used + to put a special tag on some entries, for instance to note that a + specific behavior was detected and must be known for future matches. + + - gpc1_rate(<period>) : increment rate of the second General Purpose Counter + over a period. It is a positive 32-bit integer integer which may be used + for anything. Just like <gpc1>, it counts events, but instead of keeping + a cumulative number, it maintains the rate at which the counter is + incremented. Most of the time it will be used to measure the frequency of + occurrence of certain events (e.g. requests to a specific URL). + + - gpt(<nb>) : General Purpose Tags Array of <nb> elements. This is an array + of positive 32-bit integers which may be used for anything. + Most of the time they will be used to put a special tags on some entries, + for instance to note that a specific behavior was detected and must be + known for future matches. This array is limited to a maximum of 100 + elements: gpt(100) allowing the storage of gpt0 to gpt99, to ensure that + the build of a peer update message can fit into the buffer. + The array cannot contain less than 1 element: use gpt(1) if you want to + to store only the tag gpt0. + Users should take in consideration that a large amount of counters will + increase the data size and the traffic load using peers protocol since + all data/counters are pushed each time any of them is updated. + This data_type will exclude the usage of the legacy data_type 'gpt0' + on the same table. Using the 'gpt' array data_type, all 'gpt0' related + fetches and actions will apply to the first element of this array. + + - gpt0 : first General Purpose Tag. It is a positive 32-bit integer + integer which may be used for anything. Most of the time it will be used + to put a special tag on some entries, for instance to note that a + specific behavior was detected and must be known for future matches + + - conn_cnt : Connection Count. It is a positive 32-bit integer which counts + the absolute number of connections received from clients which matched + this entry. It does not mean the connections were accepted, just that + they were received. + + - conn_cur : Current Connections. It is a positive 32-bit integer which + stores the concurrent connection counts for the entry. It is incremented + once an incoming connection matches the entry, and decremented once the + connection leaves. That way it is possible to know at any time the exact + number of concurrent connections for an entry. + + - conn_rate(<period>) : frequency counter (takes 12 bytes). It takes an + integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + incoming connection rate over that period, in connections per period. The + result is an integer which can be matched using ACLs. + + - sess_cnt : Session Count. It is a positive 32-bit integer which counts + the absolute number of sessions received from clients which matched this + entry. A session is a connection that was accepted by the layer 4 rules. + + - sess_rate(<period>) : frequency counter (takes 12 bytes). It takes an + integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + incoming session rate over that period, in sessions per period. The + result is an integer which can be matched using ACLs. + + - http_req_cnt : HTTP request Count. It is a positive 32-bit integer which + counts the absolute number of HTTP requests received from clients which + matched this entry. It does not matter whether they are valid requests or + not. Note that this is different from sessions when keep-alive is used on + the client side. + + - http_req_rate(<period>) : frequency counter (takes 12 bytes). It takes an + integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + HTTP request rate over that period, in requests per period. The result is + an integer which can be matched using ACLs. It does not matter whether + they are valid requests or not. Note that this is different from sessions + when keep-alive is used on the client side. + + - http_err_cnt : HTTP Error Count. It is a positive 32-bit integer which + counts the absolute number of HTTP requests errors induced by clients + which matched this entry. Errors are counted on invalid and truncated + requests, as well as on denied or tarpitted requests, and on failed + authentications. If the server responds with 4xx, then the request is + also counted as an error since it's an error triggered by the client + (e.g. vulnerability scan). + + - http_err_rate(<period>) : frequency counter (takes 12 bytes). It takes an + integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + HTTP request error rate over that period, in requests per period (see + http_err_cnt above for what is accounted as an error). The result is an + integer which can be matched using ACLs. + + - http_fail_cnt : HTTP Failure Count. It is a positive 32-bit integer which + counts the absolute number of HTTP response failures induced by servers + which matched this entry. Errors are counted on invalid and truncated + responses, as well as any 5xx response other than 501 or 505. It aims at + being used combined with path or URI to detect service failures. + + - http_fail_rate(<period>) : frequency counter (takes 12 bytes). It takes + an integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + HTTP response failure rate over that period, in requests per period (see + http_fail_cnt above for what is accounted as a failure). The result is an + integer which can be matched using ACLs. + + - bytes_in_cnt : client to server byte count. It is a positive 64-bit + integer which counts the cumulative number of bytes received from clients + which matched this entry. Headers are included in the count. This may be + used to limit abuse of upload features on photo or video servers. + + - bytes_in_rate(<period>) : frequency counter (takes 12 bytes). It takes an + integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + incoming bytes rate over that period, in bytes per period. It may be used + to detect users which upload too much and too fast. Warning: with large + uploads, it is possible that the amount of uploaded data will be counted + once upon termination, thus causing spikes in the average transfer speed + instead of having a smooth one. This may partially be smoothed with + "option contstats" though this is not perfect yet. Use of byte_in_cnt is + recommended for better fairness. + + - bytes_out_cnt : server to client byte count. It is a positive 64-bit + integer which counts the cumulative number of bytes sent to clients which + matched this entry. Headers are included in the count. This may be used + to limit abuse of bots sucking the whole site. + + - bytes_out_rate(<period>) : frequency counter (takes 12 bytes). It takes + an integer parameter <period> which indicates in milliseconds the length + of the period over which the average is measured. It reports the average + outgoing bytes rate over that period, in bytes per period. It may be used + to detect users which download too much and too fast. Warning: with large + transfers, it is possible that the amount of transferred data will be + counted once upon termination, thus causing spikes in the average + transfer speed instead of having a smooth one. This may partially be + smoothed with "option contstats" though this is not perfect yet. Use of + byte_out_cnt is recommended for better fairness. + + There is only one stick-table per proxy. At the moment of writing this doc, + it does not seem useful to have multiple tables per proxy. If this happens + to be required, simply create a dummy backend with a stick-table in it and + reference it. + + It is important to understand that stickiness based on learning information + has some limitations, including the fact that all learned associations are + lost upon restart unless peers are properly configured to transfer such + information upon restart (recommended). In general it can be good as a + complement but not always as an exclusive stickiness. + + Last, memory requirements may be important when storing many data types. + Indeed, storing all indicators above at once in each entry requires 116 bytes + per entry, or 116 MB for a 1-million entries table. This is definitely not + something that can be ignored. + + Example: + # Keep track of counters of up to 1 million IP addresses over 5 minutes + # and store a general purpose counter and the average connection rate + # computed over a sliding window of 30 seconds. + stick-table type ip size 1m expire 5m store gpc0,conn_rate(30s) + + See also : "stick match", "stick on", "stick store-request", section 2.5 + about time format and section 7 about ACLs. + + +stick store-response <pattern> [table <table>] [{if | unless} <condition>] + Define a response pattern used to create an entry in a stickiness table + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + + Arguments : + <pattern> is a sample expression rule as described in section 7.3. It + describes what elements of the response or connection will + be analyzed, extracted and stored in the table once a + server is selected. + + <table> is an optional stickiness table name. If unspecified, the same + backend's table is used. A stickiness table is declared using + the "stick-table" statement. + + <cond> is an optional storage condition. It makes it possible to store + certain criteria only when some conditions are met (or not met). + For instance, it could be used to store the SSL session ID only + when the response is a SSL server hello. + + Some protocols or applications require complex stickiness rules and cannot + always simply rely on cookies nor hashing. The "stick store-response" + statement describes a rule to decide what to extract from the response and + when to do it, in order to store it into a stickiness table for further + requests to match it using the "stick match" statement. Obviously the + extracted part must make sense and have a chance to be matched in a further + request. Storing an ID found in a header of a response makes sense. + See section 7 for a complete list of possible patterns and transformation + rules. + + The table has to be declared using the "stick-table" statement. It must be of + a type compatible with the pattern. By default it is the one which is present + in the same backend. It is possible to share a table with other backends by + referencing it using the "table" keyword. If another table is referenced, + the server's ID inside the backends are used. By default, all server IDs + start at 1 in each backend, so the server ordering is enough. But in case of + doubt, it is highly recommended to force server IDs using their "id" setting. + + It is possible to restrict the conditions where a "stick store-response" + statement will apply, using "if" or "unless" followed by a condition. This + condition will be evaluated while parsing the response, so any criteria can + be used. See section 7 for ACL based conditions. + + There is no limit on the number of "stick store-response" statements, but + there is a limit of 8 simultaneous stores per request or response. This + makes it possible to store up to 8 criteria, all extracted from either the + request or the response, regardless of the number of rules. Only the 8 first + ones which match will be kept. Using this, it is possible to feed multiple + tables at once in the hope to increase the chance to recognize a user on + another protocol or access method. Using multiple store-response rules with + the same table is possible and may be used to find the best criterion to rely + on, by arranging the rules by decreasing preference order. Only the first + extracted criterion for a given table will be stored. All subsequent store- + response rules referencing the same table will be skipped and their ACLs will + not be evaluated. However, even if a store-request rule references a table, a + store-response rule may also use the same table. This means that each table + may learn exactly one element from the request and one element from the + response at once. + + The table will contain the real server that processed the request. + + Example : + # Learn SSL session ID from both request and response and create affinity. + backend https + mode tcp + balance roundrobin + # maximum SSL session ID length is 32 bytes. + stick-table type binary len 32 size 30k expire 30m + + acl clienthello req.ssl_hello_type 1 + acl serverhello rep.ssl_hello_type 2 + + # use tcp content accepts to detects ssl client and server hello. + tcp-request inspect-delay 5s + tcp-request content accept if clienthello + + # no timeout on response inspect delay by default. + tcp-response content accept if serverhello + + # SSL session ID (SSLID) may be present on a client or server hello. + # Its length is coded on 1 byte at offset 43 and its value starts + # at offset 44. + + # Match and learn on request if client hello. + stick on req.payload_lv(43,1) if clienthello + + # Learn on response if server hello. + stick store-response resp.payload_lv(43,1) if serverhello + + server s1 192.168.1.1:443 + server s2 192.168.1.1:443 + + See also : "stick-table", "stick on", and section 7 about ACLs and pattern + extraction. + + +tcp-check comment <string> + Defines a comment for the following the tcp-check rule, reported in logs if + it fails. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <string> is the comment message to add in logs if the following tcp-check + rule fails. + + It only works for connect, send and expect rules. It is useful to make + user-friendly error reporting. + + See also : "option tcp-check", "tcp-check connect", "tcp-check send" and + "tcp-check expect". + + +tcp-check connect [default] [port <expr>] [addr <ip>] [send-proxy] [via-socks4] + [ssl] [sni <sni>] [alpn <alpn>] [linger] + [proto <name>] [comment <msg>] + Opens a new connection + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + default Use default options of the server line to do the health + checks. The server options are used only if not redefined. + + port <expr> if not set, check port or server port is used. + It tells HAProxy where to open the connection to. + <port> must be a valid TCP port source integer, from 1 to + 65535 or an sample-fetch expression. + + addr <ip> defines the IP address to do the health check. + + send-proxy send a PROXY protocol string + + via-socks4 enables outgoing health checks using upstream socks4 proxy. + + ssl opens a ciphered connection + + sni <sni> specifies the SNI to use to do health checks over SSL. + + alpn <alpn> defines which protocols to advertise with ALPN. The protocol + list consists in a comma-delimited list of protocol names, + for instance: "http/1.1,http/1.0" (without quotes). + If it is not set, the server ALPN is used. + + proto <name> forces the multiplexer's protocol to use for this connection. + It must be a TCP mux protocol and it must be usable on the + backend side. The list of available protocols is reported in + haproxy -vv. + + linger cleanly close the connection instead of using a single RST. + + When an application lies on more than a single TCP port or when HAProxy + load-balance many services in a single backend, it makes sense to probe all + the services individually before considering a server as operational. + + When there are no TCP port configured on the server line neither server port + directive, then the 'tcp-check connect port <port>' must be the first step + of the sequence. + + In a tcp-check ruleset a 'connect' is required, it is also mandatory to start + the ruleset with a 'connect' rule. Purpose is to ensure admin know what they + do. + + When a connect must start the ruleset, if may still be preceded by set-var, + unset-var or comment rules. + + Examples : + # check HTTP and HTTPs services on a server. + # first open port 80 thanks to server line port directive, then + # tcp-check opens port 443, ciphered and run a request on it: + option tcp-check + tcp-check connect + tcp-check send GET\ /\ HTTP/1.0\r\n + tcp-check send Host:\ haproxy.1wt.eu\r\n + tcp-check send \r\n + tcp-check expect rstring (2..|3..) + tcp-check connect port 443 ssl + tcp-check send GET\ /\ HTTP/1.0\r\n + tcp-check send Host:\ haproxy.1wt.eu\r\n + tcp-check send \r\n + tcp-check expect rstring (2..|3..) + server www 10.0.0.1 check port 80 + + # check both POP and IMAP from a single server: + option tcp-check + tcp-check connect port 110 linger + tcp-check expect string +OK\ POP3\ ready + tcp-check connect port 143 + tcp-check expect string *\ OK\ IMAP4\ ready + server mail 10.0.0.1 check + + See also : "option tcp-check", "tcp-check send", "tcp-check expect" + + +tcp-check expect [min-recv <int>] [comment <msg>] + [ok-status <st>] [error-status <st>] [tout-status <st>] + [on-success <fmt>] [on-error <fmt>] [status-code <expr>] + [!] <match> <pattern> + Specify data to be collected and analyzed during a generic health check + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + min-recv is optional and can define the minimum amount of data required to + evaluate the current expect rule. If the number of received bytes + is under this limit, the check will wait for more data. This + option can be used to resolve some ambiguous matching rules or to + avoid executing costly regex matches on content known to be still + incomplete. If an exact string (string or binary) is used, the + minimum between the string length and this parameter is used. + This parameter is ignored if it is set to -1. If the expect rule + does not match, the check will wait for more data. If set to 0, + the evaluation result is always conclusive. + + <match> is a keyword indicating how to look for a specific pattern in the + response. The keyword may be one of "string", "rstring", "binary" or + "rbinary". + The keyword may be preceded by an exclamation mark ("!") to negate + the match. Spaces are allowed between the exclamation mark and the + keyword. See below for more details on the supported keywords. + + ok-status <st> is optional and can be used to set the check status if + the expect rule is successfully evaluated and if it is + the last rule in the tcp-check ruleset. "L7OK", "L7OKC", + "L6OK" and "L4OK" are supported : + - L7OK : check passed on layer 7 + - L7OKC : check conditionally passed on layer 7, set + server to NOLB state. + - L6OK : check passed on layer 6 + - L4OK : check passed on layer 4 + By default "L7OK" is used. + + error-status <st> is optional and can be used to set the check status if + an error occurred during the expect rule evaluation. + "L7OKC", "L7RSP", "L7STS", "L6RSP" and "L4CON" are + supported : + - L7OKC : check conditionally passed on layer 7, set + server to NOLB state. + - L7RSP : layer 7 invalid response - protocol error + - L7STS : layer 7 response error, for example HTTP 5xx + - L6RSP : layer 6 invalid response - protocol error + - L4CON : layer 1-4 connection problem + By default "L7RSP" is used. + + tout-status <st> is optional and can be used to set the check status if + a timeout occurred during the expect rule evaluation. + "L7TOUT", "L6TOUT", and "L4TOUT" are supported : + - L7TOUT : layer 7 (HTTP/SMTP) timeout + - L6TOUT : layer 6 (SSL) timeout + - L4TOUT : layer 1-4 timeout + By default "L7TOUT" is used. + + on-success <fmt> is optional and can be used to customize the + informational message reported in logs if the expect + rule is successfully evaluated and if it is the last rule + in the tcp-check ruleset. <fmt> is a log-format string. + + on-error <fmt> is optional and can be used to customize the + informational message reported in logs if an error + occurred during the expect rule evaluation. <fmt> is a + log-format string. + + status-code <expr> is optional and can be used to set the check status code + reported in logs, on success or on error. <expr> is a + standard HAProxy expression formed by a sample-fetch + followed by some converters. + + <pattern> is the pattern to look for. It may be a string or a regular + expression. If the pattern contains spaces, they must be escaped + with the usual backslash ('\'). + If the match is set to binary, then the pattern must be passed as + a series of hexadecimal digits in an even number. Each sequence of + two digits will represent a byte. The hexadecimal digits may be + used upper or lower case. + + The available matches are intentionally similar to their http-check cousins : + + string <string> : test the exact string matches in the response buffer. + A health check response will be considered valid if the + response's buffer contains this exact string. If the + "string" keyword is prefixed with "!", then the response + will be considered invalid if the body contains this + string. This can be used to look for a mandatory pattern + in a protocol response, or to detect a failure when a + specific error appears in a protocol banner. + + rstring <regex> : test a regular expression on the response buffer. + A health check response will be considered valid if the + response's buffer matches this expression. If the + "rstring" keyword is prefixed with "!", then the response + will be considered invalid if the body matches the + expression. + + string-lf <fmt> : test a log-format string match in the response's buffer. + A health check response will be considered valid if the + response's buffer contains the string resulting of the + evaluation of <fmt>, which follows the log-format rules. + If prefixed with "!", then the response will be + considered invalid if the buffer contains the string. + + binary <hexstring> : test the exact string in its hexadecimal form matches + in the response buffer. A health check response will + be considered valid if the response's buffer contains + this exact hexadecimal string. + Purpose is to match data on binary protocols. + + rbinary <regex> : test a regular expression on the response buffer, like + "rstring". However, the response buffer is transformed + into its hexadecimal form, including NUL-bytes. This + allows using all regex engines to match any binary + content. The hexadecimal transformation takes twice the + size of the original response. As such, the expected + pattern should work on at-most half the response buffer + size. + + binary-lf <hexfmt> : test a log-format string in its hexadecimal form + match in the response's buffer. A health check response + will be considered valid if the response's buffer + contains the hexadecimal string resulting of the + evaluation of <fmt>, which follows the log-format + rules. If prefixed with "!", then the response will be + considered invalid if the buffer contains the + hexadecimal string. The hexadecimal string is converted + in a binary string before matching the response's + buffer. + + It is important to note that the responses will be limited to a certain size + defined by the global "tune.bufsize" option, which defaults to 16384 bytes. + Thus, too large responses may not contain the mandatory pattern when using + "string", "rstring" or binary. If a large response is absolutely required, it + is possible to change the default max size by setting the global variable. + However, it is worth keeping in mind that parsing very large responses can + waste some CPU cycles, especially when regular expressions are used, and that + it is always better to focus the checks on smaller resources. Also, in its + current state, the check will not find any string nor regex past a null + character in the response. Similarly it is not possible to request matching + the null character. + + Examples : + # perform a POP check + option tcp-check + tcp-check expect string +OK\ POP3\ ready + + # perform an IMAP check + option tcp-check + tcp-check expect string *\ OK\ IMAP4\ ready + + # look for the redis master server + option tcp-check + tcp-check send PING\r\n + tcp-check expect string +PONG + tcp-check send info\ replication\r\n + tcp-check expect string role:master + tcp-check send QUIT\r\n + tcp-check expect string +OK + + + See also : "option tcp-check", "tcp-check connect", "tcp-check send", + "tcp-check send-binary", "http-check expect", tune.bufsize + + +tcp-check send <data> [comment <msg>] +tcp-check send-lf <fmt> [comment <msg>] + Specify a string or a log-format string to be sent as a question during a + generic health check + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + <data> is the string that will be sent during a generic health + check session. + + <fmt> is the log-format string that will be sent, once evaluated, + during a generic health check session. + + Examples : + # look for the redis master server + option tcp-check + tcp-check send info\ replication\r\n + tcp-check expect string role:master + + See also : "option tcp-check", "tcp-check connect", "tcp-check expect", + "tcp-check send-binary", tune.bufsize + + +tcp-check send-binary <hexstring> [comment <msg>] +tcp-check send-binary-lf <hexfmt> [comment <msg>] + Specify an hex digits string or an hex digits log-format string to be sent as + a binary question during a raw tcp health check + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + comment <msg> defines a message to report if the rule evaluation fails. + + <hexstring> is the hexadecimal string that will be send, once converted + to binary, during a generic health check session. + + <hexfmt> is the hexadecimal log-format string that will be send, once + evaluated and converted to binary, during a generic health + check session. + + Examples : + # redis check in binary + option tcp-check + tcp-check send-binary 50494e470d0a # PING\r\n + tcp-check expect binary 2b504F4e47 # +PONG + + + See also : "option tcp-check", "tcp-check connect", "tcp-check expect", + "tcp-check send", tune.bufsize + + +tcp-check set-var(<var-name>[,<cond>...]) <expr> +tcp-check set-var-fmt(<var-name>[,<cond>...]) <fmt> + This operation sets the content of a variable. The variable is declared inline. + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <var-name> The name of the variable starts with an indication about its + scope. The scopes allowed for tcp-check are: + "proc" : the variable is shared with the whole process. + "sess" : the variable is shared with the tcp-check session. + "check": the variable is declared for the lifetime of the tcp-check. + This prefix is followed by a name. The separator is a '.'. + The name may only contain characters 'a-z', 'A-Z', '0-9', '.', + and '-'. + + <cond> A set of conditions that must all be true for the variable to + actually be set (such as "ifnotempty", "ifgt" ...). See the + set-var converter's description for a full list of possible + conditions. + + <expr> Is a sample-fetch expression potentially followed by converters. + + <fmt> This is the value expressed using log-format rules (see Custom + Log Format in section 8.2.4). + + Examples : + tcp-check set-var(check.port) int(1234) + tcp-check set-var-fmt(check.name) "%H" + + +tcp-check unset-var(<var-name>) + Free a reference to a variable within its scope. + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + + Arguments : + <var-name> The name of the variable starts with an indication about its + scope. The scopes allowed for tcp-check are: + "proc" : the variable is shared with the whole process. + "sess" : the variable is shared with the tcp-check session. + "check": the variable is declared for the lifetime of the tcp-check. + This prefix is followed by a name. The separator is a '.'. + The name may only contain characters 'a-z', 'A-Z', '0-9', '.', + and '-'. + + Examples : + tcp-check unset-var(check.port) + + +tcp-request connection <action> <options...> [ { if | unless } <condition> ] + Perform an action on an incoming connection depending on a layer 4 condition + May be used in sections : defaults | frontend | listen | backend + yes(!) | yes | yes | no + Arguments : + <action> defines the action to perform if the condition applies. See + below. + + <condition> is a standard layer4-only ACL-based condition (see section 7). + + Immediately after acceptance of a new incoming connection, it is possible to + evaluate some conditions to decide whether this connection must be accepted + or dropped or have its counters tracked. Those conditions cannot make use of + any data contents because the connection has not been read from yet, and the + buffers are not yet allocated. This is used to selectively and very quickly + accept or drop connections from various sources with a very low overhead. If + some contents need to be inspected in order to take the decision, the + "tcp-request content" statements must be used instead. + + The "tcp-request connection" rules are evaluated in their exact declaration + order. If no rule matches or if there is no rule, the default action is to + accept the incoming connection. There is no specific limit to the number of + rules which may be inserted. Any rule may optionally be followed by an + ACL-based condition, in which case it will only be evaluated if the condition + is true. + + The first keyword is the rule's action. Several types of actions are + supported: + - accept + - expect-netscaler-cip layer4 + - expect-proxy layer4 + - reject + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - set-dst <expr> + - set-dst-port <expr> + - set-mark <mark> + - set-src <expr> + - set-src-port <expr> + - set-tos <tos> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - silent-drop + - track-sc0 <key> [table <table>] + - track-sc1 <key> [table <table>] + - track-sc2 <key> [table <table>] + - unset-var(<var-name>) + + The supported actions are described below. + + There is no limit to the number of "tcp-request connection" statements per + instance. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Note that the "if/unless" condition is optional. If no condition is set on + the action, it is simply performed unconditionally. That can be useful for + "track-sc*" actions as well as for changing the default action to a reject. + + Example: accept all connections from white-listed hosts, reject too fast + connection without counting them, and track accepted connections. + This results in connection rate being capped from abusive sources. + + tcp-request connection accept if { src -f /etc/haproxy/whitelist.lst } + tcp-request connection reject if { src_conn_rate gt 10 } + tcp-request connection track-sc0 src + + Example: accept all connections from white-listed hosts, count all other + connections and reject too fast ones. This results in abusive ones + being blocked as long as they don't slow down. + + tcp-request connection accept if { src -f /etc/haproxy/whitelist.lst } + tcp-request connection track-sc0 src + tcp-request connection reject if { sc0_conn_rate gt 10 } + + Example: enable the PROXY protocol for traffic coming from all known proxies. + + tcp-request connection expect-proxy layer4 if { src -f proxies.lst } + + See section 7 about ACL usage. + + See also : "tcp-request session", "tcp-request content", "stick-table" + +tcp-request connection accept [ { if | unless } <condition> ] + + This is used to accept the connection. No further "tcp-request connection" + rules are evaluated. + +tcp-request connection expect-netscaler-cip layer4 + [ { if | unless } <condition> ] + + This configures the client-facing connection to receive a NetScaler Client IP + insertion protocol header before any byte is read from the socket. This is + equivalent to having the "accept-netscaler-cip" keyword on the "bind" line, + except that using the TCP rule allows the PROXY protocol to be accepted only + for certain IP address ranges using an ACL. This is convenient when multiple + layers of load balancers are passed through by traffic coming from public + hosts. + +tcp-request connection expect-proxy layer4 [ { if | unless } <condition> ] + + This configures the client-facing connection to receive a PROXY protocol + header before any byte is read from the socket. This is equivalent to having + the "accept-proxy" keyword on the "bind" line, except that using the TCP rule + allows the PROXY protocol to be accepted only for certain IP address ranges + using an ACL. This is convenient when multiple layers of load balancers are + passed through by traffic coming from public hosts. + +tcp-request connection reject [ { if | unless } <condition> ] + + This is used to reject the connection. No further "tcp-request connection" + rules are evaluated. Rejected connections do not even become a session, which + is why they are accounted separately for in the stats, as "denied + connections". They are not considered for the session rate-limit and are not + logged either. The reason is that these rules should only be used to filter + extremely high connection rates such as the ones encountered during a massive + DDoS attack. Under these extreme conditions, the simple action of logging + each event would make the system collapse and would considerably lower the + filtering capacity. If logging is absolutely desired, then "tcp-request + content" rules should be used instead, as "tcp-request session" rules will + not log either. + +tcp-request connection sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] +tcp-request connection sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +tcp-request connection sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + These actions increment the General Purppose Counters according to the sticky + counter designated by <sc-id>. Please refer to "http-request sc-inc-gpc", + "http-request sc-inc-gpc0" and "http-request sc-inc-gpc1" for a complete + description. + +tcp-request connection sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] +tcp-request connection sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + These actions set the 32-bit unsigned General Purpose Tags according to the + sticky counter designated by <sc-id>. Please refer to "http-request + sc-inc-gpt" and "http-request sc-inc-gpt0" for a complete description. + +tcp-request connection set-dst <expr> [ { if | unless } <condition> ] +tcp-request connection set-dst-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the destination IP/Port address to the value of + specified expression. Please refer to "http-request set-dst" and + "http-request set-dst-port" for a complete description. + +tcp-request connection set-mark <mark> [ { if | unless } <condition> ] + + This action is used to set the Netfilter/IPFW MARK in all packets sent to the + client to the value passed in <mark> on platforms which support it. Please + refer to "http-request set-mark" for a complete description. + +tcp-request connection set-src <expr> [ { if | unless } <condition> ] +tcp-request connection set-src-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the source IP/Port address to the value of + specified expression. Please refer to "http-request set-src" and + "http-request set-src-port" for a complete description. + +tcp-request connection set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. Please refer to + "http-request set-tos" for a complete description. + +tcp-request connection set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +tcp-request connection set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. "tcp-request connection" can set variables in the "proc" and "sess" + scopes. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +tcp-request connection silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. Please refer to "http-request silent-drop" for a + complete description. + +tcp-request connection track-sc0 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request connection track-sc1 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request connection track-sc2 <key> [table <table>] [ { if | unless } <condition> ] + + This enables tracking of sticky counters from current connection. Please + refer to "http-request track-sc0", "http-request track-sc1" and "http-request + track-sc2" for a complete description. + +tcp-request connection unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. Please refer to "http-request set-var" for + details about variables. + + +tcp-request content <action> [{if | unless} <condition>] + Perform an action on a new session depending on a layer 4-7 condition + May be used in sections : defaults | frontend | listen | backend + yes(!) | yes | yes | yes + Arguments : + <action> defines the action to perform if the condition applies. See + below. + + <condition> is a standard layer 4-7 ACL-based condition (see section 7). + + A request's contents can be analyzed at an early stage of request processing + called "TCP content inspection". During this stage, ACL-based rules are + evaluated every time the request contents are updated, until either an + "accept", a "reject" or a "switch-mode" rule matches, or the TCP request + inspection delay expires with no matching rule. + + The first difference between these rules and "tcp-request connection" rules + is that "tcp-request content" rules can make use of contents to take a + decision. Most often, these decisions will consider a protocol recognition or + validity. The second difference is that content-based rules can be used in + both frontends and backends. In case of HTTP keep-alive with the client, all + tcp-request content rules are evaluated again, so HAProxy keeps a record of + what sticky counters were assigned by a "tcp-request connection" versus a + "tcp-request content" rule, and flushes all the content-related ones after + processing an HTTP request, so that they may be evaluated again by the rules + being evaluated again for the next request. This is of particular importance + when the rule tracks some L7 information or when it is conditioned by an + L7-based ACL, since tracking may change between requests. + + Content-based rules are evaluated in their exact declaration order. If no + rule matches or if there is no rule, the default action is to accept the + contents. There is no specific limit to the number of rules which may be + inserted. + + The first keyword is the rule's action. Several types of actions are + supported: + - accept + - capture <sample> len <length> + - do-resolve(<var>,<resolvers>,[ipv4,ipv6]) <expr> + - reject + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - send-spoe-group <engine-name> <group-name> + - set-dst <expr> + - set-dst-port <expr> + - set-log-level <level> + - set-mark <mark> + - set-nice <nice> + - set-priority-class <expr> + - set-priority-offset <expr> + - set-src <expr> + - set-src-port <expr> + - set-tos <tos> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - silent-drop + - switch-mode http [ proto <name> ] + - track-sc0 <key> [table <table>] + - track-sc1 <key> [table <table>] + - track-sc2 <key> [table <table>] + - unset-var(<var-name>) + - use-service <service-name> + + The supported actions are described below. + + While there is nothing mandatory about it, it is recommended to use the + track-sc0 in "tcp-request connection" rules, track-sc1 for "tcp-request + content" rules in the frontend, and track-sc2 for "tcp-request content" + rules in the backend, because that makes the configuration more readable + and easier to troubleshoot, but this is just a guideline and all counters + may be used everywhere. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Note that the "if/unless" condition is optional. If no condition is set on + the action, it is simply performed unconditionally. That can be useful for + "track-sc*" actions as well as for changing the default action to a reject. + + Note also that it is recommended to use a "tcp-request session" rule to track + information that does *not* depend on Layer 7 contents, especially for HTTP + frontends. Some HTTP processing are performed at the session level and may + lead to an early rejection of the requests. Thus, the tracking at the content + level may be disturbed in such case. A warning is emitted during startup to + prevent, as far as possible, such unreliable usage. + + It is perfectly possible to match layer 7 contents with "tcp-request content" + rules from a TCP proxy, since HTTP-specific ACL matches are able to + preliminarily parse the contents of a buffer before extracting the required + data. If the buffered contents do not parse as a valid HTTP message, then the + ACL does not match. The parser which is involved there is exactly the same + as for all other HTTP processing, so there is no risk of parsing something + differently. In an HTTP frontend or an HTTP backend, it is guaranteed that + HTTP contents will always be immediately present when the rule is evaluated + first because the HTTP parsing is performed in the early stages of the + connection processing, at the session level. But for such proxies, using + "http-request" rules is much more natural and recommended. + + Tracking layer7 information is also possible provided that the information + are present when the rule is processed. The rule processing engine is able to + wait until the inspect delay expires when the data to be tracked is not yet + available. + + Example: + tcp-request content use-service lua.deny if { src -f /etc/haproxy/blacklist.lst } + + Example: + + tcp-request content set-var(sess.my_var) src + tcp-request content set-var-fmt(sess.from) %[src]:%[src_port] + tcp-request content unset-var(sess.my_var2) + + Example: + # Accept HTTP requests containing a Host header saying "example.com" + # and reject everything else. (Only works for HTTP/1 connections) + acl is_host_com hdr(Host) -i example.com + tcp-request inspect-delay 30s + tcp-request content accept if is_host_com + tcp-request content reject + + # Accept HTTP requests containing a Host header saying "example.com" + # and reject everything else. (works for HTTP/1 and HTTP/2 connections) + acl is_host_com hdr(Host) -i example.com + tcp-request inspect-delay 5s + tcp-request switch-mode http if HTTP + tcp-request reject # non-HTTP traffic is implicit here + ... + http-request reject unless is_host_com + + Example: + # reject SMTP connection if client speaks first + tcp-request inspect-delay 30s + acl content_present req.len gt 0 + tcp-request content reject if content_present + + # Forward HTTPS connection only if client speaks + tcp-request inspect-delay 30s + acl content_present req.len gt 0 + tcp-request content accept if content_present + tcp-request content reject + + Example: + # Track the last IP(stick-table type string) from X-Forwarded-For + tcp-request inspect-delay 10s + tcp-request content track-sc0 hdr(x-forwarded-for,-1) + # Or track the last IP(stick-table type ip|ipv6) from X-Forwarded-For + tcp-request content track-sc0 req.hdr_ip(x-forwarded-for,-1) + + Example: + # track request counts per "base" (concatenation of Host+URL) + tcp-request inspect-delay 10s + tcp-request content track-sc0 base table req-rate + + Example: track per-frontend and per-backend counters, block abusers at the + frontend when the backend detects abuse(and marks gpc0). + + frontend http + # Use General Purpose Counter 0 in SC0 as a global abuse counter + # protecting all our sites + stick-table type ip size 1m expire 5m store gpc0 + tcp-request connection track-sc0 src + tcp-request connection reject if { sc0_get_gpc0 gt 0 } + ... + use_backend http_dynamic if { path_end .php } + + backend http_dynamic + # if a source makes too fast requests to this dynamic site (tracked + # by SC1), block it globally in the frontend. + stick-table type ip size 1m expire 5m store http_req_rate(10s) + acl click_too_fast sc1_http_req_rate gt 10 + acl mark_as_abuser sc0_inc_gpc0(http) gt 0 + tcp-request content track-sc1 src + tcp-request content reject if click_too_fast mark_as_abuser + + See section 7 about ACL usage. + + See also : "tcp-request connection", "tcp-request session", + "tcp-request inspect-delay", and "http-request". + +tcp-request content accept [ { if | unless } <condition> ] + + This is used to accept the connection. No further "tcp-request content" + rules are evaluated for the current section. + +tcp-request content capture <sample> len <length> + [ { if | unless } <condition> ] + + This captures sample expression <sample> from the request buffer, and + converts it to a string of at most <len> characters. The resulting string is + stored into the next request "capture" slot, so it will possibly appear next + to some captured HTTP headers. It will then automatically appear in the logs, + and it will be possible to extract it using sample fetch rules to feed it + into headers or anything. The length should be limited given that this size + will be allocated for each capture during the whole session life. Please + check section 7.3 (Fetching samples) and "capture request header" for more + information. + +tcp-request content do-resolve(<var>,<resolvers>,[ipv4,ipv6]) <expr> + + This action performs a DNS resolution of the output of <expr> and stores the + result in the variable <var>. Please refer to "http-request do-resolve" for a + complete description. + +tcp-request content reject [ { if | unless } <condition> ] + + This is used to reject the connection. No further "tcp-request content" rules + are evaluated. + +tcp-request content sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] +tcp-request content sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +tcp-request content sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + These actions increment the General Purppose Counters according to the sticky + counter designated by <sc-id>. Please refer to "http-request sc-inc-gpc", + "http-request sc-inc-gpc0" and "http-request sc-inc-gpc1" for a complete + description. + +tcp-request content sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] +tcp-request content sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + These actions set the 32-bit unsigned General Purpose Tags according to the + sticky counter designated by <sc-id>. Please refer to "http-request + sc-inc-gpt" and "http-request sc-inc-gpt0" for a complete description. + +tcp-request content send-spoe-group <engine-name> <group-name> + [ { if | unless } <condition> ] + + Thaction is is used to trigger sending of a group of SPOE messages. Please + refer to "http-request send-spoe-group" for a complete description. + +tcp-request content set-dst <expr> [ { if | unless } <condition> ] +tcp-request content set-dst-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the destination IP/Port address to the value of + specified expression. Please refer to "http-request set-dst" and + "http-request set-dst-port" for a complete description. + +tcp-request content set-log-level <level> [ { if | unless } <condition> ] + + This action is used to set the log level of the current session. Please refer + to "http-request set-log-level". for a complete description. + +tcp-request content set-mark <mark> [ { if | unless } <condition> ] + + This action is used to set the Netfilter/IPFW MARK in all packets sent to the + client to the value passed in <mark> on platforms which support it. Please + refer to "http-request set-mark" for a complete description. + +tcp-request content set-nice <nice> [ { if | unless } <condition> ] + + This sets the "nice" factor of the current request being processed. Please + refer to "http-request set-nice" for a complete description. + +tcp-request content set-priority-class <expr> [ { if | unless } <condition> ] + + This is used to set the queue priority class of the current request. Please + refer to "http-request set-priority-class" for a complete description. + +tcp-request content set-priority-offset <expr> [ { if | unless } <condition> ] + + This is used to set the queue priority timestamp offset of the current + request. Please refer to "http-request set-priority-offset" for a complete + description. + +tcp-request content set-src <expr> [ { if | unless } <condition> ] +tcp-request content set-src-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the source IP/Port address to the value of + specified expression. Please refer to "http-request set-src" and + "http-request set-src-port" for a complete description. + +tcp-request content set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. Please refer to + "http-request set-tos" for a complete description. + +tcp-request content set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +tcp-request content set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +tcp-request content silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. Please refer to "http-request silent-drop" for a + complete description. + +tcp-request content switch-mode http [ proto <name> ] + [ { if | unless } <condition> ] + + This action is used to perform a connection upgrade. Only HTTP upgrades are + supported for now. The protocol may optionally be specified. This action is + only available for a proxy with the frontend capability. The connection + upgrade is immediately performed, following "tcp-request content" rules are + not evaluated. This upgrade method should be preferred to the implicit one + consisting to rely on the backend mode. When used, it is possible to set HTTP + directives in a frontend without any warning. These directives will be + conditionally evaluated if the HTTP upgrade is performed. However, an HTTP + backend must still be selected. It remains unsupported to route an HTTP + connection (upgraded or not) to a TCP server. + + See section 4 about Proxies for more details on HTTP upgrades. + +tcp-request content track-sc0 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request content track-sc1 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request content track-sc2 <key> [table <table>] [ { if | unless } <condition> ] + + This enables tracking of sticky counters from current connection. Please + refer to "http-request track-sc0", "http-request track-sc1" and "http-request + track-sc2" for a complete description. + +tcp-request content unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. Please refer to "http-request set-var" for + details about variables. + +tcp-request content use-service <service-name> [ { if | unless } <condition> ] + + This action is used to executes a TCP service which will reply to the request + and stop the evaluation of the rules. This service may choose to reply by + sending any valid response or it may immediately close the connection without + sending anything. Outside natives services, it is possible to write your own + services in Lua. No further "tcp-request content" rules are evaluated. + + +tcp-request inspect-delay <timeout> + Set the maximum allowed time to wait for data during content inspection + May be used in sections : defaults | frontend | listen | backend + yes(!) | yes | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + People using HAProxy primarily as a TCP relay are often worried about the + risk of passing any type of protocol to a server without any analysis. In + order to be able to analyze the request contents, we must first withhold + the data then analyze them. This statement simply enables withholding of + data for at most the specified amount of time. + + TCP content inspection applies very early when a connection reaches a + frontend, then very early when the connection is forwarded to a backend. This + means that a connection may experience a first delay in the frontend and a + second delay in the backend if both have tcp-request rules. + + Note that when performing content inspection, HAProxy will evaluate the whole + rules for every new chunk which gets in, taking into account the fact that + those data are partial. If no rule matches before the aforementioned delay, + a last check is performed upon expiration, this time considering that the + contents are definitive. If no delay is set, HAProxy will not wait at all + and will immediately apply a verdict based on the available information. + Obviously this is unlikely to be very useful and might even be racy, so such + setups are not recommended. + + As soon as a rule matches, the request is released and continues as usual. If + the timeout is reached and no rule matches, the default policy will be to let + it pass through unaffected. + + For most protocols, it is enough to set it to a few seconds, as most clients + send the full request immediately upon connection. Add 3 or more seconds to + cover TCP retransmits but that's all. For some protocols, it may make sense + to use large values, for instance to ensure that the client never talks + before the server (e.g. SMTP), or to wait for a client to talk before passing + data to the server (e.g. SSL). Note that the client timeout must cover at + least the inspection delay, otherwise it will expire first. If the client + closes the connection or if the buffer is full, the delay immediately expires + since the contents will not be able to change anymore. + + This directive is only available from named defaults sections, not anonymous + ones. Proxies inherit this value from their defaults section. + + See also : "tcp-request content accept", "tcp-request content reject", + "timeout client". + + +tcp-request session <action> [{if | unless} <condition>] + Perform an action on a validated session depending on a layer 5 condition + May be used in sections : defaults | frontend | listen | backend + yes(!) | yes | yes | no + Arguments : + <action> defines the action to perform if the condition applies. See + below. + + <condition> is a standard layer5-only ACL-based condition (see section 7). + + Once a session is validated, (i.e. after all handshakes have been completed), + it is possible to evaluate some conditions to decide whether this session + must be accepted or dropped or have its counters tracked. Those conditions + cannot make use of any data contents because no buffers are allocated yet and + the processing cannot wait at this stage. The main use case is to copy some + early information into variables (since variables are accessible in the + session), or to keep track of some information collected after the handshake, + such as SSL-level elements (SNI, ciphers, client cert's CN) or information + from the PROXY protocol header (e.g. track a source forwarded this way). The + extracted information can thus be copied to a variable or tracked using + "track-sc" rules. Of course it is also possible to decide to accept/reject as + with other rulesets. Most operations performed here could also be performed + in "tcp-request content" rules, except that in HTTP these rules are evaluated + for each new request, and that might not always be acceptable. For example a + rule might increment a counter on each evaluation. It would also be possible + that a country is resolved by geolocation from the source IP address, + assigned to a session-wide variable, then the source address rewritten from + an HTTP header for all requests. If some contents need to be inspected in + order to take the decision, the "tcp-request content" statements must be used + instead. + + The "tcp-request session" rules are evaluated in their exact declaration + order. If no rule matches or if there is no rule, the default action is to + accept the incoming session. There is no specific limit to the number of + rules which may be inserted. + + The first keyword is the rule's action. Several types of actions are + supported: + - accept + - reject + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - set-dst <expr> + - set-dst-port <expr> + - set-mark <mark> + - set-src <expr> + - set-src-port <expr> + - set-tos <tos> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - silent-drop + - track-sc0 <key> [table <table>] + - track-sc1 <key> [table <table>] + - track-sc2 <key> [table <table>] + - unset-var(<var-name>) + + The supported actions are described below. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Note that the "if/unless" condition is optional. If no condition is set on + the action, it is simply performed unconditionally. That can be useful for + "track-sc*" actions as well as for changing the default action to a reject. + + Example: track the original source address by default, or the one advertised + in the PROXY protocol header for connection coming from the local + proxies. The first connection-level rule enables receipt of the + PROXY protocol for these ones, the second rule tracks whatever + address we decide to keep after optional decoding. + + tcp-request connection expect-proxy layer4 if { src -f proxies.lst } + tcp-request session track-sc0 src + + Example: accept all sessions from white-listed hosts, reject too fast + sessions without counting them, and track accepted sessions. + This results in session rate being capped from abusive sources. + + tcp-request session accept if { src -f /etc/haproxy/whitelist.lst } + tcp-request session reject if { src_sess_rate gt 10 } + tcp-request session track-sc0 src + + Example: accept all sessions from white-listed hosts, count all other + sessions and reject too fast ones. This results in abusive ones + being blocked as long as they don't slow down. + + tcp-request session accept if { src -f /etc/haproxy/whitelist.lst } + tcp-request session track-sc0 src + tcp-request session reject if { sc0_sess_rate gt 10 } + + See section 7 about ACL usage. + + See also : "tcp-request connection", "tcp-request content", "stick-table" + +tcp-request session accept [ { if | unless } <condition> ] + + This is used to accept the connection. No further "tcp-request session" + rules are evaluated. + +tcp-request session reject [ { if | unless } <condition> ] + + This is used to reject the connection. No further "tcp-request session" rules + are evaluated. + +tcp-request session sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] +tcp-request session sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +tcp-request session sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + These actions increment the General Purppose Counters according to the sticky + counter designated by <sc-id>. Please refer to "http-request sc-inc-gpc", + "http-request sc-inc-gpc0" and "http-request sc-inc-gpc1" for a complete + description. + +tcp-request session sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] +tcp-request session sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + These actions set the 32-bit unsigned General Purpose Tags according to the + sticky counter designated by <sc-id>. Please refer to "tcp-request connection + sc-inc-gpt" and "tcp-request connection sc-inc-gpt0" for a complete + description. + +tcp-request session set-dst <expr> [ { if | unless } <condition> ] +tcp-request session set-dst-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the destination IP/Port address to the value of + specified expression. Please refer to "http-request set-dst" and + "http-request set-dst-port" for a complete description. + +tcp-request session set-mark <mark> [ { if | unless } <condition> ] + + This action is used to set the Netfilter/IPFW MARK in all packets sent to the + client to the value passed in <mark> on platforms which support it. Please + refer to "http-request set-mark" for a complete description. + +tcp-request session set-src <expr> [ { if | unless } <condition> ] +tcp-request session set-src-port <expr> [ { if | unless } <condition> ] + + These actions are used to set the source IP/Port address to the value of + specified expression. Please refer to "http-request set-src" and + "http-request set-src-port" for a complete description. + +tcp-request session set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. Please refer to + "http-request set-tos" for a complete description. + +tcp-request session set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +tcp-request session set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +tcp-request session silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. Please refer to "http-request silent-drop" for a + complete description. + +tcp-request session track-sc0 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request session track-sc1 <key> [table <table>] [ { if | unless } <condition> ] +tcp-request session track-sc2 <key> [table <table>] [ { if | unless } <condition> ] + + This enables tracking of sticky counters from current connection. Please + refer to "http-request track-sc0", "http-request track-sc1" and "http-request + track-sc2" for a complete description. + +tcp-request session unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. Please refer to "http-request set-var" for + details about variables. + + +tcp-response content <action> [{if | unless} <condition>] + Perform an action on a session response depending on a layer 4-7 condition + May be used in sections : defaults | frontend | listen | backend + yes(!) | no | yes | yes + Arguments : + <action> defines the action to perform if the condition applies. See + below. + + <condition> is a standard layer 4-7 ACL-based condition (see section 7). + + Response contents can be analyzed at an early stage of response processing + called "TCP content inspection". During this stage, ACL-based rules are + evaluated every time the response contents are updated, until either an + "accept", "close" or a "reject" rule matches, or a TCP response inspection + delay is set and expires with no matching rule. + + Most often, these decisions will consider a protocol recognition or validity. + + Content-based rules are evaluated in their exact declaration order. If no + rule matches or if there is no rule, the default action is to accept the + contents. There is no specific limit to the number of rules which may be + inserted. + + The first keyword is the rule's action. Several types of actions are + supported: + - accept + - close + - reject + - sc-inc-gpc(<idx>,<sc-id>) + - sc-inc-gpc0(<sc-id>) + - sc-inc-gpc1(<sc-id>) + - sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + - sc-set-gpt0(<sc-id>) { <int> | <expr> } + - send-spoe-group <engine-name> <group-name> + - set-log-level <level> + - set-mark <mark> + - set-nice <nice> + - set-tos <tos> + - set-var(<var-name>[,<cond>...]) <expr> + - set-var-fmt(<var-name>[,<cond>...]) <fmt> + - silent-drop + - unset-var(<var-name>) + + The supported actions are described below. + + This directive is only available from named defaults sections, not anonymous + ones. Rules defined in the defaults section are evaluated before ones in the + associated proxy section. To avoid ambiguities, in this case the same + defaults section cannot be used by proxies with the frontend capability and + by proxies with the backend capability. It means a listen section cannot use + a defaults section defining such rules. + + Note that the "if/unless" condition is optional. If no condition is set on + the action, it is simply performed unconditionally. That can be useful for + for changing the default action to a reject. + + Several types of actions are supported : + + It is perfectly possible to match layer 7 contents with "tcp-response + content" rules, but then it is important to ensure that a full response has + been buffered, otherwise no contents will match. In order to achieve this, + the best solution involves detecting the HTTP protocol during the inspection + period. + + See section 7 about ACL usage. + + See also : "tcp-request content", "tcp-response inspect-delay" + +tcp-response content accept [ { if | unless } <condition> ] + + This is used to accept the response. No further "tcp-response content" rules + are evaluated. + +tcp-response content close [ { if | unless } <condition> ] + + This is used to immediately closes the connection with the server. No further + "tcp-response content" rules are evaluated. The main purpose of this action + is to force a connection to be finished between a client and a server after + an exchange when the application protocol expects some long time outs to + elapse first. The goal is to eliminate idle connections which take + significant resources on servers with certain protocols. + +tcp-response content reject [ { if | unless } <condition> ] + + This is used to reject the response. No further "tcp-response content" rules + are evaluated. + +tcp-response content sc-inc-gpc(<idx>,<sc-id>) [ { if | unless } <condition> ] +tcp-response content sc-inc-gpc0(<sc-id>) [ { if | unless } <condition> ] +tcp-response content sc-inc-gpc1(<sc-id>) [ { if | unless } <condition> ] + + These actions increment the General Purppose Counters according to the sticky + counter designated by <sc-id>. Please refer to "http-request sc-inc-gpc", + "http-request sc-inc-gpc0" and "http-request sc-inc-gpc1" for a complete + description. + +tcp-response content sc-set-gpt(<idx>,<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] +tcp-resposne content sc-set-gpt0(<sc-id>) { <int> | <expr> } + [ { if | unless } <condition> ] + + These actions set the 32-bit unsigned General Purpose Tags according to the + sticky counter designated by <sc-id>. Please refer to "http-request + sc-inc-gpt" and "http-request sc-inc-gpt0" for a complete description. + +tcp-response content send-spoe-group <engine-name> <group-name> + [ { if | unless } <condition> ] + + Thaction is is used to trigger sending of a group of SPOE messages. Please + refer to "http-request send-spoe-group" for a complete description. + +tcp-response content set-log-level <level> [ { if | unless } <condition> ] + + This action is used to set the log level of the current session. Please refer + to "http-request set-log-level". for a complete description. + +tcp-response content set-mark <mark> [ { if | unless } <condition> ] + + This action is used to set the Netfilter/IPFW MARK in all packets sent to the + client to the value passed in <mark> on platforms which support it. Please + refer to "http-request set-mark" for a complete description. + +tcp-response content set-nice <nice> [ { if | unless } <condition> ] + + This sets the "nice" factor of the current request being processed. Please + refer to "http-request set-nice" for a complete description. + +tcp-response content set-tos <tos> [ { if | unless } <condition> ] + + This is used to set the TOS or DSCP field value of packets sent to the client + to the value passed in <tos> on platforms which support this. Please refer to + "http-request set-tos" for a complete description. + +tcp-response content set-var(<var-name>[,<cond>...]) <expr> [ { if | unless } <condition> ] +tcp-response content set-var-fmt(<var-name>[,<cond>...]) <fmt> [ { if | unless } <condition> ] + + This is used to set the contents of a variable. The variable is declared + inline. Please refer to "http-request set-var" and "http-request set-var-fmt" + for a complete description. + +tcp-response content silent-drop [ { if | unless } <condition> ] + + This stops the evaluation of the rules and makes the client-facing connection + suddenly disappear using a system-dependent way that tries to prevent the + client from being notified. Please refer to "http-request silent-drop" for a + complete description. + +tcp-response content unset-var(<var-name>) [ { if | unless } <condition> ] + + This is used to unset a variable. Please refer to "http-request set-var" for + details about variables. + + +tcp-response inspect-delay <timeout> + Set the maximum allowed time to wait for a response during content inspection + May be used in sections : defaults | frontend | listen | backend + yes(!) | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + This directive is only available from named defaults sections, not anonymous + ones. Proxies inherit this value from their defaults section. + + See also : "tcp-response content", "tcp-request inspect-delay". + + +timeout check <timeout> + Set additional check timeout, but only after a connection has been already + established. + + May be used in sections: defaults | frontend | listen | backend + yes | no | yes | yes + Arguments: + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + If set, HAProxy uses min("timeout connect", "inter") as a connect timeout + for check and "timeout check" as an additional read timeout. The "min" is + used so that people running with *very* long "timeout connect" (e.g. those + who needed this due to the queue or tarpit) do not slow down their checks. + (Please also note that there is no valid reason to have such long connect + timeouts, because "timeout queue" and "timeout tarpit" can always be used to + avoid that). + + If "timeout check" is not set HAProxy uses "inter" for complete check + timeout (connect + read) exactly like all <1.3.15 version. + + In most cases check request is much simpler and faster to handle than normal + requests and people may want to kick out laggy servers so this timeout should + be smaller than "timeout server". + + This parameter is specific to backends, but can be specified once for all in + "defaults" sections. This is in fact one of the easiest solutions not to + forget about it. + + This directive is only available from named defaults sections, not anonymous + ones. Proxies inherit this value from their defaults section. + + See also: "timeout connect", "timeout queue", "timeout server", + "timeout tarpit". + + +timeout client <timeout> + Set the maximum inactivity time on the client side. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + The inactivity timeout applies when the client is expected to acknowledge or + send data. In HTTP mode, this timeout is particularly important to consider + during the first phase, when the client sends the request, and during the + response while it is reading data sent by the server. That said, for the + first phase, it is preferable to set the "timeout http-request" to better + protect HAProxy from Slowloris like attacks. The value is specified in + milliseconds by default, but can be in any other unit if the number is + suffixed by the unit, as specified at the top of this document. In TCP mode + (and to a lesser extent, in HTTP mode), it is highly recommended that the + client timeout remains equal to the server timeout in order to avoid complex + situations to debug. It is a good practice to cover one or several TCP packet + losses by specifying timeouts that are slightly above multiples of 3 seconds + (e.g. 4 or 5 seconds). If some long-lived sessions are mixed with short-lived + sessions (e.g. WebSocket and HTTP), it's worth considering "timeout tunnel", + which overrides "timeout client" and "timeout server" for tunnels, as well as + "timeout client-fin" for half-closed connections. + + This parameter is specific to frontends, but can be specified once for all in + "defaults" sections. This is in fact one of the easiest solutions not to + forget about it. An unspecified timeout results in an infinite timeout, which + is not recommended. Such a usage is accepted and works but reports a warning + during startup because it may result in accumulation of expired sessions in + the system if the system's timeouts are not configured either. + + See also : "timeout server", "timeout tunnel", "timeout http-request". + + +timeout client-fin <timeout> + Set the inactivity timeout on the client side for half-closed connections. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + The inactivity timeout applies when the client is expected to acknowledge or + send data while one direction is already shut down. This timeout is different + from "timeout client" in that it only applies to connections which are closed + in one direction. This is particularly useful to avoid keeping connections in + FIN_WAIT state for too long when clients do not disconnect cleanly. This + problem is particularly common long connections such as RDP or WebSocket. + Note that this timeout can override "timeout tunnel" when a connection shuts + down in one direction. It is applied to idle HTTP/2 connections once a GOAWAY + frame was sent, often indicating an expectation that the connection quickly + ends. + + This parameter is specific to frontends, but can be specified once for all in + "defaults" sections. By default it is not set, so half-closed connections + will use the other timeouts (timeout.client or timeout.tunnel). + + See also : "timeout client", "timeout server-fin", and "timeout tunnel". + + +timeout connect <timeout> + Set the maximum time to wait for a connection attempt to a server to succeed. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + If the server is located on the same LAN as HAProxy, the connection should be + immediate (less than a few milliseconds). Anyway, it is a good practice to + cover one or several TCP packet losses by specifying timeouts that are + slightly above multiples of 3 seconds (e.g. 4 or 5 seconds). By default, the + connect timeout also presets both queue and tarpit timeouts to the same value + if these have not been specified. + + This parameter is specific to backends, but can be specified once for all in + "defaults" sections. This is in fact one of the easiest solutions not to + forget about it. An unspecified timeout results in an infinite timeout, which + is not recommended. Such a usage is accepted and works but reports a warning + during startup because it may result in accumulation of failed sessions in + the system if the system's timeouts are not configured either. + + See also: "timeout check", "timeout queue", "timeout server", "timeout tarpit". + + +timeout http-keep-alive <timeout> + Set the maximum allowed time to wait for a new HTTP request to appear + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + By default, the time to wait for a new request in case of keep-alive is set + by "timeout http-request". However this is not always convenient because some + people want very short keep-alive timeouts in order to release connections + faster, and others prefer to have larger ones but still have short timeouts + once the request has started to present itself. + + The "http-keep-alive" timeout covers these needs. It will define how long to + wait for a new HTTP request to start coming after a response was sent. Once + the first byte of request has been seen, the "http-request" timeout is used + to wait for the complete request to come. Note that empty lines prior to a + new request do not refresh the timeout and are not counted as a new request. + + There is also another difference between the two timeouts : when a connection + expires during timeout http-keep-alive, no error is returned, the connection + just closes. If the connection expires in "http-request" while waiting for a + connection to complete, a HTTP 408 error is returned. + + In general it is optimal to set this value to a few tens to hundreds of + milliseconds, to allow users to fetch all objects of a page at once but + without waiting for further clicks. Also, if set to a very small value (e.g. + 1 millisecond) it will probably only accept pipelined requests but not the + non-pipelined ones. It may be a nice trade-off for very large sites running + with tens to hundreds of thousands of clients. + + If this parameter is not set, the "http-request" timeout applies, and if both + are not set, "timeout client" still applies at the lower level. It should be + set in the frontend to take effect, unless the frontend is in TCP mode, in + which case the HTTP backend's timeout will be used. + + See also : "timeout http-request", "timeout client". + + +timeout http-request <timeout> + Set the maximum allowed time to wait for a complete HTTP request + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + In order to offer DoS protection, it may be required to lower the maximum + accepted time to receive a complete HTTP request without affecting the client + timeout. This helps protecting against established connections on which + nothing is sent. The client timeout cannot offer a good protection against + this abuse because it is an inactivity timeout, which means that if the + attacker sends one character every now and then, the timeout will not + trigger. With the HTTP request timeout, no matter what speed the client + types, the request will be aborted if it does not complete in time. When the + timeout expires, an HTTP 408 response is sent to the client to inform it + about the problem, and the connection is closed. The logs will report + termination codes "cR". Some recent browsers are having problems with this + standard, well-documented behavior, so it might be needed to hide the 408 + code using "option http-ignore-probes" or "errorfile 408 /dev/null". See + more details in the explanations of the "cR" termination code in section 8.5. + + By default, this timeout only applies to the header part of the request, + and not to any data. As soon as the empty line is received, this timeout is + not used anymore. When combined with "option http-buffer-request", this + timeout also applies to the body of the request.. + It is used again on keep-alive connections to wait for a second + request if "timeout http-keep-alive" is not set. + + Generally it is enough to set it to a few seconds, as most clients send the + full request immediately upon connection. Add 3 or more seconds to cover TCP + retransmits but that's all. Setting it to very low values (e.g. 50 ms) will + generally work on local networks as long as there are no packet losses. This + will prevent people from sending bare HTTP requests using telnet. + + If this parameter is not set, the client timeout still applies between each + chunk of the incoming request. It should be set in the frontend to take + effect, unless the frontend is in TCP mode, in which case the HTTP backend's + timeout will be used. + + See also : "errorfile", "http-ignore-probes", "timeout http-keep-alive", and + "timeout client", "option http-buffer-request". + + +timeout queue <timeout> + Set the maximum time to wait in the queue for a connection slot to be free + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + When a server's maxconn is reached, connections are left pending in a queue + which may be server-specific or global to the backend. In order not to wait + indefinitely, a timeout is applied to requests pending in the queue. If the + timeout is reached, it is considered that the request will almost never be + served, so it is dropped and a 503 error is returned to the client. + + The "timeout queue" statement allows to fix the maximum time for a request to + be left pending in a queue. If unspecified, the same value as the backend's + connection timeout ("timeout connect") is used, for backwards compatibility + with older versions with no "timeout queue" parameter. + + See also : "timeout connect". + + +timeout server <timeout> + Set the maximum inactivity time on the server side. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + The inactivity timeout applies when the server is expected to acknowledge or + send data. In HTTP mode, this timeout is particularly important to consider + during the first phase of the server's response, when it has to send the + headers, as it directly represents the server's processing time for the + request. To find out what value to put there, it's often good to start with + what would be considered as unacceptable response times, then check the logs + to observe the response time distribution, and adjust the value accordingly. + + The value is specified in milliseconds by default, but can be in any other + unit if the number is suffixed by the unit, as specified at the top of this + document. In TCP mode (and to a lesser extent, in HTTP mode), it is highly + recommended that the client timeout remains equal to the server timeout in + order to avoid complex situations to debug. Whatever the expected server + response times, it is a good practice to cover at least one or several TCP + packet losses by specifying timeouts that are slightly above multiples of 3 + seconds (e.g. 4 or 5 seconds minimum). If some long-lived sessions are mixed + with short-lived sessions (e.g. WebSocket and HTTP), it's worth considering + "timeout tunnel", which overrides "timeout client" and "timeout server" for + tunnels. + + This parameter is specific to backends, but can be specified once for all in + "defaults" sections. This is in fact one of the easiest solutions not to + forget about it. An unspecified timeout results in an infinite timeout, which + is not recommended. Such a usage is accepted and works but reports a warning + during startup because it may result in accumulation of expired sessions in + the system if the system's timeouts are not configured either. + + See also : "timeout client" and "timeout tunnel". + + +timeout server-fin <timeout> + Set the inactivity timeout on the server side for half-closed connections. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + The inactivity timeout applies when the server is expected to acknowledge or + send data while one direction is already shut down. This timeout is different + from "timeout server" in that it only applies to connections which are closed + in one direction. This is particularly useful to avoid keeping connections in + FIN_WAIT state for too long when a remote server does not disconnect cleanly. + This problem is particularly common long connections such as RDP or WebSocket. + Note that this timeout can override "timeout tunnel" when a connection shuts + down in one direction. This setting was provided for completeness, but in most + situations, it should not be needed. + + This parameter is specific to backends, but can be specified once for all in + "defaults" sections. By default it is not set, so half-closed connections + will use the other timeouts (timeout.server or timeout.tunnel). + + See also : "timeout client-fin", "timeout server", and "timeout tunnel". + + +timeout tarpit <timeout> + Set the duration for which tarpitted connections will be maintained + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | yes + Arguments : + <timeout> is the tarpit duration specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + When a connection is tarpitted using "http-request tarpit", it is maintained + open with no activity for a certain amount of time, then closed. "timeout + tarpit" defines how long it will be maintained open. + + The value is specified in milliseconds by default, but can be in any other + unit if the number is suffixed by the unit, as specified at the top of this + document. If unspecified, the same value as the backend's connection timeout + ("timeout connect") is used, for backwards compatibility with older versions + with no "timeout tarpit" parameter. + + See also : "timeout connect". + + +timeout tunnel <timeout> + Set the maximum inactivity time on the client and server side for tunnels. + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : + <timeout> is the timeout value specified in milliseconds by default, but + can be in any other unit if the number is suffixed by the unit, + as explained at the top of this document. + + The tunnel timeout applies when a bidirectional connection is established + between a client and a server, and the connection remains inactive in both + directions. This timeout supersedes both the client and server timeouts once + the connection becomes a tunnel. In TCP, this timeout is used as soon as no + analyzer remains attached to either connection (e.g. tcp content rules are + accepted). In HTTP, this timeout is used when a connection is upgraded (e.g. + when switching to the WebSocket protocol, or forwarding a CONNECT request + to a proxy), or after the first response when no keepalive/close option is + specified. + + Since this timeout is usually used in conjunction with long-lived connections, + it usually is a good idea to also set "timeout client-fin" to handle the + situation where a client suddenly disappears from the net and does not + acknowledge a close, or sends a shutdown and does not acknowledge pending + data anymore. This can happen in lossy networks where firewalls are present, + and is detected by the presence of large amounts of sessions in a FIN_WAIT + state. + + The value is specified in milliseconds by default, but can be in any other + unit if the number is suffixed by the unit, as specified at the top of this + document. Whatever the expected normal idle time, it is a good practice to + cover at least one or several TCP packet losses by specifying timeouts that + are slightly above multiples of 3 seconds (e.g. 4 or 5 seconds minimum). + + This parameter is specific to backends, but can be specified once for all in + "defaults" sections. This is in fact one of the easiest solutions not to + forget about it. + + Example : + defaults http + option http-server-close + timeout connect 5s + timeout client 30s + timeout client-fin 30s + timeout server 30s + timeout tunnel 1h # timeout to use with WebSocket and CONNECT + + See also : "timeout client", "timeout client-fin", "timeout server". + + +transparent (deprecated) + Enable client-side transparent proxying + May be used in sections : defaults | frontend | listen | backend + yes | no | yes | yes + Arguments : none + + This keyword was introduced in order to provide layer 7 persistence to layer + 3 load balancers. The idea is to use the OS's ability to redirect an incoming + connection for a remote address to a local process (here HAProxy), and let + this process know what address was initially requested. When this option is + used, sessions without cookies will be forwarded to the original destination + IP address of the incoming request (which should match that of another + equipment), while requests with cookies will still be forwarded to the + appropriate server. + + The "transparent" keyword is deprecated, use "option transparent" instead. + + Note that contrary to a common belief, this option does NOT make HAProxy + present the client's IP to the server when establishing the connection. + + See also: "option transparent" + +unique-id-format <string> + Generate a unique ID for each request. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <string> is a log-format string. + + This keyword creates a ID for each request using the custom log format. A + unique ID is useful to trace a request passing through many components of + a complex infrastructure. The newly created ID may also be logged using the + %ID tag the log-format string. + + The format should be composed from elements that are guaranteed to be + unique when combined together. For instance, if multiple HAProxy instances + are involved, it might be important to include the node name. It is often + needed to log the incoming connection's source and destination addresses + and ports. Note that since multiple requests may be performed over the same + connection, including a request counter may help differentiate them. + Similarly, a timestamp may protect against a rollover of the counter. + Logging the process ID will avoid collisions after a service restart. + + It is recommended to use hexadecimal notation for many fields since it + makes them more compact and saves space in logs. + + Example: + + unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid + + will generate: + + 7F000001:8296_7F00001E:1F90_4F7B0A69_0003:790A + + See also: "unique-id-header" + +unique-id-header <name> + Add a unique ID header in the HTTP request. + May be used in sections : defaults | frontend | listen | backend + yes | yes | yes | no + Arguments : + <name> is the name of the header. + + Add a unique-id header in the HTTP request sent to the server, using the + unique-id-format. It can't work if the unique-id-format doesn't exist. + + Example: + + unique-id-format %{+X}o\ %ci:%cp_%fi:%fp_%Ts_%rt:%pid + unique-id-header X-Unique-ID + + will generate: + + X-Unique-ID: 7F000001:8296_7F00001E:1F90_4F7B0A69_0003:790A + + See also: "unique-id-format" + +use_backend <backend> [{if | unless} <condition>] + Switch to a specific backend if/unless an ACL-based condition is matched. + May be used in sections : defaults | frontend | listen | backend + no | yes | yes | no + Arguments : + <backend> is the name of a valid backend or "listen" section, or a + "log-format" string resolving to a backend name. + + <condition> is a condition composed of ACLs, as described in section 7. If + it is omitted, the rule is unconditionally applied. + + When doing content-switching, connections arrive on a frontend and are then + dispatched to various backends depending on a number of conditions. The + relation between the conditions and the backends is described with the + "use_backend" keyword. While it is normally used with HTTP processing, it can + also be used in pure TCP, either without content using stateless ACLs (e.g. + source address validation) or combined with a "tcp-request" rule to wait for + some payload. + + There may be as many "use_backend" rules as desired. All of these rules are + evaluated in their declaration order, and the first one which matches will + assign the backend. + + In the first form, the backend will be used if the condition is met. In the + second form, the backend will be used if the condition is not met. If no + condition is valid, the backend defined with "default_backend" will be used. + If no default backend is defined, either the servers in the same section are + used (in case of a "listen" section) or, in case of a frontend, no server is + used and a 503 service unavailable response is returned. + + Note that it is possible to switch from a TCP frontend to an HTTP backend. In + this case, either the frontend has already checked that the protocol is HTTP, + and backend processing will immediately follow, or the backend will wait for + a complete HTTP request to get in. This feature is useful when a frontend + must decode several protocols on a unique port, one of them being HTTP. + + When <backend> is a simple name, it is resolved at configuration time, and an + error is reported if the specified backend does not exist. If <backend> is + a log-format string instead, no check may be done at configuration time, so + the backend name is resolved dynamically at run time. If the resulting + backend name does not correspond to any valid backend, no other rule is + evaluated, and the default_backend directive is applied instead. Note that + when using dynamic backend names, it is highly recommended to use a prefix + that no other backend uses in order to ensure that an unauthorized backend + cannot be forced from the request. + + It is worth mentioning that "use_backend" rules with an explicit name are + used to detect the association between frontends and backends to compute the + backend's "fullconn" setting. This cannot be done for dynamic names. + + See also: "default_backend", "tcp-request", "fullconn", "log-format", and + section 7 about ACLs. + +use-fcgi-app <name> + Defines the FastCGI application to use for the backend. + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + Arguments : + <name> is the name of the FastCGI application to use. + + See section 10.1 about FastCGI application setup for details. + +use-server <server> if <condition> +use-server <server> unless <condition> + Only use a specific server if/unless an ACL-based condition is matched. + May be used in sections : defaults | frontend | listen | backend + no | no | yes | yes + Arguments : + <server> is the name of a valid server in the same backend section + or a "log-format" string resolving to a server name. + + <condition> is a condition composed of ACLs, as described in section 7. + + By default, connections which arrive to a backend are load-balanced across + the available servers according to the configured algorithm, unless a + persistence mechanism such as a cookie is used and found in the request. + + Sometimes it is desirable to forward a particular request to a specific + server without having to declare a dedicated backend for this server. This + can be achieved using the "use-server" rules. These rules are evaluated after + the "redirect" rules and before evaluating cookies, and they have precedence + on them. There may be as many "use-server" rules as desired. All of these + rules are evaluated in their declaration order, and the first one which + matches will assign the server. + + If a rule designates a server which is down, and "option persist" is not used + and no force-persist rule was validated, it is ignored and evaluation goes on + with the next rules until one matches. + + In the first form, the server will be used if the condition is met. In the + second form, the server will be used if the condition is not met. If no + condition is valid, the processing continues and the server will be assigned + according to other persistence mechanisms. + + Note that even if a rule is matched, cookie processing is still performed but + does not assign the server. This allows prefixed cookies to have their prefix + stripped. + + The "use-server" statement works both in HTTP and TCP mode. This makes it + suitable for use with content-based inspection. For instance, a server could + be selected in a farm according to the TLS SNI field when using protocols with + implicit TLS (also see "req.ssl_sni"). And if these servers have their weight + set to zero, they will not be used for other traffic. + + Example : + # intercept incoming TLS requests based on the SNI field + use-server www if { req.ssl_sni -i www.example.com } + server www 192.168.0.1:443 weight 0 + use-server mail if { req.ssl_sni -i mail.example.com } + server mail 192.168.0.1:465 weight 0 + use-server imap if { req.ssl_sni -i imap.example.com } + server imap 192.168.0.1:993 weight 0 + # all the rest is forwarded to this server + server default 192.168.0.2:443 check + + When <server> is a simple name, it is checked against existing servers in the + configuration and an error is reported if the specified server does not exist. + If it is a log-format, no check is performed when parsing the configuration, + and if we can't resolve a valid server name at runtime but the use-server rule + was conditioned by an ACL returning true, no other use-server rule is applied + and we fall back to load balancing. + + See also: "use_backend", section 5 about server and section 7 about ACLs. + + +5. Bind and server options +-------------------------- + +The "bind", "server" and "default-server" keywords support a number of settings +depending on some build options and on the system HAProxy was built on. These +settings generally each consist in one word sometimes followed by a value, +written on the same line as the "bind" or "server" line. All these options are +described in this section. + + +5.1. Bind options +----------------- + +The "bind" keyword supports a certain number of settings which are all passed +as arguments on the same line. The order in which those arguments appear makes +no importance, provided that they appear after the bind address. All of these +parameters are optional. Some of them consist in a single words (booleans), +while other ones expect a value after them. In this case, the value must be +provided immediately after the setting name. + +The currently supported settings are the following ones. + +accept-netscaler-cip <magic number> + Enforces the use of the NetScaler Client IP insertion protocol over any + connection accepted by any of the TCP sockets declared on the same line. The + NetScaler Client IP insertion protocol dictates the layer 3/4 addresses of + the incoming connection to be used everywhere an address is used, with the + only exception of "tcp-request connection" rules which will only see the + real connection address. Logs will reflect the addresses indicated in the + protocol, unless it is violated, in which case the real address will still + be used. This keyword combined with support from external components can be + used as an efficient and reliable alternative to the X-Forwarded-For + mechanism which is not always reliable and not even always usable. See also + "tcp-request connection expect-netscaler-cip" for a finer-grained setting of + which client is allowed to use the protocol. + +accept-proxy + Enforces the use of the PROXY protocol over any connection accepted by any of + the sockets declared on the same line. Versions 1 and 2 of the PROXY protocol + are supported and correctly detected. The PROXY protocol dictates the layer + 3/4 addresses of the incoming connection to be used everywhere an address is + used, with the only exception of "tcp-request connection" rules which will + only see the real connection address. Logs will reflect the addresses + indicated in the protocol, unless it is violated, in which case the real + address will still be used. This keyword combined with support from external + components can be used as an efficient and reliable alternative to the + X-Forwarded-For mechanism which is not always reliable and not even always + usable. See also "tcp-request connection expect-proxy" for a finer-grained + setting of which client is allowed to use the protocol. + +allow-0rtt + Allow receiving early data when using TLSv1.3. This is disabled by default, + due to security considerations. Because it is vulnerable to replay attacks, + you should only allow if for requests that are safe to replay, i.e. requests + that are idempotent. You can use the "wait-for-handshake" action for any + request that wouldn't be safe with early data. + +alpn <protocols> + This enables the TLS ALPN extension and advertises the specified protocol + list as supported on top of ALPN. The protocol list consists in a comma- + delimited list of protocol names, for instance: "http/1.1,http/1.0" (without + quotes). This requires that the SSL library is built with support for TLS + extensions enabled (check with haproxy -vv). The ALPN extension replaces the + initial NPN extension. ALPN is required to enable HTTP/2 on an HTTP frontend. + Versions of OpenSSL prior to 1.0.2 didn't support ALPN and only supposed the + now obsolete NPN extension. At the time of writing this, most browsers still + support both ALPN and NPN for HTTP/2 so a fallback to NPN may still work for + a while. But ALPN must be used whenever possible. If both HTTP/2 and HTTP/1.1 + are expected to be supported, both versions can be advertised, in order of + preference, like below : + + bind :443 ssl crt pub.pem alpn h2,http/1.1 + + QUIC supports only h3 and hq-interop as ALPN. h3 is for HTTP/3 and hq-interop + is used for http/0.9 and QUIC interop runner (see https://interop.seemann.io). + +backlog <backlog> + Sets the socket's backlog to this value. If unspecified or 0, the frontend's + backlog is used instead, which generally defaults to the maxconn value. + +curves <curves> + This setting is only available when support for OpenSSL was built in. It sets + the string describing the list of elliptic curves algorithms ("curve suite") + that are negotiated during the SSL/TLS handshake with ECDHE. The format of the + string is a colon-delimited list of curve name. + Example: "X25519:P-256" (without quote) + When "curves" is set, "ecdhe" parameter is ignored. + +ecdhe <named curve> + This setting is only available when support for OpenSSL was built in. It sets + the named curve (RFC 4492) used to generate ECDH ephemeral keys. By default, + used named curve is prime256v1. + +ca-file <cafile> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file from which to load CA certificates used to verify + client's certificate. It is possible to load a directory containing multiple + CAs, in this case HAProxy will try to load every ".pem", ".crt", ".cer", and + .crl" available in the directory, files starting with a dot are ignored. + + Warning: The "@system-ca" parameter could be used in place of the cafile + in order to use the trusted CAs of your system, like its done with the server + directive. But you mustn't use it unless you know what you are doing. + Configuring it this way basically mean that the bind will accept any client + certificate generated from one of the CA present on your system, which is + extremely unsecure. + +ca-ignore-err [all|<errorID>,...] + This setting is only available when support for OpenSSL was built in. + Sets a comma separated list of errorIDs to ignore during verify at depth > 0. + If set to 'all', all errors are ignored. SSL handshake is not aborted if an + error is ignored. + +ca-sign-file <cafile> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file containing both the CA certificate and the CA private + key used to create and sign server's certificates. This is a mandatory + setting when the dynamic generation of certificates is enabled. See + 'generate-certificates' for details. + +ca-sign-pass <passphrase> + This setting is only available when support for OpenSSL was built in. It is + the CA private key passphrase. This setting is optional and used only when + the dynamic generation of certificates is enabled. See + 'generate-certificates' for details. + +ca-verify-file <cafile> + This setting designates a PEM file from which to load CA certificates used to + verify client's certificate. It designates CA certificates which must not be + included in CA names sent in server hello message. Typically, "ca-file" must + be defined with intermediate certificates, and "ca-verify-file" with + certificates to ending the chain, like root CA. + +ciphers <ciphers> + This setting is only available when support for OpenSSL was built in. It sets + the string describing the list of cipher algorithms ("cipher suite") that are + negotiated during the SSL/TLS handshake up to TLSv1.2. The format of the + string is defined in "man 1 ciphers" from OpenSSL man pages. For background + information and recommendations see e.g. + (https://wiki.mozilla.org/Security/Server_Side_TLS) and + (https://mozilla.github.io/server-side-tls/ssl-config-generator/). For TLSv1.3 + cipher configuration, please check the "ciphersuites" keyword. + +ciphersuites <ciphersuites> + This setting is only available when support for OpenSSL was built in and + OpenSSL 1.1.1 or later was used to build HAProxy. It sets the string describing + the list of cipher algorithms ("cipher suite") that are negotiated during the + TLSv1.3 handshake. The format of the string is defined in "man 1 ciphers" from + OpenSSL man pages under the "ciphersuites" section. For cipher configuration + for TLSv1.2 and earlier, please check the "ciphers" keyword. + +crl-file <crlfile> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file from which to load certificate revocation list used + to verify client's certificate. You need to provide a certificate revocation + list for every certificate of your certificate authority chain. + +crt <cert> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file containing both the required certificates and any + associated private keys. This file can be built by concatenating multiple + PEM files into one (e.g. cat cert.pem key.pem > combined.pem). If your CA + requires an intermediate certificate, this can also be concatenated into this + file. Intermediate certificate can also be shared in a directory via + "issuers-chain-path" directive. + + If the file does not contain a private key, HAProxy will try to load + the key at the same path suffixed by a ".key". + + If the OpenSSL used supports Diffie-Hellman, parameters present in this file + are loaded. + + If a directory name is used instead of a PEM file, then all files found in + that directory will be loaded in alphabetic order unless their name ends + with '.key', '.issuer', '.ocsp' or '.sctl' (reserved extensions). Files + starting with a dot are also ignored. This directive may be specified multiple + times in order to load certificates from multiple files or directories. The + certificates will be presented to clients who provide a valid TLS Server Name + Indication field matching one of their CN or alt subjects. Wildcards are + supported, where a wildcard character '*' is used instead of the first + hostname component (e.g. *.example.org matches www.example.org but not + www.sub.example.org). + + If no SNI is provided by the client or if the SSL library does not support + TLS extensions, or if the client provides an SNI hostname which does not + match any certificate, then the first loaded certificate will be presented. + This means that when loading certificates from a directory, it is highly + recommended to load the default one first as a file or to ensure that it will + always be the first one in the directory. + + Note that the same cert may be loaded multiple times without side effects. + + Some CAs (such as GoDaddy) offer a drop down list of server types that do not + include HAProxy when obtaining a certificate. If this happens be sure to + choose a web server that the CA believes requires an intermediate CA (for + GoDaddy, selection Apache Tomcat will get the correct bundle, but many + others, e.g. nginx, result in a wrong bundle that will not work for some + clients). + + For each PEM file, HAProxy checks for the presence of file at the same path + suffixed by ".ocsp". If such file is found, support for the TLS Certificate + Status Request extension (also known as "OCSP stapling") is automatically + enabled. The content of this file is optional. If not empty, it must contain + a valid OCSP Response in DER format. In order to be valid an OCSP Response + must comply with the following rules: it has to indicate a good status, + it has to be a single response for the certificate of the PEM file, and it + has to be valid at the moment of addition. If these rules are not respected + the OCSP Response is ignored and a warning is emitted. In order to identify + which certificate an OCSP Response applies to, the issuer's certificate is + necessary. If the issuer's certificate is not found in the PEM file, it will + be loaded from a file at the same path as the PEM file suffixed by ".issuer" + if it exists otherwise it will fail with an error. + + For each PEM file, HAProxy also checks for the presence of file at the same + path suffixed by ".sctl". If such file is found, support for Certificate + Transparency (RFC6962) TLS extension is enabled. The file must contain a + valid Signed Certificate Timestamp List, as described in RFC. File is parsed + to check basic syntax, but no signatures are verified. + + There are cases where it is desirable to support multiple key types, e.g. RSA + and ECDSA in the cipher suites offered to the clients. This allows clients + that support EC certificates to be able to use EC ciphers, while + simultaneously supporting older, RSA only clients. + + To achieve this, OpenSSL 1.1.1 is required, you can configure this behavior + by providing one crt entry per certificate type, or by configuring a "cert + bundle" like it was required before HAProxy 1.8. See "ssl-load-extra-files". + +crt-ignore-err <errors> + This setting is only available when support for OpenSSL was built in. Sets a + comma separated list of errorIDs to ignore during verify at depth == 0. If + set to 'all', all errors are ignored. SSL handshake is not aborted if an error + is ignored. + +crt-list <file> + This setting is only available when support for OpenSSL was built in. It + designates a list of PEM file with an optional ssl configuration and a SNI + filter per certificate, with the following format for each line : + + <crtfile> [\[<sslbindconf> ...\]] [[!]<snifilter> ...] + + sslbindconf supports "allow-0rtt", "alpn", "ca-file", "ca-verify-file", + "ciphers", "ciphersuites", "crl-file", "curves", "ecdhe", "no-ca-names", + "npn", "verify" configuration. With BoringSSL and Openssl >= 1.1.1 + "ssl-min-ver" and "ssl-max-ver" are also supported. It overrides the + configuration set in bind line for the certificate. + + Wildcards are supported in the SNI filter. Negative filter are also supported, + useful in combination with a wildcard filter to exclude a particular SNI, or + after the first certificate to exclude a pattern from its CN or Subject Alt + Name (SAN). The certificates will be presented to clients who provide a valid + TLS Server Name Indication field matching one of the SNI filters. If no SNI + filter is specified, the CN and SAN are used. This directive may be specified + multiple times. See the "crt" option for more information. The default + certificate is still needed to meet OpenSSL expectations. If it is not used, + the 'strict-sni' option may be used. + + Multi-cert bundling (see "ssl-load-extra-files") is supported with crt-list, + as long as only the base name is given in the crt-list. SNI filter will do + the same work on all bundled certificates. + + Empty lines as well as lines beginning with a hash ('#') will be ignored. + + The first declared certificate of a bind line is used as the default + certificate, either from crt or crt-list option, which HAProxy should use in + the TLS handshake if no other certificate matches. This certificate will also + be used if the provided SNI matches its CN or SAN, even if a matching SNI + filter is found on any crt-list. The SNI filter !* can be used after the first + declared certificate to not include its CN and SAN in the SNI tree, so it will + never match except if no other certificate matches. This way the first + declared certificate act as a fallback. + + crt-list file example: + cert1.pem !* + # comment + cert2.pem [alpn h2,http/1.1] + certW.pem *.domain.tld !secure.domain.tld + certS.pem [curves X25519:P-256 ciphers ECDHE-ECDSA-AES256-GCM-SHA384] secure.domain.tld + +defer-accept + Is an optional keyword which is supported only on certain Linux kernels. It + states that a connection will only be accepted once some data arrive on it, + or at worst after the first retransmit. This should be used only on protocols + for which the client talks first (e.g. HTTP). It can slightly improve + performance by ensuring that most of the request is already available when + the connection is accepted. On the other hand, it will not be able to detect + connections which don't talk. It is important to note that this option is + broken in all kernels up to 2.6.31, as the connection is never accepted until + the client talks. This can cause issues with front firewalls which would see + an established connection while the proxy will only see it in SYN_RECV. This + option is only supported on TCPv4/TCPv6 sockets and ignored by other ones. + +expose-fd listeners + This option is only usable with the stats socket. It gives your stats socket + the capability to pass listeners FD to another HAProxy process. + In master-worker mode, this is not required anymore, the listeners will be + passed using the internal socketpairs between the master and the workers. + See also "-x" in the management guide. + +force-sslv3 + This option enforces use of SSLv3 only on SSL connections instantiated from + this listener. SSLv3 is generally less expensive than the TLS counterparts + for high connection rates. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver" and "ssl-max-ver". + +force-tlsv10 + This option enforces use of TLSv1.0 only on SSL connections instantiated from + this listener. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver" and "ssl-max-ver". + +force-tlsv11 + This option enforces use of TLSv1.1 only on SSL connections instantiated from + this listener. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver" and "ssl-max-ver". + +force-tlsv12 + This option enforces use of TLSv1.2 only on SSL connections instantiated from + this listener. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver" and "ssl-max-ver". + +force-tlsv13 + This option enforces use of TLSv1.3 only on SSL connections instantiated from + this listener. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver" and "ssl-max-ver". + +generate-certificates + This setting is only available when support for OpenSSL was built in. It + enables the dynamic SSL certificates generation. A CA certificate and its + private key are necessary (see 'ca-sign-file'). When HAProxy is configured as + a transparent forward proxy, SSL requests generate errors because of a common + name mismatch on the certificate presented to the client. With this option + enabled, HAProxy will try to forge a certificate using the SNI hostname + indicated by the client. This is done only if no certificate matches the SNI + hostname (see 'crt-list'). If an error occurs, the default certificate is + used, else the 'strict-sni' option is set. + It can also be used when HAProxy is configured as a reverse proxy to ease the + deployment of an architecture with many backends. + + Creating a SSL certificate is an expensive operation, so a LRU cache is used + to store forged certificates (see 'tune.ssl.ssl-ctx-cache-size'). It + increases the HAProxy's memory footprint to reduce latency when the same + certificate is used many times. + +gid <gid> + Sets the group of the UNIX sockets to the designated system gid. It can also + be set by default in the global section's "unix-bind" statement. Note that + some platforms simply ignore this. This setting is equivalent to the "group" + setting except that the group ID is used instead of its name. This setting is + ignored by non UNIX sockets. + +group <group> + Sets the group of the UNIX sockets to the designated system group. It can + also be set by default in the global section's "unix-bind" statement. Note + that some platforms simply ignore this. This setting is equivalent to the + "gid" setting except that the group name is used instead of its gid. This + setting is ignored by non UNIX sockets. + +id <id> + Fixes the socket ID. By default, socket IDs are automatically assigned, but + sometimes it is more convenient to fix them to ease monitoring. This value + must be strictly positive and unique within the listener/frontend. This + option can only be used when defining only a single socket. + +interface <interface> + Restricts the socket to a specific interface. When specified, only packets + received from that particular interface are processed by the socket. This is + currently only supported on Linux. The interface must be a primary system + interface, not an aliased interface. It is also possible to bind multiple + frontends to the same address if they are bound to different interfaces. Note + that binding to a network interface requires root privileges. This parameter + is only compatible with TCPv4/TCPv6 sockets. When specified, return traffic + uses the same interface as inbound traffic, and its associated routing table, + even if there are explicit routes through different interfaces configured. + This can prove useful to address asymmetric routing issues when the same + client IP addresses need to be able to reach frontends hosted on different + interfaces. + +level <level> + This setting is used with the stats sockets only to restrict the nature of + the commands that can be issued on the socket. It is ignored by other + sockets. <level> can be one of : + - "user" is the least privileged level; only non-sensitive stats can be + read, and no change is allowed. It would make sense on systems where it + is not easy to restrict access to the socket. + - "operator" is the default level and fits most common uses. All data can + be read, and only non-sensitive changes are permitted (e.g. clear max + counters). + - "admin" should be used with care, as everything is permitted (e.g. clear + all counters). + +severity-output <format> + This setting is used with the stats sockets only to configure severity + level output prepended to informational feedback messages. Severity + level of messages can range between 0 and 7, conforming to syslog + rfc5424. Valid and successful socket commands requesting data + (i.e. "show map", "get acl foo" etc.) will never have a severity level + prepended. It is ignored by other sockets. <format> can be one of : + - "none" (default) no severity level is prepended to feedback messages. + - "number" severity level is prepended as a number. + - "string" severity level is prepended as a string following the + rfc5424 convention. + +maxconn <maxconn> + Limits the sockets to this number of concurrent connections. Extraneous + connections will remain in the system's backlog until a connection is + released. If unspecified, the limit will be the same as the frontend's + maxconn. Note that in case of port ranges or multiple addresses, the same + value will be applied to each socket. This setting enables different + limitations on expensive sockets, for instance SSL entries which may easily + eat all memory. + +mode <mode> + Sets the octal mode used to define access permissions on the UNIX socket. It + can also be set by default in the global section's "unix-bind" statement. + Note that some platforms simply ignore this. This setting is ignored by non + UNIX sockets. + +mss <maxseg> + Sets the TCP Maximum Segment Size (MSS) value to be advertised on incoming + connections. This can be used to force a lower MSS for certain specific + ports, for instance for connections passing through a VPN. Note that this + relies on a kernel feature which is theoretically supported under Linux but + was buggy in all versions prior to 2.6.28. It may or may not work on other + operating systems. It may also not change the advertised value but change the + effective size of outgoing segments. The commonly advertised value for TCPv4 + over Ethernet networks is 1460 = 1500(MTU) - 40(IP+TCP). If this value is + positive, it will be used as the advertised MSS. If it is negative, it will + indicate by how much to reduce the incoming connection's advertised MSS for + outgoing segments. This parameter is only compatible with TCP v4/v6 sockets. + +name <name> + Sets an optional name for these sockets, which will be reported on the stats + page. + +namespace <name> + On Linux, it is possible to specify which network namespace a socket will + belong to. This directive makes it possible to explicitly bind a listener to + a namespace different from the default one. Please refer to your operating + system's documentation to find more details about network namespaces. + +nice <nice> + Sets the 'niceness' of connections initiated from the socket. Value must be + in the range -1024..1024 inclusive, and defaults to zero. Positive values + means that such connections are more friendly to others and easily offer + their place in the scheduler. On the opposite, negative values mean that + connections want to run with a higher priority than others. The difference + only happens under high loads when the system is close to saturation. + Negative values are appropriate for low-latency or administration services, + and high values are generally recommended for CPU intensive tasks such as SSL + processing or bulk transfers which are less sensible to latency. For example, + it may make sense to use a positive value for an SMTP socket and a negative + one for an RDP socket. + +no-ca-names + This setting is only available when support for OpenSSL was built in. It + prevents from send CA names in server hello message when ca-file is used. + Use "ca-verify-file" instead of "ca-file" with "no-ca-names". + +no-sslv3 + This setting is only available when support for OpenSSL was built in. It + disables support for SSLv3 on any sockets instantiated from the listener when + SSL is supported. Note that SSLv2 is forced disabled in the code and cannot + be enabled using any configuration option. This option is also available on + global statement "ssl-default-bind-options". Use "ssl-min-ver" and + "ssl-max-ver" instead. + +no-tls-tickets + This setting is only available when support for OpenSSL was built in. It + disables the stateless session resumption (RFC 5077 TLS Ticket + extension) and force to use stateful session resumption. Stateless + session resumption is more expensive in CPU usage. This option is also + available on global statement "ssl-default-bind-options". + The TLS ticket mechanism is only used up to TLS 1.2. + Forward Secrecy is compromised with TLS tickets, unless ticket keys + are periodically rotated (via reload or by using "tls-ticket-keys"). + +no-tlsv10 + This setting is only available when support for OpenSSL was built in. It + disables support for TLSv1.0 on any sockets instantiated from the listener + when SSL is supported. Note that SSLv2 is forced disabled in the code and + cannot be enabled using any configuration option. This option is also + available on global statement "ssl-default-bind-options". Use "ssl-min-ver" + and "ssl-max-ver" instead. + +no-tlsv11 + This setting is only available when support for OpenSSL was built in. It + disables support for TLSv1.1 on any sockets instantiated from the listener + when SSL is supported. Note that SSLv2 is forced disabled in the code and + cannot be enabled using any configuration option. This option is also + available on global statement "ssl-default-bind-options". Use "ssl-min-ver" + and "ssl-max-ver" instead. + +no-tlsv12 + This setting is only available when support for OpenSSL was built in. It + disables support for TLSv1.2 on any sockets instantiated from the listener + when SSL is supported. Note that SSLv2 is forced disabled in the code and + cannot be enabled using any configuration option. This option is also + available on global statement "ssl-default-bind-options". Use "ssl-min-ver" + and "ssl-max-ver" instead. + +no-tlsv13 + This setting is only available when support for OpenSSL was built in. It + disables support for TLSv1.3 on any sockets instantiated from the listener + when SSL is supported. Note that SSLv2 is forced disabled in the code and + cannot be enabled using any configuration option. This option is also + available on global statement "ssl-default-bind-options". Use "ssl-min-ver" + and "ssl-max-ver" instead. + +npn <protocols> + This enables the NPN TLS extension and advertises the specified protocol list + as supported on top of NPN. The protocol list consists in a comma-delimited + list of protocol names, for instance: "http/1.1,http/1.0" (without quotes). + This requires that the SSL library is built with support for TLS extensions + enabled (check with haproxy -vv). Note that the NPN extension has been + replaced with the ALPN extension (see the "alpn" keyword), though this one is + only available starting with OpenSSL 1.0.2. If HTTP/2 is desired on an older + version of OpenSSL, NPN might still be used as most clients still support it + at the time of writing this. It is possible to enable both NPN and ALPN + though it probably doesn't make any sense out of testing. + +prefer-client-ciphers + Use the client's preference when selecting the cipher suite, by default + the server's preference is enforced. This option is also available on + global statement "ssl-default-bind-options". + Note that with OpenSSL >= 1.1.1 ChaCha20-Poly1305 is reprioritized anyway + (without setting this option), if a ChaCha20-Poly1305 cipher is at the top of + the client cipher list. + +process <process-set>[/<thread-set>] + This restricts the list of threads on which this listener is allowed to run. + It does not enforce any of them but eliminates those which do not match. Note + that only process number 1 is permitted. If a thread set is specified, it + limits the threads allowed to process incoming connections for this listener. + For the unlikely case where several ranges are needed, this directive may be + repeated. <process-set> and <thread-set> must use the format + + all | odd | even | number[-[number]] + + Ranges can be partially defined. The higher bound can be omitted. In such a + case, it is replaced by the corresponding maximum value. The main purpose is + to have multiple bind lines sharing the same IP:port but not the same thread + in a listener, so that the system can distribute the incoming connections + into multiple queues, bypassing haproxy's internal queue load balancing. + Currently Linux 3.9 and above is known for supporting this. + + This directive is deprecated in favor of the more suited "thread" directive + below, and will be removed in 2.7. + +proto <name> + Forces the multiplexer's protocol to use for the incoming connections. It + must be compatible with the mode of the frontend (TCP or HTTP). It must also + be usable on the frontend side. The list of available protocols is reported + in haproxy -vv. The protocols properties are reported : the mode (TCP/HTTP), + the side (FE/BE), the mux name and its flags. + + Some protocols are subject to the head-of-line blocking on server side + (flag=HOL_RISK). Finally some protocols don't support upgrades (flag=NO_UPG). + The HTX compatibility is also reported (flag=HTX). + + Here are the protocols that may be used as argument to a "proto" directive on + a bind line : + + h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG + h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG + none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG + + Idea behind this option is to bypass the selection of the best multiplexer's + protocol for all connections instantiated from this listening socket. For + instance, it is possible to force the http/2 on clear TCP by specifying "proto + h2" on the bind line. + +quic-cc-algo [ cubic | newreno ] + Warning: QUIC support in HAProxy is currently experimental. Configuration may + + This is a QUIC specific setting to select the congestion control algorithm + for any connection attempts to the configured QUIC listeners. They are similar + to those used by TCP. + + Default value: cubic + +quic-force-retry + Warning: QUIC support in HAProxy is currently experimental. Configuration may + change without deprecation in the future. + + This is a QUIC specific setting which forces the use of the QUIC Retry feature + for all the connection attempts to the configured QUIC listeners. It consists + in veryfying the peers are able to receive packets at the transport address + they used to initiate a new connection, sending them a Retry packet which + contains a token. This token must be sent back to the Retry packet sender, + this latter being the only one to be able to validate the token. Note that QUIC + Retry will always be used even if a Retry threshold was set (see + "tune.quic.retry-threshold" setting). + + This setting requires the cluster secret to be set or else an error will be + reported on startup (see "cluster-secret"). + + See https://www.rfc-editor.org/rfc/rfc9000.html#section-8.1.2 for more + information about QUIC retry. + +shards <number> | by-thread + In multi-threaded mode, on operating systems supporting multiple listeners on + the same IP:port, this will automatically create this number of multiple + identical listeners for the same line, all bound to a fair share of the number + of the threads attached to this listener. This can sometimes be useful when + using very large thread counts where the in-kernel locking on a single socket + starts to cause a significant overhead. In this case the incoming traffic is + distributed over multiple sockets and the contention is reduced. Note that + doing this can easily increase the CPU usage by making more threads work a + little bit. + + If the number of shards is higher than the number of available threads, it + will automatically be trimmed to the number of threads (i.e. one shard per + thread). The special "by-thread" value also creates as many shards as there + are threads on the "bind" line. Since the system will evenly distribute the + incoming traffic between all these shards, it is important that this number + is an integral divisor of the number of threads. + +ssl + This setting is only available when support for OpenSSL was built in. It + enables SSL deciphering on connections instantiated from this listener. A + certificate is necessary (see "crt" above). All contents in the buffers will + appear in clear text, so that ACLs and HTTP processing will only have access + to deciphered contents. SSLv3 is disabled per default, use "ssl-min-ver SSLv3" + to enable it. + +ssl-max-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + This option enforces use of <version> or lower on SSL connections instantiated + from this listener. Using this setting without "ssl-min-ver" can be + ambiguous because the default ssl-min-ver value could change in future HAProxy + versions. This option is also available on global statement + "ssl-default-bind-options". See also "ssl-min-ver". + +ssl-min-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + This option enforces use of <version> or upper on SSL connections + instantiated from this listener. The default value is "TLSv1.2". This option + is also available on global statement "ssl-default-bind-options". + See also "ssl-max-ver". + +strict-sni + This setting is only available when support for OpenSSL was built in. The + SSL/TLS negotiation is allow only if the client provided an SNI which match + a certificate. The default certificate is not used. + See the "crt" option for more information. + +tcp-ut <delay> + Sets the TCP User Timeout for all incoming connections instantiated from this + listening socket. This option is available on Linux since version 2.6.37. It + allows HAProxy to configure a timeout for sockets which contain data not + receiving an acknowledgment for the configured delay. This is especially + useful on long-lived connections experiencing long idle periods such as + remote terminals or database connection pools, where the client and server + timeouts must remain high to allow a long period of idle, but where it is + important to detect that the client has disappeared in order to release all + resources associated with its connection (and the server's session). The + argument is a delay expressed in milliseconds by default. This only works + for regular TCP connections, and is ignored for other protocols. + +tfo + Is an optional keyword which is supported only on Linux kernels >= 3.7. It + enables TCP Fast Open on the listening socket, which means that clients which + support this feature will be able to send a request and receive a response + during the 3-way handshake starting from second connection, thus saving one + round-trip after the first connection. This only makes sense with protocols + that use high connection rates and where each round trip matters. This can + possibly cause issues with many firewalls which do not accept data on SYN + packets, so this option should only be enabled once well tested. This option + is only supported on TCPv4/TCPv6 sockets and ignored by other ones. You may + need to build HAProxy with USE_TFO=1 if your libc doesn't define + TCP_FASTOPEN. + +thread [<thread-group>/]<thread-set> + This restricts the list of threads on which this listener is allowed to run. + It does not enforce any of them but eliminates those which do not match. It + limits the threads allowed to process incoming connections for this listener. + + There are two numbering schemes. By default, thread numbers are absolute in + the process, comprised between 1 and the value specified in global.nbthread. + When thread groups are enabled, the number of a single desired thread group + (starting at 1) may be specified before a slash ('/') before the thread + range. In this case, the thread numbers in the range are relative to the + thread group instead, and start at 1 for each thread group. Absolute and + relative thread numbers may be used interchangeably but they must not be + mixed on a single "bind" line, as those not set will be resolved at the end + of the parsing. + + For the unlikely case where several ranges are needed, this directive may be + repeated. It is not permitted to use different thread groups even when using + multiple directives. The <thread-set> specification must use the format: + + all | odd | even | number[-[number]] + + Ranges can be partially defined. The higher bound can be omitted. In such a + case, it is replaced by the corresponding maximum value. The main purpose is + to have multiple bind lines sharing the same IP:port but not the same thread + in a listener, so that the system can distribute the incoming connections + into multiple queues, bypassing haproxy's internal queue load balancing. + Currently Linux 3.9 and above is known for supporting this. + +tls-ticket-keys <keyfile> + Sets the TLS ticket keys file to load the keys from. The keys need to be 48 + or 80 bytes long, depending if aes128 or aes256 is used, encoded with base64 + with one line per key (ex. openssl rand 80 | openssl base64 -A | xargs echo). + The first key determines the key length used for next keys: you can't mix + aes128 and aes256 keys. Number of keys is specified by the TLS_TICKETS_NO + build option (default 3) and at least as many keys need to be present in + the file. Last TLS_TICKETS_NO keys will be used for decryption and the + penultimate one for encryption. This enables easy key rotation by just + appending new key to the file and reloading the process. Keys must be + periodically rotated (ex. every 12h) or Perfect Forward Secrecy is + compromised. It is also a good idea to keep the keys off any permanent + storage such as hard drives (hint: use tmpfs and don't swap those files). + Lifetime hint can be changed using tune.ssl.timeout. + +transparent + Is an optional keyword which is supported only on certain Linux kernels. It + indicates that the addresses will be bound even if they do not belong to the + local machine, and that packets targeting any of these addresses will be + intercepted just as if the addresses were locally configured. This normally + requires that IP forwarding is enabled. Caution! do not use this with the + default address '*', as it would redirect any traffic for the specified port. + This keyword is available only when HAProxy is built with USE_LINUX_TPROXY=1. + This parameter is only compatible with TCPv4 and TCPv6 sockets, depending on + kernel version. Some distribution kernels include backports of the feature, + so check for support with your vendor. + +v4v6 + Is an optional keyword which is supported only on most recent systems + including Linux kernels >= 2.4.21. It is used to bind a socket to both IPv4 + and IPv6 when it uses the default address. Doing so is sometimes necessary + on systems which bind to IPv6 only by default. It has no effect on non-IPv6 + sockets, and is overridden by the "v6only" option. + +v6only + Is an optional keyword which is supported only on most recent systems + including Linux kernels >= 2.4.21. It is used to bind a socket to IPv6 only + when it uses the default address. Doing so is sometimes preferred to doing it + system-wide as it is per-listener. It has no effect on non-IPv6 sockets and + has precedence over the "v4v6" option. + +uid <uid> + Sets the owner of the UNIX sockets to the designated system uid. It can also + be set by default in the global section's "unix-bind" statement. Note that + some platforms simply ignore this. This setting is equivalent to the "user" + setting except that the user numeric ID is used instead of its name. This + setting is ignored by non UNIX sockets. + +user <user> + Sets the owner of the UNIX sockets to the designated system user. It can also + be set by default in the global section's "unix-bind" statement. Note that + some platforms simply ignore this. This setting is equivalent to the "uid" + setting except that the user name is used instead of its uid. This setting is + ignored by non UNIX sockets. + +verify [none|optional|required] + This setting is only available when support for OpenSSL was built in. If set + to 'none', client certificate is not requested. This is the default. In other + cases, a client certificate is requested. If the client does not provide a + certificate after the request and if 'verify' is set to 'required', then the + handshake is aborted, while it would have succeeded if set to 'optional'. The + certificate provided by the client is always verified using CAs from + 'ca-file' and optional CRLs from 'crl-file'. On verify failure the handshake + is aborted, regardless of the 'verify' option, unless the error code exactly + matches one of those listed with 'ca-ignore-err' or 'crt-ignore-err'. + +5.2. Server and default-server options +------------------------------------ + +The "server" and "default-server" keywords support a certain number of settings +which are all passed as arguments on the server line. The order in which those +arguments appear does not count, and they are all optional. Some of those +settings are single words (booleans) while others expect one or several values +after them. In this case, the values must immediately follow the setting name. +Except default-server, all those settings must be specified after the server's +address if they are used: + + server <name> <address>[:port] [settings ...] + default-server [settings ...] + +Note that all these settings are supported both by "server" and "default-server" +keywords, except "id" which is only supported by "server". + +The currently supported settings are the following ones. + +addr <ipv4|ipv6> + Using the "addr" parameter, it becomes possible to use a different IP address + to send health-checks or to probe the agent-check. On some servers, it may be + desirable to dedicate an IP address to specific component able to perform + complex tests which are more suitable to health-checks than the application. + This parameter is ignored if the "check" parameter is not set. See also the + "port" parameter. + +agent-check + Enable an auxiliary agent check which is run independently of a regular + health check. An agent health check is performed by making a TCP connection + to the port set by the "agent-port" parameter and reading an ASCII string + terminated by the first '\r' or '\n' met. The string is made of a series of + words delimited by spaces, tabs or commas in any order, each consisting of : + + - An ASCII representation of a positive integer percentage, e.g. "75%". + Values in this format will set the weight proportional to the initial + weight of a server as configured when HAProxy starts. Note that a zero + weight is reported on the stats page as "DRAIN" since it has the same + effect on the server (it's removed from the LB farm). + + - The string "maxconn:" followed by an integer (no space between). Values + in this format will set the maxconn of a server. The maximum number of + connections advertised needs to be multiplied by the number of load + balancers and different backends that use this health check to get the + total number of connections the server might receive. Example: maxconn:30 + + - The word "ready". This will turn the server's administrative state to the + READY mode, thus canceling any DRAIN or MAINT state + + - The word "drain". This will turn the server's administrative state to the + DRAIN mode, thus it will not accept any new connections other than those + that are accepted via persistence. + + - The word "maint". This will turn the server's administrative state to the + MAINT mode, thus it will not accept any new connections at all, and health + checks will be stopped. + + - The words "down", "fail", or "stopped", optionally followed by a + description string after a sharp ('#'). All of these mark the server's + operating state as DOWN, but since the word itself is reported on the stats + page, the difference allows an administrator to know if the situation was + expected or not : the service may intentionally be stopped, may appear up + but fail some validity tests, or may be seen as down (e.g. missing process, + or port not responding). + + - The word "up" sets back the server's operating state as UP if health checks + also report that the service is accessible. + + Parameters which are not advertised by the agent are not changed. For + example, an agent might be designed to monitor CPU usage and only report a + relative weight and never interact with the operating status. Similarly, an + agent could be designed as an end-user interface with 3 radio buttons + allowing an administrator to change only the administrative state. However, + it is important to consider that only the agent may revert its own actions, + so if a server is set to DRAIN mode or to DOWN state using the agent, the + agent must implement the other equivalent actions to bring the service into + operations again. + + Failure to connect to the agent is not considered an error as connectivity + is tested by the regular health check which is enabled by the "check" + parameter. Warning though, it is not a good idea to stop an agent after it + reports "down", since only an agent reporting "up" will be able to turn the + server up again. Note that the CLI on the Unix stats socket is also able to + force an agent's result in order to work around a bogus agent if needed. + + Requires the "agent-port" parameter to be set. See also the "agent-inter" + and "no-agent-check" parameters. + +agent-send <string> + If this option is specified, HAProxy will send the given string (verbatim) + to the agent server upon connection. You could, for example, encode + the backend name into this string, which would enable your agent to send + different responses based on the backend. Make sure to include a '\n' if + you want to terminate your request with a newline. + +agent-inter <delay> + The "agent-inter" parameter sets the interval between two agent checks + to <delay> milliseconds. If left unspecified, the delay defaults to 2000 ms. + + Just as with every other time-based parameter, it may be entered in any + other explicit unit among { us, ms, s, m, h, d }. The "agent-inter" + parameter also serves as a timeout for agent checks "timeout check" is + not set. In order to reduce "resonance" effects when multiple servers are + hosted on the same hardware, the agent and health checks of all servers + are started with a small time offset between them. It is also possible to + add some random noise in the agent and health checks interval using the + global "spread-checks" keyword. This makes sense for instance when a lot + of backends use the same servers. + + See also the "agent-check" and "agent-port" parameters. + +agent-addr <addr> + The "agent-addr" parameter sets address for agent check. + + You can offload agent-check to another target, so you can make single place + managing status and weights of servers defined in HAProxy in case you can't + make self-aware and self-managing services. You can specify both IP or + hostname, it will be resolved. + +agent-port <port> + The "agent-port" parameter sets the TCP port used for agent checks. + + See also the "agent-check" and "agent-inter" parameters. + +allow-0rtt + Allow sending early data to the server when using TLS 1.3. + Note that early data will be sent only if the client used early data, or + if the backend uses "retry-on" with the "0rtt-rejected" keyword. + +alpn <protocols> + This enables the TLS ALPN extension and advertises the specified protocol + list as supported on top of ALPN. The protocol list consists in a comma- + delimited list of protocol names, for instance: "http/1.1,http/1.0" (without + quotes). This requires that the SSL library is built with support for TLS + extensions enabled (check with haproxy -vv). The ALPN extension replaces the + initial NPN extension. ALPN is required to connect to HTTP/2 servers. + Versions of OpenSSL prior to 1.0.2 didn't support ALPN and only supposed the + now obsolete NPN extension. + If both HTTP/2 and HTTP/1.1 are expected to be supported, both versions can + be advertised, in order of preference, like below : + + server 127.0.0.1:443 ssl crt pub.pem alpn h2,http/1.1 + + See also "ws" to use an alternative ALPN for websocket streams. + +backup + When "backup" is present on a server line, the server is only used in load + balancing when all other non-backup servers are unavailable. Requests coming + with a persistence cookie referencing the server will always be served + though. By default, only the first operational backup server is used, unless + the "allbackups" option is set in the backend. See also the "no-backup" and + "allbackups" options. + +ca-file <cafile> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file from which to load CA certificates used to verify + server's certificate. It is possible to load a directory containing multiple + CAs, in this case HAProxy will try to load every ".pem", ".crt", ".cer", and + .crl" available in the directory, files starting with a dot are ignored. + + In order to use the trusted CAs of your system, the "@system-ca" parameter + could be used in place of the cafile. The location of this directory could be + overwritten by setting the SSL_CERT_DIR environment variable. + +check + This option enables health checks on a server: + - when not set, no health checking is performed, and the server is always + considered available. + - when set and no other check method is configured, the server is considered + available when a connection can be established at the highest configured + transport layer. This means TCP by default, or SSL/TLS when "ssl" or + "check-ssl" are set, both possibly combined with connection prefixes such + as a PROXY protocol header when "send-proxy" or "check-send-proxy" are + set. This behavior is slightly different for dynamic servers, read the + following paragraphs for more details. + - when set and an application-level health check is defined, the + application-level exchanges are performed on top of the configured + transport layer and the server is considered available if all of the + exchanges succeed. + + By default, health checks are performed on the same address and port as + configured on the server, using the same encapsulation parameters (SSL/TLS, + proxy-protocol header, etc... ). It is possible to change the destination + address using "addr" and the port using "port". When done, it is assumed the + server isn't checked on the service port, and configured encapsulation + parameters are not reused. One must explicitly set "check-send-proxy" to send + connection headers, "check-ssl" to use SSL/TLS. + + Note that the implicit configuration of ssl and PROXY protocol is not + performed for dynamic servers. In this case, it is required to explicitly + use "check-ssl" and "check-send-proxy" when wanted, even if the check port is + not overridden. + + When "sni" or "alpn" are set on the server line, their value is not used for + health checks and one must use "check-sni" or "check-alpn". + + The default source address for health check traffic is the same as the one + defined in the backend. It can be changed with the "source" keyword. + + The interval between checks can be set using the "inter" keyword, and the + "rise" and "fall" keywords can be used to define how many successful or + failed health checks are required to flag a server available or not + available. + + Optional application-level health checks can be configured with "option + httpchk", "option mysql-check" "option smtpchk", "option pgsql-check", + "option ldap-check", or "option redis-check". + + Example: + # simple tcp check + backend foo + server s1 192.168.0.1:80 check + # this does a tcp connect + tls handshake + backend foo + server s1 192.168.0.1:443 ssl check + # simple tcp check is enough for check success + backend foo + option tcp-check + tcp-check connect + server s1 192.168.0.1:443 ssl check + +check-send-proxy + This option forces emission of a PROXY protocol line with outgoing health + checks, regardless of whether the server uses send-proxy or not for the + normal traffic. By default, the PROXY protocol is enabled for health checks + if it is already enabled for normal traffic and if no "port" nor "addr" + directive is present. However, if such a directive is present, the + "check-send-proxy" option needs to be used to force the use of the + protocol. See also the "send-proxy" option for more information. + +check-alpn <protocols> + Defines which protocols to advertise with ALPN. The protocol list consists in + a comma-delimited list of protocol names, for instance: "http/1.1,http/1.0" + (without quotes). If it is not set, the server ALPN is used. + +check-proto <name> + Forces the multiplexer's protocol to use for the server's health-check + connections. It must be compatible with the health-check type (TCP or + HTTP). It must also be usable on the backend side. The list of available + protocols is reported in haproxy -vv. The protocols properties are + reported : the mode (TCP/HTTP), the side (FE/BE), the mux name and its flags. + + Some protocols are subject to the head-of-line blocking on server side + (flag=HOL_RISK). Finally some protocols don't support upgrades (flag=NO_UPG). + The HTX compatibility is also reported (flag=HTX). + + Here are the protocols that may be used as argument to a "check-proto" + directive on a server line: + + h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG + fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG + h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG + none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG + + Idea behind this option is to bypass the selection of the best multiplexer's + protocol for health-check connections established to this server. + If not defined, the server one will be used, if set. + +check-sni <sni> + This option allows you to specify the SNI to be used when doing health checks + over SSL. It is only possible to use a string to set <sni>. If you want to + set a SNI for proxied traffic, see "sni". + +check-ssl + This option forces encryption of all health checks over SSL, regardless of + whether the server uses SSL or not for the normal traffic. This is generally + used when an explicit "port" or "addr" directive is specified and SSL health + checks are not inherited. It is important to understand that this option + inserts an SSL transport layer below the checks, so that a simple TCP connect + check becomes an SSL connect, which replaces the old ssl-hello-chk. The most + common use is to send HTTPS checks by combining "httpchk" with SSL checks. + All SSL settings are common to health checks and traffic (e.g. ciphers). + See the "ssl" option for more information and "no-check-ssl" to disable + this option. + +check-via-socks4 + This option enables outgoing health checks using upstream socks4 proxy. By + default, the health checks won't go through socks tunnel even it was enabled + for normal traffic. + +ciphers <ciphers> + This setting is only available when support for OpenSSL was built in. This + option sets the string describing the list of cipher algorithms that is + negotiated during the SSL/TLS handshake with the server. The format of the + string is defined in "man 1 ciphers" from OpenSSL man pages. For background + information and recommendations see e.g. + (https://wiki.mozilla.org/Security/Server_Side_TLS) and + (https://mozilla.github.io/server-side-tls/ssl-config-generator/). For TLSv1.3 + cipher configuration, please check the "ciphersuites" keyword. + +ciphersuites <ciphersuites> + This setting is only available when support for OpenSSL was built in and + OpenSSL 1.1.1 or later was used to build HAProxy. This option sets the string + describing the list of cipher algorithms that is negotiated during the TLS + 1.3 handshake with the server. The format of the string is defined in + "man 1 ciphers" from OpenSSL man pages under the "ciphersuites" section. + For cipher configuration for TLSv1.2 and earlier, please check the "ciphers" + keyword. + +cookie <value> + The "cookie" parameter sets the cookie value assigned to the server to + <value>. This value will be checked in incoming requests, and the first + operational server possessing the same value will be selected. In return, in + cookie insertion or rewrite modes, this value will be assigned to the cookie + sent to the client. There is nothing wrong in having several servers sharing + the same cookie value, and it is in fact somewhat common between normal and + backup servers. See also the "cookie" keyword in backend section. + +crl-file <crlfile> + This setting is only available when support for OpenSSL was built in. It + designates a PEM file from which to load certificate revocation list used + to verify server's certificate. + +crt <cert> + This setting is only available when support for OpenSSL was built in. + It designates a PEM file from which to load both a certificate and the + associated private key. This file can be built by concatenating both PEM + files into one. This certificate will be sent if the server send a client + certificate request. + + If the file does not contain a private key, HAProxy will try to load the key + at the same path suffixed by a ".key" (provided the "ssl-load-extra-files" + option is set accordingly). + +disabled + The "disabled" keyword starts the server in the "disabled" state. That means + that it is marked down in maintenance mode, and no connection other than the + ones allowed by persist mode will reach it. It is very well suited to setup + new servers, because normal traffic will never reach them, while it is still + possible to test the service by making use of the force-persist mechanism. + See also "enabled" setting. + +enabled + This option may be used as 'server' setting to reset any 'disabled' + setting which would have been inherited from 'default-server' directive as + default value. + It may also be used as 'default-server' setting to reset any previous + 'default-server' 'disabled' setting. + +error-limit <count> + If health observing is enabled, the "error-limit" parameter specifies the + number of consecutive errors that triggers event selected by the "on-error" + option. By default it is set to 10 consecutive errors. + + See also the "check", "error-limit" and "on-error". + +fall <count> + The "fall" parameter states that a server will be considered as dead after + <count> consecutive unsuccessful health checks. This value defaults to 3 if + unspecified. See also the "check", "inter" and "rise" parameters. + +force-sslv3 + This option enforces use of SSLv3 only when SSL is used to communicate with + the server. SSLv3 is generally less expensive than the TLS counterparts for + high connection rates. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". + +force-tlsv10 + This option enforces use of TLSv1.0 only when SSL is used to communicate with + the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". + +force-tlsv11 + This option enforces use of TLSv1.1 only when SSL is used to communicate with + the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". + +force-tlsv12 + This option enforces use of TLSv1.2 only when SSL is used to communicate with + the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". + +force-tlsv13 + This option enforces use of TLSv1.3 only when SSL is used to communicate with + the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver" and ssl-max-ver". + +id <value> + Set a persistent ID for the server. This ID must be positive and unique for + the proxy. An unused ID will automatically be assigned if unset. The first + assigned value will be 1. This ID is currently only returned in statistics. + +init-addr {last | libc | none | <ip>},[...]* + Indicate in what order the server's address should be resolved upon startup + if it uses an FQDN. Attempts are made to resolve the address by applying in + turn each of the methods mentioned in the comma-delimited list. The first + method which succeeds is used. If the end of the list is reached without + finding a working method, an error is thrown. Method "last" suggests to pick + the address which appears in the state file (see "server-state-file"). Method + "libc" uses the libc's internal resolver (gethostbyname() or getaddrinfo() + depending on the operating system and build options). Method "none" + specifically indicates that the server should start without any valid IP + address in a down state. It can be useful to ignore some DNS issues upon + startup, waiting for the situation to get fixed later. Finally, an IP address + (IPv4 or IPv6) may be provided. It can be the currently known address of the + server (e.g. filled by a configuration generator), or the address of a dummy + server used to catch old sessions and present them with a decent error + message for example. When the "first" load balancing algorithm is used, this + IP address could point to a fake server used to trigger the creation of new + instances on the fly. This option defaults to "last,libc" indicating that the + previous address found in the state file (if any) is used first, otherwise + the libc's resolver is used. This ensures continued compatibility with the + historic behavior. + + Example: + defaults + # never fail on address resolution + default-server init-addr last,libc,none + +inter <delay> +fastinter <delay> +downinter <delay> + The "inter" parameter sets the interval between two consecutive health checks + to <delay> milliseconds. If left unspecified, the delay defaults to 2000 ms. + It is also possible to use "fastinter" and "downinter" to optimize delays + between checks depending on the server state : + + Server state | Interval used + ----------------------------------------+---------------------------------- + UP 100% (non-transitional) | "inter" + ----------------------------------------+---------------------------------- + Transitionally UP (going down "fall"), | "fastinter" if set, + Transitionally DOWN (going up "rise"), | "inter" otherwise. + or yet unchecked. | + ----------------------------------------+---------------------------------- + DOWN 100% (non-transitional) | "downinter" if set, + | "inter" otherwise. + ----------------------------------------+---------------------------------- + + Just as with every other time-based parameter, they can be entered in any + other explicit unit among { us, ms, s, m, h, d }. The "inter" parameter also + serves as a timeout for health checks sent to servers if "timeout check" is + not set. In order to reduce "resonance" effects when multiple servers are + hosted on the same hardware, the agent and health checks of all servers + are started with a small time offset between them. It is also possible to + add some random noise in the agent and health checks interval using the + global "spread-checks" keyword. This makes sense for instance when a lot + of backends use the same servers. + +log-proto <logproto> + The "log-proto" specifies the protocol used to forward event messages to + a server configured in a ring section. Possible values are "legacy" + and "octet-count" corresponding respectively to "Non-transparent-framing" + and "Octet counting" in rfc6587. "legacy" is the default. + +maxconn <maxconn> + The "maxconn" parameter specifies the maximal number of concurrent + connections that will be sent to this server. If the number of incoming + concurrent connections goes higher than this value, they will be queued, + waiting for a slot to be released. This parameter is very important as it can + save fragile servers from going down under extreme loads. If a "minconn" + parameter is specified, the limit becomes dynamic. The default value is "0" + which means unlimited. See also the "minconn" and "maxqueue" parameters, and + the backend's "fullconn" keyword. + + In HTTP mode this parameter limits the number of concurrent requests instead + of the number of connections. Multiple requests might be multiplexed over a + single TCP connection to the server. As an example if you specify a maxconn + of 50 you might see between 1 and 50 actual server connections, but no more + than 50 concurrent requests. + +maxqueue <maxqueue> + The "maxqueue" parameter specifies the maximal number of connections which + will wait in the queue for this server. If this limit is reached, next + requests will be redispatched to other servers instead of indefinitely + waiting to be served. This will break persistence but may allow people to + quickly re-log in when the server they try to connect to is dying. Some load + balancing algorithms such as leastconn take this into account and accept to + add requests into a server's queue up to this value if it is explicitly set + to a value greater than zero, which often allows to better smooth the load + when dealing with single-digit maxconn values. The default value is "0" which + means the queue is unlimited. See also the "maxconn" and "minconn" parameters + and "balance leastconn". + +max-reuse <count> + The "max-reuse" argument indicates the HTTP connection processors that they + should not reuse a server connection more than this number of times to send + new requests. Permitted values are -1 (the default), which disables this + limit, or any positive value. Value zero will effectively disable keep-alive. + This is only used to work around certain server bugs which cause them to leak + resources over time. The argument is not necessarily respected by the lower + layers as there might be technical limitations making it impossible to + enforce. At least HTTP/2 connections to servers will respect it. + +minconn <minconn> + When the "minconn" parameter is set, the maxconn limit becomes a dynamic + limit following the backend's load. The server will always accept at least + <minconn> connections, never more than <maxconn>, and the limit will be on + the ramp between both values when the backend has less than <fullconn> + concurrent connections. This makes it possible to limit the load on the + server during normal loads, but push it further for important loads without + overloading the server during exceptional loads. See also the "maxconn" + and "maxqueue" parameters, as well as the "fullconn" backend keyword. + +namespace <name> + On Linux, it is possible to specify which network namespace a socket will + belong to. This directive makes it possible to explicitly bind a server to + a namespace different from the default one. Please refer to your operating + system's documentation to find more details about network namespaces. + +no-agent-check + This option may be used as "server" setting to reset any "agent-check" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "agent-check" setting. + +no-backup + This option may be used as "server" setting to reset any "backup" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "backup" setting. + +no-check + This option may be used as "server" setting to reset any "check" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "check" setting. + +no-check-ssl + This option may be used as "server" setting to reset any "check-ssl" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "check-ssl" setting. + +no-send-proxy + This option may be used as "server" setting to reset any "send-proxy" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "send-proxy" setting. + +no-send-proxy-v2 + This option may be used as "server" setting to reset any "send-proxy-v2" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "send-proxy-v2" setting. + +no-send-proxy-v2-ssl + This option may be used as "server" setting to reset any "send-proxy-v2-ssl" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "send-proxy-v2-ssl" setting. + +no-send-proxy-v2-ssl-cn + This option may be used as "server" setting to reset any "send-proxy-v2-ssl-cn" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "send-proxy-v2-ssl-cn" setting. + +no-ssl + This option may be used as "server" setting to reset any "ssl" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "ssl" setting. + + Note that using `default-server ssl` setting and `no-ssl` on server will + however init SSL connection, so it can be later be enabled through the + runtime API: see `set server` commands in management doc. + +no-ssl-reuse + This option disables SSL session reuse when SSL is used to communicate with + the server. It will force the server to perform a full handshake for every + new connection. It's probably only useful for benchmarking, troubleshooting, + and for paranoid users. + +no-sslv3 + This option disables support for SSLv3 when SSL is used to communicate with + the server. Note that SSLv2 is disabled in the code and cannot be enabled + using any configuration option. Use "ssl-min-ver" and "ssl-max-ver" instead. + + Supported in default-server: No + +no-tls-tickets + This setting is only available when support for OpenSSL was built in. It + disables the stateless session resumption (RFC 5077 TLS Ticket + extension) and force to use stateful session resumption. Stateless + session resumption is more expensive in CPU usage for servers. This option + is also available on global statement "ssl-default-server-options". + The TLS ticket mechanism is only used up to TLS 1.2. + Forward Secrecy is compromised with TLS tickets, unless ticket keys + are periodically rotated (via reload or by using "tls-ticket-keys"). + See also "tls-tickets". + +no-tlsv10 + This option disables support for TLSv1.0 when SSL is used to communicate with + the server. Note that SSLv2 is disabled in the code and cannot be enabled + using any configuration option. TLSv1 is more expensive than SSLv3 so it + often makes sense to disable it when communicating with local servers. This + option is also available on global statement "ssl-default-server-options". + Use "ssl-min-ver" and "ssl-max-ver" instead. + + Supported in default-server: No + +no-tlsv11 + This option disables support for TLSv1.1 when SSL is used to communicate with + the server. Note that SSLv2 is disabled in the code and cannot be enabled + using any configuration option. TLSv1 is more expensive than SSLv3 so it + often makes sense to disable it when communicating with local servers. This + option is also available on global statement "ssl-default-server-options". + Use "ssl-min-ver" and "ssl-max-ver" instead. + + Supported in default-server: No + +no-tlsv12 + This option disables support for TLSv1.2 when SSL is used to communicate with + the server. Note that SSLv2 is disabled in the code and cannot be enabled + using any configuration option. TLSv1 is more expensive than SSLv3 so it + often makes sense to disable it when communicating with local servers. This + option is also available on global statement "ssl-default-server-options". + Use "ssl-min-ver" and "ssl-max-ver" instead. + + Supported in default-server: No + +no-tlsv13 + This option disables support for TLSv1.3 when SSL is used to communicate with + the server. Note that SSLv2 is disabled in the code and cannot be enabled + using any configuration option. TLSv1 is more expensive than SSLv3 so it + often makes sense to disable it when communicating with local servers. This + option is also available on global statement "ssl-default-server-options". + Use "ssl-min-ver" and "ssl-max-ver" instead. + + Supported in default-server: No + +no-verifyhost + This option may be used as "server" setting to reset any "verifyhost" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "verifyhost" setting. + +no-tfo + This option may be used as "server" setting to reset any "tfo" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "tfo" setting. + +non-stick + Never add connections allocated to this sever to a stick-table. + This may be used in conjunction with backup to ensure that + stick-table persistence is disabled for backup servers. + +npn <protocols> + This enables the NPN TLS extension and advertises the specified protocol list + as supported on top of NPN. The protocol list consists in a comma-delimited + list of protocol names, for instance: "http/1.1,http/1.0" (without quotes). + This requires that the SSL library is built with support for TLS extensions + enabled (check with haproxy -vv). Note that the NPN extension has been + replaced with the ALPN extension (see the "alpn" keyword), though this one is + only available starting with OpenSSL 1.0.2. + +observe <mode> + This option enables health adjusting based on observing communication with + the server. By default this functionality is disabled and enabling it also + requires to enable health checks. There are two supported modes: "layer4" and + "layer7". In layer4 mode, only successful/unsuccessful tcp connections are + significant. In layer7, which is only allowed for http proxies, responses + received from server are verified, like valid/wrong http code, unparsable + headers, a timeout, etc. Valid status codes include 100 to 499, 501 and 505. + + See also the "check", "on-error" and "error-limit". + +on-error <mode> + Select what should happen when enough consecutive errors are detected. + Currently, four modes are available: + - fastinter: force fastinter + - fail-check: simulate a failed check, also forces fastinter (default) + - sudden-death: simulate a pre-fatal failed health check, one more failed + check will mark a server down, forces fastinter + - mark-down: mark the server immediately down and force fastinter + + See also the "check", "observe" and "error-limit". + +on-marked-down <action> + Modify what occurs when a server is marked down. + Currently one action is available: + - shutdown-sessions: Shutdown peer sessions. When this setting is enabled, + all connections to the server are immediately terminated when the server + goes down. It might be used if the health check detects more complex cases + than a simple connection status, and long timeouts would cause the service + to remain unresponsive for too long a time. For instance, a health check + might detect that a database is stuck and that there's no chance to reuse + existing connections anymore. Connections killed this way are logged with + a 'D' termination code (for "Down"). + + Actions are disabled by default + +on-marked-up <action> + Modify what occurs when a server is marked up. + Currently one action is available: + - shutdown-backup-sessions: Shutdown sessions on all backup servers. This is + done only if the server is not in backup state and if it is not disabled + (it must have an effective weight > 0). This can be used sometimes to force + an active server to take all the traffic back after recovery when dealing + with long sessions (e.g. LDAP, SQL, ...). Doing this can cause more trouble + than it tries to solve (e.g. incomplete transactions), so use this feature + with extreme care. Sessions killed because a server comes up are logged + with an 'U' termination code (for "Up"). + + Actions are disabled by default + +pool-low-conn <max> + Set a low threshold on the number of idling connections for a server, below + which a thread will not try to steal a connection from another thread. This + can be useful to improve CPU usage patterns in scenarios involving many very + fast servers, in order to ensure all threads will keep a few idle connections + all the time instead of letting them accumulate over one thread and migrating + them from thread to thread. Typical values of twice the number of threads + seem to show very good performance already with sub-millisecond response + times. The default is zero, indicating that any idle connection can be used + at any time. It is the recommended setting for normal use. This only applies + to connections that can be shared according to the same principles as those + applying to "http-reuse". In case connection sharing between threads would + be disabled via "tune.idle-pool.shared", it can become very important to use + this setting to make sure each thread always has a few connections, or the + connection reuse rate will decrease as thread count increases. + +pool-max-conn <max> + Set the maximum number of idling connections for a server. -1 means unlimited + connections, 0 means no idle connections. The default is -1. When idle + connections are enabled, orphaned idle connections which do not belong to any + client session anymore are moved to a dedicated pool so that they remain + usable by future clients. This only applies to connections that can be shared + according to the same principles as those applying to "http-reuse". + +pool-purge-delay <delay> + Sets the delay to start purging idle connections. Each <delay> interval, half + of the idle connections are closed. 0 means we don't keep any idle connection. + The default is 5s. + +port <port> + Using the "port" parameter, it becomes possible to use a different port to + send health-checks or to probe the agent-check. On some servers, it may be + desirable to dedicate a port to a specific component able to perform complex + tests which are more suitable to health-checks than the application. It is + common to run a simple script in inetd for instance. This parameter is + ignored if the "check" parameter is not set. See also the "addr" parameter. + +proto <name> + Forces the multiplexer's protocol to use for the outgoing connections to this + server. It must be compatible with the mode of the backend (TCP or HTTP). It + must also be usable on the backend side. The list of available protocols is + reported in haproxy -vv.The protocols properties are reported : the mode + (TCP/HTTP), the side (FE/BE), the mux name and its flags. + + Some protocols are subject to the head-of-line blocking on server side + (flag=HOL_RISK). Finally some protocols don't support upgrades (flag=NO_UPG). + The HTX compatibility is also reported (flag=HTX). + + Here are the protocols that may be used as argument to a "proto" directive on + a server line : + + h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG + fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG + h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG + none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG + + Idea behind this option is to bypass the selection of the best multiplexer's + protocol for all connections established to this server. + + See also "ws" to use an alternative protocol for websocket streams. + +redir <prefix> + The "redir" parameter enables the redirection mode for all GET and HEAD + requests addressing this server. This means that instead of having HAProxy + forward the request to the server, it will send an "HTTP 302" response with + the "Location" header composed of this prefix immediately followed by the + requested URI beginning at the leading '/' of the path component. That means + that no trailing slash should be used after <prefix>. All invalid requests + will be rejected, and all non-GET or HEAD requests will be normally served by + the server. Note that since the response is completely forged, no header + mangling nor cookie insertion is possible in the response. However, cookies in + requests are still analyzed, making this solution completely usable to direct + users to a remote location in case of local disaster. Main use consists in + increasing bandwidth for static servers by having the clients directly + connect to them. Note: never use a relative location here, it would cause a + loop between the client and HAProxy! + + Example : server srv1 192.168.1.1:80 redir http://image1.mydomain.com check + +rise <count> + The "rise" parameter states that a server will be considered as operational + after <count> consecutive successful health checks. This value defaults to 2 + if unspecified. See also the "check", "inter" and "fall" parameters. + +resolve-opts <option>,<option>,... + Comma separated list of options to apply to DNS resolution linked to this + server. + + Available options: + + * allow-dup-ip + By default, HAProxy prevents IP address duplication in a backend when DNS + resolution at runtime is in operation. + That said, for some cases, it makes sense that two servers (in the same + backend, being resolved by the same FQDN) have the same IP address. + For such case, simply enable this option. + This is the opposite of prevent-dup-ip. + + * ignore-weight + Ignore any weight that is set within an SRV record. This is useful when + you would like to control the weights using an alternate method, such as + using an "agent-check" or through the runtime api. + + * prevent-dup-ip + Ensure HAProxy's default behavior is enforced on a server: prevent re-using + an IP address already set to a server in the same backend and sharing the + same fqdn. + This is the opposite of allow-dup-ip. + + Example: + backend b_myapp + default-server init-addr none resolvers dns + server s1 myapp.example.com:80 check resolve-opts allow-dup-ip + server s2 myapp.example.com:81 check resolve-opts allow-dup-ip + + With the option allow-dup-ip set: + * if the nameserver returns a single IP address, then both servers will use + it + * If the nameserver returns 2 IP addresses, then each server will pick up a + different address + + Default value: not set + +resolve-prefer <family> + When DNS resolution is enabled for a server and multiple IP addresses from + different families are returned, HAProxy will prefer using an IP address + from the family mentioned in the "resolve-prefer" parameter. + Available families: "ipv4" and "ipv6" + + Default value: ipv6 + + Example: + + server s1 app1.domain.com:80 resolvers mydns resolve-prefer ipv6 + +resolve-net <network>[,<network[,...]] + This option prioritizes the choice of an ip address matching a network. This is + useful with clouds to prefer a local ip. In some cases, a cloud high + availability service can be announced with many ip addresses on many + different datacenters. The latency between datacenter is not negligible, so + this patch permits to prefer a local datacenter. If no address matches the + configured network, another address is selected. + + Example: + + server s1 app1.domain.com:80 resolvers mydns resolve-net 10.0.0.0/8 + +resolvers <id> + Points to an existing "resolvers" section to resolve current server's + hostname. + + Example: + + server s1 app1.domain.com:80 check resolvers mydns + + See also section 5.3 + +send-proxy + The "send-proxy" parameter enforces use of the PROXY protocol over any + connection established to this server. The PROXY protocol informs the other + end about the layer 3/4 addresses of the incoming connection, so that it can + know the client's address or the public address it accessed to, whatever the + upper layer protocol. For connections accepted by an "accept-proxy" or + "accept-netscaler-cip" listener, the advertised address will be used. Only + TCPv4 and TCPv6 address families are supported. Other families such as + Unix sockets, will report an UNKNOWN family. Servers using this option can + fully be chained to another instance of HAProxy listening with an + "accept-proxy" setting. This setting must not be used if the server isn't + aware of the protocol. When health checks are sent to the server, the PROXY + protocol is automatically used when this option is set, unless there is an + explicit "port" or "addr" directive, in which case an explicit + "check-send-proxy" directive would also be needed to use the PROXY protocol. + See also the "no-send-proxy" option of this section and "accept-proxy" and + "accept-netscaler-cip" option of the "bind" keyword. + +send-proxy-v2 + The "send-proxy-v2" parameter enforces use of the PROXY protocol version 2 + over any connection established to this server. The PROXY protocol informs + the other end about the layer 3/4 addresses of the incoming connection, so + that it can know the client's address or the public address it accessed to, + whatever the upper layer protocol. It also send ALPN information if an alpn + have been negotiated. This setting must not be used if the server isn't aware + of this version of the protocol. See also the "no-send-proxy-v2" option of + this section and send-proxy" option of the "bind" keyword. + +proxy-v2-options <option>[,<option>]* + The "proxy-v2-options" parameter add options to send in PROXY protocol + version 2 when "send-proxy-v2" is used. Options available are: + + - ssl : See also "send-proxy-v2-ssl". + - cert-cn : See also "send-proxy-v2-ssl-cn". + - ssl-cipher: Name of the used cipher. + - cert-sig : Signature algorithm of the used certificate. + - cert-key : Key algorithm of the used certificate + - authority : Host name value passed by the client (only SNI from a TLS + connection is supported). + - crc32c : Checksum of the PROXYv2 header. + - unique-id : Send a unique ID generated using the frontend's + "unique-id-format" within the PROXYv2 header. + This unique-id is primarily meant for "mode tcp". It can + lead to unexpected results in "mode http", because the + generated unique ID is also used for the first HTTP request + within a Keep-Alive connection. + +send-proxy-v2-ssl + The "send-proxy-v2-ssl" parameter enforces use of the PROXY protocol version + 2 over any connection established to this server. The PROXY protocol informs + the other end about the layer 3/4 addresses of the incoming connection, so + that it can know the client's address or the public address it accessed to, + whatever the upper layer protocol. In addition, the SSL information extension + of the PROXY protocol is added to the PROXY protocol header. This setting + must not be used if the server isn't aware of this version of the protocol. + See also the "no-send-proxy-v2-ssl" option of this section and the + "send-proxy-v2" option of the "bind" keyword. + +send-proxy-v2-ssl-cn + The "send-proxy-v2-ssl" parameter enforces use of the PROXY protocol version + 2 over any connection established to this server. The PROXY protocol informs + the other end about the layer 3/4 addresses of the incoming connection, so + that it can know the client's address or the public address it accessed to, + whatever the upper layer protocol. In addition, the SSL information extension + of the PROXY protocol, along along with the Common Name from the subject of + the client certificate (if any), is added to the PROXY protocol header. This + setting must not be used if the server isn't aware of this version of the + protocol. See also the "no-send-proxy-v2-ssl-cn" option of this section and + the "send-proxy-v2" option of the "bind" keyword. + +slowstart <start_time_in_ms> + The "slowstart" parameter for a server accepts a value in milliseconds which + indicates after how long a server which has just come back up will run at + full speed. Just as with every other time-based parameter, it can be entered + in any other explicit unit among { us, ms, s, m, h, d }. The speed grows + linearly from 0 to 100% during this time. The limitation applies to two + parameters : + + - maxconn: the number of connections accepted by the server will grow from 1 + to 100% of the usual dynamic limit defined by (minconn,maxconn,fullconn). + + - weight: when the backend uses a dynamic weighted algorithm, the weight + grows linearly from 1 to 100%. In this case, the weight is updated at every + health-check. For this reason, it is important that the "inter" parameter + is smaller than the "slowstart", in order to maximize the number of steps. + + The slowstart never applies when HAProxy starts, otherwise it would cause + trouble to running servers. It only applies when a server has been previously + seen as failed. + +sni <expression> + The "sni" parameter evaluates the sample fetch expression, converts it to a + string and uses the result as the host name sent in the SNI TLS extension to + the server. A typical use case is to send the SNI received from the client in + a bridged TCP/SSL scenario, using the "ssl_fc_sni" sample fetch for the + expression. THIS MUST NOT BE USED FOR HTTPS, where req.hdr(host) should be + used instead, since SNI in HTTPS must always match the Host field and clients + are allowed to use different host names over the same connection). If + "verify required" is set (which is the recommended setting), the resulting + name will also be matched against the server certificate's names. See the + "verify" directive for more details. If you want to set a SNI for health + checks, see the "check-sni" directive for more details. + +source <addr>[:<pl>[-<ph>]] [usesrc { <addr2>[:<port2>] | client | clientip } ] +source <addr>[:<port>] [usesrc { <addr2>[:<port2>] | hdr_ip(<hdr>[,<occ>]) } ] +source <addr>[:<pl>[-<ph>]] [interface <name>] ... + The "source" parameter sets the source address which will be used when + connecting to the server. It follows the exact same parameters and principle + as the backend "source" keyword, except that it only applies to the server + referencing it. Please consult the "source" keyword for details. + + Additionally, the "source" statement on a server line allows one to specify a + source port range by indicating the lower and higher bounds delimited by a + dash ('-'). Some operating systems might require a valid IP address when a + source port range is specified. It is permitted to have the same IP/range for + several servers. Doing so makes it possible to bypass the maximum of 64k + total concurrent connections. The limit will then reach 64k connections per + server. + + Since Linux 4.2/libc 2.23 IP_BIND_ADDRESS_NO_PORT is set for connections + specifying the source address without port(s). + +ssl + This option enables SSL ciphering on outgoing connections to the server. It + is critical to verify server certificates using "verify" when using SSL to + connect to servers, otherwise the communication is prone to trivial man in + the-middle attacks rendering SSL useless. When this option is used, health + checks are automatically sent in SSL too unless there is a "port" or an + "addr" directive indicating the check should be sent to a different location. + See the "no-ssl" to disable "ssl" option and "check-ssl" option to force + SSL health checks. + +ssl-max-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + This option enforces use of <version> or lower when SSL is used to communicate + with the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-min-ver". + +ssl-min-ver [ SSLv3 | TLSv1.0 | TLSv1.1 | TLSv1.2 | TLSv1.3 ] + This option enforces use of <version> or upper when SSL is used to communicate + with the server. This option is also available on global statement + "ssl-default-server-options". See also "ssl-max-ver". + +ssl-reuse + This option may be used as "server" setting to reset any "no-ssl-reuse" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "no-ssl-reuse" setting. + +stick + This option may be used as "server" setting to reset any "non-stick" + setting which would have been inherited from "default-server" directive as + default value. + It may also be used as "default-server" setting to reset any previous + "default-server" "non-stick" setting. + +socks4 <addr>:<port> + This option enables upstream socks4 tunnel for outgoing connections to the + server. Using this option won't force the health check to go via socks4 by + default. You will have to use the keyword "check-via-socks4" to enable it. + +tcp-ut <delay> + Sets the TCP User Timeout for all outgoing connections to this server. This + option is available on Linux since version 2.6.37. It allows HAProxy to + configure a timeout for sockets which contain data not receiving an + acknowledgment for the configured delay. This is especially useful on + long-lived connections experiencing long idle periods such as remote + terminals or database connection pools, where the client and server timeouts + must remain high to allow a long period of idle, but where it is important to + detect that the server has disappeared in order to release all resources + associated with its connection (and the client's session). One typical use + case is also to force dead server connections to die when health checks are + too slow or during a soft reload since health checks are then disabled. The + argument is a delay expressed in milliseconds by default. This only works for + regular TCP connections, and is ignored for other protocols. + +tfo + This option enables using TCP fast open when connecting to servers, on + systems that support it (currently only the Linux kernel >= 4.11). + See the "tfo" bind option for more information about TCP fast open. + Please note that when using tfo, you should also use the "conn-failure", + "empty-response" and "response-timeout" keywords for "retry-on", or HAProxy + won't be able to retry the connection on failure. See also "no-tfo". + +track [<proxy>/]<server> + This option enables ability to set the current state of the server by tracking + another one. It is possible to track a server which itself tracks another + server, provided that at the end of the chain, a server has health checks + enabled. If <proxy> is omitted the current one is used. If disable-on-404 is + used, it has to be enabled on both proxies. + +tls-tickets + This option may be used as "server" setting to reset any "no-tls-tickets" + setting which would have been inherited from "default-server" directive as + default value. + The TLS ticket mechanism is only used up to TLS 1.2. + Forward Secrecy is compromised with TLS tickets, unless ticket keys + are periodically rotated (via reload or by using "tls-ticket-keys"). + It may also be used as "default-server" setting to reset any previous + "default-server" "no-tls-tickets" setting. + +verify [none|required] + This setting is only available when support for OpenSSL was built in. If set + to 'none', server certificate is not verified. In the other case, The + certificate provided by the server is verified using CAs from 'ca-file' and + optional CRLs from 'crl-file' after having checked that the names provided in + the certificate's subject and subjectAlternateNames attributes match either + the name passed using the "sni" directive, or if not provided, the static + host name passed using the "verifyhost" directive. When no name is found, the + certificate's names are ignored. For this reason, without SNI it's important + to use "verifyhost". On verification failure the handshake is aborted. It is + critically important to verify server certificates when using SSL to connect + to servers, otherwise the communication is prone to trivial man-in-the-middle + attacks rendering SSL totally useless. Unless "ssl_server_verify" appears in + the global section, "verify" is set to "required" by default. + +verifyhost <hostname> + This setting is only available when support for OpenSSL was built in, and + only takes effect if 'verify required' is also specified. This directive sets + a default static hostname to check the server's certificate against when no + SNI was used to connect to the server. If SNI is not used, this is the only + way to enable hostname verification. This static hostname, when set, will + also be used for health checks (which cannot provide an SNI value). If none + of the hostnames in the certificate match the specified hostname, the + handshake is aborted. The hostnames in the server-provided certificate may + include wildcards. See also "verify", "sni" and "no-verifyhost" options. + +weight <weight> + The "weight" parameter is used to adjust the server's weight relative to + other servers. All servers will receive a load proportional to their weight + relative to the sum of all weights, so the higher the weight, the higher the + load. The default weight is 1, and the maximal value is 256. A value of 0 + means the server will not participate in load-balancing but will still accept + persistent connections. If this parameter is used to distribute the load + according to server's capacity, it is recommended to start with values which + can both grow and shrink, for instance between 10 and 100 to leave enough + room above and below for later adjustments. + +ws { auto | h1 | h2 } + This option allows to configure the protocol used when relaying websocket + streams. This is most notably useful when using an HTTP/2 backend without the + support for H2 websockets through the RFC8441. + + The default mode is "auto". This will reuse the same protocol as the main + one. The only difference is when using ALPN. In this case, it can try to + downgrade the ALPN to "http/1.1" only for websocket streams if the configured + server ALPN contains it. + + The value "h1" is used to force HTTP/1.1 for websockets streams, through ALPN + if SSL ALPN is activated for the server. Similarly, "h2" can be used to + force HTTP/2.0 websockets. Use this value with care : the server must support + RFC8441 or an error will be reported by haproxy when relaying websockets. + + Note that NPN is not taken into account as its usage has been deprecated in + favor of the ALPN extension. + + See also "alpn" and "proto". + + +5.3. Server IP address resolution using DNS +------------------------------------------- + +HAProxy allows using a host name on the server line to retrieve its IP address +using name servers. By default, HAProxy resolves the name when parsing the +configuration file, at startup and cache the result for the process's life. +This is not sufficient in some cases, such as in Amazon where a server's IP +can change after a reboot or an ELB Virtual IP can change based on current +workload. +This chapter describes how HAProxy can be configured to process server's name +resolution at run time. +Whether run time server name resolution has been enable or not, HAProxy will +carry on doing the first resolution when parsing the configuration. + + +5.3.1. Global overview +---------------------- + +As we've seen in introduction, name resolution in HAProxy occurs at two +different steps of the process life: + + 1. when starting up, HAProxy parses the server line definition and matches a + host name. It uses libc functions to get the host name resolved. This + resolution relies on /etc/resolv.conf file. + + 2. at run time, HAProxy performs periodically name resolutions for servers + requiring DNS resolutions. + +A few other events can trigger a name resolution at run time: + - when a server's health check ends up in a connection timeout: this may be + because the server has a new IP address. So we need to trigger a name + resolution to know this new IP. + +When using resolvers, the server name can either be a hostname, or a SRV label. +HAProxy considers anything that starts with an underscore as a SRV label. If a +SRV label is specified, then the corresponding SRV records will be retrieved +from the DNS server, and the provided hostnames will be used. The SRV label +will be checked periodically, and if any server are added or removed, HAProxy +will automatically do the same. + +A few things important to notice: + - all the name servers are queried in the meantime. HAProxy will process the + first valid response. + + - a resolution is considered as invalid (NX, timeout, refused), when all the + servers return an error. + + +5.3.2. The resolvers section +---------------------------- + +This section is dedicated to host information related to name resolution in +HAProxy. There can be as many as resolvers section as needed. Each section can +contain many name servers. + +At startup, HAProxy tries to generate a resolvers section named "default", if +no section was named this way in the configuration. This section is used by +default by the httpclient and uses the parse-resolv-conf keyword. If HAProxy +failed to generate automatically this section, no error or warning are emitted. + +When multiple name servers are configured in a resolvers section, then HAProxy +uses the first valid response. In case of invalid responses, only the last one +is treated. Purpose is to give the chance to a slow server to deliver a valid +answer after a fast faulty or outdated server. + +When each server returns a different error type, then only the last error is +used by HAProxy. The following processing is applied on this error: + + 1. HAProxy retries the same DNS query with a new query type. The A queries are + switch to AAAA or the opposite. SRV queries are not concerned here. Timeout + errors are also excluded. + + 2. When the fallback on the query type was done (or not applicable), HAProxy + retries the original DNS query, with the preferred query type. + + 3. HAProxy retries previous steps <resolve_retries> times. If no valid + response is received after that, it stops the DNS resolution and reports + the error. + +For example, with 2 name servers configured in a resolvers section, the +following scenarios are possible: + + - First response is valid and is applied directly, second response is + ignored + + - First response is invalid and second one is valid, then second response is + applied + + - First response is a NX domain and second one a truncated response, then + HAProxy retries the query with a new type + + - First response is a NX domain and second one is a timeout, then HAProxy + retries the query with a new type + + - Query timed out for both name servers, then HAProxy retries it with the + same query type + +As a DNS server may not answer all the IPs in one DNS request, HAProxy keeps +a cache of previous answers, an answer will be considered obsolete after +<hold obsolete> seconds without the IP returned. + + +resolvers <resolvers id> + Creates a new name server list labeled <resolvers id> + +A resolvers section accept the following parameters: + +accepted_payload_size <nb> + Defines the maximum payload size accepted by HAProxy and announced to all the + name servers configured in this resolvers section. + <nb> is in bytes. If not set, HAProxy announces 512. (minimal value defined + by RFC 6891) + + Note: the maximum allowed value is 65535. Recommended value for UDP is + 4096 and it is not recommended to exceed 8192 except if you are sure + that your system and network can handle this (over 65507 makes no sense + since is the maximum UDP payload size). If you are using only TCP + nameservers to handle huge DNS responses, you should put this value + to the max: 65535. + +nameserver <name> <address>[:port] [param*] + Used to configure a nameserver. <name> of the nameserver should ne unique. + By default the <address> is considered of type datagram. This means if an + IPv4 or IPv6 is configured without special address prefixes (paragraph 11.) + the UDP protocol will be used. If an stream protocol address prefix is used, + the nameserver will be considered as a stream server (TCP for instance) and + "server" parameters found in 5.2 paragraph which are relevant for DNS + resolving will be considered. Note: currently, in TCP mode, 4 queries are + pipelined on the same connections. A batch of idle connections are removed + every 5 seconds. "maxconn" can be configured to limit the amount of those + concurrent connections and TLS should also usable if the server supports. + +parse-resolv-conf + Adds all nameservers found in /etc/resolv.conf to this resolvers nameservers + list. Ordered as if each nameserver in /etc/resolv.conf was individually + placed in the resolvers section in place of this directive. + +hold <status> <period> + Upon receiving the DNS response <status>, determines whether a server's state + should change from UP to DOWN. To make that determination, it checks whether + any valid status has been received during the past <period> in order to + counteract the just received invalid status. + + <status> : last name resolution status. + nx After receiving an NXDOMAIN status, check for any valid + status during the concluding period. + + refused After receiving a REFUSED status, check for any valid + status during the concluding period. + + timeout After the "timeout retry" has struck, check for any + valid status during the concluding period. + + other After receiving any other invalid status, check for any + valid status during the concluding period. + + valid Applies only to "http-request do-resolve" and + "tcp-request content do-resolve" actions. It defines the + period for which the server will maintain a valid response + before triggering another resolution. It does not affect + dynamic resolution of servers. + + obsolete Defines how long to wait before removing obsolete DNS + records after an updated answer record is received. It + applies to SRV records. + + <period> : Amount of time into the past during which a valid response must + have been received. It follows the HAProxy time format and is in + milliseconds by default. + + For a server that relies on dynamic DNS resolution to determine its IP + address, receiving an invalid DNS response, such as NXDOMAIN, will lead to + changing the server's state from UP to DOWN. The hold directives define how + far into the past to look for a valid response. If a valid response has been + received within <period>, the just received invalid status will be ignored. + + Unless a valid response has been receiving during the concluding period, the + server will be marked as DOWN. For example, if "hold nx 30s" is set and the + last received DNS response was NXDOMAIN, the server will be marked DOWN + unless a valid response has been received during the last 30 seconds. + + A server in the DOWN state will be marked UP immediately upon receiving a + valid status from the DNS server. + + A separate behavior exists for "hold valid" and "hold obsolete". + +resolve_retries <nb> + Defines the number <nb> of queries to send to resolve a server name before + giving up. + Default value: 3 + + A retry occurs on name server timeout or when the full sequence of DNS query + type failover is over and we need to start up from the default ANY query + type. + +timeout <event> <time> + Defines timeouts related to name resolution + <event> : the event on which the <time> timeout period applies to. + events available are: + - resolve : default time to trigger name resolutions when no + other time applied. + Default value: 1s + - retry : time between two DNS queries, when no valid response + have been received. + Default value: 1s + <time> : time related to the event. It follows the HAProxy time format. + <time> is expressed in milliseconds. + + Example: + + resolvers mydns + nameserver dns1 10.0.0.1:53 + nameserver dns2 10.0.0.2:53 + nameserver dns3 tcp@10.0.0.3:53 + parse-resolv-conf + resolve_retries 3 + timeout resolve 1s + timeout retry 1s + hold other 30s + hold refused 30s + hold nx 30s + hold timeout 30s + hold valid 10s + hold obsolete 30s + + +6. Cache +--------- + +HAProxy provides a cache, which was designed to perform cache on small objects +(favicon, css...). This is a minimalist low-maintenance cache which runs in +RAM. + +The cache is based on a memory area shared between all threads, and split in 1kB +blocks. + +If an object is not used anymore, it can be deleted to store a new object +independently of its expiration date. The oldest objects are deleted first +when we try to allocate a new one. + +The cache uses a hash of the host header and the URI as the key. + +It's possible to view the status of a cache using the Unix socket command +"show cache" consult section 9.3 "Unix Socket commands" of Management Guide +for more details. + +When an object is delivered from the cache, the server name in the log is +replaced by "<CACHE>". + + +6.1. Limitation +---------------- + +The cache won't store and won't deliver objects in these cases: + +- If the response is not a 200 +- If the response contains a Vary header and either the process-vary option is + disabled, or a currently unmanaged header is specified in the Vary value (only + accept-encoding and referer are managed for now) +- If the Content-Length + the headers size is greater than "max-object-size" +- If the response is not cacheable +- If the response does not have an explicit expiration time (s-maxage or max-age + Cache-Control directives or Expires header) or a validator (ETag or Last-Modified + headers) +- If the process-vary option is enabled and there are already max-secondary-entries + entries with the same primary key as the current response +- If the process-vary option is enabled and the response has an unknown encoding (not + mentioned in https://www.iana.org/assignments/http-parameters/http-parameters.xhtml) + while varying on the accept-encoding client header + +- If the request is not a GET +- If the HTTP version of the request is smaller than 1.1 +- If the request contains an Authorization header + + +6.2. Setup +----------- + +To setup a cache, you must define a cache section and use it in a proxy with +the corresponding http-request and response actions. + + +6.2.1. Cache section +--------------------- + +cache <name> + Declare a cache section, allocate a shared cache memory named <name>, the + size of cache is mandatory. + +total-max-size <megabytes> + Define the size in RAM of the cache in megabytes. This size is split in + blocks of 1kB which are used by the cache entries. Its maximum value is 4095. + +max-object-size <bytes> + Define the maximum size of the objects to be cached. Must not be greater than + an half of "total-max-size". If not set, it equals to a 256th of the cache size. + All objects with sizes larger than "max-object-size" will not be cached. + +max-age <seconds> + Define the maximum expiration duration. The expiration is set as the lowest + value between the s-maxage or max-age (in this order) directive in the + Cache-Control response header and this value. The default value is 60 + seconds, which means that you can't cache an object more than 60 seconds by + default. + +process-vary <on/off> + Enable or disable the processing of the Vary header. When disabled, a response + containing such a header will never be cached. When enabled, we need to calculate + a preliminary hash for a subset of request headers on all the incoming requests + (which might come with a cpu cost) which will be used to build a secondary key + for a given request (see RFC 7234#4.1). The default value is off (disabled). + +max-secondary-entries <number> + Define the maximum number of simultaneous secondary entries with the same primary + key in the cache. This needs the vary support to be enabled. Its default value is 10 + and should be passed a strictly positive integer. + + +6.2.2. Proxy section +--------------------- + +http-request cache-use <name> [ { if | unless } <condition> ] + Try to deliver a cached object from the cache <name>. This directive is also + mandatory to store the cache as it calculates the cache hash. If you want to + use a condition for both storage and delivering that's a good idea to put it + after this one. + +http-response cache-store <name> [ { if | unless } <condition> ] + Store an http-response within the cache. The storage of the response headers + is done at this step, which means you can use others http-response actions + to modify headers before or after the storage of the response. This action + is responsible for the setup of the cache storage filter. + + +Example: + + backend bck1 + mode http + + http-request cache-use foobar + http-response cache-store foobar + server srv1 127.0.0.1:80 + + cache foobar + total-max-size 4 + max-age 240 + + +7. Using ACLs and fetching samples +---------------------------------- + +HAProxy is capable of extracting data from request or response streams, from +client or server information, from tables, environmental information etc... +The action of extracting such data is called fetching a sample. Once retrieved, +these samples may be used for various purposes such as a key to a stick-table, +but most common usages consist in matching them against predefined constant +data called patterns. + + +7.1. ACL basics +--------------- + +The use of Access Control Lists (ACL) provides a flexible solution to perform +content switching and generally to take decisions based on content extracted +from the request, the response or any environmental status. The principle is +simple : + + - extract a data sample from a stream, table or the environment + - optionally apply some format conversion to the extracted sample + - apply one or multiple pattern matching methods on this sample + - perform actions only when a pattern matches the sample + +The actions generally consist in blocking a request, selecting a backend, or +adding a header. + +In order to define a test, the "acl" keyword is used. The syntax is : + + acl <aclname> <criterion> [flags] [operator] [<value>] ... + +This creates a new ACL <aclname> or completes an existing one with new tests. +Those tests apply to the portion of request/response specified in <criterion> +and may be adjusted with optional flags [flags]. Some criteria also support +an operator which may be specified before the set of values. Optionally some +conversion operators may be applied to the sample, and they will be specified +as a comma-delimited list of keywords just after the first keyword. The values +are of the type supported by the criterion, and are separated by spaces. + +ACL names must be formed from upper and lower case letters, digits, '-' (dash), +'_' (underscore) , '.' (dot) and ':' (colon). ACL names are case-sensitive, +which means that "my_acl" and "My_Acl" are two different ACLs. + +There is no enforced limit to the number of ACLs. The unused ones do not affect +performance, they just consume a small amount of memory. + +The criterion generally is the name of a sample fetch method, or one of its ACL +specific declinations. The default test method is implied by the output type of +this sample fetch method. The ACL declinations can describe alternate matching +methods of a same sample fetch method. The sample fetch methods are the only +ones supporting a conversion. + +Sample fetch methods return data which can be of the following types : + - boolean + - integer (signed or unsigned) + - IPv4 or IPv6 address + - string + - data block + +Converters transform any of these data into any of these. For example, some +converters might convert a string to a lower-case string while other ones +would turn a string to an IPv4 address, or apply a netmask to an IP address. +The resulting sample is of the type of the last converter applied to the list, +which defaults to the type of the sample fetch method. + +Each sample or converter returns data of a specific type, specified with its +keyword in this documentation. When an ACL is declared using a standard sample +fetch method, certain types automatically involved a default matching method +which are summarized in the table below : + + +---------------------+-----------------+ + | Sample or converter | Default | + | output type | matching method | + +---------------------+-----------------+ + | boolean | bool | + +---------------------+-----------------+ + | integer | int | + +---------------------+-----------------+ + | ip | ip | + +---------------------+-----------------+ + | string | str | + +---------------------+-----------------+ + | binary | none, use "-m" | + +---------------------+-----------------+ + +Note that in order to match a binary samples, it is mandatory to specify a +matching method, see below. + +The ACL engine can match these types against patterns of the following types : + - boolean + - integer or integer range + - IP address / network + - string (exact, substring, suffix, prefix, subdir, domain) + - regular expression + - hex block + +The following ACL flags are currently supported : + + -i : ignore case during matching of all subsequent patterns. + -f : load patterns from a file. + -m : use a specific pattern matching method + -n : forbid the DNS resolutions + -M : load the file pointed by -f like a map file. + -u : force the unique id of the ACL + -- : force end of flags. Useful when a string looks like one of the flags. + +The "-f" flag is followed by the name of a file from which all lines will be +read as individual values. It is even possible to pass multiple "-f" arguments +if the patterns are to be loaded from multiple files. Empty lines as well as +lines beginning with a sharp ('#') will be ignored. All leading spaces and tabs +will be stripped. If it is absolutely necessary to insert a valid pattern +beginning with a sharp, just prefix it with a space so that it is not taken for +a comment. Depending on the data type and match method, HAProxy may load the +lines into a binary tree, allowing very fast lookups. This is true for IPv4 and +exact string matching. In this case, duplicates will automatically be removed. + +The "-M" flag allows an ACL to use a map file. If this flag is set, the file is +parsed as two column file. The first column contains the patterns used by the +ACL, and the second column contain the samples. The sample can be used later by +a map. This can be useful in some rare cases where an ACL would just be used to +check for the existence of a pattern in a map before a mapping is applied. + +The "-u" flag forces the unique id of the ACL. This unique id is used with the +socket interface to identify ACL and dynamically change its values. Note that a +file is always identified by its name even if an id is set. + +Also, note that the "-i" flag applies to subsequent entries and not to entries +loaded from files preceding it. For instance : + + acl valid-ua hdr(user-agent) -f exact-ua.lst -i -f generic-ua.lst test + +In this example, each line of "exact-ua.lst" will be exactly matched against +the "user-agent" header of the request. Then each line of "generic-ua" will be +case-insensitively matched. Then the word "test" will be insensitively matched +as well. + +The "-m" flag is used to select a specific pattern matching method on the input +sample. All ACL-specific criteria imply a pattern matching method and generally +do not need this flag. However, this flag is useful with generic sample fetch +methods to describe how they're going to be matched against the patterns. This +is required for sample fetches which return data type for which there is no +obvious matching method (e.g. string or binary). When "-m" is specified and +followed by a pattern matching method name, this method is used instead of the +default one for the criterion. This makes it possible to match contents in ways +that were not initially planned, or with sample fetch methods which return a +string. The matching method also affects the way the patterns are parsed. + +The "-n" flag forbids the dns resolutions. It is used with the load of ip files. +By default, if the parser cannot parse ip address it considers that the parsed +string is maybe a domain name and try dns resolution. The flag "-n" disable this +resolution. It is useful for detecting malformed ip lists. Note that if the DNS +server is not reachable, the HAProxy configuration parsing may last many minutes +waiting for the timeout. During this time no error messages are displayed. The +flag "-n" disable this behavior. Note also that during the runtime, this +function is disabled for the dynamic acl modifications. + +There are some restrictions however. Not all methods can be used with all +sample fetch methods. Also, if "-m" is used in conjunction with "-f", it must +be placed first. The pattern matching method must be one of the following : + + - "found" : only check if the requested sample could be found in the stream, + but do not compare it against any pattern. It is recommended not + to pass any pattern to avoid confusion. This matching method is + particularly useful to detect presence of certain contents such + as headers, cookies, etc... even if they are empty and without + comparing them to anything nor counting them. + + - "bool" : check the value as a boolean. It can only be applied to fetches + which return a boolean or integer value, and takes no pattern. + Value zero or false does not match, all other values do match. + + - "int" : match the value as an integer. It can be used with integer and + boolean samples. Boolean false is integer 0, true is integer 1. + + - "ip" : match the value as an IPv4 or IPv6 address. It is compatible + with IP address samples only, so it is implied and never needed. + + - "bin" : match the contents against a hexadecimal string representing a + binary sequence. This may be used with binary or string samples. + + - "len" : match the sample's length as an integer. This may be used with + binary or string samples. + + - "str" : exact match : match the contents against a string. This may be + used with binary or string samples. + + - "sub" : substring match : check that the contents contain at least one of + the provided string patterns. This may be used with binary or + string samples. + + - "reg" : regex match : match the contents against a list of regular + expressions. This may be used with binary or string samples. + + - "beg" : prefix match : check that the contents begin like the provided + string patterns. This may be used with binary or string samples. + + - "end" : suffix match : check that the contents end like the provided + string patterns. This may be used with binary or string samples. + + - "dir" : subdir match : check that a slash-delimited portion of the + contents exactly matches one of the provided string patterns. + This may be used with binary or string samples. + + - "dom" : domain match : check that a dot-delimited portion of the contents + exactly match one of the provided string patterns. This may be + used with binary or string samples. + +For example, to quickly detect the presence of cookie "JSESSIONID" in an HTTP +request, it is possible to do : + + acl jsess_present req.cook(JSESSIONID) -m found + +In order to apply a regular expression on the 500 first bytes of data in the +buffer, one would use the following acl : + + acl script_tag req.payload(0,500) -m reg -i <script> + +On systems where the regex library is much slower when using "-i", it is +possible to convert the sample to lowercase before matching, like this : + + acl script_tag req.payload(0,500),lower -m reg <script> + +All ACL-specific criteria imply a default matching method. Most often, these +criteria are composed by concatenating the name of the original sample fetch +method and the matching method. For example, "hdr_beg" applies the "beg" match +to samples retrieved using the "hdr" fetch method. This matching method is only +usable when the keyword is used alone, without any converter. In case any such +converter were to be applied after such an ACL keyword, the default matching +method from the ACL keyword is simply ignored since what will matter for the +matching is the output type of the last converter. Since all ACL-specific +criteria rely on a sample fetch method, it is always possible instead to use +the original sample fetch method and the explicit matching method using "-m". + +If an alternate match is specified using "-m" on an ACL-specific criterion, +the matching method is simply applied to the underlying sample fetch method. +For example, all ACLs below are exact equivalent : + + acl short_form hdr_beg(host) www. + acl alternate1 hdr_beg(host) -m beg www. + acl alternate2 hdr_dom(host) -m beg www. + acl alternate3 hdr(host) -m beg www. + + +The table below summarizes the compatibility matrix between sample or converter +types and the pattern types to fetch against. It indicates for each compatible +combination the name of the matching method to be used, surrounded with angle +brackets ">" and "<" when the method is the default one and will work by +default without "-m". + + +-------------------------------------------------+ + | Input sample type | + +----------------------+---------+---------+---------+---------+---------+ + | pattern type | boolean | integer | ip | string | binary | + +----------------------+---------+---------+---------+---------+---------+ + | none (presence only) | found | found | found | found | found | + +----------------------+---------+---------+---------+---------+---------+ + | none (boolean value) |> bool <| bool | | bool | | + +----------------------+---------+---------+---------+---------+---------+ + | integer (value) | int |> int <| int | int | | + +----------------------+---------+---------+---------+---------+---------+ + | integer (length) | len | len | len | len | len | + +----------------------+---------+---------+---------+---------+---------+ + | IP address | | |> ip <| ip | ip | + +----------------------+---------+---------+---------+---------+---------+ + | exact string | str | str | str |> str <| str | + +----------------------+---------+---------+---------+---------+---------+ + | prefix | beg | beg | beg | beg | beg | + +----------------------+---------+---------+---------+---------+---------+ + | suffix | end | end | end | end | end | + +----------------------+---------+---------+---------+---------+---------+ + | substring | sub | sub | sub | sub | sub | + +----------------------+---------+---------+---------+---------+---------+ + | subdir | dir | dir | dir | dir | dir | + +----------------------+---------+---------+---------+---------+---------+ + | domain | dom | dom | dom | dom | dom | + +----------------------+---------+---------+---------+---------+---------+ + | regex | reg | reg | reg | reg | reg | + +----------------------+---------+---------+---------+---------+---------+ + | hex block | | | | bin | bin | + +----------------------+---------+---------+---------+---------+---------+ + + +7.1.1. Matching booleans +------------------------ + +In order to match a boolean, no value is needed and all values are ignored. +Boolean matching is used by default for all fetch methods of type "boolean". +When boolean matching is used, the fetched value is returned as-is, which means +that a boolean "true" will always match and a boolean "false" will never match. + +Boolean matching may also be enforced using "-m bool" on fetch methods which +return an integer value. Then, integer value 0 is converted to the boolean +"false" and all other values are converted to "true". + + +7.1.2. Matching integers +------------------------ + +Integer matching applies by default to integer fetch methods. It can also be +enforced on boolean fetches using "-m int". In this case, "false" is converted +to the integer 0, and "true" is converted to the integer 1. + +Integer matching also supports integer ranges and operators. Note that integer +matching only applies to positive values. A range is a value expressed with a +lower and an upper bound separated with a colon, both of which may be omitted. + +For instance, "1024:65535" is a valid range to represent a range of +unprivileged ports, and "1024:" would also work. "0:1023" is a valid +representation of privileged ports, and ":1023" would also work. + +As a special case, some ACL functions support decimal numbers which are in fact +two integers separated by a dot. This is used with some version checks for +instance. All integer properties apply to those decimal numbers, including +ranges and operators. + +For an easier usage, comparison operators are also supported. Note that using +operators with ranges does not make much sense and is strongly discouraged. +Similarly, it does not make much sense to perform order comparisons with a set +of values. + +Available operators for integer matching are : + + eq : true if the tested value equals at least one value + ge : true if the tested value is greater than or equal to at least one value + gt : true if the tested value is greater than at least one value + le : true if the tested value is less than or equal to at least one value + lt : true if the tested value is less than at least one value + +For instance, the following ACL matches any negative Content-Length header : + + acl negative-length req.hdr_val(content-length) lt 0 + +This one matches SSL versions between 3.0 and 3.1 (inclusive) : + + acl sslv3 req.ssl_ver 3:3.1 + + +7.1.3. Matching strings +----------------------- + +String matching applies to string or binary fetch methods, and exists in 6 +different forms : + + - exact match (-m str) : the extracted string must exactly match the + patterns; + + - substring match (-m sub) : the patterns are looked up inside the + extracted string, and the ACL matches if any of them is found inside; + + - prefix match (-m beg) : the patterns are compared with the beginning of + the extracted string, and the ACL matches if any of them matches. + + - suffix match (-m end) : the patterns are compared with the end of the + extracted string, and the ACL matches if any of them matches. + + - subdir match (-m dir) : the patterns are looked up anywhere inside the + extracted string, delimited with slashes ("/"), the beginning or the end + of the string. The ACL matches if any of them matches. As such, the string + "/images/png/logo/32x32.png", would match "/images", "/images/png", + "images/png", "/png/logo", "logo/32x32.png" or "32x32.png" but not "png" + nor "32x32". + + - domain match (-m dom) : the patterns are looked up anywhere inside the + extracted string, delimited with dots ("."), colons (":"), slashes ("/"), + question marks ("?"), the beginning or the end of the string. This is made + to be used with URLs. Leading and trailing delimiters in the pattern are + ignored. The ACL matches if any of them matches. As such, in the example + string "http://www1.dc-eu.example.com:80/blah", the patterns "http", + "www1", ".www1", "dc-eu", "example", "com", "80", "dc-eu.example", + "blah", ":www1:", "dc-eu.example:80" would match, but not "eu" nor "dc". + Using it to match domain suffixes for filtering or routing is generally + not a good idea, as the routing could easily be fooled by prepending the + matching prefix in front of another domain for example. + +String matching applies to verbatim strings as they are passed, with the +exception of the backslash ("\") which makes it possible to escape some +characters such as the space. If the "-i" flag is passed before the first +string, then the matching will be performed ignoring the case. In order +to match the string "-i", either set it second, or pass the "--" flag +before the first string. Same applies of course to match the string "--". + +Do not use string matches for binary fetches which might contain null bytes +(0x00), as the comparison stops at the occurrence of the first null byte. +Instead, convert the binary fetch to a hex string with the hex converter first. + +Example: + # matches if the string <tag> is present in the binary sample + acl tag_found req.payload(0,0),hex -m sub 3C7461673E + + +7.1.4. Matching regular expressions (regexes) +--------------------------------------------- + +Just like with string matching, regex matching applies to verbatim strings as +they are passed, with the exception of the backslash ("\") which makes it +possible to escape some characters such as the space. If the "-i" flag is +passed before the first regex, then the matching will be performed ignoring +the case. In order to match the string "-i", either set it second, or pass +the "--" flag before the first string. Same principle applies of course to +match the string "--". + + +7.1.5. Matching arbitrary data blocks +------------------------------------- + +It is possible to match some extracted samples against a binary block which may +not safely be represented as a string. For this, the patterns must be passed as +a series of hexadecimal digits in an even number, when the match method is set +to binary. Each sequence of two digits will represent a byte. The hexadecimal +digits may be used upper or lower case. + +Example : + # match "Hello\n" in the input stream (\x48 \x65 \x6c \x6c \x6f \x0a) + acl hello req.payload(0,6) -m bin 48656c6c6f0a + + +7.1.6. Matching IPv4 and IPv6 addresses +--------------------------------------- + +IPv4 addresses values can be specified either as plain addresses or with a +netmask appended, in which case the IPv4 address matches whenever it is +within the network. Plain addresses may also be replaced with a resolvable +host name, but this practice is generally discouraged as it makes it more +difficult to read and debug configurations. If hostnames are used, you should +at least ensure that they are present in /etc/hosts so that the configuration +does not depend on any random DNS match at the moment the configuration is +parsed. + +The dotted IPv4 address notation is supported in both regular as well as the +abbreviated form with all-0-octets omitted: + + +------------------+------------------+------------------+ + | Example 1 | Example 2 | Example 3 | + +------------------+------------------+------------------+ + | 192.168.0.1 | 10.0.0.12 | 127.0.0.1 | + | 192.168.1 | 10.12 | 127.1 | + | 192.168.0.1/22 | 10.0.0.12/8 | 127.0.0.1/8 | + | 192.168.1/22 | 10.12/8 | 127.1/8 | + +------------------+------------------+------------------+ + +Notice that this is different from RFC 4632 CIDR address notation in which +192.168.42/24 would be equivalent to 192.168.42.0/24. + +IPv6 may be entered in their usual form, with or without a netmask appended. +Only bit counts are accepted for IPv6 netmasks. In order to avoid any risk of +trouble with randomly resolved IP addresses, host names are never allowed in +IPv6 patterns. + +HAProxy is also able to match IPv4 addresses with IPv6 addresses in the +following situations : + - tested address is IPv4, pattern address is IPv4, the match applies + in IPv4 using the supplied mask if any. + - tested address is IPv6, pattern address is IPv6, the match applies + in IPv6 using the supplied mask if any. + - tested address is IPv6, pattern address is IPv4, the match applies in IPv4 + using the pattern's mask if the IPv6 address matches with 2002:IPV4::, + ::IPV4 or ::ffff:IPV4, otherwise it fails. + - tested address is IPv4, pattern address is IPv6, the IPv4 address is first + converted to IPv6 by prefixing ::ffff: in front of it, then the match is + applied in IPv6 using the supplied IPv6 mask. + + +7.2. Using ACLs to form conditions +---------------------------------- + +Some actions are only performed upon a valid condition. A condition is a +combination of ACLs with operators. 3 operators are supported : + + - AND (implicit) + - OR (explicit with the "or" keyword or the "||" operator) + - Negation with the exclamation mark ("!") + +A condition is formed as a disjunctive form: + + [!]acl1 [!]acl2 ... [!]acln { or [!]acl1 [!]acl2 ... [!]acln } ... + +Such conditions are generally used after an "if" or "unless" statement, +indicating when the condition will trigger the action. + +For instance, to block HTTP requests to the "*" URL with methods other than +"OPTIONS", as well as POST requests without content-length, and GET or HEAD +requests with a content-length greater than 0, and finally every request which +is not either GET/HEAD/POST/OPTIONS ! + + acl missing_cl req.hdr_cnt(Content-length) eq 0 + http-request deny if HTTP_URL_STAR !METH_OPTIONS || METH_POST missing_cl + http-request deny if METH_GET HTTP_CONTENT + http-request deny unless METH_GET or METH_POST or METH_OPTIONS + +To select a different backend for requests to static contents on the "www" site +and to every request on the "img", "video", "download" and "ftp" hosts : + + acl url_static path_beg /static /images /img /css + acl url_static path_end .gif .png .jpg .css .js + acl host_www hdr_beg(host) -i www + acl host_static hdr_beg(host) -i img. video. download. ftp. + + # now use backend "static" for all static-only hosts, and for static URLs + # of host "www". Use backend "www" for the rest. + use_backend static if host_static or host_www url_static + use_backend www if host_www + +It is also possible to form rules using "anonymous ACLs". Those are unnamed ACL +expressions that are built on the fly without needing to be declared. They must +be enclosed between braces, with a space before and after each brace (because +the braces must be seen as independent words). Example : + + The following rule : + + acl missing_cl req.hdr_cnt(Content-length) eq 0 + http-request deny if METH_POST missing_cl + + Can also be written that way : + + http-request deny if METH_POST { req.hdr_cnt(Content-length) eq 0 } + +It is generally not recommended to use this construct because it's a lot easier +to leave errors in the configuration when written that way. However, for very +simple rules matching only one source IP address for instance, it can make more +sense to use them than to declare ACLs with random names. Another example of +good use is the following : + + With named ACLs : + + acl site_dead nbsrv(dynamic) lt 2 + acl site_dead nbsrv(static) lt 2 + monitor fail if site_dead + + With anonymous ACLs : + + monitor fail if { nbsrv(dynamic) lt 2 } || { nbsrv(static) lt 2 } + +See section 4.2 for detailed help on the "http-request deny" and "use_backend" +keywords. + + +7.3. Fetching samples +--------------------- + +Historically, sample fetch methods were only used to retrieve data to match +against patterns using ACLs. With the arrival of stick-tables, a new class of +sample fetch methods was created, most often sharing the same syntax as their +ACL counterpart. These sample fetch methods are also known as "fetches". As +of now, ACLs and fetches have converged. All ACL fetch methods have been made +available as fetch methods, and ACLs may use any sample fetch method as well. + +This section details all available sample fetch methods and their output type. +Some sample fetch methods have deprecated aliases that are used to maintain +compatibility with existing configurations. They are then explicitly marked as +deprecated and should not be used in new setups. + +The ACL derivatives are also indicated when available, with their respective +matching methods. These ones all have a well defined default pattern matching +method, so it is never necessary (though allowed) to pass the "-m" option to +indicate how the sample will be matched using ACLs. + +As indicated in the sample type versus matching compatibility matrix above, +when using a generic sample fetch method in an ACL, the "-m" option is +mandatory unless the sample type is one of boolean, integer, IPv4 or IPv6. When +the same keyword exists as an ACL keyword and as a standard fetch method, the +ACL engine will automatically pick the ACL-only one by default. + +Some of these keywords support one or multiple mandatory arguments, and one or +multiple optional arguments. These arguments are strongly typed and are checked +when the configuration is parsed so that there is no risk of running with an +incorrect argument (e.g. an unresolved backend name). Fetch function arguments +are passed between parenthesis and are delimited by commas. When an argument +is optional, it will be indicated below between square brackets ('[ ]'). When +all arguments are optional, the parenthesis may be omitted. + +Thus, the syntax of a standard sample fetch method is one of the following : + - name + - name(arg1) + - name(arg1,arg2) + + +7.3.1. Converters +----------------- + +Sample fetch methods may be combined with transformations to be applied on top +of the fetched sample (also called "converters"). These combinations form what +is called "sample expressions" and the result is a "sample". Initially this +was only supported by "stick on" and "stick store-request" directives but this +has now be extended to all places where samples may be used (ACLs, log-format, +unique-id-format, add-header, ...). + +These transformations are enumerated as a series of specific keywords after the +sample fetch method. These keywords may equally be appended immediately after +the fetch keyword's argument, delimited by a comma. These keywords can also +support some arguments (e.g. a netmask) which must be passed in parenthesis. + +A certain category of converters are bitwise and arithmetic operators which +support performing basic operations on integers. Some bitwise operations are +supported (and, or, xor, cpl) and some arithmetic operations are supported +(add, sub, mul, div, mod, neg). Some comparators are provided (odd, even, not, +bool) which make it possible to report a match without having to write an ACL. + +The currently available list of transformation keywords include : + +51d.single(<prop>[,<prop>*]) + Returns values for the properties requested as a string, where values are + separated by the delimiter specified with "51degrees-property-separator". + The device is identified using the User-Agent header passed to the + converter. The function can be passed up to five property names, and if a + property name can't be found, the value "NoData" is returned. + + Example : + # Here the header "X-51D-DeviceTypeMobileTablet" is added to the request, + # containing values for the three properties requested by using the + # User-Agent passed to the converter. + frontend http-in + bind *:8081 + default_backend servers + http-request set-header X-51D-DeviceTypeMobileTablet \ + %[req.fhdr(User-Agent),51d.single(DeviceType,IsMobile,IsTablet)] + +add(<value>) + Adds <value> to the input value of type signed integer, and returns the + result as a signed integer. <value> can be a numeric value or a variable + name. The name of the variable starts with an indication about its scope. The + scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +add_item(<delim>,[<var>][,<suff>]]) + Concatenates a minimum of 2 and up to 3 fields after the current sample which + is then turned into a string. The first one, <delim>, is a constant string, + that will be appended immediately after the existing sample if an existing + sample is not empty and either the <var> or the <suff> is not empty. The + second one, <var>, is a variable name. The variable will be looked up, its + contents converted to a string, and it will be appended immediately after + the <delim> part. If the variable is not found, nothing is appended. It is + optional and may optionally be followed by a constant string <suff>, however + if <var> is omitted, then <suff> is mandatory. This converter is similar to + the concat converter and can be used to build new variables made of a + succession of other variables but the main difference is that it does the + checks if adding a delimiter makes sense as wouldn't be the case if e.g. the + current sample is empty. That situation would require 2 separate rules using + concat converter where the first rule would have to check if the current + sample string is empty before adding a delimiter. If commas or closing + parenthesis are needed as delimiters, they must be protected by quotes or + backslashes, themselves protected so that they are not stripped by the first + level parser (please see section 2.2 for quoting and escaping). See examples + below. + + Example: + http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score1,"(site1)") if src,in_table(site1)' + http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score2,"(site2)") if src,in_table(site2)' + http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score3,"(site3)") if src,in_table(site3)' + http-request set-header x-tagged %[var(req.tagged)] + + http-request set-var(req.tagged) 'var(req.tagged),add_item(",",req.score1),add_item(",",req.score2)' + http-request set-var(req.tagged) 'var(req.tagged),add_item(",",,(site1))' if src,in_table(site1) + +aes_gcm_dec(<bits>,<nonce>,<key>,<aead_tag>) + Decrypts the raw byte input using the AES128-GCM, AES192-GCM or + AES256-GCM algorithm, depending on the <bits> parameter. All other parameters + need to be base64 encoded and the returned result is in raw byte format. + If the <aead_tag> validation fails, the converter doesn't return any data. + The <nonce>, <key> and <aead_tag> can either be strings or variables. This + converter requires at least OpenSSL 1.0.1. + + Example: + http-response set-header X-Decrypted-Text %[var(txn.enc),\ + aes_gcm_dec(128,txn.nonce,Zm9vb2Zvb29mb29wZm9vbw==,txn.aead_tag)] + +and(<value>) + Performs a bitwise "AND" between <value> and the input value of type signed + integer, and returns the result as an signed integer. <value> can be a + numeric value or a variable name. The name of the variable starts with an + indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +b64dec + Converts (decodes) a base64 encoded input string to its binary + representation. It performs the inverse operation of base64(). + For base64url("URL and Filename Safe Alphabet" (RFC 4648)) variant + see "ub64dec". + +base64 + Converts a binary input sample to a base64 string. It is used to log or + transfer binary content in a way that can be reliably transferred (e.g. + an SSL ID can be copied in a header). For base64url("URL and Filename + Safe Alphabet" (RFC 4648)) variant see "ub64enc". + +be2dec(<separator>,<chunk_size>,[<truncate>]) + Converts big-endian binary input sample to a string containing an unsigned + integer number per <chunk_size> input bytes. <separator> is put every + <chunk_size> binary input bytes if specified. <truncate> flag indicates + whatever binary input is truncated at <chunk_size> boundaries. <chunk_size> + maximum value is limited by the size of long long int (8 bytes). + + Example: + bin(01020304050607),be2dec(:,2) # 258:772:1286:7 + bin(01020304050607),be2dec(-,2,1) # 258-772-1286 + bin(01020304050607),be2dec(,2,1) # 2587721286 + bin(7f000001),be2dec(.,1) # 127.0.0.1 + +be2hex([<separator>],[<chunk_size>],[<truncate>]) + Converts big-endian binary input sample to a hex string containing two hex + digits per input byte. It is used to log or transfer hex dumps of some + binary input data in a way that can be reliably transferred (e.g. an SSL ID + can be copied in a header). <separator> is put every <chunk_size> binary + input bytes if specified. <truncate> flag indicates whatever binary input is + truncated at <chunk_size> boundaries. + + Example: + bin(01020304050607),be2hex # 01020304050607 + bin(01020304050607),be2hex(:,2) # 0102:0304:0506:07 + bin(01020304050607),be2hex(--,2,1) # 0102--0304--0506 + bin(0102030405060708),be2hex(,3,1) # 010203040506 + +bool + Returns a boolean TRUE if the input value of type signed integer is + non-null, otherwise returns FALSE. Used in conjunction with and(), it can be + used to report true/false for bit testing on input values (e.g. verify the + presence of a flag). + +bytes(<offset>[,<length>]) + Extracts some bytes from an input binary sample. The result is a binary + sample starting at an offset (in bytes) of the original sample and + optionally truncated at the given length. + +concat([<start>],[<var>],[<end>]) + Concatenates up to 3 fields after the current sample which is then turned to + a string. The first one, <start>, is a constant string, that will be appended + immediately after the existing sample. It may be omitted if not used. The + second one, <var>, is a variable name. The variable will be looked up, its + contents converted to a string, and it will be appended immediately after the + <first> part. If the variable is not found, nothing is appended. It may be + omitted as well. The third field, <end> is a constant string that will be + appended after the variable. It may also be omitted. Together, these elements + allow to concatenate variables with delimiters to an existing set of + variables. This can be used to build new variables made of a succession of + other variables, such as colon-delimited values. If commas or closing + parenthesis are needed as delimiters, they must be protected by quotes or + backslashes, themselves protected so that they are not stripped by the first + level parser. This is often used to build composite variables from other + ones, but sometimes using a format string with multiple fields may be more + convenient. See examples below. + + Example: + tcp-request session set-var(sess.src) src + tcp-request session set-var(sess.dn) ssl_c_s_dn + tcp-request session set-var(txn.sig) str(),concat(<ip=,sess.ip,>),concat(<dn=,sess.dn,>) + tcp-request session set-var(txn.ipport) "str(),concat('addr=(',sess.ip),concat(',',sess.port,')')" + tcp-request session set-var-fmt(txn.ipport) "addr=(%[sess.ip],%[sess.port])" ## does the same + http-request set-header x-hap-sig %[var(txn.sig)] + +cpl + Takes the input value of type signed integer, applies a ones-complement + (flips all bits) and returns the result as an signed integer. + +crc32([<avalanche>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the CRC32 + hash function. Optionally, it is possible to apply a full avalanche hash + function to the output if the optional <avalanche> argument equals 1. This + converter uses the same functions as used by the various hash-based load + balancing algorithms, so it will provide exactly the same results. It is + provided for compatibility with other software which want a CRC32 to be + computed on some input keys, so it follows the most common implementation as + found in Ethernet, Gzip, PNG, etc... It is slower than the other algorithms + but may provide a better or at least less predictable distribution. It must + not be used for security purposes as a 32-bit hash is trivial to break. See + also "djb2", "sdbm", "wt6", "crc32c" and the "hash-type" directive. + +crc32c([<avalanche>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the CRC32C + hash function. Optionally, it is possible to apply a full avalanche hash + function to the output if the optional <avalanche> argument equals 1. This + converter uses the same functions as described in RFC4960, Appendix B [8]. + It is provided for compatibility with other software which want a CRC32C to be + computed on some input keys. It is slower than the other algorithms and it must + not be used for security purposes as a 32-bit hash is trivial to break. See + also "djb2", "sdbm", "wt6", "crc32" and the "hash-type" directive. + +cut_crlf + Cuts the string representation of the input sample on the first carriage + return ('\r') or newline ('\n') character found. Only the string length is + updated. + +da-csv-conv(<prop>[,<prop>*]) + Asks the DeviceAtlas converter to identify the User Agent string passed on + input, and to emit a string made of the concatenation of the properties + enumerated in argument, delimited by the separator defined by the global + keyword "deviceatlas-property-separator", or by default the pipe character + ('|'). There's a limit of 12 different properties imposed by the HAProxy + configuration language. + + Example: + frontend www + bind *:8881 + default_backend servers + http-request set-header X-DeviceAtlas-Data %[req.fhdr(User-Agent),da-csv(primaryHardwareType,osName,osVersion,browserName,browserVersion,browserRenderingEngine)] + +debug([<prefix][,<destination>]) + This converter is used as debug tool. It takes a capture of the input sample + and sends it to event sink <destination>, which may designate a ring buffer + such as "buf0", as well as "stdout", or "stderr". Available sinks may be + checked at run time by issuing "show events" on the CLI. When not specified, + the output will be "buf0", which may be consulted via the CLI's "show events" + command. An optional prefix <prefix> may be passed to help distinguish + outputs from multiple expressions. It will then appear before the colon in + the output message. The input sample is passed as-is on the output, so that + it is safe to insert the debug converter anywhere in a chain, even with non- + printable sample types. + + Example: + tcp-request connection track-sc0 src,debug(track-sc) + +digest(<algorithm>) + Converts a binary input sample to a message digest. The result is a binary + sample. The <algorithm> must be an OpenSSL message digest name (e.g. sha256). + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + +div(<value>) + Divides the input value of type signed integer by <value>, and returns the + result as an signed integer. If <value> is null, the largest unsigned + integer is returned (typically 2^63-1). <value> can be a numeric value or a + variable name. The name of the variable starts with an indication about its + scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +djb2([<avalanche>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the DJB2 + hash function. Optionally, it is possible to apply a full avalanche hash + function to the output if the optional <avalanche> argument equals 1. This + converter uses the same functions as used by the various hash-based load + balancing algorithms, so it will provide exactly the same results. It is + mostly intended for debugging, but can be used as a stick-table entry to + collect rough statistics. It must not be used for security purposes as a + 32-bit hash is trivial to break. See also "crc32", "sdbm", "wt6", "crc32c", + and the "hash-type" directive. + +even + Returns a boolean TRUE if the input value of type signed integer is even + otherwise returns FALSE. It is functionally equivalent to "not,and(1),bool". + +field(<index>,<delimiters>[,<count>]) + Extracts the substring at the given index counting from the beginning + (positive index) or from the end (negative index) considering given delimiters + from an input string. Indexes start at 1 or -1 and delimiters are a string + formatted list of chars. Optionally you can specify <count> of fields to + extract (default: 1). Value of 0 indicates extraction of all remaining + fields. + + Example : + str(f1_f2_f3__f5),field(5,_) # f5 + str(f1_f2_f3__f5),field(2,_,0) # f2_f3__f5 + str(f1_f2_f3__f5),field(2,_,2) # f2_f3 + str(f1_f2_f3__f5),field(-2,_,3) # f2_f3_ + str(f1_f2_f3__f5),field(-3,_,0) # f1_f2_f3 + +fix_is_valid + Parses a binary payload and performs sanity checks regarding FIX (Financial + Information eXchange): + + - checks that all tag IDs and values are not empty and the tags IDs are well + numeric + - checks the BeginString tag is the first tag with a valid FIX version + - checks the BodyLength tag is the second one with the right body length + - checks the MsgType tag is the third tag. + - checks that last tag in the message is the CheckSum tag with a valid + checksum + + Due to current HAProxy design, only the first message sent by the client and + the server can be parsed. + + This converter returns a boolean, true if the payload contains a valid FIX + message, false if not. + + See also the fix_tag_value converter. + + Example: + tcp-request inspect-delay 10s + tcp-request content reject unless { req.payload(0,0),fix_is_valid } + +fix_tag_value(<tag>) + Parses a FIX (Financial Information eXchange) message and extracts the value + from the tag <tag>. <tag> can be a string or an integer pointing to the + desired tag. Any integer value is accepted, but only the following strings + are translated into their integer equivalent: BeginString, BodyLength, + MsgType, SenderCompID, TargetCompID, CheckSum. More tag names can be easily + added. + + Due to current HAProxy design, only the first message sent by the client and + the server can be parsed. No message validation is performed by this + converter. It is highly recommended to validate the message first using + fix_is_valid converter. + + See also the fix_is_valid converter. + + Example: + tcp-request inspect-delay 10s + tcp-request content reject unless { req.payload(0,0),fix_is_valid } + # MsgType tag ID is 35, so both lines below will return the same content + tcp-request content set-var(txn.foo) req.payload(0,0),fix_tag_value(35) + tcp-request content set-var(txn.bar) req.payload(0,0),fix_tag_value(MsgType) + +hex + Converts a binary input sample to a hex string containing two hex digits per + input byte. It is used to log or transfer hex dumps of some binary input data + in a way that can be reliably transferred (e.g. an SSL ID can be copied in a + header). + +hex2i + Converts a hex string containing two hex digits per input byte to an + integer. If the input value cannot be converted, then zero is returned. + +htonl + Converts the input integer value to its 32-bit binary representation in the + network byte order. Because sample fetches own signed 64-bit integer, when + this converter is used, the input integer value is first casted to an + unsigned 32-bit integer. + +hmac(<algorithm>,<key>) + Converts a binary input sample to a message authentication code with the given + key. The result is a binary sample. The <algorithm> must be one of the + registered OpenSSL message digest names (e.g. sha256). The <key> parameter must + be base64 encoded and can either be a string or a variable. + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + +host_only + Converts a string which contains a Host header value and removes its port. + The input must respect the format of the host header value + (rfc9110#section-7.2). It will support that kind of input: hostname, + hostname:80, 127.0.0.1, 127.0.0.1:80, [::1], [::1]:80. + + This converter also sets the string in lowercase. + + See also: "port_only" converter which will return the port. + +http_date([<offset],[<unit>]) + Converts an integer supposed to contain a date since epoch to a string + representing this date in a format suitable for use in HTTP header fields. If + an offset value is specified, then it is added to the date before the + conversion is operated. This is particularly useful to emit Date header fields, + Expires values in responses when combined with a positive offset, or + Last-Modified values when the offset is negative. + If a unit value is specified, then consider the timestamp as either + "s" for seconds (default behavior), "ms" for milliseconds, or "us" for + microseconds since epoch. Offset is assumed to have the same unit as + input timestamp. + +iif(<true>,<false>) + Returns the <true> string if the input value is true. Returns the <false> + string otherwise. + + Example: + http-request set-header x-forwarded-proto %[ssl_fc,iif(https,http)] + +in_table(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, a boolean false + is returned. Otherwise a boolean true is returned. This can be used to verify + the presence of a certain key in a table tracking some elements (e.g. whether + or not a source IP address or an Authorization header was already seen). + +ipmask(<mask4>,[<mask6>]) + Apply a mask to an IP address, and use the result for lookups and storage. + This can be used to make all hosts within a certain mask to share the same + table entries and as such use the same server. The mask4 can be passed in + dotted form (e.g. 255.255.255.0) or in CIDR form (e.g. 24). The mask6 can + be passed in quadruplet form (e.g. ffff:ffff::) or in CIDR form (e.g. 64). + If no mask6 is given IPv6 addresses will fail to convert for backwards + compatibility reasons. + +json([<input-code>]) + Escapes the input string and produces an ASCII output string ready to use as a + JSON string. The converter tries to decode the input string according to the + <input-code> parameter. It can be "ascii", "utf8", "utf8s", "utf8p" or + "utf8ps". The "ascii" decoder never fails. The "utf8" decoder detects 3 types + of errors: + - bad UTF-8 sequence (lone continuation byte, bad number of continuation + bytes, ...) + - invalid range (the decoded value is within a UTF-8 prohibited range), + - code overlong (the value is encoded with more bytes than necessary). + + The UTF-8 JSON encoding can produce a "too long value" error when the UTF-8 + character is greater than 0xffff because the JSON string escape specification + only authorizes 4 hex digits for the value encoding. The UTF-8 decoder exists + in 4 variants designated by a combination of two suffix letters : "p" for + "permissive" and "s" for "silently ignore". The behaviors of the decoders + are : + - "ascii" : never fails; + - "utf8" : fails on any detected errors; + - "utf8s" : never fails, but removes characters corresponding to errors; + - "utf8p" : accepts and fixes the overlong errors, but fails on any other + error; + - "utf8ps" : never fails, accepts and fixes the overlong errors, but removes + characters corresponding to the other errors. + + This converter is particularly useful for building properly escaped JSON for + logging to servers which consume JSON-formatted traffic logs. + + Example: + capture request header Host len 15 + capture request header user-agent len 150 + log-format '{"ip":"%[src]","user-agent":"%[capture.req.hdr(1),json(utf8s)]"}' + + Input request from client 127.0.0.1: + GET / HTTP/1.0 + User-Agent: Very "Ugly" UA 1/2 + + Output log: + {"ip":"127.0.0.1","user-agent":"Very \"Ugly\" UA 1\/2"} + +json_query(<json_path>,[<output_type>]) + The json_query converter supports the JSON types string, boolean and + number. Floating point numbers will be returned as a string. By + specifying the output_type 'int' the value will be converted to an + Integer. If conversion is not possible the json_query converter fails. + + <json_path> must be a valid JSON Path string as defined in + https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/ + + Example: + # get a integer value from the request body + # "{"integer":4}" => 5 + http-request set-var(txn.pay_int) req.body,json_query('$.integer','int'),add(1) + + # get a key with '.' in the name + # {"my.key":"myvalue"} => myvalue + http-request set-var(txn.pay_mykey) req.body,json_query('$.my\\.key') + + # {"boolean-false":false} => 0 + http-request set-var(txn.pay_boolean_false) req.body,json_query('$.boolean-false') + + # get the value of the key 'iss' from a JWT Bearer token + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec,json_query('$.iss') + +jwt_header_query([<json_path>],[<output_type>]) + When given a JSON Web Token (JWT) in input, either returns the decoded header + part of the token (the first base64-url encoded part of the JWT) if no + parameter is given, or performs a json_query on the decoded header part of + the token. See "json_query" converter for details about the accepted + json_path and output_type parameters. + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + +jwt_payload_query([<json_path>],[<output_type>]) + When given a JSON Web Token (JWT) in input, either returns the decoded + payload part of the token (the second base64-url encoded part of the JWT) if + no parameter is given, or performs a json_query on the decoded payload part + of the token. See "json_query" converter for details about the accepted + json_path and output_type parameters. + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + +jwt_verify(<alg>,<key>) + Performs a signature verification for the JSON Web Token (JWT) given in input + by using the <alg> algorithm and the <key> parameter, which should either + hold a secret or a path to a public certificate. Returns 1 in case of + verification success, 0 in case of verification error and a strictly negative + value for any other error. Because of all those non-null error return values, + the result of this converter should never be converted to a boolean. See + below for a full list of the possible return values. + + For now, only JWS tokens using the Compact Serialization format can be + processed (three dot-separated base64-url encoded strings). Among the + accepted algorithms for a JWS (see section 3.1 of RFC7518), the PSXXX ones + are not managed yet. + + If the used algorithm is of the HMAC family, <key> should be the secret used + in the HMAC signature calculation. Otherwise, <key> should be the path to the + public certificate that can be used to validate the token's signature. All + the certificates that might be used to verify JWTs must be known during init + in order to be added into a dedicated certificate cache so that no disk + access is required during runtime. For this reason, any used certificate must + be mentioned explicitly at least once in a jwt_verify call. Passing an + intermediate variable as second parameter is then not advised. + + This converter only verifies the signature of the token and does not perform + a full JWT validation as specified in section 7.2 of RFC7519. We do not + ensure that the header and payload contents are fully valid JSON's once + decoded for instance, and no checks are performed regarding their respective + contents. + + The possible return values are the following : + + +----+----------------------------------------------------------------------+ + | ID | message | + +----+----------------------------------------------------------------------+ + | 0 | "Verification failure" | + | 1 | "Verification success" | + | -1 | "Unknown algorithm (not mentioned in RFC7518)" | + | -2 | "Unmanaged algorithm (PSXXX algorithm family)" | + | -3 | "Invalid token" | + | -4 | "Out of memory" | + | -5 | "Unknown certificate" | + +----+----------------------------------------------------------------------+ + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + + Example: + # Get a JWT from the authorization header, extract the "alg" field of its + # JOSE header and use a public certificate to verify a signature + http-request set-var(txn.bearer) http_auth_bearer + http-request set-var(txn.jwt_alg) var(txn.bearer),jwt_header_query('$.alg') + http-request deny unless { var(txn.jwt_alg) "RS256" } + http-request deny unless { var(txn.bearer),jwt_verify(txn.jwt_alg,"/path/to/crt.pem") 1 } + +language(<value>[,<default>]) + Returns the value with the highest q-factor from a list as extracted from the + "accept-language" header using "req.fhdr". Values with no q-factor have a + q-factor of 1. Values with a q-factor of 0 are dropped. Only values which + belong to the list of semi-colon delimited <values> will be considered. The + argument <value> syntax is "lang[;lang[;lang[;...]]]". If no value matches the + given list and a default value is provided, it is returned. Note that language + names may have a variant after a dash ('-'). If this variant is present in the + list, it will be matched, but if it is not, only the base language is checked. + The match is case-sensitive, and the output string is always one of those + provided in arguments. The ordering of arguments is meaningless, only the + ordering of the values in the request counts, as the first value among + multiple sharing the same q-factor is used. + + Example : + + # this configuration switches to the backend matching a + # given language based on the request : + + acl es req.fhdr(accept-language),language(es;fr;en) -m str es + acl fr req.fhdr(accept-language),language(es;fr;en) -m str fr + acl en req.fhdr(accept-language),language(es;fr;en) -m str en + use_backend spanish if es + use_backend french if fr + use_backend english if en + default_backend choose_your_language + +length + Get the length of the string. This can only be placed after a string + sample fetch function or after a transformation keyword returning a string + type. The result is of type integer. + +lower + Convert a string sample to lower case. This can only be placed after a string + sample fetch function or after a transformation keyword returning a string + type. The result is of type string. + +ltime(<format>[,<offset>]) + Converts an integer supposed to contain a date since epoch to a string + representing this date in local time using a format defined by the <format> + string using strftime(3). The purpose is to allow any date format to be used + in logs. An optional <offset> in seconds may be applied to the input date + (positive or negative). See the strftime() man page for the format supported + by your operating system. See also the utime converter. + + Example : + + # Emit two colons, one with the local time and another with ip:port + # e.g. 20140710162350 127.0.0.1:57325 + log-format %[date,ltime(%Y%m%d%H%M%S)]\ %ci:%cp + +ltrim(<chars>) + Skips any characters from <chars> from the beginning of the string + representation of the input sample. + +map(<map_file>[,<default_value>]) +map_<match_type>(<map_file>[,<default_value>]) +map_<match_type>_<output_type>(<map_file>[,<default_value>]) + Search the input value from <map_file> using the <match_type> matching method, + and return the associated value converted to the type <output_type>. If the + input value cannot be found in the <map_file>, the converter returns the + <default_value>. If the <default_value> is not set, the converter fails and + acts as if no input value could be fetched. If the <match_type> is not set, it + defaults to "str". Likewise, if the <output_type> is not set, it defaults to + "str". For convenience, the "map" keyword is an alias for "map_str" and maps a + string to another string. + + It is important to avoid overlapping between the keys : IP addresses and + strings are stored in trees, so the first of the finest match will be used. + Other keys are stored in lists, so the first matching occurrence will be used. + + The following array contains the list of all map functions available sorted by + input type, match type and output type. + + input type | match method | output type str | output type int | output type ip + -----------+--------------+-----------------+-----------------+--------------- + str | str | map_str | map_str_int | map_str_ip + -----------+--------------+-----------------+-----------------+--------------- + str | beg | map_beg | map_beg_int | map_end_ip + -----------+--------------+-----------------+-----------------+--------------- + str | sub | map_sub | map_sub_int | map_sub_ip + -----------+--------------+-----------------+-----------------+--------------- + str | dir | map_dir | map_dir_int | map_dir_ip + -----------+--------------+-----------------+-----------------+--------------- + str | dom | map_dom | map_dom_int | map_dom_ip + -----------+--------------+-----------------+-----------------+--------------- + str | end | map_end | map_end_int | map_end_ip + -----------+--------------+-----------------+-----------------+--------------- + str | reg | map_reg | map_reg_int | map_reg_ip + -----------+--------------+-----------------+-----------------+--------------- + str | reg | map_regm | map_reg_int | map_reg_ip + -----------+--------------+-----------------+-----------------+--------------- + int | int | map_int | map_int_int | map_int_ip + -----------+--------------+-----------------+-----------------+--------------- + ip | ip | map_ip | map_ip_int | map_ip_ip + -----------+--------------+-----------------+-----------------+--------------- + + The special map called "map_regm" expect matching zone in the regular + expression and modify the output replacing back reference (like "\1") by + the corresponding match text. + + The file contains one key + value per line. Lines which start with '#' are + ignored, just like empty lines. Leading tabs and spaces are stripped. The key + is then the first "word" (series of non-space/tabs characters), and the value + is what follows this series of space/tab till the end of the line excluding + trailing spaces/tabs. + + Example : + + # this is a comment and is ignored + 2.22.246.0/23 United Kingdom \n + <-><-----------><--><------------><----> + | | | | `- trailing spaces ignored + | | | `---------- value + | | `-------------------- middle spaces ignored + | `---------------------------- key + `------------------------------------ leading spaces ignored + +mod(<value>) + Divides the input value of type signed integer by <value>, and returns the + remainder as an signed integer. If <value> is null, then zero is returned. + <value> can be a numeric value or a variable name. The name of the variable + starts with an indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +mqtt_field_value(<packettype>,<fieldname_or_property_ID>) + Returns value of <fieldname> found in input MQTT payload of type + <packettype>. + <packettype> can be either a string (case insensitive matching) or a numeric + value corresponding to the type of packet we're supposed to extract data + from. + Supported string and integers can be found here: + https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718021 + https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901022 + + <fieldname> depends on <packettype> and can be any of the following below. + (note that <fieldname> matching is case insensitive). + <property id> can only be found in MQTT v5.0 streams. check this table: + https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901029 + + - CONNECT (or 1): flags, protocol_name, protocol_version, client_identifier, + will_topic, will_payload, username, password, keepalive + OR any property ID as a numeric value (for MQTT v5.0 + packets only): + 17: Session Expiry Interval + 33: Receive Maximum + 39: Maximum Packet Size + 34: Topic Alias Maximum + 25: Request Response Information + 23: Request Problem Information + 21: Authentication Method + 22: Authentication Data + 18: Will Delay Interval + 1: Payload Format Indicator + 2: Message Expiry Interval + 3: Content Type + 8: Response Topic + 9: Correlation Data + Not supported yet: + 38: User Property + + - CONNACK (or 2): flags, protocol_version, reason_code + OR any property ID as a numeric value (for MQTT v5.0 + packets only): + 17: Session Expiry Interval + 33: Receive Maximum + 36: Maximum QoS + 37: Retain Available + 39: Maximum Packet Size + 18: Assigned Client Identifier + 34: Topic Alias Maximum + 31: Reason String + 40; Wildcard Subscription Available + 41: Subscription Identifiers Available + 42: Shared Subscription Available + 19: Server Keep Alive + 26: Response Information + 28: Server Reference + 21: Authentication Method + 22: Authentication Data + Not supported yet: + 38: User Property + + Due to current HAProxy design, only the first message sent by the client and + the server can be parsed. Thus this converter can extract data only from + CONNECT and CONNACK packet types. CONNECT is the first message sent by the + client and CONNACK is the first response sent by the server. + + Example: + + acl data_in_buffer req.len ge 4 + tcp-request content set-var(txn.username) \ + req.payload(0,0),mqtt_field_value(connect,protocol_name) \ + if data_in_buffer + # do the same as above + tcp-request content set-var(txn.username) \ + req.payload(0,0),mqtt_field_value(1,protocol_name) \ + if data_in_buffer + +mqtt_is_valid + Checks that the binary input is a valid MQTT packet. It returns a boolean. + + Due to current HAProxy design, only the first message sent by the client and + the server can be parsed. Thus this converter can extract data only from + CONNECT and CONNACK packet types. CONNECT is the first message sent by the + client and CONNACK is the first response sent by the server. + + Only MQTT 3.1, 3.1.1 and 5.0 are supported. + + Example: + + acl data_in_buffer req.len ge 4 + tcp-request content reject unless { req.payload(0,0),mqtt_is_valid } + +mul(<value>) + Multiplies the input value of type signed integer by <value>, and returns + the product as an signed integer. In case of overflow, the largest possible + value for the sign is returned so that the operation doesn't wrap around. + <value> can be a numeric value or a variable name. The name of the variable + starts with an indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +nbsrv + Takes an input value of type string, interprets it as a backend name and + returns the number of usable servers in that backend. Can be used in places + where we want to look up a backend from a dynamic name, like a result of a + map lookup. + +neg + Takes the input value of type signed integer, computes the opposite value, + and returns the remainder as an signed integer. 0 is identity. This operator + is provided for reversed subtracts : in order to subtract the input from a + constant, simply perform a "neg,add(value)". + +not + Returns a boolean FALSE if the input value of type signed integer is + non-null, otherwise returns TRUE. Used in conjunction with and(), it can be + used to report true/false for bit testing on input values (e.g. verify the + absence of a flag). + +odd + Returns a boolean TRUE if the input value of type signed integer is odd + otherwise returns FALSE. It is functionally equivalent to "and(1),bool". + +or(<value>) + Performs a bitwise "OR" between <value> and the input value of type signed + integer, and returns the result as an signed integer. <value> can be a + numeric value or a variable name. The name of the variable starts with an + indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and response) + "req" : the variable is shared only during request processing + "res" : the variable is shared only during response processing + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +port_only + Converts a string which contains a Host header value into an integer by + returning its port. + The input must respect the format of the host header value + (rfc9110#section-7.2). It will support that kind of input: hostname, + hostname:80, 127.0.0.1, 127.0.0.1:80, [::1], [::1]:80. + + If no port were provided in the input, it will return 0. + + See also: "host_only" converter which will return the host. + +protobuf(<field_number>,[<field_type>]) + This extracts the protocol buffers message field in raw mode of an input binary + sample representation of a protocol buffer message with <field_number> as field + number (dotted notation) if <field_type> is not present, or as an integer sample + if this field is present (see also "ungrpc" below). + The list of the authorized types is the following one: "int32", "int64", "uint32", + "uint64", "sint32", "sint64", "bool", "enum" for the "varint" wire type 0 + "fixed64", "sfixed64", "double" for the 64bit wire type 1, "fixed32", "sfixed32", + "float" for the wire type 5. Note that "string" is considered as a length-delimited + type, so it does not require any <field_type> argument to be extracted. + More information may be found here about the protocol buffers message field types: + https://developers.google.com/protocol-buffers/docs/encoding + +regsub(<regex>,<subst>[,<flags>]) + Applies a regex-based substitution to the input string. It does the same + operation as the well-known "sed" utility with "s/<regex>/<subst>/". By + default it will replace in the input string the first occurrence of the + largest part matching the regular expression <regex> with the substitution + string <subst>. It is possible to replace all occurrences instead by adding + the flag "g" in the third argument <flags>. It is also possible to make the + regex case insensitive by adding the flag "i" in <flags>. Since <flags> is a + string, it is made up from the concatenation of all desired flags. Thus if + both "i" and "g" are desired, using "gi" or "ig" will have the same effect. + The first use of this converter is to replace certain characters or sequence + of characters with other ones. + + It is highly recommended to enclose the regex part using protected quotes to + improve clarity and never have a closing parenthesis from the regex mixed up + with the parenthesis from the function. Just like in Bourne shell, the first + level of quotes is processed when delimiting word groups on the line, a + second level is usable for argument. It is recommended to use single quotes + outside since these ones do not try to resolve backslashes nor dollar signs. + + Examples: + + # de-duplicate "/" in header "x-path". + # input: x-path: /////a///b/c/xzxyz/ + # output: x-path: /a/b/c/xzxyz/ + http-request set-header x-path "%[hdr(x-path),regsub('/+','/','g')]" + + # copy query string to x-query and drop all leading '?', ';' and '&' + http-request set-header x-query "%[query,regsub([?;&]*,'')]" + + # capture groups and backreferences + # both lines do the same. + http-request redirect location %[url,'regsub("(foo|bar)([0-9]+)?","\2\1",i)'] + http-request redirect location %[url,regsub(\"(foo|bar)([0-9]+)?\",\"\2\1\",i)] + +capture-req(<id>) + Capture the string entry in the request slot <id> and returns the entry as + is. If the slot doesn't exist, the capture fails silently. + + See also: "declare capture", "http-request capture", + "http-response capture", "capture.req.hdr" and + "capture.res.hdr" (sample fetches). + +capture-res(<id>) + Capture the string entry in the response slot <id> and returns the entry as + is. If the slot doesn't exist, the capture fails silently. + + See also: "declare capture", "http-request capture", + "http-response capture", "capture.req.hdr" and + "capture.res.hdr" (sample fetches). + +rtrim(<chars>) + Skips any characters from <chars> from the end of the string representation + of the input sample. + +sdbm([<avalanche>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the SDBM + hash function. Optionally, it is possible to apply a full avalanche hash + function to the output if the optional <avalanche> argument equals 1. This + converter uses the same functions as used by the various hash-based load + balancing algorithms, so it will provide exactly the same results. It is + mostly intended for debugging, but can be used as a stick-table entry to + collect rough statistics. It must not be used for security purposes as a + 32-bit hash is trivial to break. See also "crc32", "djb2", "wt6", "crc32c", + and the "hash-type" directive. + +secure_memcmp(<var>) + Compares the contents of <var> with the input value. Both values are treated + as a binary string. Returns a boolean indicating whether both binary strings + match. + + If both binary strings have the same length then the comparison will be + performed in constant time. + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + + Example : + + http-request set-var(txn.token) hdr(token) + # Check whether the token sent by the client matches the secret token + # value, without leaking the contents using a timing attack. + acl token_given str(my_secret_token),secure_memcmp(txn.token) + +set-var(<var>[,<cond>...]) + Sets a variable with the input content and returns the content on the output + as-is if all of the specified conditions are true (see below for a list of + possible conditions). The variable keeps the value and the associated input + type. The name of the variable starts with an indication about its scope. The + scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and + response), + "req" : the variable is shared only during request processing, + "res" : the variable is shared only during response processing. + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + + You can pass at most four conditions to the converter among the following + possible conditions : + - "ifexists"/"ifnotexists": + Checks if the variable already existed before the current set-var call. + A variable is usually created through a successful set-var call. + Note that variables of scope "proc" are created during configuration + parsing so the "ifexists" condition will always be true for them. + - "ifempty"/"ifnotempty": + Checks if the input is empty or not. + Scalar types are never empty so the ifempty condition will be false for + them regardless of the input's contents (integers, booleans, IPs ...). + - "ifset"/"ifnotset": + Checks if the variable was previously set or not, or if unset-var was + called on the variable. + A variable that does not exist yet is considered as not set. A "proc" + variable can exist while not being set since they are created during + configuration parsing. + - "ifgt"/"iflt": + Checks if the content of the variable is "greater than" or "less than" + the input. This check can only be performed if both the input and + the variable are of type integer. Otherwise, the check is considered as + true by default. + +sha1 + Converts a binary input sample to a SHA-1 digest. The result is a binary + sample with length of 20 bytes. + +sha2([<bits>]) + Converts a binary input sample to a digest in the SHA-2 family. The result + is a binary sample with length of <bits>/8 bytes. + + Valid values for <bits> are 224, 256, 384, 512, each corresponding to + SHA-<bits>. The default value is 256. + + Please note that this converter is only available when HAProxy has been + compiled with USE_OPENSSL. + +srv_queue + Takes an input value of type string, either a server name or <backend>/<server> + format and returns the number of queued sessions on that server. Can be used + in places where we want to look up queued sessions from a dynamic name, like a + cookie value (e.g. req.cook(SRVID),srv_queue) and then make a decision to break + persistence or direct a request elsewhere. + +strcmp(<var>) + Compares the contents of <var> with the input value of type string. Returns + the result as a signed integer compatible with strcmp(3): 0 if both strings + are identical. A value less than 0 if the left string is lexicographically + smaller than the right string or if the left string is shorter. A value greater + than 0 otherwise (right string greater than left string or the right string is + shorter). + + See also the secure_memcmp converter if you need to compare two binary + strings in constant time. + + Example : + + http-request set-var(txn.host) hdr(host) + # Check whether the client is attempting domain fronting. + acl ssl_sni_http_host_match ssl_fc_sni,strcmp(txn.host) eq 0 + + +sub(<value>) + Subtracts <value> from the input value of type signed integer, and returns + the result as an signed integer. Note: in order to subtract the input from + a constant, simply perform a "neg,add(value)". <value> can be a numeric value + or a variable name. The name of the variable starts with an indication about + its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and + response), + "req" : the variable is shared only during request processing, + "res" : the variable is shared only during response processing. + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +table_bytes_in_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the average client-to-server + bytes rate associated with the input sample in the designated table, measured + in amount of bytes over the period configured in the table. See also the + sc_bytes_in_rate sample fetch keyword. + + +table_bytes_out_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the average server-to-client + bytes rate associated with the input sample in the designated table, measured + in amount of bytes over the period configured in the table. See also the + sc_bytes_out_rate sample fetch keyword. + +table_conn_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of incoming + connections associated with the input sample in the designated table. See + also the sc_conn_cnt sample fetch keyword. + +table_conn_cur(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the current amount of concurrent + tracked connections associated with the input sample in the designated table. + See also the sc_conn_cur sample fetch keyword. + +table_conn_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the average incoming connection + rate associated with the input sample in the designated table. See also the + sc_conn_rate sample fetch keyword. + +table_expire(<table>[,<default_value>]) + Uses the input sample to perform a look up in the specified table. If the key + is not found in the table, the converter fails except if <default_value> is + set: this makes the converter succeed and return <default_value>. If the key + is found the converter returns the key expiration delay associated with the + input sample in the designated table. + See also the table_idle sample fetch keyword. + +table_gpt(<idx>,<table>) + Uses the string representation of the input sample to perform a lookup in + the specified table. If the key is not found in the table, boolean value zero + is returned. Otherwise the converter returns the current value of the general + purpose tag at the index <idx> of the array associated to the input sample + in the designated <table>. <idx> is an integer between 0 and 99. + If there is no GPT stored at this index, it also returns the boolean value 0. + This applies only to the 'gpt' array data_type (and not on the legacy 'gpt0' + data-type). + See also the sc_get_gpt sample fetch keyword. + +table_gpt0(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, boolean value zero + is returned. Otherwise the converter returns the current value of the first + general purpose tag associated with the input sample in the designated table. + See also the sc_get_gpt0 sample fetch keyword. + +table_gpc(<idx>,<table>) + Uses the string representation of the input sample to perform a lookup in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the current value of the + General Purpose Counter at the index <idx> of the array associated + to the input sample in the designated <table>. <idx> is an integer + between 0 and 99. + If there is no GPC stored at this index, it also returns the boolean value 0. + This applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). + See also the sc_get_gpc sample fetch keyword. + +table_gpc_rate(<idx>,<table>) + Uses the string representation of the input sample to perform a lookup in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the frequency which the Global + Purpose Counter at index <idx> of the array (associated to the input sample + in the designated stick-table <table>) was incremented over the + configured period. <idx> is an integer between 0 and 99. + If there is no gpc_rate stored at this index, it also returns the boolean + value 0. + This applies only to the 'gpc_rate' array data_type (and not to the + legacy 'gpc0_rate' nor 'gpc1_rate' data_types). + See also the sc_gpc_rate sample fetch keyword. + +table_gpc0(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the current value of the first + general purpose counter associated with the input sample in the designated + table. See also the sc_get_gpc0 sample fetch keyword. + +table_gpc0_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the frequency which the gpc0 + counter was incremented over the configured period in the table, associated + with the input sample in the designated table. See also the sc_get_gpc0_rate + sample fetch keyword. + +table_gpc1(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the current value of the second + general purpose counter associated with the input sample in the designated + table. See also the sc_get_gpc1 sample fetch keyword. + +table_gpc1_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the frequency which the gpc1 + counter was incremented over the configured period in the table, associated + with the input sample in the designated table. See also the sc_get_gpc1_rate + sample fetch keyword. + +table_http_err_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of HTTP + errors associated with the input sample in the designated table. See also the + sc_http_err_cnt sample fetch keyword. + +table_http_err_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the average rate of HTTP errors associated with the + input sample in the designated table, measured in amount of errors over the + period configured in the table. See also the sc_http_err_rate sample fetch + keyword. + +table_http_fail_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of HTTP + failures associated with the input sample in the designated table. See also + the sc_http_fail_cnt sample fetch keyword. + +table_http_fail_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the average rate of HTTP failures associated with the + input sample in the designated table, measured in amount of failures over the + period configured in the table. See also the sc_http_fail_rate sample fetch + keyword. + +table_http_req_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of HTTP + requests associated with the input sample in the designated table. See also + the sc_http_req_cnt sample fetch keyword. + +table_http_req_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the average rate of HTTP requests associated with the + input sample in the designated table, measured in amount of requests over the + period configured in the table. See also the sc_http_req_rate sample fetch + keyword. + +table_idle(<table>[,<default_value>]) + Uses the input sample to perform a look up in the specified table. If the key + is not found in the table, the converter fails except if <default_value> is + set: this makes the converter succeed and return <default_value>. If the key + is found the converter returns the time the key entry associated with the + input sample in the designated table remained idle since the last time it was + updated. + See also the table_expire sample fetch keyword. + +table_kbytes_in(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of client- + to-server data associated with the input sample in the designated table, + measured in kilobytes. The test is currently performed on 32-bit integers, + which limits values to 4 terabytes. See also the sc_kbytes_in sample fetch + keyword. + +table_kbytes_out(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of server- + to-client data associated with the input sample in the designated table, + measured in kilobytes. The test is currently performed on 32-bit integers, + which limits values to 4 terabytes. See also the sc_kbytes_out sample fetch + keyword. + +table_server_id(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the server ID associated with + the input sample in the designated table. A server ID is associated to a + sample by a "stick" rule when a connection to a server succeeds. A server ID + zero means that no server is associated with this key. + +table_sess_cnt(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the cumulative number of incoming + sessions associated with the input sample in the designated table. Note that + a session here refers to an incoming connection being accepted by the + "tcp-request connection" rulesets. See also the sc_sess_cnt sample fetch + keyword. + +table_sess_rate(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the average incoming session + rate associated with the input sample in the designated table. Note that a + session here refers to an incoming connection being accepted by the + "tcp-request connection" rulesets. See also the sc_sess_rate sample fetch + keyword. + +table_trackers(<table>) + Uses the string representation of the input sample to perform a look up in + the specified table. If the key is not found in the table, integer value zero + is returned. Otherwise the converter returns the current amount of concurrent + connections tracking the same key as the input sample in the designated + table. It differs from table_conn_cur in that it does not rely on any stored + information but on the table's reference count (the "use" value which is + returned by "show table" on the CLI). This may sometimes be more suited for + layer7 tracking. It can be used to tell a server how many concurrent + connections there are from a given address for example. See also the + sc_trackers sample fetch keyword. + +ub64dec + This converter is the base64url variant of b64dec converter. base64url + encoding is the "URL and Filename Safe Alphabet" variant of base64 encoding. + It is also the encoding used in JWT (JSON Web Token) standard. + + Example: + # Decoding a JWT payload: + http-request set-var(txn.token_payload) req.hdr(Authorization),word(2,.),ub64dec + +ub64enc + This converter is the base64url variant of base64 converter. + +upper + Convert a string sample to upper case. This can only be placed after a string + sample fetch function or after a transformation keyword returning a string + type. The result is of type string. + +url_dec([<in_form>]) + Takes an url-encoded string provided as input and returns the decoded version + as output. The input and the output are of type string. If the <in_form> + argument is set to a non-zero integer value, the input string is assumed to + be part of a form or query string and the '+' character will be turned into a + space (' '). Otherwise this will only happen after a question mark indicating + a query string ('?'). + +url_enc([<enc_type>]) + Takes a string provided as input and returns the encoded version as output. + The input and the output are of type string. By default the type of encoding + is meant for `query` type. There is no other type supported for now but the + optional argument is here for future changes. + +ungrpc(<field_number>,[<field_type>]) + This extracts the protocol buffers message field in raw mode of an input binary + sample representation of a gRPC message with <field_number> as field number + (dotted notation) if <field_type> is not present, or as an integer sample if this + field is present. + The list of the authorized types is the following one: "int32", "int64", "uint32", + "uint64", "sint32", "sint64", "bool", "enum" for the "varint" wire type 0 + "fixed64", "sfixed64", "double" for the 64bit wire type 1, "fixed32", "sfixed32", + "float" for the wire type 5. Note that "string" is considered as a length-delimited + type, so it does not require any <field_type> argument to be extracted. + More information may be found here about the protocol buffers message field types: + https://developers.google.com/protocol-buffers/docs/encoding + + Example: + // with such a protocol buffer .proto file content adapted from + // https://github.com/grpc/grpc/blob/master/examples/protos/route_guide.proto + + message Point { + int32 latitude = 1; + int32 longitude = 2; + } + + message PPoint { + Point point = 59; + } + + message Rectangle { + // One corner of the rectangle. + PPoint lo = 48; + // The other corner of the rectangle. + PPoint hi = 49; + } + + let's say a body request is made of a "Rectangle" object value (two PPoint + protocol buffers messages), the four protocol buffers fields could be + extracted with these "ungrpc" directives: + + req.body,ungrpc(48.59.1,int32) # "latitude" of "lo" first PPoint + req.body,ungrpc(48.59.2,int32) # "longitude" of "lo" first PPoint + req.body,ungrpc(49.59.1,int32) # "latitude" of "hi" second PPoint + req.body,ungrpc(49.59.2,int32) # "longitude" of "hi" second PPoint + + We could also extract the intermediary 48.59 field as a binary sample as follows: + + req.body,ungrpc(48.59) + + As a gRPC message is always made of a gRPC header followed by protocol buffers + messages, in the previous example the "latitude" of "lo" first PPoint + could be extracted with these equivalent directives: + + req.body,ungrpc(48.59),protobuf(1,int32) + req.body,ungrpc(48),protobuf(59.1,int32) + req.body,ungrpc(48),protobuf(59),protobuf(1,int32) + + Note that the first convert must be "ungrpc", the remaining ones must be + "protobuf" and only the last one may have or not a second argument to + interpret the previous binary sample. + + +unset-var(<var>) + Unsets a variable if the input content is defined. The name of the variable + starts with an indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and + response), + "req" : the variable is shared only during request processing, + "res" : the variable is shared only during response processing. + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +utime(<format>[,<offset>]) + Converts an integer supposed to contain a date since epoch to a string + representing this date in UTC time using a format defined by the <format> + string using strftime(3). The purpose is to allow any date format to be used + in logs. An optional <offset> in seconds may be applied to the input date + (positive or negative). See the strftime() man page for the format supported + by your operating system. See also the ltime converter. + + Example : + + # Emit two colons, one with the UTC time and another with ip:port + # e.g. 20140710162350 127.0.0.1:57325 + log-format %[date,utime(%Y%m%d%H%M%S)]\ %ci:%cp + +word(<index>,<delimiters>[,<count>]) + Extracts the nth word counting from the beginning (positive index) or from + the end (negative index) considering given delimiters from an input string. + Indexes start at 1 or -1 and delimiters are a string formatted list of chars. + Delimiters at the beginning or end of the input string are ignored. + Optionally you can specify <count> of words to extract (default: 1). + Value of 0 indicates extraction of all remaining words. + + Example : + str(f1_f2_f3__f5),word(4,_) # f5 + str(f1_f2_f3__f5),word(2,_,0) # f2_f3__f5 + str(f1_f2_f3__f5),word(3,_,2) # f3__f5 + str(f1_f2_f3__f5),word(-2,_,3) # f1_f2_f3 + str(f1_f2_f3__f5),word(-3,_,0) # f1_f2 + str(/f1/f2/f3/f4),word(1,/) # f1 + +wt6([<avalanche>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the WT6 + hash function. Optionally, it is possible to apply a full avalanche hash + function to the output if the optional <avalanche> argument equals 1. This + converter uses the same functions as used by the various hash-based load + balancing algorithms, so it will provide exactly the same results. It is + mostly intended for debugging, but can be used as a stick-table entry to + collect rough statistics. It must not be used for security purposes as a + 32-bit hash is trivial to break. See also "crc32", "djb2", "sdbm", "crc32c", + and the "hash-type" directive. + +xor(<value>) + Performs a bitwise "XOR" (exclusive OR) between <value> and the input value + of type signed integer, and returns the result as an signed integer. + <value> can be a numeric value or a variable name. The name of the variable + starts with an indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and + response), + "req" : the variable is shared only during request processing, + "res" : the variable is shared only during response processing. + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +xxh3([<seed>]) + Hashes a binary input sample into a signed 64-bit quantity using the XXH3 + 64-bit variant of the XXhash hash function. This hash supports a seed which + defaults to zero but a different value maybe passed as the <seed> argument. + This hash is known to be very good and very fast so it can be used to hash + URLs and/or URL parameters for use as stick-table keys to collect statistics + with a low collision rate, though care must be taken as the algorithm is not + considered as cryptographically secure. + +xxh32([<seed>]) + Hashes a binary input sample into an unsigned 32-bit quantity using the 32-bit + variant of the XXHash hash function. This hash supports a seed which defaults + to zero but a different value maybe passed as the <seed> argument. This hash + is known to be very good and very fast so it can be used to hash URLs and/or + URL parameters for use as stick-table keys to collect statistics with a low + collision rate, though care must be taken as the algorithm is not considered + as cryptographically secure. + +xxh64([<seed>]) + Hashes a binary input sample into a signed 64-bit quantity using the 64-bit + variant of the XXHash hash function. This hash supports a seed which defaults + to zero but a different value maybe passed as the <seed> argument. This hash + is known to be very good and very fast so it can be used to hash URLs and/or + URL parameters for use as stick-table keys to collect statistics with a low + collision rate, though care must be taken as the algorithm is not considered + as cryptographically secure. + + +7.3.2. Fetching samples from internal states +-------------------------------------------- + +A first set of sample fetch methods applies to internal information which does +not even relate to any client information. These ones are sometimes used with +"monitor-fail" directives to report an internal status to external watchers. +The sample fetch methods described in this section are usable anywhere. + +always_false : boolean + Always returns the boolean "false" value. It may be used with ACLs as a + temporary replacement for another one when adjusting configurations. + +always_true : boolean + Always returns the boolean "true" value. It may be used with ACLs as a + temporary replacement for another one when adjusting configurations. + +avg_queue([<backend>]) : integer + Returns the total number of queued connections of the designated backend + divided by the number of active servers. The current backend is used if no + backend is specified. This is very similar to "queue" except that the size of + the farm is considered, in order to give a more accurate measurement of the + time it may take for a new connection to be processed. The main usage is with + ACL to return a sorry page to new users when it becomes certain they will get + a degraded service, or to pass to the backend servers in a header so that + they decide to work in degraded mode or to disable some functions to speed up + the processing a bit. Note that in the event there would not be any active + server anymore, twice the number of queued connections would be considered as + the measured value. This is a fair estimate, as we expect one server to get + back soon anyway, but we still prefer to send new traffic to another backend + if in better shape. See also the "queue", "be_conn", and "be_sess_rate" + sample fetches. + +be_conn([<backend>]) : integer + Applies to the number of currently established connections on the backend, + possibly including the connection being evaluated. If no backend name is + specified, the current one is used. But it is also possible to check another + backend. It can be used to use a specific farm when the nominal one is full. + See also the "fe_conn", "queue", "be_conn_free", and "be_sess_rate" criteria. + +be_conn_free([<backend>]) : integer + Returns an integer value corresponding to the number of available connections + across available servers in the backend. Queue slots are not included. Backup + servers are also not included, unless all other servers are down. If no + backend name is specified, the current one is used. But it is also possible + to check another backend. It can be used to use a specific farm when the + nominal one is full. See also the "be_conn", "connslots", and "srv_conn_free" + criteria. + + OTHER CAVEATS AND NOTES: if any of the server maxconn, or maxqueue is 0 + (meaning unlimited), then this fetch clearly does not make sense, in which + case the value returned will be -1. + +be_sess_rate([<backend>]) : integer + Returns an integer value corresponding to the sessions creation rate on the + backend, in number of new sessions per second. This is used with ACLs to + switch to an alternate backend when an expensive or fragile one reaches too + high a session rate, or to limit abuse of service (e.g. prevent sucking of an + online dictionary). It can also be useful to add this element to logs using a + log-format directive. + + Example : + # Redirect to an error page if the dictionary is requested too often + backend dynamic + mode http + acl being_scanned be_sess_rate gt 100 + redirect location /denied.html if being_scanned + +bin(<hex>) : bin + Returns a binary chain. The input is the hexadecimal representation + of the string. + +bool(<bool>) : bool + Returns a boolean value. <bool> can be 'true', 'false', '1' or '0'. + 'false' and '0' are the same. 'true' and '1' are the same. + +connslots([<backend>]) : integer + Returns an integer value corresponding to the number of connection slots + still available in the backend, by totaling the maximum amount of + connections on all servers and the maximum queue size. This is probably only + used with ACLs. + + The basic idea here is to be able to measure the number of connection "slots" + still available (connection + queue), so that anything beyond that (intended + usage; see "use_backend" keyword) can be redirected to a different backend. + + 'connslots' = number of available server connection slots, + number of + available server queue slots. + + Note that while "fe_conn" may be used, "connslots" comes in especially + useful when you have a case of traffic going to one single ip, splitting into + multiple backends (perhaps using ACLs to do name-based load balancing) and + you want to be able to differentiate between different backends, and their + available "connslots". Also, whereas "nbsrv" only measures servers that are + actually *down*, this fetch is more fine-grained and looks into the number of + available connection slots as well. See also "queue" and "avg_queue". + + OTHER CAVEATS AND NOTES: at this point in time, the code does not take care + of dynamic connections. Also, if any of the server maxconn, or maxqueue is 0, + then this fetch clearly does not make sense, in which case the value returned + will be -1. + +cpu_calls : integer + Returns the number of calls to the task processing the stream or current + request since it was allocated. This number is reset for each new request on + the same connections in case of HTTP keep-alive. This value should usually be + low and stable (around 2 calls for a typically simple request) but may become + high if some processing (compression, caching or analysis) is performed. This + is purely for performance monitoring purposes. + +cpu_ns_avg : integer + Returns the average number of nanoseconds spent in each call to the task + processing the stream or current request. This number is reset for each new + request on the same connections in case of HTTP keep-alive. This value + indicates the overall cost of processing the request or the connection for + each call. There is no good nor bad value but the time spent in a call + automatically causes latency for other processing (see lat_ns_avg below), + and may affect other connection's apparent response time. Certain operations + like compression, complex regex matching or heavy Lua operations may directly + affect this value, and having it in the logs will make it easier to spot the + faulty processing that needs to be fixed to recover decent performance. + Note: this value is exactly cpu_ns_tot divided by cpu_calls. + +cpu_ns_tot : integer + Returns the total number of nanoseconds spent in each call to the task + processing the stream or current request. This number is reset for each new + request on the same connections in case of HTTP keep-alive. This value + indicates the overall cost of processing the request or the connection for + each call. There is no good nor bad value but the time spent in a call + automatically causes latency for other processing (see lat_ns_avg below), + induces CPU costs on the machine, and may affect other connection's apparent + response time. Certain operations like compression, complex regex matching or + heavy Lua operations may directly affect this value, and having it in the + logs will make it easier to spot the faulty processing that needs to be fixed + to recover decent performance. The value may be artificially high due to a + high cpu_calls count, for example when processing many HTTP chunks, and for + this reason it is often preferred to log cpu_ns_avg instead. + +date([<offset>],[<unit>]) : integer + Returns the current date as the epoch (number of seconds since 01/01/1970). + + If an offset value is specified, then it is added to the current date before + returning the value. This is particularly useful to compute relative dates, + as both positive and negative offsets are allowed. + It is useful combined with the http_date converter. + + <unit> is facultative, and can be set to "s" for seconds (default behavior), + "ms" for milliseconds or "us" for microseconds. + If unit is set, return value is an integer reflecting either seconds, + milliseconds or microseconds since epoch, plus offset. + It is useful when a time resolution of less than a second is needed. + + Example : + + # set an expires header to now+1 hour in every response + http-response set-header Expires %[date(3600),http_date] + + # set an expires header to now+1 hour in every response, with + # millisecond granularity + http-response set-header Expires %[date(3600000,ms),http_date(0,ms)] + +date_us : integer + Return the microseconds part of the date (the "second" part is returned by + date sample). This sample is coherent with the date sample as it is comes + from the same timeval structure. + +env(<name>) : string + Returns a string containing the value of environment variable <name>. As a + reminder, environment variables are per-process and are sampled when the + process starts. This can be useful to pass some information to a next hop + server, or with ACLs to take specific action when the process is started a + certain way. + + Examples : + # Pass the Via header to next hop with the local hostname in it + http-request add-header Via 1.1\ %[env(HOSTNAME)] + + # reject cookie-less requests when the STOP environment variable is set + http-request deny if !{ req.cook(SESSIONID) -m found } { env(STOP) -m found } + +fe_conn([<frontend>]) : integer + Returns the number of currently established connections on the frontend, + possibly including the connection being evaluated. If no frontend name is + specified, the current one is used. But it is also possible to check another + frontend. It can be used to return a sorry page before hard-blocking, or to + use a specific backend to drain new requests when the farm is considered + full. This is mostly used with ACLs but can also be used to pass some + statistics to servers in HTTP headers. See also the "dst_conn", "be_conn", + "fe_sess_rate" fetches. + +fe_req_rate([<frontend>]) : integer + Returns an integer value corresponding to the number of HTTP requests per + second sent to a frontend. This number can differ from "fe_sess_rate" in + situations where client-side keep-alive is enabled. + +fe_sess_rate([<frontend>]) : integer + Returns an integer value corresponding to the sessions creation rate on the + frontend, in number of new sessions per second. This is used with ACLs to + limit the incoming session rate to an acceptable range in order to prevent + abuse of service at the earliest moment, for example when combined with other + layer 4 ACLs in order to force the clients to wait a bit for the rate to go + down below the limit. It can also be useful to add this element to logs using + a log-format directive. See also the "rate-limit sessions" directive for use + in frontends. + + Example : + # This frontend limits incoming mails to 10/s with a max of 100 + # concurrent connections. We accept any connection below 10/s, and + # force excess clients to wait for 100 ms. Since clients are limited to + # 100 max, there cannot be more than 10 incoming mails per second. + frontend mail + bind :25 + mode tcp + maxconn 100 + acl too_fast fe_sess_rate ge 10 + tcp-request inspect-delay 100ms + tcp-request content accept if ! too_fast + tcp-request content accept if WAIT_END + +hostname : string + Returns the system hostname. + +int(<integer>) : signed integer + Returns a signed integer. + +ipv4(<ipv4>) : ipv4 + Returns an ipv4. + +ipv6(<ipv6>) : ipv6 + Returns an ipv6. + +last_rule_file : string + This returns the name of the configuration file containing the last final + rule that was matched during stream analysis. A final rule is one that + terminates the evaluation of the rule set (like an "accept", "deny" or + "redirect"). This works for TCP request and response rules acting on the + "content" rulesets, and on HTTP rules from "http-request", "http-response" + and "http-after-response" rule sets. The legacy "redirect" rulesets are not + supported (such information is not stored there), and neither "tcp-request + connection" nor "tcp-request session" rulesets are supported because the + information is stored at the stream level and streams do not exist during + these rules. The main purpose of this function is to be able to report in + logs where was the rule that gave the final verdict, in order to help + figure why a request was denied for example. See also "last_rule_line". + +last_rule_line : integer + This returns the line number in the configuration file where is located the + last final rule that was matched during stream analysis. A final rule is one + that terminates the evaluation of the rule set (like an "accept", "deny" or + "redirect"). This works for TCP request and response rules acting on the + "content" rulesets, and on HTTP rules from "http-request", "http-response" + and "http-after-response" rule sets. The legacy "redirect" rulesets are not + supported (such information is not stored there), and neither "tcp-request + connection" nor "tcp-request session" rulesets are supported because the + information is stored at the stream level and streams do not exist during + these rules. The main purpose of this function is to be able to report in + logs where was the rule that gave the final verdict, in order to help + figure why a request was denied for example. See also "last_rule_file". + +lat_ns_avg : integer + Returns the average number of nanoseconds spent between the moment the task + handling the stream is woken up and the moment it is effectively called. This + number is reset for each new request on the same connections in case of HTTP + keep-alive. This value indicates the overall latency inflicted to the current + request by all other requests being processed in parallel, and is a direct + indicator of perceived performance due to noisy neighbours. In order to keep + the value low, it is possible to reduce the scheduler's run queue depth using + "tune.runqueue-depth", to reduce the number of concurrent events processed at + once using "tune.maxpollevents", to decrease the stream's nice value using + the "nice" option on the "bind" lines or in the frontend, to enable low + latency scheduling using "tune.sched.low-latency", or to look for other heavy + requests in logs (those exhibiting large values of "cpu_ns_avg"), whose + processing needs to be adjusted or fixed. Compression of large buffers could + be a culprit, like heavy regex or long lists of regex. Note: this value is + exactly lat_ns_tot divided by cpu_calls. + +lat_ns_tot : integer + Returns the total number of nanoseconds spent between the moment the task + handling the stream is woken up and the moment it is effectively called. This + number is reset for each new request on the same connections in case of HTTP + keep-alive. This value indicates the overall latency inflicted to the current + request by all other requests being processed in parallel, and is a direct + indicator of perceived performance due to noisy neighbours. In order to keep + the value low, it is possible to reduce the scheduler's run queue depth using + "tune.runqueue-depth", to reduce the number of concurrent events processed at + once using "tune.maxpollevents", to decrease the stream's nice value using + the "nice" option on the "bind" lines or in the frontend, to enable low + latency scheduling using "tune.sched.low-latency", or to look for other heavy + requests in logs (those exhibiting large values of "cpu_ns_avg"), whose + processing needs to be adjusted or fixed. Compression of large buffers could + be a culprit, like heavy regex or long lists of regex. Note: while it + may intuitively seem that the total latency adds to a transfer time, it is + almost never true because while a task waits for the CPU, network buffers + continue to fill up and the next call will process more at once. The value + may be artificially high due to a high cpu_calls count, for example when + processing many HTTP chunks, and for this reason it is often preferred to log + lat_ns_avg instead, which is a more relevant performance indicator. + +meth(<method>) : method + Returns a method. + +nbsrv([<backend>]) : integer + Returns an integer value corresponding to the number of usable servers of + either the current backend or the named backend. This is mostly used with + ACLs but can also be useful when added to logs. This is normally used to + switch to an alternate backend when the number of servers is too low to + to handle some load. It is useful to report a failure when combined with + "monitor fail". + +prio_class : integer + Returns the priority class of the current session for http mode or connection + for tcp mode. The value will be that set by the last call to "http-request + set-priority-class" or "tcp-request content set-priority-class". + +prio_offset : integer + Returns the priority offset of the current session for http mode or + connection for tcp mode. The value will be that set by the last call to + "http-request set-priority-offset" or "tcp-request content + set-priority-offset". + +proc : integer + Always returns value 1 (historically it would return the calling process + number). + +queue([<backend>]) : integer + Returns the total number of queued connections of the designated backend, + including all the connections in server queues. If no backend name is + specified, the current one is used, but it is also possible to check another + one. This is useful with ACLs or to pass statistics to backend servers. This + can be used to take actions when queuing goes above a known level, generally + indicating a surge of traffic or a massive slowdown on the servers. One + possible action could be to reject new users but still accept old ones. See + also the "avg_queue", "be_conn", and "be_sess_rate" fetches. + +rand([<range>]) : integer + Returns a random integer value within a range of <range> possible values, + starting at zero. If the range is not specified, it defaults to 2^32, which + gives numbers between 0 and 4294967295. It can be useful to pass some values + needed to take some routing decisions for example, or just for debugging + purposes. This random must not be used for security purposes. + +srv_conn([<backend>/]<server>) : integer + Returns an integer value corresponding to the number of currently established + connections on the designated server, possibly including the connection being + evaluated. If <backend> is omitted, then the server is looked up in the + current backend. It can be used to use a specific farm when one server is + full, or to inform the server about our view of the number of active + connections with it. See also the "fe_conn", "be_conn", "queue", and + "srv_conn_free" fetch methods. + +srv_conn_free([<backend>/]<server>) : integer + Returns an integer value corresponding to the number of available connections + on the designated server, possibly including the connection being evaluated. + The value does not include queue slots. If <backend> is omitted, then the + server is looked up in the current backend. It can be used to use a specific + farm when one server is full, or to inform the server about our view of the + number of active connections with it. See also the "be_conn_free" and + "srv_conn" fetch methods. + + OTHER CAVEATS AND NOTES: If the server maxconn is 0, then this fetch clearly + does not make sense, in which case the value returned will be -1. + +srv_is_up([<backend>/]<server>) : boolean + Returns true when the designated server is UP, and false when it is either + DOWN or in maintenance mode. If <backend> is omitted, then the server is + looked up in the current backend. It is mainly used to take action based on + an external status reported via a health check (e.g. a geographical site's + availability). Another possible use which is more of a hack consists in + using dummy servers as boolean variables that can be enabled or disabled from + the CLI, so that rules depending on those ACLs can be tweaked in realtime. + +srv_queue([<backend>/]<server>) : integer + Returns an integer value corresponding to the number of connections currently + pending in the designated server's queue. If <backend> is omitted, then the + server is looked up in the current backend. It can sometimes be used together + with the "use-server" directive to force to use a known faster server when it + is not much loaded. See also the "srv_conn", "avg_queue" and "queue" sample + fetch methods. + +srv_sess_rate([<backend>/]<server>) : integer + Returns an integer corresponding to the sessions creation rate on the + designated server, in number of new sessions per second. If <backend> is + omitted, then the server is looked up in the current backend. This is mostly + used with ACLs but can make sense with logs too. This is used to switch to an + alternate backend when an expensive or fragile one reaches too high a session + rate, or to limit abuse of service (e.g. prevent latent requests from + overloading servers). + + Example : + # Redirect to a separate back + acl srv1_full srv_sess_rate(be1/srv1) gt 50 + acl srv2_full srv_sess_rate(be1/srv2) gt 50 + use_backend be2 if srv1_full or srv2_full + +srv_iweight([<backend>/]<server>) : integer + Returns an integer corresponding to the server's initial weight. If <backend> + is omitted, then the server is looked up in the current backend. See also + "srv_weight" and "srv_uweight". + +srv_uweight([<backend>/]<server>) : integer + Returns an integer corresponding to the user visible server's weight. If + <backend> is omitted, then the server is looked up in the current + backend. See also "srv_weight" and "srv_iweight". + +srv_weight([<backend>/]<server>) : integer + Returns an integer corresponding to the current (or effective) server's + weight. If <backend> is omitted, then the server is looked up in the current + backend. See also "srv_iweight" and "srv_uweight". + +stopping : boolean + Returns TRUE if the process calling the function is currently stopping. This + can be useful for logging, or for relaxing certain checks or helping close + certain connections upon graceful shutdown. + +str(<string>) : string + Returns a string. + +table_avl([<table>]) : integer + Returns the total number of available entries in the current proxy's + stick-table or in the designated stick-table. See also table_cnt. + +table_cnt([<table>]) : integer + Returns the total number of entries currently in use in the current proxy's + stick-table or in the designated stick-table. See also src_conn_cnt and + table_avl for other entry counting methods. + +thread : integer + Returns an integer value corresponding to the position of the thread calling + the function, between 0 and (global.nbthread-1). This is useful for logging + and debugging purposes. + +uuid([<version>]) : string + Returns a UUID following the RFC4122 standard. If the version is not + specified, a UUID version 4 (fully random) is returned. + Currently, only version 4 is supported. + +var(<var-name>[,<default>]) : undefined + Returns a variable with the stored type. If the variable is not set, the + sample fetch fails, unless a default value is provided, in which case it will + return it as a string. Empty strings are permitted. The name of the variable + starts with an indication about its scope. The scopes allowed are: + "proc" : the variable is shared with the whole process + "sess" : the variable is shared with the whole session + "txn" : the variable is shared with the transaction (request and + response), + "req" : the variable is shared only during request processing, + "res" : the variable is shared only during response processing. + This prefix is followed by a name. The separator is a '.'. The name may only + contain characters 'a-z', 'A-Z', '0-9', '.' and '_'. + +7.3.3. Fetching samples at Layer 4 +---------------------------------- + +The layer 4 usually describes just the transport layer which in HAProxy is +closest to the connection, where no content is yet made available. The fetch +methods described here are usable as low as the "tcp-request connection" rule +sets unless they require some future information. Those generally include +TCP/IP addresses and ports, as well as elements from stick-tables related to +the incoming connection. For retrieving a value from a sticky counters, the +counter number can be explicitly set as 0, 1, or 2 using the pre-defined +"sc0_", "sc1_", or "sc2_" prefix. These three pre-defined prefixes can only be +used if MAX_SESS_STKCTR value does not exceed 3, otherwise the counter number +can be specified as the first integer argument when using the "sc_" prefix. +Starting from "sc_0" to "sc_N" where N is (MAX_SESS_STKCTR-1). An optional +table may be specified with the "sc*" form, in which case the currently +tracked key will be looked up into this alternate table instead of the table +currently being tracked. + +bc_dst : ip + This is the destination ip address of the connection on the server side, + which is the server address HAProxy connected to. It is of type IP and works + on both IPv4 and IPv6 tables. On IPv6 tables, IPv4 address is mapped to its + IPv6 equivalent, according to RFC 4291. + +bc_dst_port : integer + Returns an integer value corresponding to the destination TCP port of the + connection on the server side, which is the port HAProxy connected to. + +bc_err : integer + Returns the ID of the error that might have occurred on the current backend + connection. See the "fc_err_str" fetch for a full list of error codes + and their corresponding error message. + +bc_err_str : string + Returns an error message describing what problem happened on the current + backend connection, resulting in a connection failure. See the + "fc_err_str" fetch for a full list of error codes and their + corresponding error message. + +bc_http_major : integer + Returns the backend connection's HTTP major version encoding, which may be 1 + for HTTP/0.9 to HTTP/1.1 or 2 for HTTP/2. Note, this is based on the on-wire + encoding and not the version present in the request header. + +bc_src : ip + This is the source ip address of the connection on the server side, which is + the server address HAProxy connected from. It is of type IP and works on both + IPv4 and IPv6 tables. On IPv6 tables, IPv4 addresses are mapped to their IPv6 + equivalent, according to RFC 4291. + +bc_src_port : integer + Returns an integer value corresponding to the TCP source port of the + connection on the server side, which is the port HAProxy connected from. + +be_id : integer + Returns an integer containing the current backend's id. It can be used in + frontends with responses to check which backend processed the request. It can + also be used in a tcp-check or an http-check ruleset. + +be_name : string + Returns a string containing the current backend's name. It can be used in + frontends with responses to check which backend processed the request. It can + also be used in a tcp-check or an http-check ruleset. + +be_server_timeout : integer + Returns the configuration value in millisecond for the server timeout of the + current backend. This timeout can be overwritten by a "set-timeout" rule. See + also the "cur_server_timeout". + +be_tunnel_timeout : integer + Returns the configuration value in millisecond for the tunnel timeout of the + current backend. This timeout can be overwritten by a "set-timeout" rule. See + also the "cur_tunnel_timeout". + +cur_server_timeout : integer + Returns the currently applied server timeout in millisecond for the stream. + In the default case, this will be equal to be_server_timeout unless a + "set-timeout" rule has been applied. See also "be_server_timeout". + +cur_tunnel_timeout : integer + Returns the currently applied tunnel timeout in millisecond for the stream. + In the default case, this will be equal to be_tunnel_timeout unless a + "set-timeout" rule has been applied. See also "be_tunnel_timeout". + +dst : ip + This is the destination IP address of the connection on the client side, + which is the address the client connected to. Any tcp/http rules may alter + this address. It can be useful when running in transparent mode. It is of + type IP and works on both IPv4 and IPv6 tables. On IPv6 tables, IPv4 address + is mapped to its IPv6 equivalent, according to RFC 4291. When the incoming + connection passed through address translation or redirection involving + connection tracking, the original destination address before the redirection + will be reported. On Linux systems, the source and destination may seldom + appear reversed if the nf_conntrack_tcp_loose sysctl is set, because a late + response may reopen a timed out connection and switch what is believed to be + the source and the destination. + +dst_conn : integer + Returns an integer value corresponding to the number of currently established + connections on the same socket including the one being evaluated. It is + normally used with ACLs but can as well be used to pass the information to + servers in an HTTP header or in logs. It can be used to either return a sorry + page before hard-blocking, or to use a specific backend to drain new requests + when the socket is considered saturated. This offers the ability to assign + different limits to different listening ports or addresses. See also the + "fe_conn" and "be_conn" fetches. + +dst_is_local : boolean + Returns true if the destination address of the incoming connection is local + to the system, or false if the address doesn't exist on the system, meaning + that it was intercepted in transparent mode. It can be useful to apply + certain rules by default to forwarded traffic and other rules to the traffic + targeting the real address of the machine. For example the stats page could + be delivered only on this address, or SSH access could be locally redirected. + Please note that the check involves a few system calls, so it's better to do + it only once per connection. + +dst_port : integer + Returns an integer value corresponding to the destination TCP port of the + connection on the client side, which is the port the client connected to. + Any tcp/http rules may alter this address. This might be used when running in + transparent mode, when assigning dynamic ports to some clients for a whole + application session, to stick all users to a same server, or to pass the + destination port information to a server using an HTTP header. + +fc_dst : ip + This is the original destination IP address of the connection on the client + side. Only "tcp-request connection" rules may alter this address. See "dst" + for details. + +fc_dst_is_local : boolean + Returns true if the original destination address of the incoming connection + is local to the system, or false if the address doesn't exist on the + system. See "dst_is_local" for details. + +fc_dst_port : integer + Returns an integer value corresponding to the original destination TCP port + of the connection on the client side. Only "tcp-request connection" rules may + alter this address. See "dst-port" for details. + +fc_err : integer + Returns the ID of the error that might have occurred on the current + connection. Any strictly positive value of this fetch indicates that the + connection did not succeed and would result in an error log being output (as + described in section 8.2.6). See the "fc_err_str" fetch for a full list of + error codes and their corresponding error message. + +fc_err_str : string + Returns an error message describing what problem happened on the current + connection, resulting in a connection failure. This string corresponds to the + "message" part of the error log format (see section 8.2.6). See below for a + full list of error codes and their corresponding error messages : + + +----+---------------------------------------------------------------------------+ + | ID | message | + +----+---------------------------------------------------------------------------+ + | 0 | "Success" | + | 1 | "Reached configured maxconn value" | + | 2 | "Too many sockets on the process" | + | 3 | "Too many sockets on the system" | + | 4 | "Out of system buffers" | + | 5 | "Protocol or address family not supported" | + | 6 | "General socket error" | + | 7 | "Source port range exhausted" | + | 8 | "Can't bind to source address" | + | 9 | "Out of local source ports on the system" | + | 10 | "Local source address already in use" | + | 11 | "Connection closed while waiting for PROXY protocol header" | + | 12 | "Connection error while waiting for PROXY protocol header" | + | 13 | "Timeout while waiting for PROXY protocol header" | + | 14 | "Truncated PROXY protocol header received" | + | 15 | "Received something which does not look like a PROXY protocol header" | + | 16 | "Received an invalid PROXY protocol header" | + | 17 | "Received an unhandled protocol in the PROXY protocol header" | + | 18 | "Connection closed while waiting for NetScaler Client IP header" | + | 19 | "Connection error while waiting for NetScaler Client IP header" | + | 20 | "Timeout while waiting for a NetScaler Client IP header" | + | 21 | "Truncated NetScaler Client IP header received" | + | 22 | "Received an invalid NetScaler Client IP magic number" | + | 23 | "Received an unhandled protocol in the NetScaler Client IP header" | + | 24 | "Connection closed during SSL handshake" | + | 25 | "Connection error during SSL handshake" | + | 26 | "Timeout during SSL handshake" | + | 27 | "Too many SSL connections" | + | 28 | "Out of memory when initializing an SSL connection" | + | 29 | "Rejected a client-initiated SSL renegotiation attempt" | + | 30 | "SSL client CA chain cannot be verified" | + | 31 | "SSL client certificate not trusted" | + | 32 | "Server presented an SSL certificate different from the configured one" | + | 33 | "Server presented an SSL certificate different from the expected one" | + | 34 | "SSL handshake failure" | + | 35 | "SSL handshake failure after heartbeat" | + | 36 | "Stopped a TLSv1 heartbeat attack (CVE-2014-0160)" | + | 37 | "Attempt to use SSL on an unknown target (internal error)" | + | 38 | "Server refused early data" | + | 39 | "SOCKS4 Proxy write error during handshake" | + | 40 | "SOCKS4 Proxy read error during handshake" | + | 41 | "SOCKS4 Proxy deny the request" | + | 42 | "SOCKS4 Proxy handshake aborted by server" | + | 43 | "SSL fatal error" | + +----+---------------------------------------------------------------------------+ + +fc_fackets : integer + Returns the fack counter measured by the kernel for the client + connection. If the server connection is not established, if the connection is + not TCP or if the operating system does not support TCP_INFO, for example + Linux kernels before 2.4, the sample fetch fails. + +fc_http_major : integer + Reports the front connection's HTTP major version encoding, which may be 1 + for HTTP/0.9 to HTTP/1.1 or 2 for HTTP/2. Note, this is based on the on-wire + encoding and not on the version present in the request header. + +fc_lost : integer + Returns the lost counter measured by the kernel for the client + connection. If the server connection is not established, if the connection is + not TCP or if the operating system does not support TCP_INFO, for example + Linux kernels before 2.4, the sample fetch fails. + +fc_pp_authority : string + Returns the authority TLV sent by the client in the PROXY protocol header, + if any. + +fc_pp_unique_id : string + Returns the unique ID TLV sent by the client in the PROXY protocol header, + if any. + +fc_rcvd_proxy : boolean + Returns true if the client initiated the connection with a PROXY protocol + header. + +fc_reordering : integer + Returns the reordering counter measured by the kernel for the client + connection. If the server connection is not established, if the connection is + not TCP or if the operating system does not support TCP_INFO, for example + Linux kernels before 2.4, the sample fetch fails. + +fc_retrans : integer + Returns the retransmits counter measured by the kernel for the client + connection. If the server connection is not established, if the connection is + not TCP or if the operating system does not support TCP_INFO, for example + Linux kernels before 2.4, the sample fetch fails. + +fc_rtt(<unit>) : integer + Returns the Round Trip Time (RTT) measured by the kernel for the client + connection. <unit> is facultative, by default the unit is milliseconds. <unit> + can be set to "ms" for milliseconds or "us" for microseconds. If the server + connection is not established, if the connection is not TCP or if the + operating system does not support TCP_INFO, for example Linux kernels before + 2.4, the sample fetch fails. + +fc_rttvar(<unit>) : integer + Returns the Round Trip Time (RTT) variance measured by the kernel for the + client connection. <unit> is facultative, by default the unit is milliseconds. + <unit> can be set to "ms" for milliseconds or "us" for microseconds. If the + server connection is not established, if the connection is not TCP or if the + operating system does not support TCP_INFO, for example Linux kernels before + 2.4, the sample fetch fails. + +fc_sacked : integer + Returns the sacked counter measured by the kernel for the client connection. + If the server connection is not established, if the connection is not TCP or + if the operating system does not support TCP_INFO, for example Linux kernels + before 2.4, the sample fetch fails. + +fc_src : ip + This is the original destination IP address of the connection on the client + side. Only "tcp-request connection" rules may alter this address. See "src" + for details. + +fc_src_is_local : boolean + Returns true if the source address of incoming connection is local to the + system, or false if the address doesn't exist on the system. See + "src_is_local" for details. + +fc_src_port : integer + + Returns an integer value corresponding to the TCP source port of the + connection on the client side. Only "tcp-request connection" rules may alter + this address. See "src-port" for details. + + +fc_unacked : integer + Returns the unacked counter measured by the kernel for the client connection. + If the server connection is not established, if the connection is not TCP or + if the operating system does not support TCP_INFO, for example Linux kernels + before 2.4, the sample fetch fails. + +fe_defbe : string + Returns a string containing the frontend's default backend name. It can be + used in frontends to check which backend will handle requests by default. + +fe_id : integer + Returns an integer containing the current frontend's id. It can be used in + backends to check from which frontend it was called, or to stick all users + coming via a same frontend to the same server. + +fe_name : string + Returns a string containing the current frontend's name. It can be used in + backends to check from which frontend it was called, or to stick all users + coming via a same frontend to the same server. + +fe_client_timeout : integer + Returns the configuration value in millisecond for the client timeout of the + current frontend. + +sc_bytes_in_rate(<ctr>[,<table>]) : integer +sc0_bytes_in_rate([<table>]) : integer +sc1_bytes_in_rate([<table>]) : integer +sc2_bytes_in_rate([<table>]) : integer + Returns the average client-to-server bytes rate from the currently tracked + counters, measured in amount of bytes over the period configured in the + table. See also src_bytes_in_rate. + +sc_bytes_out_rate(<ctr>[,<table>]) : integer +sc0_bytes_out_rate([<table>]) : integer +sc1_bytes_out_rate([<table>]) : integer +sc2_bytes_out_rate([<table>]) : integer + Returns the average server-to-client bytes rate from the currently tracked + counters, measured in amount of bytes over the period configured in the + table. See also src_bytes_out_rate. + +sc_clr_gpc(<idx>,<ctr>[,<table>]) : integer + Clears the General Purpose Counter at the index <idx> of the array + associated to the designated tracked counter of ID <ctr> from current + proxy's stick table or from the designated stick-table <table>, and + returns its previous value. <idx> is an integer between 0 and 99 and + <ctr> an integer between 0 and 2. + Before the first invocation, the stored value is zero, so first invocation + will always return zero. + This fetch applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). + +sc_clr_gpc0(<ctr>[,<table>]) : integer +sc0_clr_gpc0([<table>]) : integer +sc1_clr_gpc0([<table>]) : integer +sc2_clr_gpc0([<table>]) : integer + Clears the first General Purpose Counter associated to the currently tracked + counters, and returns its previous value. Before the first invocation, the + stored value is zero, so first invocation will always return zero. This is + typically used as a second ACL in an expression in order to mark a connection + when a first ACL was verified : + + Example: + # block if 5 consecutive requests continue to come faster than 10 sess + # per second, and reset the counter as soon as the traffic slows down. + acl abuse sc0_http_req_rate gt 10 + acl kill sc0_inc_gpc0 gt 5 + acl save sc0_clr_gpc0 ge 0 + tcp-request connection accept if !abuse save + tcp-request connection reject if abuse kill + +sc_clr_gpc1(<ctr>[,<table>]) : integer +sc0_clr_gpc1([<table>]) : integer +sc1_clr_gpc1([<table>]) : integer +sc2_clr_gpc1([<table>]) : integer + Clears the second General Purpose Counter associated to the currently tracked + counters, and returns its previous value. Before the first invocation, the + stored value is zero, so first invocation will always return zero. This is + typically used as a second ACL in an expression in order to mark a connection + when a first ACL was verified. + +sc_conn_cnt(<ctr>[,<table>]) : integer +sc0_conn_cnt([<table>]) : integer +sc1_conn_cnt([<table>]) : integer +sc2_conn_cnt([<table>]) : integer + Returns the cumulative number of incoming connections from currently tracked + counters. See also src_conn_cnt. + +sc_conn_cur(<ctr>[,<table>]) : integer +sc0_conn_cur([<table>]) : integer +sc1_conn_cur([<table>]) : integer +sc2_conn_cur([<table>]) : integer + Returns the current amount of concurrent connections tracking the same + tracked counters. This number is automatically incremented when tracking + begins and decremented when tracking stops. See also src_conn_cur. + +sc_conn_rate(<ctr>[,<table>]) : integer +sc0_conn_rate([<table>]) : integer +sc1_conn_rate([<table>]) : integer +sc2_conn_rate([<table>]) : integer + Returns the average connection rate from the currently tracked counters, + measured in amount of connections over the period configured in the table. + See also src_conn_rate. + +sc_get_gpc(<idx>,<ctr>[,<table>]) : integer + Returns the value of the General Purpose Counter at the index <idx> + in the GPC array and associated to the currently tracked counter of + ID <ctr> from the current proxy's stick-table or from the designated + stick-table <table>. <idx> is an integer between 0 and 99 and + <ctr> an integer between 0 and 2. If there is not gpc stored at this + index, zero is returned. + This fetch applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). See also src_get_gpc and sc_inc_gpc. + +sc_get_gpc0(<ctr>[,<table>]) : integer +sc0_get_gpc0([<table>]) : integer +sc1_get_gpc0([<table>]) : integer +sc2_get_gpc0([<table>]) : integer + Returns the value of the first General Purpose Counter associated to the + currently tracked counters. See also src_get_gpc0 and sc/sc0/sc1/sc2_inc_gpc0. + +sc_get_gpc1(<ctr>[,<table>]) : integer +sc0_get_gpc1([<table>]) : integer +sc1_get_gpc1([<table>]) : integer +sc2_get_gpc1([<table>]) : integer + Returns the value of the second General Purpose Counter associated to the + currently tracked counters. See also src_get_gpc1 and sc/sc0/sc1/sc2_inc_gpc1. + +sc_get_gpt(<idx>,<ctr>[,<table>]) : integer + Returns the value of the first General Purpose Tag at the index <idx> of + the array associated to the tracked counter of ID <ctr> and from the + current proxy's sitck-table or the designated stick-table <table>. <idx> + is an integer between 0 and 99 and <ctr> an integer between 0 and 2. + If there is no GPT stored at this index, zero is returned. + This fetch applies only to the 'gpt' array data_type (and not on + the legacy 'gpt0' data-type). See also src_get_gpt. + +sc_get_gpt0(<ctr>[,<table>]) : integer +sc0_get_gpt0([<table>]) : integer +sc1_get_gpt0([<table>]) : integer +sc2_get_gpt0([<table>]) : integer + Returns the value of the first General Purpose Tag associated to the + currently tracked counters. See also src_get_gpt0. + +sc_gpc_rate(<idx>,<ctr>[,<table>]) : integer + Returns the average increment rate of the General Purpose Counter at the + index <idx> of the array associated to the tracked counter of ID <ctr> from + the current proxy's table or from the designated stick-table <table>. + It reports the frequency which the gpc counter was incremented over the + configured period. <idx> is an integer between 0 and 99 and <ctr> an integer + between 0 and 2. + Note that the 'gpc_rate' counter array must be stored in the stick-table + for a value to be returned, as 'gpc' only holds the event count. + This fetch applies only to the 'gpc_rate' array data_type (and not to + the legacy 'gpc0_rate' nor 'gpc1_rate' data_types). + See also src_gpc_rate, sc_get_gpc, and sc_inc_gpc. + +sc_gpc0_rate(<ctr>[,<table>]) : integer +sc0_gpc0_rate([<table>]) : integer +sc1_gpc0_rate([<table>]) : integer +sc2_gpc0_rate([<table>]) : integer + Returns the average increment rate of the first General Purpose Counter + associated to the currently tracked counters. It reports the frequency + which the gpc0 counter was incremented over the configured period. See also + src_gpc0_rate, sc/sc0/sc1/sc2_get_gpc0, and sc/sc0/sc1/sc2_inc_gpc0. Note + that the "gpc0_rate" counter must be stored in the stick-table for a value to + be returned, as "gpc0" only holds the event count. + +sc_gpc1_rate(<ctr>[,<table>]) : integer +sc0_gpc1_rate([<table>]) : integer +sc1_gpc1_rate([<table>]) : integer +sc2_gpc1_rate([<table>]) : integer + Returns the average increment rate of the second General Purpose Counter + associated to the currently tracked counters. It reports the frequency + which the gpc1 counter was incremented over the configured period. See also + src_gpcA_rate, sc/sc0/sc1/sc2_get_gpc1, and sc/sc0/sc1/sc2_inc_gpc1. Note + that the "gpc1_rate" counter must be stored in the stick-table for a value to + be returned, as "gpc1" only holds the event count. + +sc_http_err_cnt(<ctr>[,<table>]) : integer +sc0_http_err_cnt([<table>]) : integer +sc1_http_err_cnt([<table>]) : integer +sc2_http_err_cnt([<table>]) : integer + Returns the cumulative number of HTTP errors from the currently tracked + counters. This includes the both request errors and 4xx error responses. + See also src_http_err_cnt. + +sc_http_err_rate(<ctr>[,<table>]) : integer +sc0_http_err_rate([<table>]) : integer +sc1_http_err_rate([<table>]) : integer +sc2_http_err_rate([<table>]) : integer + Returns the average rate of HTTP errors from the currently tracked counters, + measured in amount of errors over the period configured in the table. This + includes the both request errors and 4xx error responses. See also + src_http_err_rate. + +sc_http_fail_cnt(<ctr>[,<table>]) : integer +sc0_http_fail_cnt([<table>]) : integer +sc1_http_fail_cnt([<table>]) : integer +sc2_http_fail_cnt([<table>]) : integer + Returns the cumulative number of HTTP response failures from the currently + tracked counters. This includes the both response errors and 5xx status codes + other than 501 and 505. See also src_http_fail_cnt. + +sc_http_fail_rate(<ctr>[,<table>]) : integer +sc0_http_fail_rate([<table>]) : integer +sc1_http_fail_rate([<table>]) : integer +sc2_http_fail_rate([<table>]) : integer + Returns the average rate of HTTP response failures from the currently tracked + counters, measured in amount of failures over the period configured in the + table. This includes the both response errors and 5xx status codes other than + 501 and 505. See also src_http_fail_rate. + +sc_http_req_cnt(<ctr>[,<table>]) : integer +sc0_http_req_cnt([<table>]) : integer +sc1_http_req_cnt([<table>]) : integer +sc2_http_req_cnt([<table>]) : integer + Returns the cumulative number of HTTP requests from the currently tracked + counters. This includes every started request, valid or not. See also + src_http_req_cnt. + +sc_http_req_rate(<ctr>[,<table>]) : integer +sc0_http_req_rate([<table>]) : integer +sc1_http_req_rate([<table>]) : integer +sc2_http_req_rate([<table>]) : integer + Returns the average rate of HTTP requests from the currently tracked + counters, measured in amount of requests over the period configured in + the table. This includes every started request, valid or not. See also + src_http_req_rate. + +sc_inc_gpc(<idx>,<ctr>[,<table>]) : integer + Increments the General Purpose Counter at the index <idx> of the array + associated to the designated tracked counter of ID <ctr> from current + proxy's stick table or from the designated stick-table <table>, and + returns its new value. <idx> is an integer between 0 and 99 and + <ctr> an integer between 0 and 2. + Before the first invocation, the stored value is zero, so first invocation + will increase it to 1 and will return 1. + This fetch applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). + +sc_inc_gpc0(<ctr>[,<table>]) : integer +sc0_inc_gpc0([<table>]) : integer +sc1_inc_gpc0([<table>]) : integer +sc2_inc_gpc0([<table>]) : integer + Increments the first General Purpose Counter associated to the currently + tracked counters, and returns its new value. Before the first invocation, + the stored value is zero, so first invocation will increase it to 1 and will + return 1. This is typically used as a second ACL in an expression in order + to mark a connection when a first ACL was verified : + + Example: + acl abuse sc0_http_req_rate gt 10 + acl kill sc0_inc_gpc0 gt 0 + tcp-request connection reject if abuse kill + +sc_inc_gpc1(<ctr>[,<table>]) : integer +sc0_inc_gpc1([<table>]) : integer +sc1_inc_gpc1([<table>]) : integer +sc2_inc_gpc1([<table>]) : integer + Increments the second General Purpose Counter associated to the currently + tracked counters, and returns its new value. Before the first invocation, + the stored value is zero, so first invocation will increase it to 1 and will + return 1. This is typically used as a second ACL in an expression in order + to mark a connection when a first ACL was verified. + +sc_kbytes_in(<ctr>[,<table>]) : integer +sc0_kbytes_in([<table>]) : integer +sc1_kbytes_in([<table>]) : integer +sc2_kbytes_in([<table>]) : integer + Returns the total amount of client-to-server data from the currently tracked + counters, measured in kilobytes. The test is currently performed on 32-bit + integers, which limits values to 4 terabytes. See also src_kbytes_in. + +sc_kbytes_out(<ctr>[,<table>]) : integer +sc0_kbytes_out([<table>]) : integer +sc1_kbytes_out([<table>]) : integer +sc2_kbytes_out([<table>]) : integer + Returns the total amount of server-to-client data from the currently tracked + counters, measured in kilobytes. The test is currently performed on 32-bit + integers, which limits values to 4 terabytes. See also src_kbytes_out. + +sc_sess_cnt(<ctr>[,<table>]) : integer +sc0_sess_cnt([<table>]) : integer +sc1_sess_cnt([<table>]) : integer +sc2_sess_cnt([<table>]) : integer + Returns the cumulative number of incoming connections that were transformed + into sessions, which means that they were accepted by a "tcp-request + connection" rule, from the currently tracked counters. A backend may count + more sessions than connections because each connection could result in many + backend sessions if some HTTP keep-alive is performed over the connection + with the client. See also src_sess_cnt. + +sc_sess_rate(<ctr>[,<table>]) : integer +sc0_sess_rate([<table>]) : integer +sc1_sess_rate([<table>]) : integer +sc2_sess_rate([<table>]) : integer + Returns the average session rate from the currently tracked counters, + measured in amount of sessions over the period configured in the table. A + session is a connection that got past the early "tcp-request connection" + rules. A backend may count more sessions than connections because each + connection could result in many backend sessions if some HTTP keep-alive is + performed over the connection with the client. See also src_sess_rate. + +sc_tracked(<ctr>[,<table>]) : boolean +sc0_tracked([<table>]) : boolean +sc1_tracked([<table>]) : boolean +sc2_tracked([<table>]) : boolean + Returns true if the designated session counter is currently being tracked by + the current session. This can be useful when deciding whether or not we want + to set some values in a header passed to the server. + +sc_trackers(<ctr>[,<table>]) : integer +sc0_trackers([<table>]) : integer +sc1_trackers([<table>]) : integer +sc2_trackers([<table>]) : integer + Returns the current amount of concurrent connections tracking the same + tracked counters. This number is automatically incremented when tracking + begins and decremented when tracking stops. It differs from sc0_conn_cur in + that it does not rely on any stored information but on the table's reference + count (the "use" value which is returned by "show table" on the CLI). This + may sometimes be more suited for layer7 tracking. It can be used to tell a + server how many concurrent connections there are from a given address for + example. + +so_id : integer + Returns an integer containing the current listening socket's id. It is useful + in frontends involving many "bind" lines, or to stick all users coming via a + same socket to the same server. + +so_name : string + Returns a string containing the current listening socket's name, as defined + with name on a "bind" line. It can serve the same purposes as so_id but with + strings instead of integers. + +src : ip + This is the source IP address of the client of the session. Any tcp/http + rules may alter this address. It is of type IP and works on both IPv4 and + IPv6 tables. On IPv6 tables, IPv4 addresses are mapped to their IPv6 + equivalent, according to RFC 4291. Note that it is the TCP-level source + address which is used, and not the address of a client behind a + proxy. However if the "accept-proxy" or "accept-netscaler-cip" bind directive + is used, it can be the address of a client behind another PROXY-protocol + compatible component for all rule sets except "tcp-request connection" which + sees the real address. When the incoming connection passed through address + translation or redirection involving connection tracking, the original + destination address before the redirection will be reported. On Linux + systems, the source and destination may seldom appear reversed if the + nf_conntrack_tcp_loose sysctl is set, because a late response may reopen a + timed out connection and switch what is believed to be the source and the + destination. + + Example: + # add an HTTP header in requests with the originating address' country + http-request set-header X-Country %[src,map_ip(geoip.lst)] + +src_bytes_in_rate([<table>]) : integer + Returns the average bytes rate from the incoming connection's source address + in the current proxy's stick-table or in the designated stick-table, measured + in amount of bytes over the period configured in the table. If the address is + not found, zero is returned. See also sc/sc0/sc1/sc2_bytes_in_rate. + +src_bytes_out_rate([<table>]) : integer + Returns the average bytes rate to the incoming connection's source address in + the current proxy's stick-table or in the designated stick-table, measured in + amount of bytes over the period configured in the table. If the address is + not found, zero is returned. See also sc/sc0/sc1/sc2_bytes_out_rate. + +src_clr_gpc(<idx>,[<table>]) : integer + Clears the General Purpose Counter at the index <idx> of the array + associated to the incoming connection's source address in the current proxy's + stick-table or in the designated stick-table <table>, and returns its + previous value. <idx> is an integer between 0 and 99. + If the address is not found, an entry is created and 0 is returned. + This fetch applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). + See also sc_clr_gpc. + +src_clr_gpc0([<table>]) : integer + Clears the first General Purpose Counter associated to the incoming + connection's source address in the current proxy's stick-table or in the + designated stick-table, and returns its previous value. If the address is not + found, an entry is created and 0 is returned. This is typically used as a + second ACL in an expression in order to mark a connection when a first ACL + was verified : + + Example: + # block if 5 consecutive requests continue to come faster than 10 sess + # per second, and reset the counter as soon as the traffic slows down. + acl abuse src_http_req_rate gt 10 + acl kill src_inc_gpc0 gt 5 + acl save src_clr_gpc0 ge 0 + tcp-request connection accept if !abuse save + tcp-request connection reject if abuse kill + +src_clr_gpc1([<table>]) : integer + Clears the second General Purpose Counter associated to the incoming + connection's source address in the current proxy's stick-table or in the + designated stick-table, and returns its previous value. If the address is not + found, an entry is created and 0 is returned. This is typically used as a + second ACL in an expression in order to mark a connection when a first ACL + was verified. + +src_conn_cnt([<table>]) : integer + Returns the cumulative number of connections initiated from the current + incoming connection's source address in the current proxy's stick-table or in + the designated stick-table. If the address is not found, zero is returned. + See also sc/sc0/sc1/sc2_conn_cnt. + +src_conn_cur([<table>]) : integer + Returns the current amount of concurrent connections initiated from the + current incoming connection's source address in the current proxy's + stick-table or in the designated stick-table. If the address is not found, + zero is returned. See also sc/sc0/sc1/sc2_conn_cur. + +src_conn_rate([<table>]) : integer + Returns the average connection rate from the incoming connection's source + address in the current proxy's stick-table or in the designated stick-table, + measured in amount of connections over the period configured in the table. If + the address is not found, zero is returned. See also sc/sc0/sc1/sc2_conn_rate. + +src_get_gpc(<idx>,[<table>]) : integer + Returns the value of the General Purpose Counter at the index <idx> of the + array associated to the incoming connection's source address in the + current proxy's stick-table or in the designated stick-table <table>. <idx> + is an integer between 0 and 99. + If the address is not found or there is no gpc stored at this index, zero + is returned. + This fetch applies only to the 'gpc' array data_type (and not on the legacy + 'gpc0' nor 'gpc1' data_types). + See also sc_get_gpc and src_inc_gpc. + +src_get_gpc0([<table>]) : integer + Returns the value of the first General Purpose Counter associated to the + incoming connection's source address in the current proxy's stick-table or in + the designated stick-table. If the address is not found, zero is returned. + See also sc/sc0/sc1/sc2_get_gpc0 and src_inc_gpc0. + +src_get_gpc1([<table>]) : integer + Returns the value of the second General Purpose Counter associated to the + incoming connection's source address in the current proxy's stick-table or in + the designated stick-table. If the address is not found, zero is returned. + See also sc/sc0/sc1/sc2_get_gpc1 and src_inc_gpc1. + +src_get_gpt(<idx>[,<table>]) : integer + Returns the value of the first General Purpose Tag at the index <idx> of + the array associated to the incoming connection's source address in the + current proxy's stick-table or in the designated stick-table <table>. + <idx> is an integer between 0 and 99. + If the address is not found or the GPT is not stored, zero is returned. + See also the sc_get_gpt sample fetch keyword. + +src_get_gpt0([<table>]) : integer + Returns the value of the first General Purpose Tag associated to the + incoming connection's source address in the current proxy's stick-table or in + the designated stick-table. If the address is not found, zero is returned. + See also sc/sc0/sc1/sc2_get_gpt0. + +src_gpc_rate(<idx>[,<table>]) : integer + Returns the average increment rate of the General Purpose Counter at the + index <idx> of the array associated to the incoming connection's + source address in the current proxy's stick-table or in the designated + stick-table <table>. It reports the frequency which the gpc counter was + incremented over the configured period. <idx> is an integer between 0 and 99. + Note that the 'gpc_rate' counter must be stored in the stick-table for a + value to be returned, as 'gpc' only holds the event count. + This fetch applies only to the 'gpc_rate' array data_type (and not to + the legacy 'gpc0_rate' nor 'gpc1_rate' data_types). + See also sc_gpc_rate, src_get_gpc, and sc_inc_gpc. + +src_gpc0_rate([<table>]) : integer + Returns the average increment rate of the first General Purpose Counter + associated to the incoming connection's source address in the current proxy's + stick-table or in the designated stick-table. It reports the frequency + which the gpc0 counter was incremented over the configured period. See also + sc/sc0/sc1/sc2_gpc0_rate, src_get_gpc0, and sc/sc0/sc1/sc2_inc_gpc0. Note + that the "gpc0_rate" counter must be stored in the stick-table for a value to + be returned, as "gpc0" only holds the event count. + +src_gpc1_rate([<table>]) : integer + Returns the average increment rate of the second General Purpose Counter + associated to the incoming connection's source address in the current proxy's + stick-table or in the designated stick-table. It reports the frequency + which the gpc1 counter was incremented over the configured period. See also + sc/sc0/sc1/sc2_gpc1_rate, src_get_gpc1, and sc/sc0/sc1/sc2_inc_gpc1. Note + that the "gpc1_rate" counter must be stored in the stick-table for a value to + be returned, as "gpc1" only holds the event count. + +src_http_err_cnt([<table>]) : integer + Returns the cumulative number of HTTP errors from the incoming connection's + source address in the current proxy's stick-table or in the designated + stick-table. This includes the both request errors and 4xx error responses. + See also sc/sc0/sc1/sc2_http_err_cnt. If the address is not found, zero is + returned. + +src_http_err_rate([<table>]) : integer + Returns the average rate of HTTP errors from the incoming connection's source + address in the current proxy's stick-table or in the designated stick-table, + measured in amount of errors over the period configured in the table. This + includes the both request errors and 4xx error responses. If the address is + not found, zero is returned. See also sc/sc0/sc1/sc2_http_err_rate. + +src_http_fail_cnt([<table>]) : integer + Returns the cumulative number of HTTP response failures triggered by the + incoming connection's source address in the current proxy's stick-table or in + the designated stick-table. This includes the both response errors and 5xx + status codes other than 501 and 505. See also sc/sc0/sc1/sc2_http_fail_cnt. + If the address is not found, zero is returned. + +src_http_fail_rate([<table>]) : integer + Returns the average rate of HTTP response failures triggered by the incoming + connection's source address in the current proxy's stick-table or in the + designated stick-table, measured in amount of failures over the period + configured in the table. This includes the both response errors and 5xx + status codes other than 501 and 505. If the address is not found, zero is + returned. See also sc/sc0/sc1/sc2_http_fail_rate. + +src_http_req_cnt([<table>]) : integer + Returns the cumulative number of HTTP requests from the incoming connection's + source address in the current proxy's stick-table or in the designated stick- + table. This includes every started request, valid or not. If the address is + not found, zero is returned. See also sc/sc0/sc1/sc2_http_req_cnt. + +src_http_req_rate([<table>]) : integer + Returns the average rate of HTTP requests from the incoming connection's + source address in the current proxy's stick-table or in the designated stick- + table, measured in amount of requests over the period configured in the + table. This includes every started request, valid or not. If the address is + not found, zero is returned. See also sc/sc0/sc1/sc2_http_req_rate. + +src_inc_gpc(<idx>,[<table>]) : integer + Increments the General Purpose Counter at index <idx> of the array + associated to the incoming connection's source address in the current proxy's + stick-table or in the designated stick-table <table>, and returns its new + value. <idx> is an integer between 0 and 99. + If the address is not found, an entry is created and 1 is returned. + This fetch applies only to the 'gpc' array data_type (and not to the legacy + 'gpc0' nor 'gpc1' data_types). + See also sc_inc_gpc. + +src_inc_gpc0([<table>]) : integer + Increments the first General Purpose Counter associated to the incoming + connection's source address in the current proxy's stick-table or in the + designated stick-table, and returns its new value. If the address is not + found, an entry is created and 1 is returned. See also sc0/sc2/sc2_inc_gpc0. + This is typically used as a second ACL in an expression in order to mark a + connection when a first ACL was verified : + + Example: + acl abuse src_http_req_rate gt 10 + acl kill src_inc_gpc0 gt 0 + tcp-request connection reject if abuse kill + +src_inc_gpc1([<table>]) : integer + Increments the second General Purpose Counter associated to the incoming + connection's source address in the current proxy's stick-table or in the + designated stick-table, and returns its new value. If the address is not + found, an entry is created and 1 is returned. See also sc0/sc2/sc2_inc_gpc1. + This is typically used as a second ACL in an expression in order to mark a + connection when a first ACL was verified. + +src_is_local : boolean + Returns true if the source address of the incoming connection is local to the + system, or false if the address doesn't exist on the system, meaning that it + comes from a remote machine. Note that UNIX addresses are considered local. + It can be useful to apply certain access restrictions based on where the + client comes from (e.g. require auth or https for remote machines). Please + note that the check involves a few system calls, so it's better to do it only + once per connection. + +src_kbytes_in([<table>]) : integer + Returns the total amount of data received from the incoming connection's + source address in the current proxy's stick-table or in the designated + stick-table, measured in kilobytes. If the address is not found, zero is + returned. The test is currently performed on 32-bit integers, which limits + values to 4 terabytes. See also sc/sc0/sc1/sc2_kbytes_in. + +src_kbytes_out([<table>]) : integer + Returns the total amount of data sent to the incoming connection's source + address in the current proxy's stick-table or in the designated stick-table, + measured in kilobytes. If the address is not found, zero is returned. The + test is currently performed on 32-bit integers, which limits values to 4 + terabytes. See also sc/sc0/sc1/sc2_kbytes_out. + +src_port : integer + Returns an integer value corresponding to the TCP source port of the + connection on the client side, which is the port the client connected + from. Any tcp/http rules may alter this address. Usage of this function is + very limited as modern protocols do not care much about source ports + nowadays. + +src_sess_cnt([<table>]) : integer + Returns the cumulative number of connections initiated from the incoming + connection's source IPv4 address in the current proxy's stick-table or in the + designated stick-table, that were transformed into sessions, which means that + they were accepted by "tcp-request" rules. If the address is not found, zero + is returned. See also sc/sc0/sc1/sc2_sess_cnt. + +src_sess_rate([<table>]) : integer + Returns the average session rate from the incoming connection's source + address in the current proxy's stick-table or in the designated stick-table, + measured in amount of sessions over the period configured in the table. A + session is a connection that went past the early "tcp-request" rules. If the + address is not found, zero is returned. See also sc/sc0/sc1/sc2_sess_rate. + +src_updt_conn_cnt([<table>]) : integer + Creates or updates the entry associated to the incoming connection's source + address in the current proxy's stick-table or in the designated stick-table. + This table must be configured to store the "conn_cnt" data type, otherwise + the match will be ignored. The current count is incremented by one, and the + expiration timer refreshed. The updated count is returned, so this match + can't return zero. This was used to reject service abusers based on their + source address. Note: it is recommended to use the more complete "track-sc*" + actions in "tcp-request" rules instead. + + Example : + # This frontend limits incoming SSH connections to 3 per 10 second for + # each source address, and rejects excess connections until a 10 second + # silence is observed. At most 20 addresses are tracked. + listen ssh + bind :22 + mode tcp + maxconn 100 + stick-table type ip size 20 expire 10s store conn_cnt + tcp-request content reject if { src_updt_conn_cnt gt 3 } + server local 127.0.0.1:22 + +srv_id : integer + Returns an integer containing the server's id when processing the response. + While it's almost only used with ACLs, it may be used for logging or + debugging. It can also be used in a tcp-check or an http-check ruleset. + +srv_name : string + Returns a string containing the server's name when processing the response. + While it's almost only used with ACLs, it may be used for logging or + debugging. It can also be used in a tcp-check or an http-check ruleset. + +7.3.4. Fetching samples at Layer 5 +---------------------------------- + +The layer 5 usually describes just the session layer which in HAProxy is +closest to the session once all the connection handshakes are finished, but +when no content is yet made available. The fetch methods described here are +usable as low as the "tcp-request content" rule sets unless they require some +future information. Those generally include the results of SSL negotiations. + +51d.all(<prop>[,<prop>*]) : string + Returns values for the properties requested as a string, where values are + separated by the delimiter specified with "51degrees-property-separator". + The device is identified using all the important HTTP headers from the + request. The function can be passed up to five property names, and if a + property name can't be found, the value "NoData" is returned. + + Example : + # Here the header "X-51D-DeviceTypeMobileTablet" is added to the request + # containing the three properties requested using all relevant headers from + # the request. + frontend http-in + bind *:8081 + default_backend servers + http-request set-header X-51D-DeviceTypeMobileTablet \ + %[51d.all(DeviceType,IsMobile,IsTablet)] + +ssl_bc : boolean + Returns true when the back connection was made via an SSL/TLS transport + layer and is locally deciphered. This means the outgoing connection was made + other a server with the "ssl" option. It can be used in a tcp-check or an + http-check ruleset. + +ssl_bc_alg_keysize : integer + Returns the symmetric cipher key size supported in bits when the outgoing + connection was made over an SSL/TLS transport layer. It can be used in a + tcp-check or an http-check ruleset. + +ssl_bc_alpn : string + This extracts the Application Layer Protocol Negotiation field from an + outgoing connection made via a TLS transport layer. + The result is a string containing the protocol name negotiated with the + server. The SSL library must have been built with support for TLS + extensions enabled (check haproxy -vv). Note that the TLS ALPN extension is + not advertised unless the "alpn" keyword on the "server" line specifies a + protocol list. Also, nothing forces the server to pick a protocol from this + list, any other one may be requested. The TLS ALPN extension is meant to + replace the TLS NPN extension. See also "ssl_bc_npn". It can be used in a + tcp-check or an http-check ruleset. + +ssl_bc_cipher : string + Returns the name of the used cipher when the outgoing connection was made + over an SSL/TLS transport layer. It can be used in a tcp-check or an + http-check ruleset. + +ssl_bc_client_random : binary + Returns the client random of the back connection when the incoming connection + was made over an SSL/TLS transport layer. It is useful to to decrypt traffic + sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or BoringSSL. + It can be used in a tcp-check or an http-check ruleset. + +ssl_bc_err : integer + When the outgoing connection was made over an SSL/TLS transport layer, + returns the ID of the last error of the first error stack raised on the + backend side. It can raise handshake errors as well as other read or write + errors occurring during the connection's lifetime. In order to get a text + description of this error code, you can either use the "ssl_bc_err_str" + sample fetch or use the "openssl errstr" command (which takes an error code + in hexadecimal representation as parameter). Please refer to your SSL + library's documentation to find the exhaustive list of error codes. + +ssl_bc_err_str : string + When the outgoing connection was made over an SSL/TLS transport layer, + returns a string representation of the last error of the first error stack + that was raised on the connection from the backend's perspective. See also + "ssl_fc_err". + +ssl_bc_is_resumed : boolean + Returns true when the back connection was made over an SSL/TLS transport + layer and the newly created SSL session was resumed using a cached + session or a TLS ticket. It can be used in a tcp-check or an http-check + ruleset. + +ssl_bc_npn : string + This extracts the Next Protocol Negotiation field from an outgoing connection + made via a TLS transport layer. The result is a string containing the + protocol name negotiated with the server . The SSL library must have been + built with support for TLS extensions enabled (check haproxy -vv). Note that + the TLS NPN extension is not advertised unless the "npn" keyword on the + "server" line specifies a protocol list. Also, nothing forces the server to + pick a protocol from this list, any other one may be used. Please note that + the TLS NPN extension was replaced with ALPN. It can be used in a tcp-check + or an http-check ruleset. + +ssl_bc_protocol : string + Returns the name of the used protocol when the outgoing connection was made + over an SSL/TLS transport layer. It can be used in a tcp-check or an + http-check ruleset. + +ssl_bc_unique_id : binary + When the outgoing connection was made over an SSL/TLS transport layer, + returns the TLS unique ID as defined in RFC5929 section 3. The unique id + can be encoded to base64 using the converter: "ssl_bc_unique_id,base64". It + can be used in a tcp-check or an http-check ruleset. + +ssl_bc_server_random : binary + Returns the server random of the back connection when the incoming connection + was made over an SSL/TLS transport layer. It is useful to to decrypt traffic + sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or BoringSSL. + It can be used in a tcp-check or an http-check ruleset. + +ssl_bc_session_id : binary + Returns the SSL ID of the back connection when the outgoing connection was + made over an SSL/TLS transport layer. It is useful to log if we want to know + if session was reused or not. It can be used in a tcp-check or an http-check + ruleset. + +ssl_bc_session_key : binary + Returns the SSL session master key of the back connection when the outgoing + connection was made over an SSL/TLS transport layer. It is useful to decrypt + traffic sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or + BoringSSL. It can be used in a tcp-check or an http-check ruleset. + +ssl_bc_use_keysize : integer + Returns the symmetric cipher key size used in bits when the outgoing + connection was made over an SSL/TLS transport layer. It can be used in a + tcp-check or an http-check ruleset. + +ssl_c_ca_err : integer + When the incoming connection was made over an SSL/TLS transport layer, + returns the ID of the first error detected during verification of the client + certificate at depth > 0, or 0 if no error was encountered during this + verification process. Please refer to your SSL library's documentation to + find the exhaustive list of error codes. + +ssl_c_ca_err_depth : integer + When the incoming connection was made over an SSL/TLS transport layer, + returns the depth in the CA chain of the first error detected during the + verification of the client certificate. If no error is encountered, 0 is + returned. + +ssl_c_chain_der : binary + Returns the DER formatted chain certificate presented by the client when the + incoming connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. One + can parse the result with any lib accepting ASN.1 DER data. It currently + does not support resumed sessions. + +ssl_c_der : binary + Returns the DER formatted certificate presented by the client when the + incoming connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_c_err : integer + When the incoming connection was made over an SSL/TLS transport layer, + returns the ID of the first error detected during verification at depth 0, or + 0 if no error was encountered during this verification process. Please refer + to your SSL library's documentation to find the exhaustive list of error + codes. + +ssl_c_i_dn([<entry>[,<occ>[,<format>]]]) : string + When the incoming connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the issuer of the certificate + presented by the client when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_c_i_dn(OU,2)" the second organization unit, and + "ssl_c_i_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_c_i_dn(,0,rfc2253) + +ssl_c_key_alg : string + Returns the name of the algorithm used to generate the key of the certificate + presented by the client when the incoming connection was made over an SSL/TLS + transport layer. + +ssl_c_notafter : string + Returns the end date presented by the client as a formatted string + YYMMDDhhmmss[Z] when the incoming connection was made over an SSL/TLS + transport layer. + +ssl_c_notbefore : string + Returns the start date presented by the client as a formatted string + YYMMDDhhmmss[Z] when the incoming connection was made over an SSL/TLS + transport layer. + +ssl_c_s_dn([<entry>[,<occ>[,<format>]]]) : string + When the incoming connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the subject of the certificate + presented by the client when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_c_s_dn(OU,2)" the second organization unit, and + "ssl_c_s_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_c_s_dn(,0,rfc2253) + +ssl_c_serial : binary + Returns the serial of the certificate presented by the client when the + incoming connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_c_sha1 : binary + Returns the SHA-1 fingerprint of the certificate presented by the client when + the incoming connection was made over an SSL/TLS transport layer. This can be + used to stick a client to a server, or to pass this information to a server. + Note that the output is binary, so if you want to pass that signature to the + server, you need to encode it in hex or base64, such as in the example below: + + Example: + http-request set-header X-SSL-Client-SHA1 %[ssl_c_sha1,hex] + +ssl_c_sig_alg : string + Returns the name of the algorithm used to sign the certificate presented by + the client when the incoming connection was made over an SSL/TLS transport + layer. + +ssl_c_used : boolean + Returns true if current SSL session uses a client certificate even if current + connection uses SSL session resumption. See also "ssl_fc_has_crt". + +ssl_c_verify : integer + Returns the verify result error ID when the incoming connection was made over + an SSL/TLS transport layer, otherwise zero if no error is encountered. Please + refer to your SSL library's documentation for an exhaustive list of error + codes. + +ssl_c_version : integer + Returns the version of the certificate presented by the client when the + incoming connection was made over an SSL/TLS transport layer. + +ssl_f_der : binary + Returns the DER formatted certificate presented by the frontend when the + incoming connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_f_i_dn([<entry>[,<occ>[,<format>]]]) : string + When the incoming connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the issuer of the certificate + presented by the frontend when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_f_i_dn(OU,2)" the second organization unit, and + "ssl_f_i_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_f_i_dn(,0,rfc2253) + +ssl_f_key_alg : string + Returns the name of the algorithm used to generate the key of the certificate + presented by the frontend when the incoming connection was made over an + SSL/TLS transport layer. + +ssl_f_notafter : string + Returns the end date presented by the frontend as a formatted string + YYMMDDhhmmss[Z] when the incoming connection was made over an SSL/TLS + transport layer. + +ssl_f_notbefore : string + Returns the start date presented by the frontend as a formatted string + YYMMDDhhmmss[Z] when the incoming connection was made over an SSL/TLS + transport layer. + +ssl_f_s_dn([<entry>[,<occ>[,<format>]]]) : string + When the incoming connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the subject of the certificate + presented by the frontend when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_f_s_dn(OU,2)" the second organization unit, and + "ssl_f_s_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_f_s_dn(,0,rfc2253) + +ssl_f_serial : binary + Returns the serial of the certificate presented by the frontend when the + incoming connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_f_sha1 : binary + Returns the SHA-1 fingerprint of the certificate presented by the frontend + when the incoming connection was made over an SSL/TLS transport layer. This + can be used to know which certificate was chosen using SNI. + +ssl_f_sig_alg : string + Returns the name of the algorithm used to sign the certificate presented by + the frontend when the incoming connection was made over an SSL/TLS transport + layer. + +ssl_f_version : integer + Returns the version of the certificate presented by the frontend when the + incoming connection was made over an SSL/TLS transport layer. + +ssl_fc : boolean + Returns true when the front connection was made via an SSL/TLS transport + layer and is locally deciphered. This means it has matched a socket declared + with a "bind" line having the "ssl" option. + + Example : + # This passes "X-Proto: https" to servers when client connects over SSL + listen http-https + bind :80 + bind :443 ssl crt /etc/haproxy.pem + http-request add-header X-Proto https if { ssl_fc } + +ssl_fc_alg_keysize : integer + Returns the symmetric cipher key size supported in bits when the incoming + connection was made over an SSL/TLS transport layer. + +ssl_fc_alpn : string + This extracts the Application Layer Protocol Negotiation field from an + incoming connection made via a TLS transport layer and locally deciphered by + HAProxy. The result is a string containing the protocol name advertised by + the client. The SSL library must have been built with support for TLS + extensions enabled (check haproxy -vv). Note that the TLS ALPN extension is + not advertised unless the "alpn" keyword on the "bind" line specifies a + protocol list. Also, nothing forces the client to pick a protocol from this + list, any other one may be requested. The TLS ALPN extension is meant to + replace the TLS NPN extension. See also "ssl_fc_npn". + +ssl_fc_cipher : string + Returns the name of the used cipher when the incoming connection was made + over an SSL/TLS transport layer. + +ssl_fc_cipherlist_bin([<filter_option>]) : binary + Returns the binary form of the client hello cipher list. The maximum + returned value length is limited by the shared capture buffer size + controlled by "tune.ssl.capture-buffer-size" setting. Setting + <filter_option> allows to filter returned data. Accepted values: + 0 : return the full list of ciphers (default) + 1 : exclude GREASE (RFC8701) values from the output + + Example: + http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\ + %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_extlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_eclist_bin(1),be2dec(-,2)],\ + %[ssl_fc_ecformats_bin,be2dec(-,1)] + acl is_malware req.fhdr(x-ssl-ja3),digest(md5),hex \ + -f /path/to/file/with/malware-ja3.lst + http-request set-header X-Malware True if is_malware + http-request set-header X-Malware False if !is_malware + +ssl_fc_cipherlist_hex([<filter_option>]) : string + Returns the binary form of the client hello cipher list encoded as + hexadecimal. The maximum returned value length is limited by the shared + capture buffer size controlled by "tune.ssl.capture-buffer-size" setting. + Setting <filter_option> allows to filter returned data. Accepted values: + 0 : return the full list of ciphers (default) + 1 : exclude GREASE (RFC8701) values from the output + +ssl_fc_cipherlist_str([<filter_option>]) : string + Returns the decoded text form of the client hello cipher list. The maximum + returned value length is limited by the shared capture buffer size + controlled by "tune.ssl.capture-buffer-size" setting. Setting + <filter_option> allows to filter returned data. Accepted values: + 0 : return the full list of ciphers (default) + 1 : exclude GREASE (RFC8701) values from the output + Note that this sample-fetch is only available with OpenSSL >= 1.0.2. If the + function is not enabled, this sample-fetch returns the hash like + "ssl_fc_cipherlist_xxh". + +ssl_fc_cipherlist_xxh : integer + Returns a xxh64 of the cipher list. This hash can return only if the value + "tune.ssl.capture-buffer-size" is set greater than 0, however the hash take + into account all the data of the cipher list. + +ssl_fc_ecformats_bin : binary + Return the binary form of the client hello supported elliptic curve point + formats. The maximum returned value length is limited by the shared capture + buffer size controlled by "tune.ssl.capture-buffer-size" setting. + + Example: + http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\ + %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_extlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_eclist_bin(1),be2dec(-,2)],\ + %[ssl_fc_ecformats_bin,be2dec(-,1)] + acl is_malware req.fhdr(x-ssl-ja3),digest(md5),hex \ + -f /path/to/file/with/malware-ja3.lst + http-request set-header X-Malware True if is_malware + http-request set-header X-Malware False if !is_malware + +ssl_fc_eclist_bin([<filter_option>]) : binary + Returns the binary form of the client hello supported elliptic curves. The + maximum returned value length is limited by the shared capture buffer size + controlled by "tune.ssl.capture-buffer-size" setting. Setting + <filter_option> allows to filter returned data. Accepted values: + 0 : return the full list of supported elliptic curves (default) + 1 : exclude GREASE (RFC8701) values from the output + + Example: + http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\ + %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_extlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_eclist_bin(1),be2dec(-,2)],\ + %[ssl_fc_ecformats_bin,be2dec(-,1)] + acl is_malware req.fhdr(x-ssl-ja3),digest(md5),hex \ + -f /path/to/file/with/malware-ja3.lst + http-request set-header X-Malware True if is_malware + http-request set-header X-Malware False if !is_malware + +ssl_fc_extlist_bin([<filter_option>]) : binary + Returns the binary form of the client hello extension list. The maximum + returned value length is limited by the shared capture buffer size + controlled by "tune.ssl.capture-buffer-size" setting. Setting + <filter_option> allows to filter returned data. Accepted values: + 0 : return the full list of extensions (default) + 1 : exclude GREASE (RFC8701) values from the output + + Example: + http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\ + %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_extlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_eclist_bin(1),be2dec(-,2)],\ + %[ssl_fc_ecformats_bin,be2dec(-,1)] + acl is_malware req.fhdr(x-ssl-ja3),digest(md5),hex \ + -f /path/to/file/with/malware-ja3.lst + http-request set-header X-Malware True if is_malware + http-request set-header X-Malware False if !is_malware + +ssl_fc_client_random : binary + Returns the client random of the front connection when the incoming connection + was made over an SSL/TLS transport layer. It is useful to to decrypt traffic + sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or BoringSSL. + +ssl_fc_client_early_traffic_secret : string + Return the CLIENT_EARLY_TRAFFIC_SECRET as an hexadecimal string for the + front connection when the incoming connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_client_handshake_traffic_secret : string + Return the CLIENT_HANDSHAKE_TRAFFIC_SECRET as an hexadecimal string for the + front connection when the incoming connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_client_traffic_secret_0 : string + Return the CLIENT_TRAFFIC_SECRET_0 as an hexadecimal string for the + front connection when the incoming connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_exporter_secret : string + Return the EXPORTER_SECRET as an hexadecimal string for the + front connection when the incoming connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_early_exporter_secret : string + Return the EARLY_EXPORTER_SECRET as an hexadecimal string for the + front connection when the incoming connection was made over an TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_err : integer + When the incoming connection was made over an SSL/TLS transport layer, + returns the ID of the last error of the first error stack raised on the + frontend side, or 0 if no error was encountered. It can be used to identify + handshake related errors other than verify ones (such as cipher mismatch), as + well as other read or write errors occurring during the connection's + lifetime. Any error happening during the client's certificate verification + process will not be raised through this fetch but via the existing + "ssl_c_err", "ssl_c_ca_err" and "ssl_c_ca_err_depth" fetches. In order to get + a text description of this error code, you can either use the + "ssl_fc_err_str" sample fetch or use the "openssl errstr" command (which + takes an error code in hexadecimal representation as parameter). Please refer + to your SSL library's documentation to find the exhaustive list of error + codes. + +ssl_fc_err_str : string + When the incoming connection was made over an SSL/TLS transport layer, + returns a string representation of the last error of the first error stack + that was raised on the frontend side. Any error happening during the client's + certificate verification process will not be raised through this fetch. See + also "ssl_fc_err". + +ssl_fc_has_crt : boolean + Returns true if a client certificate is present in an incoming connection over + SSL/TLS transport layer. Useful if 'verify' statement is set to 'optional'. + Note: on SSL session resumption with Session ID or TLS ticket, client + certificate is not present in the current connection but may be retrieved + from the cache or the ticket. So prefer "ssl_c_used" if you want to check if + current SSL session uses a client certificate. + +ssl_fc_has_early : boolean + Returns true if early data were sent, and the handshake didn't happen yet. As + it has security implications, it is useful to be able to refuse those, or + wait until the handshake happened. + +ssl_fc_has_sni : boolean + This checks for the presence of a Server Name Indication TLS extension (SNI) + in an incoming connection was made over an SSL/TLS transport layer. Returns + true when the incoming connection presents a TLS SNI field. This requires + that the SSL library is built with support for TLS extensions enabled (check + haproxy -vv). + +ssl_fc_is_resumed : boolean + Returns true if the SSL/TLS session has been resumed through the use of + SSL session cache or TLS tickets on an incoming connection over an SSL/TLS + transport layer. + +ssl_fc_npn : string + This extracts the Next Protocol Negotiation field from an incoming connection + made via a TLS transport layer and locally deciphered by HAProxy. The result + is a string containing the protocol name advertised by the client. The SSL + library must have been built with support for TLS extensions enabled (check + haproxy -vv). Note that the TLS NPN extension is not advertised unless the + "npn" keyword on the "bind" line specifies a protocol list. Also, nothing + forces the client to pick a protocol from this list, any other one may be + requested. Please note that the TLS NPN extension was replaced with ALPN. + +ssl_fc_protocol : string + Returns the name of the used protocol when the incoming connection was made + over an SSL/TLS transport layer. + +ssl_fc_protocol_hello_id : integer + The version of the TLS protocol by which the client wishes to communicate + during the session as indicated in client hello message. This value can + return only if the value "tune.ssl.capture-buffer-size" is set greater than + 0. + + Example: + http-request set-header X-SSL-JA3 %[ssl_fc_protocol_hello_id],\ + %[ssl_fc_cipherlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_extlist_bin(1),be2dec(-,2)],\ + %[ssl_fc_eclist_bin(1),be2dec(-,2)],\ + %[ssl_fc_ecformats_bin,be2dec(-,1)] + acl is_malware req.fhdr(x-ssl-ja3),digest(md5),hex \ + -f /path/to/file/with/malware-ja3.lst + http-request set-header X-Malware True if is_malware + http-request set-header X-Malware False if !is_malware + +ssl_fc_unique_id : binary + When the incoming connection was made over an SSL/TLS transport layer, + returns the TLS unique ID as defined in RFC5929 section 3. The unique id + can be encoded to base64 using the converter: "ssl_fc_unique_id,base64". + +ssl_fc_server_handshake_traffic_secret : string + Return the SERVER_HANDSHAKE_TRAFFIC_SECRET as an hexadecimal string for the + front connection when the incoming connection was made over a TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_server_traffic_secret_0 : string + Return the SERVER_TRAFFIC_SECRET_0 as an hexadecimal string for the + front connection when the incoming connection was made over an TLS 1.3 + transport layer. + Require OpenSSL >= 1.1.1. This is one of the keys dumped by the OpenSSL + keylog callback to generate the SSLKEYLOGFILE. The SSL Key logging must be + activated with "tune.ssl.keylog on" in the global section. See also + "tune.ssl.keylog" + +ssl_fc_server_random : binary + Returns the server random of the front connection when the incoming connection + was made over an SSL/TLS transport layer. It is useful to to decrypt traffic + sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or BoringSSL. + +ssl_fc_session_id : binary + Returns the SSL ID of the front connection when the incoming connection was + made over an SSL/TLS transport layer. It is useful to stick a given client to + a server. It is important to note that some browsers refresh their session ID + every few minutes. + +ssl_fc_session_key : binary + Returns the SSL session master key of the front connection when the incoming + connection was made over an SSL/TLS transport layer. It is useful to decrypt + traffic sent using ephemeral ciphers. This requires OpenSSL >= 1.1.0, or + BoringSSL. + + +ssl_fc_sni : string + This extracts the Server Name Indication TLS extension (SNI) field from an + incoming connection made via an SSL/TLS transport layer and locally + deciphered by HAProxy. The result (when present) typically is a string + matching the HTTPS host name (253 chars or less). The SSL library must have + been built with support for TLS extensions enabled (check haproxy -vv). + + This fetch is different from "req.ssl_sni" above in that it applies to the + connection being deciphered by HAProxy and not to SSL contents being blindly + forwarded. See also "ssl_fc_sni_end" and "ssl_fc_sni_reg" below. This + requires that the SSL library is built with support for TLS extensions + enabled (check haproxy -vv). + + CAUTION! Except under very specific conditions, it is normally not correct to + use this field as a substitute for the HTTP "Host" header field. For example, + when forwarding an HTTPS connection to a server, the SNI field must be set + from the HTTP Host header field using "req.hdr(host)" and not from the front + SNI value. The reason is that SNI is solely used to select the certificate + the server side will present, and that clients are then allowed to send + requests with different Host values as long as they match the names in the + certificate. As such, "ssl_fc_sni" should normally not be used as an argument + to the "sni" server keyword, unless the backend works in TCP mode. + + ACL derivatives : + ssl_fc_sni_end : suffix match + ssl_fc_sni_reg : regex match + +ssl_fc_use_keysize : integer + Returns the symmetric cipher key size used in bits when the incoming + connection was made over an SSL/TLS transport layer. + +ssl_s_der : binary + Returns the DER formatted certificate presented by the server when the + outgoing connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_s_chain_der : binary + Returns the DER formatted chain certificate presented by the server when the + outgoing connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. One + can parse the result with any lib accepting ASN.1 DER data. It currently + does not support resumed sessions. + +ssl_s_key_alg : string + Returns the name of the algorithm used to generate the key of the certificate + presented by the server when the outgoing connection was made over an + SSL/TLS transport layer. + +ssl_s_notafter : string + Returns the end date presented by the server as a formatted string + YYMMDDhhmmss[Z] when the outgoing connection was made over an SSL/TLS + transport layer. + +ssl_s_notbefore : string + Returns the start date presented by the server as a formatted string + YYMMDDhhmmss[Z] when the outgoing connection was made over an SSL/TLS + transport layer. + +ssl_s_i_dn([<entry>[,<occ>[,<format>]]]) : string + When the outgoing connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the issuer of the certificate + presented by the server when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_s_i_dn(OU,2)" the second organization unit, and + "ssl_s_i_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_s_i_dn(,0,rfc2253) + +ssl_s_s_dn([<entry>[,<occ>[,<format>]]]) : string + When the outgoing connection was made over an SSL/TLS transport layer, + returns the full distinguished name of the subject of the certificate + presented by the server when no <entry> is specified, or the value of the + first given entry found from the beginning of the DN. If a positive/negative + occurrence number is specified as the optional second argument, it returns + the value of the nth given entry value from the beginning/end of the DN. + For instance, "ssl_s_s_dn(OU,2)" the second organization unit, and + "ssl_s_s_dn(CN)" retrieves the common name. + The <format> parameter allows you to receive the DN suitable for + consumption by different protocols. Currently supported is rfc2253 for + LDAP v3. + If you'd like to modify the format only you can specify an empty string + and zero for the first two parameters. Example: ssl_s_s_dn(,0,rfc2253) + +ssl_s_serial : binary + Returns the serial of the certificate presented by the server when the + outgoing connection was made over an SSL/TLS transport layer. When used for + an ACL, the value(s) to match against can be passed in hexadecimal form. + +ssl_s_sha1 : binary + Returns the SHA-1 fingerprint of the certificate presented by the server + when the outgoing connection was made over an SSL/TLS transport layer. This + can be used to know which certificate was chosen using SNI. + +ssl_s_sig_alg : string + Returns the name of the algorithm used to sign the certificate presented by + the server when the outgoing connection was made over an SSL/TLS transport + layer. + +ssl_s_version : integer + Returns the version of the certificate presented by the server when the + outgoing connection was made over an SSL/TLS transport layer. + +7.3.5. Fetching samples from buffer contents (Layer 6) +------------------------------------------------------ + +Fetching samples from buffer contents is a bit different from the previous +sample fetches above because the sampled data are ephemeral. These data can +only be used when they're available and will be lost when they're forwarded. +For this reason, samples fetched from buffer contents during a request cannot +be used in a response for example. Even while the data are being fetched, they +can change. Sometimes it is necessary to set some delays or combine multiple +sample fetch methods to ensure that the expected data are complete and usable, +for example through TCP request content inspection. Please see the "tcp-request +content" keyword for more detailed information on the subject. + +Warning : Following sample fetches are ignored if used from HTTP proxies. They + only deal with raw contents found in the buffers. On their side, + HTTP proxies use structured content. Thus raw representation of + these data are meaningless. A warning is emitted if an ACL relies on + one of the following sample fetches. But it is not possible to detect + all invalid usage (for instance inside a log-format string or a + sample expression). So be careful. + +distcc_body(<token>[,<occ>]) : binary + Parses a distcc message and returns the body associated to occurrence #<occ> + of the token <token>. Occurrences start at 1, and when unspecified, any may + match though in practice only the first one is checked for now. This can be + used to extract file names or arguments in files built using distcc through + HAProxy. Please refer to distcc's protocol documentation for the complete + list of supported tokens. + +distcc_param(<token>[,<occ>]) : integer + Parses a distcc message and returns the parameter associated to occurrence + #<occ> of the token <token>. Occurrences start at 1, and when unspecified, + any may match though in practice only the first one is checked for now. This + can be used to extract certain information such as the protocol version, the + file size or the argument in files built using distcc through HAProxy. + Another use case consists in waiting for the start of the preprocessed file + contents before connecting to the server to avoid keeping idle connections. + Please refer to distcc's protocol documentation for the complete list of + supported tokens. + + Example : + # wait up to 20s for the pre-processed file to be uploaded + tcp-request inspect-delay 20s + tcp-request content accept if { distcc_param(DOTI) -m found } + # send large files to the big farm + use_backend big_farm if { distcc_param(DOTI) gt 1000000 } + +payload(<offset>,<length>) : binary (deprecated) + This is an alias for "req.payload" when used in the context of a request (e.g. + "stick on", "stick match"), and for "res.payload" when used in the context of + a response such as in "stick store response". + +payload_lv(<offset1>,<length>[,<offset2>]) : binary (deprecated) + This is an alias for "req.payload_lv" when used in the context of a request + (e.g. "stick on", "stick match"), and for "res.payload_lv" when used in the + context of a response such as in "stick store response". + +req.len : integer +req_len : integer (deprecated) + Returns an integer value corresponding to the number of bytes present in the + request buffer. This is mostly used in ACL. It is important to understand + that this test does not return false as long as the buffer is changing. This + means that a check with equality to zero will almost always immediately match + at the beginning of the session, while a test for more data will wait for + that data to come in and return false only when HAProxy is certain that no + more data will come in. This test was designed to be used with TCP request + content inspection. + +req.payload(<offset>,<length>) : binary + This extracts a binary block of <length> bytes and starting at byte <offset> + in the request buffer. As a special case, if the <length> argument is zero, + the the whole buffer from <offset> to the end is extracted. This can be used + with ACLs in order to check for the presence of some content in a buffer at + any location. + + ACL derivatives : + req.payload(<offset>,<length>) : hex binary match + +req.payload_lv(<offset1>,<length>[,<offset2>]) : binary + This extracts a binary block whose size is specified at <offset1> for <length> + bytes, and which starts at <offset2> if specified or just after the length in + the request buffer. The <offset2> parameter also supports relative offsets if + prepended with a '+' or '-' sign. + + ACL derivatives : + req.payload_lv(<offset1>,<length>[,<offset2>]) : hex binary match + + Example : please consult the example from the "stick store-response" keyword. + +req.proto_http : boolean +req_proto_http : boolean (deprecated) + Returns true when data in the request buffer look like HTTP and correctly + parses as such. It is the same parser as the common HTTP request parser which + is used so there should be no surprises. The test does not match until the + request is complete, failed or timed out. This test may be used to report the + protocol in TCP logs, but the biggest use is to block TCP request analysis + until a complete HTTP request is present in the buffer, for example to track + a header. + + Example: + # track request counts per "base" (concatenation of Host+URL) + tcp-request inspect-delay 10s + tcp-request content reject if !HTTP + tcp-request content track-sc0 base table req-rate + +req.rdp_cookie([<name>]) : string +rdp_cookie([<name>]) : string (deprecated) + When the request buffer looks like the RDP protocol, extracts the RDP cookie + <name>, or any cookie if unspecified. The parser only checks for the first + cookie, as illustrated in the RDP protocol specification. The cookie name is + case insensitive. Generally the "MSTS" cookie name will be used, as it can + contain the user name of the client connecting to the server if properly + configured on the client. The "MSTSHASH" cookie is often used as well for + session stickiness to servers. + + This differs from "balance rdp-cookie" in that any balancing algorithm may be + used and thus the distribution of clients to backend servers is not linked to + a hash of the RDP cookie. It is envisaged that using a balancing algorithm + such as "balance roundrobin" or "balance leastconn" will lead to a more even + distribution of clients to backend servers than the hash used by "balance + rdp-cookie". + + ACL derivatives : + req.rdp_cookie([<name>]) : exact string match + + Example : + listen tse-farm + bind 0.0.0.0:3389 + # wait up to 5s for an RDP cookie in the request + tcp-request inspect-delay 5s + tcp-request content accept if RDP_COOKIE + # apply RDP cookie persistence + persist rdp-cookie + # Persist based on the mstshash cookie + # This is only useful makes sense if + # balance rdp-cookie is not used + stick-table type string size 204800 + stick on req.rdp_cookie(mstshash) + server srv1 1.1.1.1:3389 + server srv1 1.1.1.2:3389 + + See also : "balance rdp-cookie", "persist rdp-cookie", "tcp-request" and the + "req.rdp_cookie" ACL. + +req.rdp_cookie_cnt([name]) : integer +rdp_cookie_cnt([name]) : integer (deprecated) + Tries to parse the request buffer as RDP protocol, then returns an integer + corresponding to the number of RDP cookies found. If an optional cookie name + is passed, only cookies matching this name are considered. This is mostly + used in ACL. + + ACL derivatives : + req.rdp_cookie_cnt([<name>]) : integer match + +req.ssl_alpn : string + Returns a string containing the values of the Application-Layer Protocol + Negotiation (ALPN) TLS extension (RFC7301), sent by the client within the SSL + ClientHello message. Note that this only applies to raw contents found in the + request buffer and not to the contents deciphered via an SSL data layer, so + this will not work with "bind" lines having the "ssl" option. This is useful + in ACL to make a routing decision based upon the ALPN preferences of a TLS + client, like in the example below. See also "ssl_fc_alpn". + + Examples : + # Wait for a client hello for at most 5 seconds + tcp-request inspect-delay 5s + tcp-request content accept if { req.ssl_hello_type 1 } + use_backend bk_acme if { req.ssl_alpn acme-tls/1 } + default_backend bk_default + +req.ssl_ec_ext : boolean + Returns a boolean identifying if client sent the Supported Elliptic Curves + Extension as defined in RFC4492, section 5.1. within the SSL ClientHello + message. This can be used to present ECC compatible clients with EC + certificate and to use RSA for all others, on the same IP address. Note that + this only applies to raw contents found in the request buffer and not to + contents deciphered via an SSL data layer, so this will not work with "bind" + lines having the "ssl" option. + +req.ssl_hello_type : integer +req_ssl_hello_type : integer (deprecated) + Returns an integer value containing the type of the SSL hello message found + in the request buffer if the buffer contains data that parse as a complete + SSL (v3 or superior) client hello message. Note that this only applies to raw + contents found in the request buffer and not to contents deciphered via an + SSL data layer, so this will not work with "bind" lines having the "ssl" + option. This is mostly used in ACL to detect presence of an SSL hello message + that is supposed to contain an SSL session ID usable for stickiness. + +req.ssl_sni : string +req_ssl_sni : string (deprecated) + Returns a string containing the value of the Server Name TLS extension sent + by a client in a TLS stream passing through the request buffer if the buffer + contains data that parse as a complete SSL (v3 or superior) client hello + message. Note that this only applies to raw contents found in the request + buffer and not to contents deciphered via an SSL data layer, so this will not + work with "bind" lines having the "ssl" option. This will only work for actual + implicit TLS based protocols like HTTPS (443), IMAPS (993), SMTPS (465), + however it will not work for explicit TLS based protocols, like SMTP (25/587) + or IMAP (143). SNI normally contains the name of the host the client tries to + connect to (for recent browsers). SNI is useful for allowing or denying access + to certain hosts when SSL/TLS is used by the client. This test was designed to + be used with TCP request content inspection. If content switching is needed, + it is recommended to first wait for a complete client hello (type 1), like in + the example below. See also "ssl_fc_sni". + + ACL derivatives : + req.ssl_sni : exact string match + + Examples : + # Wait for a client hello for at most 5 seconds + tcp-request inspect-delay 5s + tcp-request content accept if { req.ssl_hello_type 1 } + use_backend bk_allow if { req.ssl_sni -f allowed_sites } + default_backend bk_sorry_page + +req.ssl_st_ext : integer + Returns 0 if the client didn't send a SessionTicket TLS Extension (RFC5077) + Returns 1 if the client sent SessionTicket TLS Extension + Returns 2 if the client also sent non-zero length TLS SessionTicket + Note that this only applies to raw contents found in the request buffer and + not to contents deciphered via an SSL data layer, so this will not work with + "bind" lines having the "ssl" option. This can for example be used to detect + whether the client sent a SessionTicket or not and stick it accordingly, if + no SessionTicket then stick on SessionID or don't stick as there's no server + side state is there when SessionTickets are in use. + +req.ssl_ver : integer +req_ssl_ver : integer (deprecated) + Returns an integer value containing the version of the SSL/TLS protocol of a + stream present in the request buffer. Both SSLv2 hello messages and SSLv3 + messages are supported. TLSv1 is announced as SSL version 3.1. The value is + composed of the major version multiplied by 65536, added to the minor + version. Note that this only applies to raw contents found in the request + buffer and not to contents deciphered via an SSL data layer, so this will not + work with "bind" lines having the "ssl" option. The ACL version of the test + matches against a decimal notation in the form MAJOR.MINOR (e.g. 3.1). This + fetch is mostly used in ACL. + + ACL derivatives : + req.ssl_ver : decimal match + +res.len : integer + Returns an integer value corresponding to the number of bytes present in the + response buffer. This is mostly used in ACL. It is important to understand + that this test does not return false as long as the buffer is changing. This + means that a check with equality to zero will almost always immediately match + at the beginning of the session, while a test for more data will wait for + that data to come in and return false only when HAProxy is certain that no + more data will come in. This test was designed to be used with TCP response + content inspection. But it may also be used in tcp-check based expect rules. + +res.payload(<offset>,<length>) : binary + This extracts a binary block of <length> bytes and starting at byte <offset> + in the response buffer. As a special case, if the <length> argument is zero, + the whole buffer from <offset> to the end is extracted. This can be used + with ACLs in order to check for the presence of some content in a buffer at + any location. It may also be used in tcp-check based expect rules. + +res.payload_lv(<offset1>,<length>[,<offset2>]) : binary + This extracts a binary block whose size is specified at <offset1> for <length> + bytes, and which starts at <offset2> if specified or just after the length in + the response buffer. The <offset2> parameter also supports relative offsets + if prepended with a '+' or '-' sign. It may also be used in tcp-check based + expect rules. + + Example : please consult the example from the "stick store-response" keyword. + +res.ssl_hello_type : integer +rep_ssl_hello_type : integer (deprecated) + Returns an integer value containing the type of the SSL hello message found + in the response buffer if the buffer contains data that parses as a complete + SSL (v3 or superior) hello message. Note that this only applies to raw + contents found in the response buffer and not to contents deciphered via an + SSL data layer, so this will not work with "server" lines having the "ssl" + option. This is mostly used in ACL to detect presence of an SSL hello message + that is supposed to contain an SSL session ID usable for stickiness. + +wait_end : boolean + This fetch either returns true when the inspection period is over, or does + not fetch. It is only used in ACLs, in conjunction with content analysis to + avoid returning a wrong verdict early. It may also be used to delay some + actions, such as a delayed reject for some special addresses. Since it either + stops the rules evaluation or immediately returns true, it is recommended to + use this acl as the last one in a rule. Please note that the default ACL + "WAIT_END" is always usable without prior declaration. This test was designed + to be used with TCP request content inspection. + + Examples : + # delay every incoming request by 2 seconds + tcp-request inspect-delay 2s + tcp-request content accept if WAIT_END + + # don't immediately tell bad guys they are rejected + tcp-request inspect-delay 10s + acl goodguys src 10.0.0.0/24 + acl badguys src 10.0.1.0/24 + tcp-request content accept if goodguys + tcp-request content reject if badguys WAIT_END + tcp-request content reject + + +7.3.6. Fetching HTTP samples (Layer 7) +-------------------------------------- + +It is possible to fetch samples from HTTP contents, requests and responses. +This application layer is also called layer 7. It is only possible to fetch the +data in this section when a full HTTP request or response has been parsed from +its respective request or response buffer. This is always the case with all +HTTP specific rules and for sections running with "mode http". When using TCP +content inspection, it may be necessary to support an inspection delay in order +to let the request or response come in first. These fetches may require a bit +more CPU resources than the layer 4 ones, but not much since the request and +response are indexed. + +Note : Regarding HTTP processing from the tcp-request content rules, everything + will work as expected from an HTTP proxy. However, from a TCP proxy, + without an HTTP upgrade, it will only work for HTTP/1 content. For + HTTP/2 content, only the preface is visible. Thus, it is only possible + to rely to "req.proto_http", "req.ver" and eventually "method" sample + fetches. All other L7 sample fetches will fail. After an HTTP upgrade, + they will work in the same manner than from an HTTP proxy. + +base : string + This returns the concatenation of the first Host header and the path part of + the request, which starts at the first slash and ends before the question + mark. It can be useful in virtual hosted environments to detect URL abuses as + well as to improve shared caches efficiency. Using this with a limited size + stick table also allows one to collect statistics about most commonly + requested objects by host/path. With ACLs it can allow simple content + switching rules involving the host and the path at the same time, such as + "www.example.com/favicon.ico". See also "path" and "uri". + + ACL derivatives : + base : exact string match + base_beg : prefix match + base_dir : subdir match + base_dom : domain match + base_end : suffix match + base_len : length match + base_reg : regex match + base_sub : substring match + +base32 : integer + This returns a 32-bit hash of the value returned by the "base" fetch method + above. This is useful to track per-URL activity on high traffic sites without + having to store all URLs. Instead a shorter hash is stored, saving a lot of + memory. The output type is an unsigned integer. The hash function used is + SDBM with full avalanche on the output. Technically, base32 is exactly equal + to "base,sdbm(1)". + +base32+src : binary + This returns the concatenation of the base32 fetch above and the src fetch + below. The resulting type is of type binary, with a size of 8 or 20 bytes + depending on the source address family. This can be used to track per-IP, + per-URL counters. + +baseq : string + This returns the concatenation of the first Host header and the path part of + the request with the query-string, which starts at the first slash. Using this + instead of "base" allows one to properly identify the target resource, for + statistics or caching use cases. See also "path", "pathq" and "base". + +capture.req.hdr(<idx>) : string + This extracts the content of the header captured by the "capture request + header", idx is the position of the capture keyword in the configuration. + The first entry is an index of 0. See also: "capture request header". + +capture.req.method : string + This extracts the METHOD of an HTTP request. It can be used in both request + and response. Unlike "method", it can be used in both request and response + because it's allocated. + +capture.req.uri : string + This extracts the request's URI, which starts at the first slash and ends + before the first space in the request (without the host part). Unlike "path" + and "url", it can be used in both request and response because it's + allocated. + +capture.req.ver : string + This extracts the request's HTTP version and returns either "HTTP/1.0" or + "HTTP/1.1". Unlike "req.ver", it can be used in both request, response, and + logs because it relies on a persistent flag. + +capture.res.hdr(<idx>) : string + This extracts the content of the header captured by the "capture response + header", idx is the position of the capture keyword in the configuration. + The first entry is an index of 0. + See also: "capture response header" + +capture.res.ver : string + This extracts the response's HTTP version and returns either "HTTP/1.0" or + "HTTP/1.1". Unlike "res.ver", it can be used in logs because it relies on a + persistent flag. + +req.body : binary + This returns the HTTP request's available body as a block of data. It is + recommended to use "option http-buffer-request" to be sure to wait, as much + as possible, for the request's body. + +req.body_param([<name>) : string + This fetch assumes that the body of the POST request is url-encoded. The user + can check if the "content-type" contains the value + "application/x-www-form-urlencoded". This extracts the first occurrence of the + parameter <name> in the body, which ends before '&'. The parameter name is + case-sensitive. If no name is given, any parameter will match, and the first + one will be returned. The result is a string corresponding to the value of the + parameter <name> as presented in the request body (no URL decoding is + performed). Note that the ACL version of this fetch iterates over multiple + parameters and will iteratively report all parameters values if no name is + given. + +req.body_len : integer + This returns the length of the HTTP request's available body in bytes. It may + be lower than the advertised length if the body is larger than the buffer. It + is recommended to use "option http-buffer-request" to be sure to wait, as + much as possible, for the request's body. + +req.body_size : integer + This returns the advertised length of the HTTP request's body in bytes. It + will represent the advertised Content-Length header, or the size of the + available data in case of chunked encoding. + +req.cook([<name>]) : string +cook([<name>]) : string (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Cookie" + header line from the request, and returns its value as string. If no name is + specified, the first cookie value is returned. When used with ACLs, all + matching cookies are evaluated. Spaces around the name and the value are + ignored as requested by the Cookie header specification (RFC6265). The cookie + name is case-sensitive. Empty cookies are valid, so an empty cookie may very + well return an empty value if it is present. Use the "found" match to detect + presence. Use the res.cook() variant for response cookies sent by the server. + + ACL derivatives : + req.cook([<name>]) : exact string match + req.cook_beg([<name>]) : prefix match + req.cook_dir([<name>]) : subdir match + req.cook_dom([<name>]) : domain match + req.cook_end([<name>]) : suffix match + req.cook_len([<name>]) : length match + req.cook_reg([<name>]) : regex match + req.cook_sub([<name>]) : substring match + +req.cook_cnt([<name>]) : integer +cook_cnt([<name>]) : integer (deprecated) + Returns an integer value representing the number of occurrences of the cookie + <name> in the request, or all cookies if <name> is not specified. + +req.cook_val([<name>]) : integer +cook_val([<name>]) : integer (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Cookie" + header line from the request, and converts its value to an integer which is + returned. If no name is specified, the first cookie value is returned. When + used in ACLs, all matching names are iterated over until a value matches. + +cookie([<name>]) : string (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Cookie" + header line from the request, or a "Set-Cookie" header from the response, and + returns its value as a string. A typical use is to get multiple clients + sharing a same profile use the same server. This can be similar to what + "appsession" did with the "request-learn" statement, but with support for + multi-peer synchronization and state keeping across restarts. If no name is + specified, the first cookie value is returned. This fetch should not be used + anymore and should be replaced by req.cook() or res.cook() instead as it + ambiguously uses the direction based on the context where it is used. + +hdr([<name>[,<occ>]]) : string + This is equivalent to req.hdr() when used on requests, and to res.hdr() when + used on responses. Please refer to these respective fetches for more details. + In case of doubt about the fetch direction, please use the explicit ones. + Note that contrary to the hdr() sample fetch method, the hdr_* ACL keywords + unambiguously apply to the request headers. + +req.fhdr(<name>[,<occ>]) : string + This returns the full value of the last occurrence of header <name> in an + HTTP request. It differs from req.hdr() in that any commas present in the + value are returned and are not used as delimiters. This is sometimes useful + with headers such as User-Agent. + + When used from an ACL, all occurrences are iterated over until a match is + found. + + Optionally, a specific occurrence might be specified as a position number. + Positive values indicate a position from the first occurrence, with 1 being + the first one. Negative values indicate positions relative to the last one, + with -1 being the last one. + +req.fhdr_cnt([<name>]) : integer + Returns an integer value representing the number of occurrences of request + header field name <name>, or the total number of header fields if <name> is + not specified. Like req.fhdr() it differs from res.hdr_cnt() by not splitting + headers at commas. + +req.hdr([<name>[,<occ>]]) : string + This returns the last comma-separated value of the header <name> in an HTTP + request. The fetch considers any comma as a delimiter for distinct values. + This is useful if you need to process headers that are defined to be a list + of values, such as Accept, or X-Forwarded-For. If full-line headers are + desired instead, use req.fhdr(). Please carefully check RFC 7231 to know how + certain headers are supposed to be parsed. Also, some of them are case + insensitive (e.g. Connection). + + When used from an ACL, all occurrences are iterated over until a match is + found. + + Optionally, a specific occurrence might be specified as a position number. + Positive values indicate a position from the first occurrence, with 1 being + the first one. Negative values indicate positions relative to the last one, + with -1 being the last one. + + A typical use is with the X-Forwarded-For header once converted to IP, + associated with an IP stick-table. + + ACL derivatives : + hdr([<name>[,<occ>]]) : exact string match + hdr_beg([<name>[,<occ>]]) : prefix match + hdr_dir([<name>[,<occ>]]) : subdir match + hdr_dom([<name>[,<occ>]]) : domain match + hdr_end([<name>[,<occ>]]) : suffix match + hdr_len([<name>[,<occ>]]) : length match + hdr_reg([<name>[,<occ>]]) : regex match + hdr_sub([<name>[,<occ>]]) : substring match + +req.hdr_cnt([<name>]) : integer +hdr_cnt([<header>]) : integer (deprecated) + Returns an integer value representing the number of occurrences of request + header field name <name>, or the total number of header field values if + <name> is not specified. Like req.hdr() it counts each comma separated + part of the header's value. If counting of full-line headers is desired, + then req.fhdr_cnt() should be used instead. + + With ACLs, it can be used to detect presence, absence or abuse of a specific + header, as well as to block request smuggling attacks by rejecting requests + which contain more than one of certain headers. + + Refer to req.hdr() for more information on header matching. + +req.hdr_ip([<name>[,<occ>]]) : ip +hdr_ip([<name>[,<occ>]]) : ip (deprecated) + This extracts the last occurrence of header <name> in an HTTP request, + converts it to an IPv4 or IPv6 address and returns this address. When used + with ACLs, all occurrences are checked, and if <name> is omitted, every value + of every header is checked. The parser strictly adheres to the format + described in RFC7239, with the extension that IPv4 addresses may optionally + be followed by a colon (':') and a valid decimal port number (0 to 65535), + which will be silently dropped. All other forms will not match and will + cause the address to be ignored. + + The <occ> parameter is processed as with req.hdr(). + + A typical use is with the X-Forwarded-For and X-Client-IP headers. + +req.hdr_val([<name>[,<occ>]]) : integer +hdr_val([<name>[,<occ>]]) : integer (deprecated) + This extracts the last occurrence of header <name> in an HTTP request, and + converts it to an integer value. When used with ACLs, all occurrences are + checked, and if <name> is omitted, every value of every header is checked. + + The <occ> parameter is processed as with req.hdr(). + + A typical use is with the X-Forwarded-For header. + +req.hdrs : string + Returns the current request headers as string including the last empty line + separating headers from the request body. The last empty line can be used to + detect a truncated header block. This sample fetch is useful for some SPOE + headers analyzers and for advanced logging. + +req.hdrs_bin : binary + Returns the current request headers contained in preparsed binary form. This + is useful for offloading some processing with SPOE. Each string is described + by a length followed by the number of bytes indicated in the length. The + length is represented using the variable integer encoding detailed in the + SPOE documentation. The end of the list is marked by a couple of empty header + names and values (length of 0 for both). + + *(<str:header-name><str:header-value>)<empty string><empty string> + + int: refer to the SPOE documentation for the encoding + str: <int:length><bytes> + +http_auth(<userlist>) : boolean + Returns a boolean indicating whether the authentication data received from + the client match a username & password stored in the specified userlist. This + fetch function is not really useful outside of ACLs. Currently only http + basic auth is supported. + +http_auth_bearer([<header>]) : string + Returns the client-provided token found in the authorization data when the + Bearer scheme is used (to send JSON Web Tokens for instance). No check is + performed on the data sent by the client. + If a specific <header> is supplied, it will parse this header instead of the + Authorization one. + +http_auth_group(<userlist>) : string + Returns a string corresponding to the user name found in the authentication + data received from the client if both the user name and password are valid + according to the specified userlist. The main purpose is to use it in ACLs + where it is then checked whether the user belongs to any group within a list. + This fetch function is not really useful outside of ACLs. Currently only http + basic auth is supported. + + ACL derivatives : + http_auth_group(<userlist>) : group ... + Returns true when the user extracted from the request and whose password is + valid according to the specified userlist belongs to at least one of the + groups. + +http_auth_pass : string + Returns the user's password found in the authentication data received from + the client, as supplied in the Authorization header. Not checks are + performed by this sample fetch. Only Basic authentication is supported. + +http_auth_type : string + Returns the authentication method found in the authentication data received from + the client, as supplied in the Authorization header. Not checks are + performed by this sample fetch. Only Basic authentication is supported. + +http_auth_user : string + Returns the user name found in the authentication data received from the + client, as supplied in the Authorization header. Not checks are performed by + this sample fetch. Only Basic authentication is supported. + +http_first_req : boolean + Returns true when the request being processed is the first one of the + connection. This can be used to add or remove headers that may be missing + from some requests when a request is not the first one, or to help grouping + requests in the logs. + +method : integer + string + Returns an integer value corresponding to the method in the HTTP request. For + example, "GET" equals 1 (check sources to establish the matching). Value 9 + means "other method" and may be converted to a string extracted from the + stream. This should not be used directly as a sample, this is only meant to + be used from ACLs, which transparently convert methods from patterns to these + integer + string values. Some predefined ACL already check for most common + methods. + + ACL derivatives : + method : case insensitive method match + + Example : + # only accept GET and HEAD requests + acl valid_method method GET HEAD + http-request deny if ! valid_method + +path : string + This extracts the request's URL path, which starts at the first slash and + ends before the question mark (without the host part). A typical use is with + prefetch-capable caches, and with portals which need to aggregate multiple + information from databases and keep them in caches. Note that with outgoing + caches, it would be wiser to use "url" instead. With ACLs, it's typically + used to match exact file names (e.g. "/login.php"), or directory parts using + the derivative forms. See also the "url" and "base" fetch methods. + + ACL derivatives : + path : exact string match + path_beg : prefix match + path_dir : subdir match + path_dom : domain match + path_end : suffix match + path_len : length match + path_reg : regex match + path_sub : substring match + +pathq : string + This extracts the request's URL path with the query-string, which starts at + the first slash. This sample fetch is pretty handy to always retrieve a + relative URI, excluding the scheme and the authority part, if any. Indeed, + while it is the common representation for an HTTP/1.1 request target, in + HTTP/2, an absolute URI is often used. This sample fetch will return the same + result in both cases. + +query : string + This extracts the request's query string, which starts after the first + question mark. If no question mark is present, this fetch returns nothing. If + a question mark is present but nothing follows, it returns an empty string. + This means it's possible to easily know whether a query string is present + using the "found" matching method. This fetch is the complement of "path" + which stops before the question mark. + +req.hdr_names([<delim>]) : string + This builds a string made from the concatenation of all header names as they + appear in the request when the rule is evaluated. The default delimiter is + the comma (',') but it may be overridden as an optional argument <delim>. In + this case, only the first character of <delim> is considered. + +req.ver : string +req_ver : string (deprecated) + Returns the version string from the HTTP request, for example "1.1". This can + be useful for logs, but is mostly there for ACL. Some predefined ACL already + check for versions 1.0 and 1.1. + + ACL derivatives : + req.ver : exact string match + +res.body : binary + This returns the HTTP response's available body as a block of data. Unlike + the request side, there is no directive to wait for the response's body. This + sample fetch is really useful (and usable) in the health-check context. + + It may be used in tcp-check based expect rules. + +res.body_len : integer + This returns the length of the HTTP response available body in bytes. Unlike + the request side, there is no directive to wait for the response's body. This + sample fetch is really useful (and usable) in the health-check context. + + It may be used in tcp-check based expect rules. + +res.body_size : integer + This returns the advertised length of the HTTP response body in bytes. It + will represent the advertised Content-Length header, or the size of the + available data in case of chunked encoding. Unlike the request side, there is + no directive to wait for the response body. This sample fetch is really + useful (and usable) in the health-check context. + + It may be used in tcp-check based expect rules. + +res.cache_hit : boolean + Returns the boolean "true" value if the response has been built out of an + HTTP cache entry, otherwise returns boolean "false". + +res.cache_name : string + Returns a string containing the name of the HTTP cache that was used to + build the HTTP response if res.cache_hit is true, otherwise returns an + empty string. + +res.comp : boolean + Returns the boolean "true" value if the response has been compressed by + HAProxy, otherwise returns boolean "false". This may be used to add + information in the logs. + +res.comp_algo : string + Returns a string containing the name of the algorithm used if the response + was compressed by HAProxy, for example : "deflate". This may be used to add + some information in the logs. + +res.cook([<name>]) : string +scook([<name>]) : string (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Set-Cookie" + header line from the response, and returns its value as string. If no name is + specified, the first cookie value is returned. + + It may be used in tcp-check based expect rules. + + ACL derivatives : + res.scook([<name>] : exact string match + +res.cook_cnt([<name>]) : integer +scook_cnt([<name>]) : integer (deprecated) + Returns an integer value representing the number of occurrences of the cookie + <name> in the response, or all cookies if <name> is not specified. This is + mostly useful when combined with ACLs to detect suspicious responses. + + It may be used in tcp-check based expect rules. + +res.cook_val([<name>]) : integer +scook_val([<name>]) : integer (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Set-Cookie" + header line from the response, and converts its value to an integer which is + returned. If no name is specified, the first cookie value is returned. + + It may be used in tcp-check based expect rules. + +res.fhdr([<name>[,<occ>]]) : string + This fetch works like the req.fhdr() fetch with the difference that it acts + on the headers within an HTTP response. + + Like req.fhdr() the res.fhdr() fetch returns full values. If the header is + defined to be a list you should use res.hdr(). + + This fetch is sometimes useful with headers such as Date or Expires. + + It may be used in tcp-check based expect rules. + +res.fhdr_cnt([<name>]) : integer + This fetch works like the req.fhdr_cnt() fetch with the difference that it + acts on the headers within an HTTP response. + + Like req.fhdr_cnt() the res.fhdr_cnt() fetch acts on full values. If the + header is defined to be a list you should use res.hdr_cnt(). + + It may be used in tcp-check based expect rules. + +res.hdr([<name>[,<occ>]]) : string +shdr([<name>[,<occ>]]) : string (deprecated) + This fetch works like the req.hdr() fetch with the difference that it acts + on the headers within an HTTP response. + + Like req.hdr() the res.hdr() fetch considers the comma to be a delimiter. If + this is not desired res.fhdr() should be used. + + It may be used in tcp-check based expect rules. + + ACL derivatives : + res.hdr([<name>[,<occ>]]) : exact string match + res.hdr_beg([<name>[,<occ>]]) : prefix match + res.hdr_dir([<name>[,<occ>]]) : subdir match + res.hdr_dom([<name>[,<occ>]]) : domain match + res.hdr_end([<name>[,<occ>]]) : suffix match + res.hdr_len([<name>[,<occ>]]) : length match + res.hdr_reg([<name>[,<occ>]]) : regex match + res.hdr_sub([<name>[,<occ>]]) : substring match + +res.hdr_cnt([<name>]) : integer +shdr_cnt([<name>]) : integer (deprecated) + This fetch works like the req.hdr_cnt() fetch with the difference that it + acts on the headers within an HTTP response. + + Like req.hdr_cnt() the res.hdr_cnt() fetch considers the comma to be a + delimiter. If this is not desired res.fhdr_cnt() should be used. + + It may be used in tcp-check based expect rules. + +res.hdr_ip([<name>[,<occ>]]) : ip +shdr_ip([<name>[,<occ>]]) : ip (deprecated) + This fetch works like the req.hdr_ip() fetch with the difference that it + acts on the headers within an HTTP response. + + This can be useful to learn some data into a stick table. + + It may be used in tcp-check based expect rules. + +res.hdr_names([<delim>]) : string + This builds a string made from the concatenation of all header names as they + appear in the response when the rule is evaluated. The default delimiter is + the comma (',') but it may be overridden as an optional argument <delim>. In + this case, only the first character of <delim> is considered. + + It may be used in tcp-check based expect rules. + +res.hdr_val([<name>[,<occ>]]) : integer +shdr_val([<name>[,<occ>]]) : integer (deprecated) + This fetch works like the req.hdr_val() fetch with the difference that it + acts on the headers within an HTTP response. + + This can be useful to learn some data into a stick table. + + It may be used in tcp-check based expect rules. + +res.hdrs : string + Returns the current response headers as string including the last empty line + separating headers from the request body. The last empty line can be used to + detect a truncated header block. This sample fetch is useful for some SPOE + headers analyzers and for advanced logging. + + It may also be used in tcp-check based expect rules. + +res.hdrs_bin : binary + Returns the current response headers contained in preparsed binary form. This + is useful for offloading some processing with SPOE. It may be used in + tcp-check based expect rules. Each string is described by a length followed + by the number of bytes indicated in the length. The length is represented + using the variable integer encoding detailed in the SPOE documentation. The + end of the list is marked by a couple of empty header names and values + (length of 0 for both). + + *(<str:header-name><str:header-value>)<empty string><empty string> + + int: refer to the SPOE documentation for the encoding + str: <int:length><bytes> + +res.ver : string +resp_ver : string (deprecated) + Returns the version string from the HTTP response, for example "1.1". This + can be useful for logs, but is mostly there for ACL. + + It may be used in tcp-check based expect rules. + + ACL derivatives : + resp.ver : exact string match + +set-cookie([<name>]) : string (deprecated) + This extracts the last occurrence of the cookie name <name> on a "Set-Cookie" + header line from the response and uses the corresponding value to match. This + can be comparable to what "appsession" did with default options, but with + support for multi-peer synchronization and state keeping across restarts. + + This fetch function is deprecated and has been superseded by the "res.cook" + fetch. This keyword will disappear soon. + +status : integer + Returns an integer containing the HTTP status code in the HTTP response, for + example, 302. It is mostly used within ACLs and integer ranges, for example, + to remove any Location header if the response is not a 3xx. + + It may be used in tcp-check based expect rules. + +unique-id : string + Returns the unique-id attached to the request. The directive + "unique-id-format" must be set. If it is not set, the unique-id sample fetch + fails. Note that the unique-id is usually used with HTTP requests, however this + sample fetch can be used with other protocols. Obviously, if it is used with + other protocols than HTTP, the unique-id-format directive must not contain + HTTP parts. See: unique-id-format and unique-id-header + +url : string + This extracts the request's URL as presented in the request. A typical use is + with prefetch-capable caches, and with portals which need to aggregate + multiple information from databases and keep them in caches. With ACLs, using + "path" is preferred over using "url", because clients may send a full URL as + is normally done with proxies. The only real use is to match "*" which does + not match in "path", and for which there is already a predefined ACL. See + also "path" and "base". + + ACL derivatives : + url : exact string match + url_beg : prefix match + url_dir : subdir match + url_dom : domain match + url_end : suffix match + url_len : length match + url_reg : regex match + url_sub : substring match + +url_ip : ip + This extracts the IP address from the request's URL when the host part is + presented as an IP address. Its use is very limited. For instance, a + monitoring system might use this field as an alternative for the source IP in + order to test what path a given source address would follow, or to force an + entry in a table for a given source address. It may be used in combination + with 'http-request set-dst' to emulate the older 'option http_proxy'. + +url_port : integer + This extracts the port part from the request's URL. Note that if the port is + not specified in the request, port 80 is assumed.. + +urlp([<name>[,<delim>]]) : string +url_param([<name>[,<delim>]]) : string + This extracts the first occurrence of the parameter <name> in the query + string, which begins after either '?' or <delim>, and which ends before '&', + ';' or <delim>. The parameter name is case-sensitive. If no name is given, + any parameter will match, and the first one will be returned. The result is + a string corresponding to the value of the parameter <name> as presented in + the request (no URL decoding is performed). This can be used for session + stickiness based on a client ID, to extract an application cookie passed as a + URL parameter, or in ACLs to apply some checks. Note that the ACL version of + this fetch iterates over multiple parameters and will iteratively report all + parameters values if no name is given + + ACL derivatives : + urlp(<name>[,<delim>]) : exact string match + urlp_beg(<name>[,<delim>]) : prefix match + urlp_dir(<name>[,<delim>]) : subdir match + urlp_dom(<name>[,<delim>]) : domain match + urlp_end(<name>[,<delim>]) : suffix match + urlp_len(<name>[,<delim>]) : length match + urlp_reg(<name>[,<delim>]) : regex match + urlp_sub(<name>[,<delim>]) : substring match + + + Example : + # match http://example.com/foo?PHPSESSIONID=some_id + stick on urlp(PHPSESSIONID) + # match http://example.com/foo;JSESSIONID=some_id + stick on urlp(JSESSIONID,;) + +urlp_val([<name>[,<delim>]]) : integer + See "urlp" above. This one extracts the URL parameter <name> in the request + and converts it to an integer value. This can be used for session stickiness + based on a user ID for example, or with ACLs to match a page number or price. + +url32 : integer + This returns a 32-bit hash of the value obtained by concatenating the first + Host header and the whole URL including parameters (not only the path part of + the request, as in the "base32" fetch above). This is useful to track per-URL + activity. A shorter hash is stored, saving a lot of memory. The output type + is an unsigned integer. + +url32+src : binary + This returns the concatenation of the "url32" fetch and the "src" fetch. The + resulting type is of type binary, with a size of 8 or 20 bytes depending on + the source address family. This can be used to track per-IP, per-URL counters. + + +7.3.7. Fetching samples for developers +--------------------------------------- + +This set of sample fetch methods is reserved to developers and must never be +used on a production environment, except on developer demand, for debugging +purposes. Moreover, no special care will be taken on backwards compatibility. +There is no warranty the following sample fetches will never change, be renamed +or simply removed. So be really careful if you should use one of them. To avoid +any ambiguity, these sample fetches are placed in the dedicated scope "internal", +for instance "internal.strm.is_htx". + +internal.htx.data : integer + Returns the size in bytes used by data in the HTX message associated to a + channel. The channel is chosen depending on the sample direction. + +internal.htx.free : integer + Returns the free space (size - used) in bytes in the HTX message associated + to a channel. The channel is chosen depending on the sample direction. + +internal.htx.free_data : integer + Returns the free space for the data in bytes in the HTX message associated to + a channel. The channel is chosen depending on the sample direction. + +internal.htx.has_eom : boolean + Returns true if the HTX message associated to a channel contains the + end-of-message flag (EOM). Otherwise, it returns false. The channel is chosen + depending on the sample direction. + +internal.htx.nbblks : integer + Returns the number of blocks present in the HTX message associated to a + channel. The channel is chosen depending on the sample direction. + +internal.htx.size : integer + Returns the total size in bytes of the HTX message associated to a + channel. The channel is chosen depending on the sample direction. + +internal.htx.used : integer + Returns the total size used in bytes (data + metadata) in the HTX message + associated to a channel. The channel is chosen depending on the sample + direction. + +internal.htx_blk.size(<idx>) : integer + Returns the size of the block at the position <idx> in the HTX message + associated to a channel or 0 if it does not exist. The channel is chosen + depending on the sample direction. <idx> may be any positive integer or one + of the special value : + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.htx_blk.type(<idx>) : string + Returns the type of the block at the position <idx> in the HTX message + associated to a channel or "HTX_BLK_UNUSED" if it does not exist. The channel + is chosen depending on the sample direction. <idx> may be any positive + integer or one of the special value : + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.htx_blk.data(<idx>) : binary + Returns the value of the DATA block at the position <idx> in the HTX message + associated to a channel or an empty string if it does not exist or if it is + not a DATA block. The channel is chosen depending on the sample direction. + <idx> may be any positive integer or one of the special value : + + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.htx_blk.hdrname(<idx>) : string + Returns the header name of the HEADER block at the position <idx> in the HTX + message associated to a channel or an empty string if it does not exist or if + it is not an HEADER block. The channel is chosen depending on the sample + direction. <idx> may be any positive integer or one of the special value : + + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.htx_blk.hdrval(<idx>) : string + Returns the header value of the HEADER block at the position <idx> in the HTX + message associated to a channel or an empty string if it does not exist or if + it is not an HEADER block. The channel is chosen depending on the sample + direction. <idx> may be any positive integer or one of the special value : + + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.htx_blk.start_line(<idx>) : string + Returns the value of the REQ_SL or RES_SL block at the position <idx> in the + HTX message associated to a channel or an empty string if it does not exist + or if it is not a SL block. The channel is chosen depending on the sample + direction. <idx> may be any positive integer or one of the special value : + + * head : The oldest inserted block + * tail : The newest inserted block + * first : The first block where to (re)start the analysis + +internal.strm.is_htx : boolean + Returns true if the current stream is an HTX stream. It means the data in the + channels buffers are stored using the internal HTX representation. Otherwise, + it returns false. + + +7.4. Pre-defined ACLs +--------------------- + +Some predefined ACLs are hard-coded so that they do not have to be declared in +every frontend which needs them. They all have their names in upper case in +order to avoid confusion. Their equivalence is provided below. + +ACL name Equivalent to Usage +---------------+----------------------------------+------------------------------------------------------ +FALSE always_false never match +HTTP req.proto_http match if request protocol is valid HTTP +HTTP_1.0 req.ver 1.0 match if HTTP request version is 1.0 +HTTP_1.1 req.ver 1.1 match if HTTP request version is 1.1 +HTTP_2.0 req.ver 2.0 match if HTTP request version is 2.0 +HTTP_CONTENT req.hdr_val(content-length) gt 0 match an existing content-length in the HTTP request +HTTP_URL_ABS url_reg ^[^/:]*:// match absolute URL with scheme +HTTP_URL_SLASH url_beg / match URL beginning with "/" +HTTP_URL_STAR url * match URL equal to "*" +LOCALHOST src 127.0.0.1/8 ::1 match connection from local host +METH_CONNECT method CONNECT match HTTP CONNECT method +METH_DELETE method DELETE match HTTP DELETE method +METH_GET method GET HEAD match HTTP GET or HEAD method +METH_HEAD method HEAD match HTTP HEAD method +METH_OPTIONS method OPTIONS match HTTP OPTIONS method +METH_POST method POST match HTTP POST method +METH_PUT method PUT match HTTP PUT method +METH_TRACE method TRACE match HTTP TRACE method +RDP_COOKIE req.rdp_cookie_cnt gt 0 match presence of an RDP cookie in the request buffer +REQ_CONTENT req.len gt 0 match data in the request buffer +TRUE always_true always match +WAIT_END wait_end wait for end of content analysis +---------------+----------------------------------+------------------------------------------------------ + + +8. Logging +---------- + +One of HAProxy's strong points certainly lies is its precise logs. It probably +provides the finest level of information available for such a product, which is +very important for troubleshooting complex environments. Standard information +provided in logs include client ports, TCP/HTTP state timers, precise session +state at termination and precise termination cause, information about decisions +to direct traffic to a server, and of course the ability to capture arbitrary +headers. + +In order to improve administrators reactivity, it offers a great transparency +about encountered problems, both internal and external, and it is possible to +send logs to different sources at the same time with different level filters : + + - global process-level logs (system errors, start/stop, etc..) + - per-instance system and internal errors (lack of resource, bugs, ...) + - per-instance external troubles (servers up/down, max connections) + - per-instance activity (client connections), either at the establishment or + at the termination. + - per-request control of log-level, e.g. + http-request set-log-level silent if sensitive_request + +The ability to distribute different levels of logs to different log servers +allow several production teams to interact and to fix their problems as soon +as possible. For example, the system team might monitor system-wide errors, +while the application team might be monitoring the up/down for their servers in +real time, and the security team might analyze the activity logs with one hour +delay. + + +8.1. Log levels +--------------- + +TCP and HTTP connections can be logged with information such as the date, time, +source IP address, destination address, connection duration, response times, +HTTP request, HTTP return code, number of bytes transmitted, conditions +in which the session ended, and even exchanged cookies values. For example +track a particular user's problems. All messages may be sent to up to two +syslog servers. Check the "log" keyword in section 4.2 for more information +about log facilities. + + +8.2. Log formats +---------------- + +HAProxy supports 5 log formats. Several fields are common between these formats +and will be detailed in the following sections. A few of them may vary +slightly with the configuration, due to indicators specific to certain +options. The supported formats are as follows : + + - the default format, which is very basic and very rarely used. It only + provides very basic information about the incoming connection at the moment + it is accepted : source IP:port, destination IP:port, and frontend-name. + This mode will eventually disappear so it will not be described to great + extents. + + - the TCP format, which is more advanced. This format is enabled when "option + tcplog" is set on the frontend. HAProxy will then usually wait for the + connection to terminate before logging. This format provides much richer + information, such as timers, connection counts, queue size, etc... This + format is recommended for pure TCP proxies. + + - the HTTP format, which is the most advanced for HTTP proxying. This format + is enabled when "option httplog" is set on the frontend. It provides the + same information as the TCP format with some HTTP-specific fields such as + the request, the status code, and captures of headers and cookies. This + format is recommended for HTTP proxies. + + - the CLF HTTP format, which is equivalent to the HTTP format, but with the + fields arranged in the same order as the CLF format. In this mode, all + timers, captures, flags, etc... appear one per field after the end of the + common fields, in the same order they appear in the standard HTTP format. + + - the custom log format, allows you to make your own log line. + +Next sections will go deeper into details for each of these formats. Format +specification will be performed on a "field" basis. Unless stated otherwise, a +field is a portion of text delimited by any number of spaces. Since syslog +servers are susceptible of inserting fields at the beginning of a line, it is +always assumed that the first field is the one containing the process name and +identifier. + +Note : Since log lines may be quite long, the log examples in sections below + might be broken into multiple lines. The example log lines will be + prefixed with 3 closing angle brackets ('>>>') and each time a log is + broken into multiple lines, each non-final line will end with a + backslash ('\') and the next line will start indented by two characters. + + +8.2.1. Default log format +------------------------- + +This format is used when no specific option is set. The log is emitted as soon +as the connection is accepted. One should note that this currently is the only +format which logs the request's destination IP and ports. + + Example : + listen www + mode http + log global + server srv1 127.0.0.1:8000 + + >>> Feb 6 12:12:09 localhost \ + haproxy[14385]: Connect from 10.0.1.2:33312 to 10.0.3.31:8012 \ + (www/HTTP) + + Field Format Extract from the example above + 1 process_name '[' pid ']:' haproxy[14385]: + 2 'Connect from' Connect from + 3 source_ip ':' source_port 10.0.1.2:33312 + 4 'to' to + 5 destination_ip ':' destination_port 10.0.3.31:8012 + 6 '(' frontend_name '/' mode ')' (www/HTTP) + +Detailed fields description : + - "source_ip" is the IP address of the client which initiated the connection. + - "source_port" is the TCP port of the client which initiated the connection. + - "destination_ip" is the IP address the client connected to. + - "destination_port" is the TCP port the client connected to. + - "frontend_name" is the name of the frontend (or listener) which received + and processed the connection. + - "mode is the mode the frontend is operating (TCP or HTTP). + +In case of a UNIX socket, the source and destination addresses are marked as +"unix:" and the ports reflect the internal ID of the socket which accepted the +connection (the same ID as reported in the stats). + +It is advised not to use this deprecated format for newer installations as it +will eventually disappear. + + +8.2.2. TCP log format +--------------------- + +The TCP format is used when "option tcplog" is specified in the frontend, and +is the recommended format for pure TCP proxies. It provides a lot of precious +information for troubleshooting. Since this format includes timers and byte +counts, the log is normally emitted at the end of the session. It can be +emitted earlier if "option logasap" is specified, which makes sense in most +environments with long sessions such as remote terminals. Sessions which match +the "monitor" rules are never logged. It is also possible not to emit logs for +sessions for which no data were exchanged between the client and the server, by +specifying "option dontlognull" in the frontend. Successful connections will +not be logged if "option dontlog-normal" is specified in the frontend. + +The TCP log format is internally declared as a custom log format based on the +exact following string, which may also be used as a basis to extend the format +if required. Refer to section 8.2.6 "Custom log format" to see how to use this: + + # strict equivalent of "option tcplog" + log-format "%ci:%cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %ts \ + %ac/%fc/%bc/%sc/%rc %sq/%bq" + +A few fields may slightly vary depending on some configuration options, those +are marked with a star ('*') after the field name below. + + Example : + frontend fnt + mode tcp + option tcplog + log global + default_backend bck + + backend bck + server srv1 127.0.0.1:8000 + + >>> Feb 6 12:12:56 localhost \ + haproxy[14387]: 10.0.1.2:33313 [06/Feb/2009:12:12:51.443] fnt \ + bck/srv1 0/0/5007 212 -- 0/0/0/0/3 0/0 + + Field Format Extract from the example above + 1 process_name '[' pid ']:' haproxy[14387]: + 2 client_ip ':' client_port 10.0.1.2:33313 + 3 '[' accept_date ']' [06/Feb/2009:12:12:51.443] + 4 frontend_name fnt + 5 backend_name '/' server_name bck/srv1 + 6 Tw '/' Tc '/' Tt* 0/0/5007 + 7 bytes_read* 212 + 8 termination_state -- + 9 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 0/0/0/0/3 + 10 srv_queue '/' backend_queue 0/0 + +Detailed fields description : + - "client_ip" is the IP address of the client which initiated the TCP + connection to HAProxy. If the connection was accepted on a UNIX socket + instead, the IP address would be replaced with the word "unix". Note that + when the connection is accepted on a socket configured with "accept-proxy" + and the PROXY protocol is correctly used, or with a "accept-netscaler-cip" + and the NetScaler Client IP insertion protocol is correctly used, then the + logs will reflect the forwarded connection's information. + + - "client_port" is the TCP port of the client which initiated the connection. + If the connection was accepted on a UNIX socket instead, the port would be + replaced with the ID of the accepting socket, which is also reported in the + stats interface. + + - "accept_date" is the exact date when the connection was received by HAProxy + (which might be very slightly different from the date observed on the + network if there was some queuing in the system's backlog). This is usually + the same date which may appear in any upstream firewall's log. When used in + HTTP mode, the accept_date field will be reset to the first moment the + connection is ready to receive a new request (end of previous response for + HTTP/1, immediately after previous request for HTTP/2). + + - "frontend_name" is the name of the frontend (or listener) which received + and processed the connection. + + - "backend_name" is the name of the backend (or listener) which was selected + to manage the connection to the server. This will be the same as the + frontend if no switching rule has been applied, which is common for TCP + applications. + + - "server_name" is the name of the last server to which the connection was + sent, which might differ from the first one if there were connection errors + and a redispatch occurred. Note that this server belongs to the backend + which processed the request. If the connection was aborted before reaching + a server, "<NOSRV>" is indicated instead of a server name. + + - "Tw" is the total time in milliseconds spent waiting in the various queues. + It can be "-1" if the connection was aborted before reaching the queue. + See "Timers" below for more details. + + - "Tc" is the total time in milliseconds spent waiting for the connection to + establish to the final server, including retries. It can be "-1" if the + connection was aborted before a connection could be established. See + "Timers" below for more details. + + - "Tt" is the total time in milliseconds elapsed between the accept and the + last close. It covers all possible processing. There is one exception, if + "option logasap" was specified, then the time counting stops at the moment + the log is emitted. In this case, a '+' sign is prepended before the value, + indicating that the final one will be larger. See "Timers" below for more + details. + + - "bytes_read" is the total number of bytes transmitted from the server to + the client when the log is emitted. If "option logasap" is specified, the + this value will be prefixed with a '+' sign indicating that the final one + may be larger. Please note that this value is a 64-bit counter, so log + analysis tools must be able to handle it without overflowing. + + - "termination_state" is the condition the session was in when the session + ended. This indicates the session state, which side caused the end of + session to happen, and for what reason (timeout, error, ...). The normal + flags should be "--", indicating the session was closed by either end with + no data remaining in buffers. See below "Session state at disconnection" + for more details. + + - "actconn" is the total number of concurrent connections on the process when + the session was logged. It is useful to detect when some per-process system + limits have been reached. For instance, if actconn is close to 512 when + multiple connection errors occur, chances are high that the system limits + the process to use a maximum of 1024 file descriptors and that all of them + are used. See section 3 "Global parameters" to find how to tune the system. + + - "feconn" is the total number of concurrent connections on the frontend when + the session was logged. It is useful to estimate the amount of resource + required to sustain high loads, and to detect when the frontend's "maxconn" + has been reached. Most often when this value increases by huge jumps, it is + because there is congestion on the backend servers, but sometimes it can be + caused by a denial of service attack. + + - "beconn" is the total number of concurrent connections handled by the + backend when the session was logged. It includes the total number of + concurrent connections active on servers as well as the number of + connections pending in queues. It is useful to estimate the amount of + additional servers needed to support high loads for a given application. + Most often when this value increases by huge jumps, it is because there is + congestion on the backend servers, but sometimes it can be caused by a + denial of service attack. + + - "srv_conn" is the total number of concurrent connections still active on + the server when the session was logged. It can never exceed the server's + configured "maxconn" parameter. If this value is very often close or equal + to the server's "maxconn", it means that traffic regulation is involved a + lot, meaning that either the server's maxconn value is too low, or that + there aren't enough servers to process the load with an optimal response + time. When only one of the server's "srv_conn" is high, it usually means + that this server has some trouble causing the connections to take longer to + be processed than on other servers. + + - "retries" is the number of connection retries experienced by this session + when trying to connect to the server. It must normally be zero, unless a + server is being stopped at the same moment the connection was attempted. + Frequent retries generally indicate either a network problem between + HAProxy and the server, or a misconfigured system backlog on the server + preventing new connections from being queued. This field may optionally be + prefixed with a '+' sign, indicating that the session has experienced a + redispatch after the maximal retry count has been reached on the initial + server. In this case, the server name appearing in the log is the one the + connection was redispatched to, and not the first one, though both may + sometimes be the same in case of hashing for instance. So as a general rule + of thumb, when a '+' is present in front of the retry count, this count + should not be attributed to the logged server. + + - "srv_queue" is the total number of requests which were processed before + this one in the server queue. It is zero when the request has not gone + through the server queue. It makes it possible to estimate the approximate + server's response time by dividing the time spent in queue by the number of + requests in the queue. It is worth noting that if a session experiences a + redispatch and passes through two server queues, their positions will be + cumulative. A request should not pass through both the server queue and the + backend queue unless a redispatch occurs. + + - "backend_queue" is the total number of requests which were processed before + this one in the backend's global queue. It is zero when the request has not + gone through the global queue. It makes it possible to estimate the average + queue length, which easily translates into a number of missing servers when + divided by a server's "maxconn" parameter. It is worth noting that if a + session experiences a redispatch, it may pass twice in the backend's queue, + and then both positions will be cumulative. A request should not pass + through both the server queue and the backend queue unless a redispatch + occurs. + + +8.2.3. HTTP log format +---------------------- + +The HTTP format is the most complete and the best suited for HTTP proxies. It +is enabled by when "option httplog" is specified in the frontend. It provides +the same level of information as the TCP format with additional features which +are specific to the HTTP protocol. Just like the TCP format, the log is usually +emitted at the end of the session, unless "option logasap" is specified, which +generally only makes sense for download sites. A session which matches the +"monitor" rules will never logged. It is also possible not to log sessions for +which no data were sent by the client by specifying "option dontlognull" in the +frontend. Successful connections will not be logged if "option dontlog-normal" +is specified in the frontend. + +The HTTP log format is internally declared as a custom log format based on the +exact following string, which may also be used as a basis to extend the format +if required. Refer to section 8.2.6 "Custom log format" to see how to use this: + + # strict equivalent of "option httplog" + log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC \ + %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r" + +And the CLF log format is internally declared as a custom log format based on +this exact string: + + # strict equivalent of "option httplog clf" + log-format "%{+Q}o %{-Q}ci - - [%trg] %r %ST %B \"\" \"\" %cp \ + %ms %ft %b %s %TR %Tw %Tc %Tr %Ta %tsc %ac %fc \ + %bc %sc %rc %sq %bq %CC %CS %hrl %hsl" + +Most fields are shared with the TCP log, some being different. A few fields may +slightly vary depending on some configuration options. Those ones are marked +with a star ('*') after the field name below. + + Example : + frontend http-in + mode http + option httplog + log global + default_backend bck + + backend static + server srv1 127.0.0.1:8000 + + >>> Feb 6 12:14:14 localhost \ + haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] http-in \ + static/srv1 10/0/30/69/109 200 2750 - - ---- 1/1/1/1/0 0/0 {1wt.eu} \ + {} "GET /index.html HTTP/1.1" + + Field Format Extract from the example above + 1 process_name '[' pid ']:' haproxy[14389]: + 2 client_ip ':' client_port 10.0.1.2:33317 + 3 '[' request_date ']' [06/Feb/2009:12:14:14.655] + 4 frontend_name http-in + 5 backend_name '/' server_name static/srv1 + 6 TR '/' Tw '/' Tc '/' Tr '/' Ta* 10/0/30/69/109 + 7 status_code 200 + 8 bytes_read* 2750 + 9 captured_request_cookie - + 10 captured_response_cookie - + 11 termination_state ---- + 12 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 1/1/1/1/0 + 13 srv_queue '/' backend_queue 0/0 + 14 '{' captured_request_headers* '}' {haproxy.1wt.eu} + 15 '{' captured_response_headers* '}' {} + 16 '"' http_request '"' "GET /index.html HTTP/1.1" + +Detailed fields description : + - "client_ip" is the IP address of the client which initiated the TCP + connection to HAProxy. If the connection was accepted on a UNIX socket + instead, the IP address would be replaced with the word "unix". Note that + when the connection is accepted on a socket configured with "accept-proxy" + and the PROXY protocol is correctly used, or with a "accept-netscaler-cip" + and the NetScaler Client IP insertion protocol is correctly used, then the + logs will reflect the forwarded connection's information. + + - "client_port" is the TCP port of the client which initiated the connection. + If the connection was accepted on a UNIX socket instead, the port would be + replaced with the ID of the accepting socket, which is also reported in the + stats interface. + + - "request_date" is the exact date when the first byte of the HTTP request + was received by HAProxy (log field %tr). + + - "frontend_name" is the name of the frontend (or listener) which received + and processed the connection. + + - "backend_name" is the name of the backend (or listener) which was selected + to manage the connection to the server. This will be the same as the + frontend if no switching rule has been applied. + + - "server_name" is the name of the last server to which the connection was + sent, which might differ from the first one if there were connection errors + and a redispatch occurred. Note that this server belongs to the backend + which processed the request. If the request was aborted before reaching a + server, "<NOSRV>" is indicated instead of a server name. If the request was + intercepted by the stats subsystem, "<STATS>" is indicated instead. + + - "TR" is the total time in milliseconds spent waiting for a full HTTP + request from the client (not counting body) after the first byte was + received. It can be "-1" if the connection was aborted before a complete + request could be received or a bad request was received. It should + always be very small because a request generally fits in one single packet. + Large times here generally indicate network issues between the client and + HAProxy or requests being typed by hand. See section 8.4 "Timing Events" + for more details. + + - "Tw" is the total time in milliseconds spent waiting in the various queues. + It can be "-1" if the connection was aborted before reaching the queue. + See section 8.4 "Timing Events" for more details. + + - "Tc" is the total time in milliseconds spent waiting for the connection to + establish to the final server, including retries. It can be "-1" if the + request was aborted before a connection could be established. See section + 8.4 "Timing Events" for more details. + + - "Tr" is the total time in milliseconds spent waiting for the server to send + a full HTTP response, not counting data. It can be "-1" if the request was + aborted before a complete response could be received. It generally matches + the server's processing time for the request, though it may be altered by + the amount of data sent by the client to the server. Large times here on + "GET" requests generally indicate an overloaded server. See section 8.4 + "Timing Events" for more details. + + - "Ta" is the time the request remained active in HAProxy, which is the total + time in milliseconds elapsed between the first byte of the request was + received and the last byte of response was sent. It covers all possible + processing except the handshake (see Th) and idle time (see Ti). There is + one exception, if "option logasap" was specified, then the time counting + stops at the moment the log is emitted. In this case, a '+' sign is + prepended before the value, indicating that the final one will be larger. + See section 8.4 "Timing Events" for more details. + + - "status_code" is the HTTP status code returned to the client. This status + is generally set by the server, but it might also be set by HAProxy when + the server cannot be reached or when its response is blocked by HAProxy. + + - "bytes_read" is the total number of bytes transmitted to the client when + the log is emitted. This does include HTTP headers. If "option logasap" is + specified, this value will be prefixed with a '+' sign indicating that + the final one may be larger. Please note that this value is a 64-bit + counter, so log analysis tools must be able to handle it without + overflowing. + + - "captured_request_cookie" is an optional "name=value" entry indicating that + the client had this cookie in the request. The cookie name and its maximum + length are defined by the "capture cookie" statement in the frontend + configuration. The field is a single dash ('-') when the option is not + set. Only one cookie may be captured, it is generally used to track session + ID exchanges between a client and a server to detect session crossing + between clients due to application bugs. For more details, please consult + the section "Capturing HTTP headers and cookies" below. + + - "captured_response_cookie" is an optional "name=value" entry indicating + that the server has returned a cookie with its response. The cookie name + and its maximum length are defined by the "capture cookie" statement in the + frontend configuration. The field is a single dash ('-') when the option is + not set. Only one cookie may be captured, it is generally used to track + session ID exchanges between a client and a server to detect session + crossing between clients due to application bugs. For more details, please + consult the section "Capturing HTTP headers and cookies" below. + + - "termination_state" is the condition the session was in when the session + ended. This indicates the session state, which side caused the end of + session to happen, for what reason (timeout, error, ...), just like in TCP + logs, and information about persistence operations on cookies in the last + two characters. The normal flags should begin with "--", indicating the + session was closed by either end with no data remaining in buffers. See + below "Session state at disconnection" for more details. + + - "actconn" is the total number of concurrent connections on the process when + the session was logged. It is useful to detect when some per-process system + limits have been reached. For instance, if actconn is close to 512 or 1024 + when multiple connection errors occur, chances are high that the system + limits the process to use a maximum of 1024 file descriptors and that all + of them are used. See section 3 "Global parameters" to find how to tune the + system. + + - "feconn" is the total number of concurrent connections on the frontend when + the session was logged. It is useful to estimate the amount of resource + required to sustain high loads, and to detect when the frontend's "maxconn" + has been reached. Most often when this value increases by huge jumps, it is + because there is congestion on the backend servers, but sometimes it can be + caused by a denial of service attack. + + - "beconn" is the total number of concurrent connections handled by the + backend when the session was logged. It includes the total number of + concurrent connections active on servers as well as the number of + connections pending in queues. It is useful to estimate the amount of + additional servers needed to support high loads for a given application. + Most often when this value increases by huge jumps, it is because there is + congestion on the backend servers, but sometimes it can be caused by a + denial of service attack. + + - "srv_conn" is the total number of concurrent connections still active on + the server when the session was logged. It can never exceed the server's + configured "maxconn" parameter. If this value is very often close or equal + to the server's "maxconn", it means that traffic regulation is involved a + lot, meaning that either the server's maxconn value is too low, or that + there aren't enough servers to process the load with an optimal response + time. When only one of the server's "srv_conn" is high, it usually means + that this server has some trouble causing the requests to take longer to be + processed than on other servers. + + - "retries" is the number of connection retries experienced by this session + when trying to connect to the server. It must normally be zero, unless a + server is being stopped at the same moment the connection was attempted. + Frequent retries generally indicate either a network problem between + HAProxy and the server, or a misconfigured system backlog on the server + preventing new connections from being queued. This field may optionally be + prefixed with a '+' sign, indicating that the session has experienced a + redispatch after the maximal retry count has been reached on the initial + server. In this case, the server name appearing in the log is the one the + connection was redispatched to, and not the first one, though both may + sometimes be the same in case of hashing for instance. So as a general rule + of thumb, when a '+' is present in front of the retry count, this count + should not be attributed to the logged server. + + - "srv_queue" is the total number of requests which were processed before + this one in the server queue. It is zero when the request has not gone + through the server queue. It makes it possible to estimate the approximate + server's response time by dividing the time spent in queue by the number of + requests in the queue. It is worth noting that if a session experiences a + redispatch and passes through two server queues, their positions will be + cumulative. A request should not pass through both the server queue and the + backend queue unless a redispatch occurs. + + - "backend_queue" is the total number of requests which were processed before + this one in the backend's global queue. It is zero when the request has not + gone through the global queue. It makes it possible to estimate the average + queue length, which easily translates into a number of missing servers when + divided by a server's "maxconn" parameter. It is worth noting that if a + session experiences a redispatch, it may pass twice in the backend's queue, + and then both positions will be cumulative. A request should not pass + through both the server queue and the backend queue unless a redispatch + occurs. + + - "captured_request_headers" is a list of headers captured in the request due + to the presence of the "capture request header" statement in the frontend. + Multiple headers can be captured, they will be delimited by a vertical bar + ('|'). When no capture is enabled, the braces do not appear, causing a + shift of remaining fields. It is important to note that this field may + contain spaces, and that using it requires a smarter log parser than when + it's not used. Please consult the section "Capturing HTTP headers and + cookies" below for more details. + + - "captured_response_headers" is a list of headers captured in the response + due to the presence of the "capture response header" statement in the + frontend. Multiple headers can be captured, they will be delimited by a + vertical bar ('|'). When no capture is enabled, the braces do not appear, + causing a shift of remaining fields. It is important to note that this + field may contain spaces, and that using it requires a smarter log parser + than when it's not used. Please consult the section "Capturing HTTP headers + and cookies" below for more details. + + - "http_request" is the complete HTTP request line, including the method, + request and HTTP version string. Non-printable characters are encoded (see + below the section "Non-printable characters"). This is always the last + field, and it is always delimited by quotes and is the only one which can + contain quotes. If new fields are added to the log format, they will be + added before this field. This field might be truncated if the request is + huge and does not fit in the standard syslog buffer (1024 characters). This + is the reason why this field must always remain the last one. + + +8.2.4. HTTPS log format +---------------------- + +The HTTPS format is the best suited for HTTP over SSL connections. It is an +extension of the HTTP format (see section 8.2.3) to which SSL related +information are added. It is enabled when "option httpslog" is specified in the +frontend. Just like the TCP and HTTP formats, the log is usually emitted at the +end of the session, unless "option logasap" is specified. A session which +matches the "monitor" rules will never logged. It is also possible not to log +sessions for which no data were sent by the client by specifying "option +dontlognull" in the frontend. Successful connections will not be logged if +"option dontlog-normal" is specified in the frontend. + +The HTTPS log format is internally declared as a custom log format based on the +exact following string, which may also be used as a basis to extend the format +if required. Refer to section 8.2.6 "Custom log format" to see how to use this: + + # strict equivalent of "option httpslog" + log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC \ + %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r \ + %[fc_err]/%[ssl_fc_err,hex]/%[ssl_c_err]/\ + %[ssl_c_ca_err]/%[ssl_fc_is_resumed] %[ssl_fc_sni]/%sslv/%sslc" + +This format is basically the HTTP one (see section 8.2.3) with new fields +appended to it. The new fields (lines 17 and 18) will be detailed here. For the +HTTP ones, refer to the HTTP section. + + Example : + frontend https-in + mode http + option httpslog + log global + bind *:443 ssl crt mycerts/srv.pem ... + default_backend bck + + backend static + server srv1 127.0.0.1:8000 ssl crt mycerts/clt.pem ... + + >>> Feb 6 12:14:14 localhost \ + haproxy[14389]: 10.0.1.2:33317 [06/Feb/2009:12:14:14.655] https-in \ + static/srv1 10/0/30/69/109 200 2750 - - ---- 1/1/1/1/0 0/0 {1wt.eu} \ + {} "GET /index.html HTTP/1.1" 0/0/0/0/0 \ + 1wt.eu/TLSv1.3/TLS_AES_256_GCM_SHA384 + + Field Format Extract from the example above + 1 process_name '[' pid ']:' haproxy[14389]: + 2 client_ip ':' client_port 10.0.1.2:33317 + 3 '[' request_date ']' [06/Feb/2009:12:14:14.655] + 4 frontend_name https-in + 5 backend_name '/' server_name static/srv1 + 6 TR '/' Tw '/' Tc '/' Tr '/' Ta* 10/0/30/69/109 + 7 status_code 200 + 8 bytes_read* 2750 + 9 captured_request_cookie - + 10 captured_response_cookie - + 11 termination_state ---- + 12 actconn '/' feconn '/' beconn '/' srv_conn '/' retries* 1/1/1/1/0 + 13 srv_queue '/' backend_queue 0/0 + 14 '{' captured_request_headers* '}' {haproxy.1wt.eu} + 15 '{' captured_response_headers* '}' {} + 16 '"' http_request '"' "GET /index.html HTTP/1.1" + 17 fc_err '/' ssl_fc_err '/' ssl_c_err + '/' ssl_c_ca_err '/' ssl_fc_is_resumed 0/0/0/0/0 + 18 ssl_fc_sni '/' ssl_version + '/' ssl_ciphers 1wt.eu/TLSv1.3/TLS_AES_256_GCM_SHA384 + +Detailed fields description : + - "fc_err" is the status of the connection on the frontend's side. It + corresponds to the "fc_err" sample fetch. See the "fc_err" and "fc_err_str" + sample fetch functions for more information. + + - "ssl_fc_err" is the last error of the first SSL error stack that was + raised on the connection from the frontend's perspective. It might be used + to detect SSL handshake errors for instance. It will be 0 if everything + went well. See the "ssl_fc_err" sample fetch's description for more + information. + + - "ssl_c_err" is the status of the client's certificate verification process. + The handshake might be successful while having a non-null verification + error code if it is an ignored one. See the "ssl_c_err" sample fetch and + the "crt-ignore-err" option. + + - "ssl_c_ca_err" is the status of the client's certificate chain verification + process. The handshake might be successful while having a non-null + verification error code if it is an ignored one. See the "ssl_c_ca_err" + sample fetch and the "ca-ignore-err" option. + + - "ssl_fc_is_resumed" is true if the incoming TLS session was resumed with + the stateful cache or a stateless ticket. Don't forgot that a TLS session + can be shared by multiple requests. + + - "ssl_fc_sni" is the SNI (Server Name Indication) presented by the client + to select the certificate to be used. It usually matches the host name for + the first request of a connection. An absence of this field may indicate + that the SNI was not sent by the client, and will lead haproxy to use the + default certificate, or to reject the connection in case of strict-sni. + + - "ssl_version" is the SSL version of the frontend. + + - "ssl_ciphers" is the SSL cipher used for the connection. + + +8.2.5. Error log format +----------------------- + +When an incoming connection fails due to an SSL handshake or an invalid PROXY +protocol header, HAProxy will log the event using a shorter, fixed line format, +unless a dedicated error log format is defined through an "error-log-format" +line. By default, logs are emitted at the LOG_INFO level, unless the option +"log-separate-errors" is set in the backend, in which case the LOG_ERR level +will be used. Connections on which no data are exchanged (e.g. probes) are not +logged if the "dontlognull" option is set. + +The default format looks like this : + + >>> Dec 3 18:27:14 localhost \ + haproxy[6103]: 127.0.0.1:56059 [03/Dec/2012:17:35:10.380] frt/f1: \ + Connection error during SSL handshake + + Field Format Extract from the example above + 1 process_name '[' pid ']:' haproxy[6103]: + 2 client_ip ':' client_port 127.0.0.1:56059 + 3 '[' accept_date ']' [03/Dec/2012:17:35:10.380] + 4 frontend_name "/" bind_name ":" frt/f1: + 5 message Connection error during SSL handshake + +These fields just provide minimal information to help debugging connection +failures. + +By using the "error-log-format" directive, the legacy log format described +above will not be used anymore, and all error log lines will follow the +defined format. + +An example of reasonably complete error-log-format follows, it will report the +source address and port, the connection accept() date, the frontend name, the +number of active connections on the process and on thit frontend, haproxy's +internal error identifier on the front connection, the hexadecimal OpenSSL +error number (that can be copy-pasted to "openssl errstr" for full decoding), +the client certificate extraction status (0 indicates no error), the client +certificate validation status using the CA (0 indicates no error), a boolean +indicating if the connection is new or was resumed, the optional server name +indication (SNI) provided by the client, the SSL version name and the SSL +ciphers used on the connection, if any. Note that backend connection errors +are never reported here since in order for a backend connection to fail, it +would have passed through a successful stream, hence will be available as +regular traffic log (see option httplog or option httpslog). + + # detailed frontend connection error log + error-log-format "%ci:%cp [%tr] %ft %ac/%fc %[fc_err]/\ + %[ssl_fc_err,hex]/%[ssl_c_err]/%[ssl_c_ca_err]/%[ssl_fc_is_resumed] \ + %[ssl_fc_sni]/%sslv/%sslc" + + +8.2.6. Custom log format +------------------------ + +When the default log formats are not sufficient, it is possible to define new +ones in very fine details. As creating a log-format from scratch is not always +a trivial task, it is strongly recommended to first have a look at the existing +formats ("option tcplog", "option httplog", "option httpslog"), pick the one +looking the closest to the expectation, copy its "log-format" equivalent string +and adjust it. + +HAProxy understands some log format variables. % precedes log format variables. +Variables can take arguments using braces ('{}'), and multiple arguments are +separated by commas within the braces. Flags may be added or removed by +prefixing them with a '+' or '-' sign. + +Special variable "%o" may be used to propagate its flags to all other +variables on the same format string. This is particularly handy with quoted +("Q") and escaped ("E") string formats. + +If a variable is named between square brackets ('[' .. ']') then it is used +as a sample expression rule (see section 7.3). This it useful to add some +less common information such as the client's SSL certificate's DN, or to log +the key that would be used to store an entry into a stick table. + +Note: spaces must be escaped. In configuration directives "log-format", +"log-format-sd" and "unique-id-format", spaces are considered as +delimiters and are merged. In order to emit a verbatim '%', it must be +preceded by another '%' resulting in '%%'. + +Note: when using the RFC5424 syslog message format, the characters '"', +'\' and ']' inside PARAM-VALUE should be escaped with '\' as prefix (see +https://tools.ietf.org/html/rfc5424#section-6.3.3 for more details). In +such cases, the use of the flag "E" should be considered. + +Flags are : + * Q: quote a string + * X: hexadecimal representation (IPs, Ports, %Ts, %rt, %pid) + * E: escape characters '"', '\' and ']' in a string with '\' as prefix + (intended purpose is for the RFC5424 structured-data log formats) + + Example: + + log-format %T\ %t\ Some\ Text + log-format %{+Q}o\ %t\ %s\ %{-Q}r + + log-format-sd %{+Q,+E}o\ [exampleSDID@1234\ header=%[capture.req.hdr(0)]] + +Please refer to the table below for currently defined variables : + + +---+------+-----------------------------------------------+-------------+ + | R | var | field name (8.2.2 and 8.2.3 for description) | type | + +---+------+-----------------------------------------------+-------------+ + | | %o | special variable, apply flags on all next var | | + +---+------+-----------------------------------------------+-------------+ + | | %B | bytes_read (from server to client) | numeric | + | H | %CC | captured_request_cookie | string | + | H | %CS | captured_response_cookie | string | + | | %H | hostname | string | + | H | %HM | HTTP method (ex: POST) | string | + | H | %HP | HTTP request URI without query string | string | + | H | %HPO | HTTP path only (without host nor query string)| string | + | H | %HQ | HTTP request URI query string (ex: ?bar=baz) | string | + | H | %HU | HTTP request URI (ex: /foo?bar=baz) | string | + | H | %HV | HTTP version (ex: HTTP/1.0) | string | + | | %ID | unique-id | string | + | | %ST | status_code | numeric | + | | %T | gmt_date_time | date | + | H | %Ta | Active time of the request (from TR to end) | numeric | + | | %Tc | Tc | numeric | + | | %Td | Td = Tt - (Tq + Tw + Tc + Tr) | numeric | + | | %Tl | local_date_time | date | + | | %Th | connection handshake time (SSL, PROXY proto) | numeric | + | H | %Ti | idle time before the HTTP request | numeric | + | H | %Tq | Th + Ti + TR | numeric | + | H | %TR | time to receive the full request from 1st byte| numeric | + | H | %Tr | Tr (response time) | numeric | + | | %Ts | timestamp | numeric | + | | %Tt | Tt | numeric | + | | %Tu | Tu | numeric | + | | %Tw | Tw | numeric | + | | %U | bytes_uploaded (from client to server) | numeric | + | | %ac | actconn | numeric | + | | %b | backend_name | string | + | | %bc | beconn (backend concurrent connections) | numeric | + | | %bi | backend_source_ip (connecting address) | IP | + | | %bp | backend_source_port (connecting address) | numeric | + | | %bq | backend_queue | numeric | + | | %ci | client_ip (accepted address) | IP | + | | %cp | client_port (accepted address) | numeric | + | | %f | frontend_name | string | + | | %fc | feconn (frontend concurrent connections) | numeric | + | | %fi | frontend_ip (accepting address) | IP | + | | %fp | frontend_port (accepting address) | numeric | + | | %ft | frontend_name_transport ('~' suffix for SSL) | string | + | | %lc | frontend_log_counter | numeric | + | | %hr | captured_request_headers default style | string | + | | %hrl | captured_request_headers CLF style | string list | + | | %hs | captured_response_headers default style | string | + | | %hsl | captured_response_headers CLF style | string list | + | | %ms | accept date milliseconds (left-padded with 0) | numeric | + | | %pid | PID | numeric | + | H | %r | http_request | string | + | | %rc | retries | numeric | + | | %rt | request_counter (HTTP req or TCP session) | numeric | + | | %s | server_name | string | + | | %sc | srv_conn (server concurrent connections) | numeric | + | | %si | server_IP (target address) | IP | + | | %sp | server_port (target address) | numeric | + | | %sq | srv_queue | numeric | + | S | %sslc| ssl_ciphers (ex: AES-SHA) | string | + | S | %sslv| ssl_version (ex: TLSv1) | string | + | | %t | date_time (with millisecond resolution) | date | + | H | %tr | date_time of HTTP request | date | + | H | %trg | gmt_date_time of start of HTTP request | date | + | H | %trl | local_date_time of start of HTTP request | date | + | | %ts | termination_state | string | + | H | %tsc | termination_state with cookie status | string | + +---+------+-----------------------------------------------+-------------+ + + R = Restrictions : H = mode http only ; S = SSL only + + +8.3. Advanced logging options +----------------------------- + +Some advanced logging options are often looked for but are not easy to find out +just by looking at the various options. Here is an entry point for the few +options which can enable better logging. Please refer to the keywords reference +for more information about their usage. + + +8.3.1. Disabling logging of external tests +------------------------------------------ + +It is quite common to have some monitoring tools perform health checks on +HAProxy. Sometimes it will be a layer 3 load-balancer such as LVS or any +commercial load-balancer, and sometimes it will simply be a more complete +monitoring system such as Nagios. When the tests are very frequent, users often +ask how to disable logging for those checks. There are three possibilities : + + - if connections come from everywhere and are just TCP probes, it is often + desired to simply disable logging of connections without data exchange, by + setting "option dontlognull" in the frontend. It also disables logging of + port scans, which may or may not be desired. + + - it is possible to use the "http-request set-log-level silent" action using + a variety of conditions (source networks, paths, user-agents, etc). + + - if the tests are performed on a known URI, use "monitor-uri" to declare + this URI as dedicated to monitoring. Any host sending this request will + only get the result of a health-check, and the request will not be logged. + + +8.3.2. Logging before waiting for the session to terminate +---------------------------------------------------------- + +The problem with logging at end of connection is that you have no clue about +what is happening during very long sessions, such as remote terminal sessions +or large file downloads. This problem can be worked around by specifying +"option logasap" in the frontend. HAProxy will then log as soon as possible, +just before data transfer begins. This means that in case of TCP, it will still +log the connection status to the server, and in case of HTTP, it will log just +after processing the server headers. In this case, the number of bytes reported +is the number of header bytes sent to the client. In order to avoid confusion +with normal logs, the total time field and the number of bytes are prefixed +with a '+' sign which means that real numbers are certainly larger. + + +8.3.3. Raising log level upon errors +------------------------------------ + +Sometimes it is more convenient to separate normal traffic from errors logs, +for instance in order to ease error monitoring from log files. When the option +"log-separate-errors" is used, connections which experience errors, timeouts, +retries, redispatches or HTTP status codes 5xx will see their syslog level +raised from "info" to "err". This will help a syslog daemon store the log in +a separate file. It is very important to keep the errors in the normal traffic +file too, so that log ordering is not altered. You should also be careful if +you already have configured your syslog daemon to store all logs higher than +"notice" in an "admin" file, because the "err" level is higher than "notice". + + +8.3.4. Disabling logging of successful connections +-------------------------------------------------- + +Although this may sound strange at first, some large sites have to deal with +multiple thousands of logs per second and are experiencing difficulties keeping +them intact for a long time or detecting errors within them. If the option +"dontlog-normal" is set on the frontend, all normal connections will not be +logged. In this regard, a normal connection is defined as one without any +error, timeout, retry nor redispatch. In HTTP, the status code is checked too, +and a response with a status 5xx is not considered normal and will be logged +too. Of course, doing is is really discouraged as it will remove most of the +useful information from the logs. Do this only if you have no other +alternative. + + +8.4. Timing events +------------------ + +Timers provide a great help in troubleshooting network problems. All values are +reported in milliseconds (ms). These timers should be used in conjunction with +the session termination flags. In TCP mode with "option tcplog" set on the +frontend, 3 control points are reported under the form "Tw/Tc/Tt", and in HTTP +mode, 5 control points are reported under the form "TR/Tw/Tc/Tr/Ta". In +addition, three other measures are provided, "Th", "Ti", and "Tq". + +Timings events in HTTP mode: + + first request 2nd request + |<-------------------------------->|<-------------- ... + t tr t tr ... + ---|----|----|----|----|----|----|----|----|-- + : Th Ti TR Tw Tc Tr Td : Ti ... + :<---- Tq ---->: : + :<-------------- Tt -------------->: + :<-- -----Tu--------------->: + :<--------- Ta --------->: + +Timings events in TCP mode: + + TCP session + |<----------------->| + t t + ---|----|----|----|----|--- + | Th Tw Tc Td | + |<------ Tt ------->| + + - Th: total time to accept tcp connection and execute handshakes for low level + protocols. Currently, these protocols are proxy-protocol and SSL. This may + only happen once during the whole connection's lifetime. A large time here + may indicate that the client only pre-established the connection without + speaking, that it is experiencing network issues preventing it from + completing a handshake in a reasonable time (e.g. MTU issues), or that an + SSL handshake was very expensive to compute. Please note that this time is + reported only before the first request, so it is safe to average it over + all request to calculate the amortized value. The second and subsequent + request will always report zero here. + + - Ti: is the idle time before the HTTP request (HTTP mode only). This timer + counts between the end of the handshakes and the first byte of the HTTP + request. When dealing with a second request in keep-alive mode, it starts + to count after the end of the transmission the previous response. When a + multiplexed protocol such as HTTP/2 is used, it starts to count immediately + after the previous request. Some browsers pre-establish connections to a + server in order to reduce the latency of a future request, and keep them + pending until they need it. This delay will be reported as the idle time. A + value of -1 indicates that nothing was received on the connection. + + - TR: total time to get the client request (HTTP mode only). It's the time + elapsed between the first bytes received and the moment the proxy received + the empty line marking the end of the HTTP headers. The value "-1" + indicates that the end of headers has never been seen. This happens when + the client closes prematurely or times out. This time is usually very short + since most requests fit in a single packet. A large time may indicate a + request typed by hand during a test. + + - Tq: total time to get the client request from the accept date or since the + emission of the last byte of the previous response (HTTP mode only). It's + exactly equal to Th + Ti + TR unless any of them is -1, in which case it + returns -1 as well. This timer used to be very useful before the arrival of + HTTP keep-alive and browsers' pre-connect feature. It's recommended to drop + it in favor of TR nowadays, as the idle time adds a lot of noise to the + reports. + + - Tw: total time spent in the queues waiting for a connection slot. It + accounts for backend queue as well as the server queues, and depends on the + queue size, and the time needed for the server to complete previous + requests. The value "-1" means that the request was killed before reaching + the queue, which is generally what happens with invalid or denied requests. + + - Tc: total time to establish the TCP connection to the server. It's the time + elapsed between the moment the proxy sent the connection request, and the + moment it was acknowledged by the server, or between the TCP SYN packet and + the matching SYN/ACK packet in return. The value "-1" means that the + connection never established. + + - Tr: server response time (HTTP mode only). It's the time elapsed between + the moment the TCP connection was established to the server and the moment + the server sent its complete response headers. It purely shows its request + processing time, without the network overhead due to the data transmission. + It is worth noting that when the client has data to send to the server, for + instance during a POST request, the time already runs, and this can distort + apparent response time. For this reason, it's generally wise not to trust + too much this field for POST requests initiated from clients behind an + untrusted network. A value of "-1" here means that the last the response + header (empty line) was never seen, most likely because the server timeout + stroke before the server managed to process the request. + + - Ta: total active time for the HTTP request, between the moment the proxy + received the first byte of the request header and the emission of the last + byte of the response body. The exception is when the "logasap" option is + specified. In this case, it only equals (TR+Tw+Tc+Tr), and is prefixed with + a '+' sign. From this field, we can deduce "Td", the data transmission time, + by subtracting other timers when valid : + + Td = Ta - (TR + Tw + Tc + Tr) + + Timers with "-1" values have to be excluded from this equation. Note that + "Ta" can never be negative. + + - Tt: total session duration time, between the moment the proxy accepted it + and the moment both ends were closed. The exception is when the "logasap" + option is specified. In this case, it only equals (Th+Ti+TR+Tw+Tc+Tr), and + is prefixed with a '+' sign. From this field, we can deduce "Td", the data + transmission time, by subtracting other timers when valid : + + Td = Tt - (Th + Ti + TR + Tw + Tc + Tr) + + Timers with "-1" values have to be excluded from this equation. In TCP + mode, "Ti", "Tq" and "Tr" have to be excluded too. Note that "Tt" can never + be negative and that for HTTP, Tt is simply equal to (Th+Ti+Ta). + + - Tu: total estimated time as seen from client, between the moment the proxy + accepted it and the moment both ends were closed, without idle time. + This is useful to roughly measure end-to-end time as a user would see it, + without idle time pollution from keep-alive time between requests. This + timer in only an estimation of time seen by user as it assumes network + latency is the same in both directions. The exception is when the "logasap" + option is specified. In this case, it only equals (Th+TR+Tw+Tc+Tr), and is + prefixed with a '+' sign. + +These timers provide precious indications on trouble causes. Since the TCP +protocol defines retransmit delays of 3, 6, 12... seconds, we know for sure +that timers close to multiples of 3s are nearly always related to lost packets +due to network problems (wires, negotiation, congestion). Moreover, if "Ta" or +"Tt" is close to a timeout value specified in the configuration, it often means +that a session has been aborted on timeout. + +Most common cases : + + - If "Th" or "Ti" are close to 3000, a packet has probably been lost between + the client and the proxy. This is very rare on local networks but might + happen when clients are on far remote networks and send large requests. It + may happen that values larger than usual appear here without any network + cause. Sometimes, during an attack or just after a resource starvation has + ended, HAProxy may accept thousands of connections in a few milliseconds. + The time spent accepting these connections will inevitably slightly delay + processing of other connections, and it can happen that request times in the + order of a few tens of milliseconds are measured after a few thousands of + new connections have been accepted at once. Using one of the keep-alive + modes may display larger idle times since "Ti" measures the time spent + waiting for additional requests. + + - If "Tc" is close to 3000, a packet has probably been lost between the + server and the proxy during the server connection phase. This value should + always be very low, such as 1 ms on local networks and less than a few tens + of ms on remote networks. + + - If "Tr" is nearly always lower than 3000 except some rare values which seem + to be the average majored by 3000, there are probably some packets lost + between the proxy and the server. + + - If "Ta" is large even for small byte counts, it generally is because + neither the client nor the server decides to close the connection while + HAProxy is running in tunnel mode and both have agreed on a keep-alive + connection mode. In order to solve this issue, it will be needed to specify + one of the HTTP options to manipulate keep-alive or close options on either + the frontend or the backend. Having the smallest possible 'Ta' or 'Tt' is + important when connection regulation is used with the "maxconn" option on + the servers, since no new connection will be sent to the server until + another one is released. + +Other noticeable HTTP log cases ('xx' means any value to be ignored) : + + TR/Tw/Tc/Tr/+Ta The "option logasap" is present on the frontend and the log + was emitted before the data phase. All the timers are valid + except "Ta" which is shorter than reality. + + -1/xx/xx/xx/Ta The client was not able to send a complete request in time + or it aborted too early. Check the session termination flags + then "timeout http-request" and "timeout client" settings. + + TR/-1/xx/xx/Ta It was not possible to process the request, maybe because + servers were out of order, because the request was invalid + or forbidden by ACL rules. Check the session termination + flags. + + TR/Tw/-1/xx/Ta The connection could not establish on the server. Either it + actively refused it or it timed out after Ta-(TR+Tw) ms. + Check the session termination flags, then check the + "timeout connect" setting. Note that the tarpit action might + return similar-looking patterns, with "Tw" equal to the time + the client connection was maintained open. + + TR/Tw/Tc/-1/Ta The server has accepted the connection but did not return + a complete response in time, or it closed its connection + unexpectedly after Ta-(TR+Tw+Tc) ms. Check the session + termination flags, then check the "timeout server" setting. + + +8.5. Session state at disconnection +----------------------------------- + +TCP and HTTP logs provide a session termination indicator in the +"termination_state" field, just before the number of active connections. It is +2-characters long in TCP mode, and is extended to 4 characters in HTTP mode, +each of which has a special meaning : + + - On the first character, a code reporting the first event which caused the + session to terminate : + + C : the TCP session was unexpectedly aborted by the client. + + S : the TCP session was unexpectedly aborted by the server, or the + server explicitly refused it. + + P : the session was prematurely aborted by the proxy, because of a + connection limit enforcement, because a DENY filter was matched, + because of a security check which detected and blocked a dangerous + error in server response which might have caused information leak + (e.g. cacheable cookie). + + L : the session was locally processed by HAProxy and was not passed to + a server. This is what happens for stats and redirects. + + R : a resource on the proxy has been exhausted (memory, sockets, source + ports, ...). Usually, this appears during the connection phase, and + system logs should contain a copy of the precise error. If this + happens, it must be considered as a very serious anomaly which + should be fixed as soon as possible by any means. + + I : an internal error was identified by the proxy during a self-check. + This should NEVER happen, and you are encouraged to report any log + containing this, because this would almost certainly be a bug. It + would be wise to preventively restart the process after such an + event too, in case it would be caused by memory corruption. + + D : the session was killed by HAProxy because the server was detected + as down and was configured to kill all connections when going down. + + U : the session was killed by HAProxy on this backup server because an + active server was detected as up and was configured to kill all + backup connections when going up. + + K : the session was actively killed by an admin operating on HAProxy. + + c : the client-side timeout expired while waiting for the client to + send or receive data. + + s : the server-side timeout expired while waiting for the server to + send or receive data. + + - : normal session completion, both the client and the server closed + with nothing left in the buffers. + + - on the second character, the TCP or HTTP session state when it was closed : + + R : the proxy was waiting for a complete, valid REQUEST from the client + (HTTP mode only). Nothing was sent to any server. + + Q : the proxy was waiting in the QUEUE for a connection slot. This can + only happen when servers have a 'maxconn' parameter set. It can + also happen in the global queue after a redispatch consecutive to + a failed attempt to connect to a dying server. If no redispatch is + reported, then no connection attempt was made to any server. + + C : the proxy was waiting for the CONNECTION to establish on the + server. The server might at most have noticed a connection attempt. + + H : the proxy was waiting for complete, valid response HEADERS from the + server (HTTP only). + + D : the session was in the DATA phase. + + L : the proxy was still transmitting LAST data to the client while the + server had already finished. This one is very rare as it can only + happen when the client dies while receiving the last packets. + + T : the request was tarpitted. It has been held open with the client + during the whole "timeout tarpit" duration or until the client + closed, both of which will be reported in the "Tw" timer. + + - : normal session completion after end of data transfer. + + - the third character tells whether the persistence cookie was provided by + the client (only in HTTP mode) : + + N : the client provided NO cookie. This is usually the case for new + visitors, so counting the number of occurrences of this flag in the + logs generally indicate a valid trend for the site frequentation. + + I : the client provided an INVALID cookie matching no known server. + This might be caused by a recent configuration change, mixed + cookies between HTTP/HTTPS sites, persistence conditionally + ignored, or an attack. + + D : the client provided a cookie designating a server which was DOWN, + so either "option persist" was used and the client was sent to + this server, or it was not set and the client was redispatched to + another server. + + V : the client provided a VALID cookie, and was sent to the associated + server. + + E : the client provided a valid cookie, but with a last date which was + older than what is allowed by the "maxidle" cookie parameter, so + the cookie is consider EXPIRED and is ignored. The request will be + redispatched just as if there was no cookie. + + O : the client provided a valid cookie, but with a first date which was + older than what is allowed by the "maxlife" cookie parameter, so + the cookie is consider too OLD and is ignored. The request will be + redispatched just as if there was no cookie. + + U : a cookie was present but was not used to select the server because + some other server selection mechanism was used instead (typically a + "use-server" rule). + + - : does not apply (no cookie set in configuration). + + - the last character reports what operations were performed on the persistence + cookie returned by the server (only in HTTP mode) : + + N : NO cookie was provided by the server, and none was inserted either. + + I : no cookie was provided by the server, and the proxy INSERTED one. + Note that in "cookie insert" mode, if the server provides a cookie, + it will still be overwritten and reported as "I" here. + + U : the proxy UPDATED the last date in the cookie that was presented by + the client. This can only happen in insert mode with "maxidle". It + happens every time there is activity at a different date than the + date indicated in the cookie. If any other change happens, such as + a redispatch, then the cookie will be marked as inserted instead. + + P : a cookie was PROVIDED by the server and transmitted as-is. + + R : the cookie provided by the server was REWRITTEN by the proxy, which + happens in "cookie rewrite" or "cookie prefix" modes. + + D : the cookie provided by the server was DELETED by the proxy. + + - : does not apply (no cookie set in configuration). + +The combination of the two first flags gives a lot of information about what +was happening when the session terminated, and why it did terminate. It can be +helpful to detect server saturation, network troubles, local system resource +starvation, attacks, etc... + +The most common termination flags combinations are indicated below. They are +alphabetically sorted, with the lowercase set just after the upper case for +easier finding and understanding. + + Flags Reason + + -- Normal termination. + + CC The client aborted before the connection could be established to the + server. This can happen when HAProxy tries to connect to a recently + dead (or unchecked) server, and the client aborts while HAProxy is + waiting for the server to respond or for "timeout connect" to expire. + + CD The client unexpectedly aborted during data transfer. This can be + caused by a browser crash, by an intermediate equipment between the + client and HAProxy which decided to actively break the connection, + by network routing issues between the client and HAProxy, or by a + keep-alive session between the server and the client terminated first + by the client. + + cD The client did not send nor acknowledge any data for as long as the + "timeout client" delay. This is often caused by network failures on + the client side, or the client simply leaving the net uncleanly. + + CH The client aborted while waiting for the server to start responding. + It might be the server taking too long to respond or the client + clicking the 'Stop' button too fast. + + cH The "timeout client" stroke while waiting for client data during a + POST request. This is sometimes caused by too large TCP MSS values + for PPPoE networks which cannot transport full-sized packets. It can + also happen when client timeout is smaller than server timeout and + the server takes too long to respond. + + CQ The client aborted while its session was queued, waiting for a server + with enough empty slots to accept it. It might be that either all the + servers were saturated or that the assigned server was taking too + long a time to respond. + + CR The client aborted before sending a full HTTP request. Most likely + the request was typed by hand using a telnet client, and aborted + too early. The HTTP status code is likely a 400 here. Sometimes this + might also be caused by an IDS killing the connection between HAProxy + and the client. "option http-ignore-probes" can be used to ignore + connections without any data transfer. + + cR The "timeout http-request" stroke before the client sent a full HTTP + request. This is sometimes caused by too large TCP MSS values on the + client side for PPPoE networks which cannot transport full-sized + packets, or by clients sending requests by hand and not typing fast + enough, or forgetting to enter the empty line at the end of the + request. The HTTP status code is likely a 408 here. Note: recently, + some browsers started to implement a "pre-connect" feature consisting + in speculatively connecting to some recently visited web sites just + in case the user would like to visit them. This results in many + connections being established to web sites, which end up in 408 + Request Timeout if the timeout strikes first, or 400 Bad Request when + the browser decides to close them first. These ones pollute the log + and feed the error counters. Some versions of some browsers have even + been reported to display the error code. It is possible to work + around the undesirable effects of this behavior by adding "option + http-ignore-probes" in the frontend, resulting in connections with + zero data transfer to be totally ignored. This will definitely hide + the errors of people experiencing connectivity issues though. + + CT The client aborted while its session was tarpitted. It is important to + check if this happens on valid requests, in order to be sure that no + wrong tarpit rules have been written. If a lot of them happen, it + might make sense to lower the "timeout tarpit" value to something + closer to the average reported "Tw" timer, in order not to consume + resources for just a few attackers. + + LR The request was intercepted and locally handled by HAProxy. Generally + it means that this was a redirect or a stats request. + + SC The server or an equipment between it and HAProxy explicitly refused + the TCP connection (the proxy received a TCP RST or an ICMP message + in return). Under some circumstances, it can also be the network + stack telling the proxy that the server is unreachable (e.g. no route, + or no ARP response on local network). When this happens in HTTP mode, + the status code is likely a 502 or 503 here. + + sC The "timeout connect" stroke before a connection to the server could + complete. When this happens in HTTP mode, the status code is likely a + 503 or 504 here. + + SD The connection to the server died with an error during the data + transfer. This usually means that HAProxy has received an RST from + the server or an ICMP message from an intermediate equipment while + exchanging data with the server. This can be caused by a server crash + or by a network issue on an intermediate equipment. + + sD The server did not send nor acknowledge any data for as long as the + "timeout server" setting during the data phase. This is often caused + by too short timeouts on L4 equipment before the server (firewalls, + load-balancers, ...), as well as keep-alive sessions maintained + between the client and the server expiring first on HAProxy. + + SH The server aborted before sending its full HTTP response headers, or + it crashed while processing the request. Since a server aborting at + this moment is very rare, it would be wise to inspect its logs to + control whether it crashed and why. The logged request may indicate a + small set of faulty requests, demonstrating bugs in the application. + Sometimes this might also be caused by an IDS killing the connection + between HAProxy and the server. + + sH The "timeout server" stroke before the server could return its + response headers. This is the most common anomaly, indicating too + long transactions, probably caused by server or database saturation. + The immediate workaround consists in increasing the "timeout server" + setting, but it is important to keep in mind that the user experience + will suffer from these long response times. The only long term + solution is to fix the application. + + sQ The session spent too much time in queue and has been expired. See + the "timeout queue" and "timeout connect" settings to find out how to + fix this if it happens too often. If it often happens massively in + short periods, it may indicate general problems on the affected + servers due to I/O or database congestion, or saturation caused by + external attacks. + + PC The proxy refused to establish a connection to the server because the + process's socket limit has been reached while attempting to connect. + The global "maxconn" parameter may be increased in the configuration + so that it does not happen anymore. This status is very rare and + might happen when the global "ulimit-n" parameter is forced by hand. + + PD The proxy blocked an incorrectly formatted chunked encoded message in + a request or a response, after the server has emitted its headers. In + most cases, this will indicate an invalid message from the server to + the client. HAProxy supports chunk sizes of up to 2GB - 1 (2147483647 + bytes). Any larger size will be considered as an error. + + PH The proxy blocked the server's response, because it was invalid, + incomplete, dangerous (cache control), or matched a security filter. + In any case, an HTTP 502 error is sent to the client. One possible + cause for this error is an invalid syntax in an HTTP header name + containing unauthorized characters. It is also possible but quite + rare, that the proxy blocked a chunked-encoding request from the + client due to an invalid syntax, before the server responded. In this + case, an HTTP 400 error is sent to the client and reported in the + logs. Finally, it may be due to an HTTP header rewrite failure on the + response. In this case, an HTTP 500 error is sent (see + "tune.maxrewrite" and "http-response strict-mode" for more + inforomation). + + PR The proxy blocked the client's HTTP request, either because of an + invalid HTTP syntax, in which case it returned an HTTP 400 error to + the client, or because a deny filter matched, in which case it + returned an HTTP 403 error. It may also be due to an HTTP header + rewrite failure on the request. In this case, an HTTP 500 error is + sent (see "tune.maxrewrite" and "http-request strict-mode" for more + inforomation). + + PT The proxy blocked the client's request and has tarpitted its + connection before returning it a 500 server error. Nothing was sent + to the server. The connection was maintained open for as long as + reported by the "Tw" timer field. + + RC A local resource has been exhausted (memory, sockets, source ports) + preventing the connection to the server from establishing. The error + logs will tell precisely what was missing. This is very rare and can + only be solved by proper system tuning. + +The combination of the two last flags gives a lot of information about how +persistence was handled by the client, the server and by HAProxy. This is very +important to troubleshoot disconnections, when users complain they have to +re-authenticate. The commonly encountered flags are : + + -- Persistence cookie is not enabled. + + NN No cookie was provided by the client, none was inserted in the + response. For instance, this can be in insert mode with "postonly" + set on a GET request. + + II A cookie designating an invalid server was provided by the client, + a valid one was inserted in the response. This typically happens when + a "server" entry is removed from the configuration, since its cookie + value can be presented by a client when no other server knows it. + + NI No cookie was provided by the client, one was inserted in the + response. This typically happens for first requests from every user + in "insert" mode, which makes it an easy way to count real users. + + VN A cookie was provided by the client, none was inserted in the + response. This happens for most responses for which the client has + already got a cookie. + + VU A cookie was provided by the client, with a last visit date which is + not completely up-to-date, so an updated cookie was provided in + response. This can also happen if there was no date at all, or if + there was a date but the "maxidle" parameter was not set, so that the + cookie can be switched to unlimited time. + + EI A cookie was provided by the client, with a last visit date which is + too old for the "maxidle" parameter, so the cookie was ignored and a + new cookie was inserted in the response. + + OI A cookie was provided by the client, with a first visit date which is + too old for the "maxlife" parameter, so the cookie was ignored and a + new cookie was inserted in the response. + + DI The server designated by the cookie was down, a new server was + selected and a new cookie was emitted in the response. + + VI The server designated by the cookie was not marked dead but could not + be reached. A redispatch happened and selected another one, which was + then advertised in the response. + + +8.6. Non-printable characters +----------------------------- + +In order not to cause trouble to log analysis tools or terminals during log +consulting, non-printable characters are not sent as-is into log files, but are +converted to the two-digits hexadecimal representation of their ASCII code, +prefixed by the character '#'. The only characters that can be logged without +being escaped are comprised between 32 and 126 (inclusive). Obviously, the +escape character '#' itself is also encoded to avoid any ambiguity ("#23"). It +is the same for the character '"' which becomes "#22", as well as '{', '|' and +'}' when logging headers. + +Note that the space character (' ') is not encoded in headers, which can cause +issues for tools relying on space count to locate fields. A typical header +containing spaces is "User-Agent". + +Last, it has been observed that some syslog daemons such as syslog-ng escape +the quote ('"') with a backslash ('\'). The reverse operation can safely be +performed since no quote may appear anywhere else in the logs. + + +8.7. Capturing HTTP cookies +--------------------------- + +Cookie capture simplifies the tracking a complete user session. This can be +achieved using the "capture cookie" statement in the frontend. Please refer to +section 4.2 for more details. Only one cookie can be captured, and the same +cookie will simultaneously be checked in the request ("Cookie:" header) and in +the response ("Set-Cookie:" header). The respective values will be reported in +the HTTP logs at the "captured_request_cookie" and "captured_response_cookie" +locations (see section 8.2.3 about HTTP log format). When either cookie is +not seen, a dash ('-') replaces the value. This way, it's easy to detect when a +user switches to a new session for example, because the server will reassign it +a new cookie. It is also possible to detect if a server unexpectedly sets a +wrong cookie to a client, leading to session crossing. + + Examples : + # capture the first cookie whose name starts with "ASPSESSION" + capture cookie ASPSESSION len 32 + + # capture the first cookie whose name is exactly "vgnvisitor" + capture cookie vgnvisitor= len 32 + + +8.8. Capturing HTTP headers +--------------------------- + +Header captures are useful to track unique request identifiers set by an upper +proxy, virtual host names, user-agents, POST content-length, referrers, etc. In +the response, one can search for information about the response length, how the +server asked the cache to behave, or an object location during a redirection. + +Header captures are performed using the "capture request header" and "capture +response header" statements in the frontend. Please consult their definition in +section 4.2 for more details. + +It is possible to include both request headers and response headers at the same +time. Non-existent headers are logged as empty strings, and if one header +appears more than once, only its last occurrence will be logged. Request headers +are grouped within braces '{' and '}' in the same order as they were declared, +and delimited with a vertical bar '|' without any space. Response headers +follow the same representation, but are displayed after a space following the +request headers block. These blocks are displayed just before the HTTP request +in the logs. + +As a special case, it is possible to specify an HTTP header capture in a TCP +frontend. The purpose is to enable logging of headers which will be parsed in +an HTTP backend if the request is then switched to this HTTP backend. + + Example : + # This instance chains to the outgoing proxy + listen proxy-out + mode http + option httplog + option logasap + log global + server cache1 192.168.1.1:3128 + + # log the name of the virtual server + capture request header Host len 20 + + # log the amount of data uploaded during a POST + capture request header Content-Length len 10 + + # log the beginning of the referrer + capture request header Referer len 20 + + # server name (useful for outgoing proxies only) + capture response header Server len 20 + + # logging the content-length is useful with "option logasap" + capture response header Content-Length len 10 + + # log the expected cache behavior on the response + capture response header Cache-Control len 8 + + # the Via header will report the next proxy's name + capture response header Via len 20 + + # log the URL location during a redirection + capture response header Location len 20 + + >>> Aug 9 20:26:09 localhost \ + haproxy[2022]: 127.0.0.1:34014 [09/Aug/2004:20:26:09] proxy-out \ + proxy-out/cache1 0/0/0/162/+162 200 +350 - - ---- 0/0/0/0/0 0/0 \ + {fr.adserver.yahoo.co||http://fr.f416.mail.} {|864|private||} \ + "GET http://fr.adserver.yahoo.com/" + + >>> Aug 9 20:30:46 localhost \ + haproxy[2022]: 127.0.0.1:34020 [09/Aug/2004:20:30:46] proxy-out \ + proxy-out/cache1 0/0/0/182/+182 200 +279 - - ---- 0/0/0/0/0 0/0 \ + {w.ods.org||} {Formilux/0.1.8|3495|||} \ + "GET http://trafic.1wt.eu/ HTTP/1.1" + + >>> Aug 9 20:30:46 localhost \ + haproxy[2022]: 127.0.0.1:34028 [09/Aug/2004:20:30:46] proxy-out \ + proxy-out/cache1 0/0/2/126/+128 301 +223 - - ---- 0/0/0/0/0 0/0 \ + {www.sytadin.equipement.gouv.fr||http://trafic.1wt.eu/} \ + {Apache|230|||http://www.sytadin.} \ + "GET http://www.sytadin.equipement.gouv.fr/ HTTP/1.1" + + +8.9. Examples of logs +--------------------- + +These are real-world examples of logs accompanied with an explanation. Some of +them have been made up by hand. The syslog part has been removed for better +reading. Their sole purpose is to explain how to decipher them. + + >>> haproxy[674]: 127.0.0.1:33318 [15/Oct/2003:08:31:57.130] px-http \ + px-http/srv1 6559/0/7/147/6723 200 243 - - ---- 5/3/3/1/0 0/0 \ + "HEAD / HTTP/1.0" + + => long request (6.5s) entered by hand through 'telnet'. The server replied + in 147 ms, and the session ended normally ('----') + + >>> haproxy[674]: 127.0.0.1:33319 [15/Oct/2003:08:31:57.149] px-http \ + px-http/srv1 6559/1230/7/147/6870 200 243 - - ---- 324/239/239/99/0 \ + 0/9 "HEAD / HTTP/1.0" + + => Idem, but the request was queued in the global queue behind 9 other + requests, and waited there for 1230 ms. + + >>> haproxy[674]: 127.0.0.1:33320 [15/Oct/2003:08:32:17.654] px-http \ + px-http/srv1 9/0/7/14/+30 200 +243 - - ---- 3/3/3/1/0 0/0 \ + "GET /image.iso HTTP/1.0" + + => request for a long data transfer. The "logasap" option was specified, so + the log was produced just before transferring data. The server replied in + 14 ms, 243 bytes of headers were sent to the client, and total time from + accept to first data byte is 30 ms. + + >>> haproxy[674]: 127.0.0.1:33320 [15/Oct/2003:08:32:17.925] px-http \ + px-http/srv1 9/0/7/14/30 502 243 - - PH-- 3/2/2/0/0 0/0 \ + "GET /cgi-bin/bug.cgi? HTTP/1.0" + + => the proxy blocked a server response either because of an "http-response + deny" rule, or because the response was improperly formatted and not + HTTP-compliant, or because it blocked sensitive information which risked + being cached. In this case, the response is replaced with a "502 bad + gateway". The flags ("PH--") tell us that it was HAProxy who decided to + return the 502 and not the server. + + >>> haproxy[18113]: 127.0.0.1:34548 [15/Oct/2003:15:18:55.798] px-http \ + px-http/<NOSRV> -1/-1/-1/-1/8490 -1 0 - - CR-- 2/2/2/0/0 0/0 "" + + => the client never completed its request and aborted itself ("C---") after + 8.5s, while the proxy was waiting for the request headers ("-R--"). + Nothing was sent to any server. + + >>> haproxy[18113]: 127.0.0.1:34549 [15/Oct/2003:15:19:06.103] px-http \ + px-http/<NOSRV> -1/-1/-1/-1/50001 408 0 - - cR-- 2/2/2/0/0 0/0 "" + + => The client never completed its request, which was aborted by the + time-out ("c---") after 50s, while the proxy was waiting for the request + headers ("-R--"). Nothing was sent to any server, but the proxy could + send a 408 return code to the client. + + >>> haproxy[18989]: 127.0.0.1:34550 [15/Oct/2003:15:24:28.312] px-tcp \ + px-tcp/srv1 0/0/5007 0 cD 0/0/0/0/0 0/0 + + => This log was produced with "option tcplog". The client timed out after + 5 seconds ("c----"). + + >>> haproxy[18989]: 10.0.0.1:34552 [15/Oct/2003:15:26:31.462] px-http \ + px-http/srv1 3183/-1/-1/-1/11215 503 0 - - SC-- 205/202/202/115/3 \ + 0/0 "HEAD / HTTP/1.0" + + => The request took 3s to complete (probably a network problem), and the + connection to the server failed ('SC--') after 4 attempts of 2 seconds + (config says 'retries 3'), and no redispatch (otherwise we would have + seen "/+3"). Status code 503 was returned to the client. There were 115 + connections on this server, 202 connections on this proxy, and 205 on + the global process. It is possible that the server refused the + connection because of too many already established. + + +9. Supported filters +-------------------- + +Here are listed officially supported filters with the list of parameters they +accept. Depending on compile options, some of these filters might be +unavailable. The list of available filters is reported in haproxy -vv. + +See also : "filter" + +9.1. Trace +---------- + +filter trace [name <name>] [random-forwarding] [hexdump] + + Arguments: + <name> is an arbitrary name that will be reported in + messages. If no name is provided, "TRACE" is used. + + <quiet> inhibits trace messages. + + <random-forwarding> enables the random forwarding of parsed data. By + default, this filter forwards all previously parsed + data. With this parameter, it only forwards a random + amount of the parsed data. + + <hexdump> dumps all forwarded data to the server and the client. + +This filter can be used as a base to develop new filters. It defines all +callbacks and print a message on the standard error stream (stderr) with useful +information for all of them. It may be useful to debug the activity of other +filters or, quite simply, HAProxy's activity. + +Using <random-parsing> and/or <random-forwarding> parameters is a good way to +tests the behavior of a filter that parses data exchanged between a client and +a server by adding some latencies in the processing. + + +9.2. HTTP compression +--------------------- + +filter compression + +The HTTP compression has been moved in a filter in HAProxy 1.7. "compression" +keyword must still be used to enable and configure the HTTP compression. And +when no other filter is used, it is enough. When used with the cache or the +fcgi-app enabled, it is also enough. In this case, the compression is always +done after the response is stored in the cache. But it is mandatory to +explicitly use a filter line to enable the HTTP compression when at least one +filter other than the cache or the fcgi-app is used for the same +listener/frontend/backend. This is important to know the filters evaluation +order. + +See also : "compression", section 9.4 about the cache filter and section 9.5 + about the fcgi-app filter. + + +9.3. Stream Processing Offload Engine (SPOE) +-------------------------------------------- + +filter spoe [engine <name>] config <file> + + Arguments : + + <name> is the engine name that will be used to find the right scope in + the configuration file. If not provided, all the file will be + parsed. + + <file> is the path of the engine configuration file. This file can + contain configuration of several engines. In this case, each + part must be placed in its own scope. + +The Stream Processing Offload Engine (SPOE) is a filter communicating with +external components. It allows the offload of some specifics processing on the +streams in tiered applications. These external components and information +exchanged with them are configured in dedicated files, for the main part. It +also requires dedicated backends, defined in HAProxy configuration. + +SPOE communicates with external components using an in-house binary protocol, +the Stream Processing Offload Protocol (SPOP). + +For all information about the SPOE configuration and the SPOP specification, see +"doc/SPOE.txt". + +9.4. Cache +---------- + +filter cache <name> + + Arguments : + + <name> is name of the cache section this filter will use. + +The cache uses a filter to store cacheable responses. The HTTP rules +"cache-store" and "cache-use" must be used to define how and when to use a +cache. By default the corresponding filter is implicitly defined. And when no +other filters than fcgi-app or compression are used, it is enough. In such +case, the compression filter is always evaluated after the cache filter. But it +is mandatory to explicitly use a filter line to use a cache when at least one +filter other than the compression or the fcgi-app is used for the same +listener/frontend/backend. This is important to know the filters evaluation +order. + +See also : section 9.2 about the compression filter, section 9.5 about the + fcgi-app filter and section 6 about cache. + + +9.5. Fcgi-app +------------- + +filter fcgi-app <name> + + Arguments : + + <name> is name of the fcgi-app section this filter will use. + +The FastCGI application uses a filter to evaluate all custom parameters on the +request path, and to process the headers on the response path. the <name> must +reference an existing fcgi-app section. The directive "use-fcgi-app" should be +used to define the application to use. By default the corresponding filter is +implicitly defined. And when no other filters than cache or compression are +used, it is enough. But it is mandatory to explicitly use a filter line to a +fcgi-app when at least one filter other than the compression or the cache is +used for the same backend. This is important to know the filters evaluation +order. + +See also: "use-fcgi-app", section 9.2 about the compression filter, section 9.4 + about the cache filter and section 10 about FastCGI application. + + +9.6. OpenTracing +---------------- + +The OpenTracing filter adds native support for using distributed tracing in +HAProxy. This is enabled by sending an OpenTracing compliant request to one +of the supported tracers such as Datadog, Jaeger, Lightstep and Zipkin tracers. +Please note: tracers are not listed by any preference, but alphabetically. + +This feature is only enabled when HAProxy was built with USE_OT=1. + +The OpenTracing filter activation is done explicitly by specifying it in the +HAProxy configuration. If this is not done, the OpenTracing filter in no way +participates in the work of HAProxy. + +filter opentracing [id <id>] config <file> + + Arguments : + + <id> is the OpenTracing filter id that will be used to find the + right scope in the configuration file. If no filter id is + specified, 'ot-filter' is used as default. If scope is not + specified in the configuration file, it applies to all defined + OpenTracing filters. + + <file> is the path of the OpenTracing configuration file. The same + file can contain configurations for multiple OpenTracing + filters simultaneously. In that case we do not need to define + scope so the same configuration applies to all filters or each + filter must have its own scope defined. + +More detailed documentation related to the operation, configuration and use +of the filter can be found in the addons/ot directory. + + +10. FastCGI applications +------------------------- + +HAProxy is able to send HTTP requests to Responder FastCGI applications. This +feature was added in HAProxy 2.1. To do so, servers must be configured to use +the FastCGI protocol (using the keyword "proto fcgi" on the server line) and a +FastCGI application must be configured and used by the backend managing these +servers (using the keyword "use-fcgi-app" into the proxy section). Several +FastCGI applications may be defined, but only one can be used at a time by a +backend. + +HAProxy implements all features of the FastCGI specification for Responder +application. Especially it is able to multiplex several requests on a simple +connection. + +10.1. Setup +----------- + +10.1.1. Fcgi-app section +-------------------------- + +fcgi-app <name> + Declare a FastCGI application named <name>. To be valid, at least the + document root must be defined. + +acl <aclname> <criterion> [flags] [operator] <value> ... + Declare or complete an access list. + + See "acl" keyword in section 4.2 and section 7 about ACL usage for + details. ACLs defined for a FastCGI application are private. They cannot be + used by any other application or by any proxy. In the same way, ACLs defined + in any other section are not usable by a FastCGI application. However, + Pre-defined ACLs are available. + +docroot <path> + Define the document root on the remote host. <path> will be used to build + the default value of FastCGI parameters SCRIPT_FILENAME and + PATH_TRANSLATED. It is a mandatory setting. + +index <script-name> + Define the script name that will be appended after an URI that ends with a + slash ("/") to set the default value of the FastCGI parameter SCRIPT_NAME. It + is an optional setting. + + Example : + index index.php + +log-stderr global +log-stderr <address> [len <length>] [format <format>] + [sample <ranges>:<sample_size>] <facility> [<level> [<minlevel>]] + Enable logging of STDERR messages reported by the FastCGI application. + + See "log" keyword in section 4.2 for details. It is an optional setting. By + default STDERR messages are ignored. + +pass-header <name> [ { if | unless } <condition> ] + Specify the name of a request header which will be passed to the FastCGI + application. It may optionally be followed by an ACL-based condition, in + which case it will only be evaluated if the condition is true. + + Most request headers are already available to the FastCGI application, + prefixed with "HTTP_". Thus, this directive is only required to pass headers + that are purposefully omitted. Currently, the headers "Authorization", + "Proxy-Authorization" and hop-by-hop headers are omitted. + + Note that the headers "Content-type" and "Content-length" are never passed to + the FastCGI application because they are already converted into parameters. + +path-info <regex> + Define a regular expression to extract the script-name and the path-info from + the URL-decoded path. Thus, <regex> may have two captures: the first one to + capture the script name and the second one to capture the path-info. The + first one is mandatory, the second one is optional. This way, it is possible + to extract the script-name from the path ignoring the path-info. It is an + optional setting. If it is not defined, no matching is performed on the + path. and the FastCGI parameters PATH_INFO and PATH_TRANSLATED are not + filled. + + For security reason, when this regular expression is defined, the newline and + the null characters are forbidden from the path, once URL-decoded. The reason + to such limitation is because otherwise the matching always fails (due to a + limitation one the way regular expression are executed in HAProxy). So if one + of these two characters is found in the URL-decoded path, an error is + returned to the client. The principle of least astonishment is applied here. + + Example : + path-info ^(/.+\.php)(/.*)?$ # both script-name and path-info may be set + path-info ^(/.+\.php) # the path-info is ignored + +option get-values +no option get-values + Enable or disable the retrieve of variables about connection management. + + HAProxy is able to send the record FCGI_GET_VALUES on connection + establishment to retrieve the value for following variables: + + * FCGI_MAX_REQS The maximum number of concurrent requests this + application will accept. + + * FCGI_MPXS_CONNS "0" if this application does not multiplex connections, + "1" otherwise. + + Some FastCGI applications does not support this feature. Some others close + the connection immediately after sending their response. So, by default, this + option is disabled. + + Note that the maximum number of concurrent requests accepted by a FastCGI + application is a connection variable. It only limits the number of streams + per connection. If the global load must be limited on the application, the + server parameters "maxconn" and "pool-max-conn" must be set. In addition, if + an application does not support connection multiplexing, the maximum number + of concurrent requests is automatically set to 1. + +option keep-conn +no option keep-conn + Instruct the FastCGI application to keep the connection open or not after + sending a response. + + If disabled, the FastCGI application closes the connection after responding + to this request. By default, this option is enabled. + +option max-reqs <reqs> + Define the maximum number of concurrent requests this application will + accept. + + This option may be overwritten if the variable FCGI_MAX_REQS is retrieved + during connection establishment. Furthermore, if the application does not + support connection multiplexing, this option will be ignored. By default set + to 1. + +option mpxs-conns +no option mpxs-conns + Enable or disable the support of connection multiplexing. + + This option may be overwritten if the variable FCGI_MPXS_CONNS is retrieved + during connection establishment. It is disabled by default. + +set-param <name> <fmt> [ { if | unless } <condition> ] + Set a FastCGI parameter that should be passed to this application. Its + value, defined by <fmt> must follows the log-format rules (see section 8.2.4 + "Custom Log format"). It may optionally be followed by an ACL-based + condition, in which case it will only be evaluated if the condition is true. + + With this directive, it is possible to overwrite the value of default FastCGI + parameters. If the value is evaluated to an empty string, the rule is + ignored. These directives are evaluated in their declaration order. + + Example : + # PHP only, required if PHP was built with --enable-force-cgi-redirect + set-param REDIRECT_STATUS 200 + + set-param PHP_AUTH_DIGEST %[req.hdr(Authorization)] + + +10.1.2. Proxy section +--------------------- + +use-fcgi-app <name> + Define the FastCGI application to use for the backend. + + Arguments : + <name> is the name of the FastCGI application to use. + + This keyword is only available for HTTP proxies with the backend capability + and with at least one FastCGI server. However, FastCGI servers can be mixed + with HTTP servers. But except there is a good reason to do so, it is not + recommended (see section 10.3 about the limitations for details). Only one + application may be defined at a time per backend. + + Note that, once a FastCGI application is referenced for a backend, depending + on the configuration some processing may be done even if the request is not + sent to a FastCGI server. Rules to set parameters or pass headers to an + application are evaluated. + + +10.1.3. Example +--------------- + + frontend front-http + mode http + bind *:80 + bind *: + + use_backend back-dynamic if { path_reg ^/.+\.php(/.*)?$ } + default_backend back-static + + backend back-static + mode http + server www A.B.C.D:80 + + backend back-dynamic + mode http + use-fcgi-app php-fpm + server php-fpm A.B.C.D:9000 proto fcgi + + fcgi-app php-fpm + log-stderr global + option keep-conn + + docroot /var/www/my-app + index index.php + path-info ^(/.+\.php)(/.*)?$ + + +10.2. Default parameters +------------------------ + +A Responder FastCGI application has the same purpose as a CGI/1.1 program. In +the CGI/1.1 specification (RFC3875), several variables must be passed to the +script. So HAProxy set them and some others commonly used by FastCGI +applications. All these variables may be overwritten, with caution though. + + +-------------------+-----------------------------------------------------+ + | AUTH_TYPE | Identifies the mechanism, if any, used by HAProxy | + | | to authenticate the user. Concretely, only the | + | | BASIC authentication mechanism is supported. | + | | | + +-------------------+-----------------------------------------------------+ + | CONTENT_LENGTH | Contains the size of the message-body attached to | + | | the request. It means only requests with a known | + | | size are considered as valid and sent to the | + | | application. | + | | | + +-------------------+-----------------------------------------------------+ + | CONTENT_TYPE | Contains the type of the message-body attached to | + | | the request. It may not be set. | + | | | + +-------------------+-----------------------------------------------------+ + | DOCUMENT_ROOT | Contains the document root on the remote host under | + | | which the script should be executed, as defined in | + | | the application's configuration. | + | | | + +-------------------+-----------------------------------------------------+ + | GATEWAY_INTERFACE | Contains the dialect of CGI being used by HAProxy | + | | to communicate with the FastCGI application. | + | | Concretely, it is set to "CGI/1.1". | + | | | + +-------------------+-----------------------------------------------------+ + | PATH_INFO | Contains the portion of the URI path hierarchy | + | | following the part that identifies the script | + | | itself. To be set, the directive "path-info" must | + | | be defined. | + | | | + +-------------------+-----------------------------------------------------+ + | PATH_TRANSLATED | If PATH_INFO is set, it is its translated version. | + | | It is the concatenation of DOCUMENT_ROOT and | + | | PATH_INFO. If PATH_INFO is not set, this parameters | + | | is not set too. | + | | | + +-------------------+-----------------------------------------------------+ + | QUERY_STRING | Contains the request's query string. It may not be | + | | set. | + | | | + +-------------------+-----------------------------------------------------+ + | REMOTE_ADDR | Contains the network address of the client sending | + | | the request. | + | | | + +-------------------+-----------------------------------------------------+ + | REMOTE_USER | Contains the user identification string supplied by | + | | client as part of user authentication. | + | | | + +-------------------+-----------------------------------------------------+ + | REQUEST_METHOD | Contains the method which should be used by the | + | | script to process the request. | + | | | + +-------------------+-----------------------------------------------------+ + | REQUEST_URI | Contains the request's URI. | + | | | + +-------------------+-----------------------------------------------------+ + | SCRIPT_FILENAME | Contains the absolute pathname of the script. it is | + | | the concatenation of DOCUMENT_ROOT and SCRIPT_NAME. | + | | | + +-------------------+-----------------------------------------------------+ + | SCRIPT_NAME | Contains the name of the script. If the directive | + | | "path-info" is defined, it is the first part of the | + | | URI path hierarchy, ending with the script name. | + | | Otherwise, it is the entire URI path. | + | | | + +-------------------+-----------------------------------------------------+ + | SERVER_NAME | Contains the name of the server host to which the | + | | client request is directed. It is the value of the | + | | header "Host", if defined. Otherwise, the | + | | destination address of the connection on the client | + | | side. | + | | | + +-------------------+-----------------------------------------------------+ + | SERVER_PORT | Contains the destination TCP port of the connection | + | | on the client side, which is the port the client | + | | connected to. | + | | | + +-------------------+-----------------------------------------------------+ + | SERVER_PROTOCOL | Contains the request's protocol. | + | | | + +-------------------+-----------------------------------------------------+ + | SERVER_SOFTWARE | Contains the string "HAProxy" followed by the | + | | current HAProxy version. | + | | | + +-------------------+-----------------------------------------------------+ + | HTTPS | Set to a non-empty value ("on") if the script was | + | | queried through the HTTPS protocol. | + | | | + +-------------------+-----------------------------------------------------+ + + +10.3. Limitations +------------------ + +The current implementation have some limitations. The first one is about the +way some request headers are hidden to the FastCGI applications. This happens +during the headers analysis, on the backend side, before the connection +establishment. At this stage, HAProxy know the backend is using a FastCGI +application but it don't know if the request will be routed to a FastCGI server +or not. But to hide request headers, it simply removes them from the HTX +message. So, if the request is finally routed to an HTTP server, it never see +these headers. For this reason, it is not recommended to mix FastCGI servers +and HTTP servers under the same backend. + +Similarly, the rules "set-param" and "pass-header" are evaluated during the +request headers analysis. So the evaluation is always performed, even if the +requests is finally forwarded to an HTTP server. + +About the rules "set-param", when a rule is applied, a pseudo header is added +into the HTX message. So, the same way than for HTTP header rewrites, it may +fail if the buffer is full. The rules "set-param" will compete with +"http-request" ones. + +Finally, all FastCGI params and HTTP headers are sent into a unique record +FCGI_PARAM. Encoding of this record must be done in one pass, otherwise a +processing error is returned. It means the record FCGI_PARAM, once encoded, +must not exceeds the size of a buffer. However, there is no reserve to respect +here. + + +11. Address formats +------------------- + +Several statements as "bind, "server", "nameserver" and "log" requires an +address. + +This address can be a host name, an IPv4 address, an IPv6 address, or '*'. +The '*' is equal to the special address "0.0.0.0" and can be used, in the case +of "bind" or "dgram-bind" to listen on all IPv4 of the system.The IPv6 +equivalent is '::'. + +Depending of the statement, a port or port range follows the IP address. This +is mandatory on 'bind' statement, optional on 'server'. + +This address can also begin with a slash '/'. It is considered as the "unix" +family, and '/' and following characters must be present the path. + +Default socket type or transport method "datagram" or "stream" depends on the +configuration statement showing the address. Indeed, 'bind' and 'server' will +use a "stream" socket type by default whereas 'log', 'nameserver' or +'dgram-bind' will use a "datagram". + +Optionally, a prefix could be used to force the address family and/or the +socket type and the transport method. + + +11.1. Address family prefixes +----------------------------- + +'abns@<name>' following <name> is an abstract namespace (Linux only). + +'fd@<n>' following address is a file descriptor <n> inherited from the + parent. The fd must be bound and may or may not already be + listening. + +'ip@<address>[:port1[-port2]]' following <address> is considered as an IPv4 or + IPv6 address depending on the syntax. Depending + on the statement using this address, a port or + a port range may or must be specified. + +'ipv4@<address>[:port1[-port2]]' following <address> is always considered as + an IPv4 address. Depending on the statement + using this address, a port or a port range + may or must be specified. + +'ipv6@<address>[:port1[-port2]]' following <address> is always considered as + an IPv6 address. Depending on the statement + using this address, a port or a port range + may or must be specified. + +'sockpair@<n>' following address is the file descriptor of a connected unix + socket or of a socketpair. During a connection, the initiator + creates a pair of connected sockets, and passes one of them + over the FD to the other end. The listener waits to receive + the FD from the unix socket and uses it as if it were the FD + of an accept(). Should be used carefully. + +'unix@<path>' following string is considered as a UNIX socket <path>. this + prefix is useful to declare an UNIX socket path which don't + start by slash '/'. + + +11.2. Socket type prefixes +-------------------------- + +Previous "Address family prefixes" can also be prefixed to force the socket +type and the transport method. The default depends of the statement using +this address but in some cases the user may force it to a different one. +This is the case for "log" statement where the default is syslog over UDP +but we could force to use syslog over TCP. + +Those prefixes were designed for internal purpose and users should +instead use aliases of the next section "11.3 Protocol prefixes". + +If users need one those prefixes to perform what they expect because +they can not configure the same using the protocol prefixes, they should +report this to the maintainers. + +'stream+<family>@<address>' forces socket type and transport method + to "stream" + +'dgram+<family>@<address>' forces socket type and transport method + to "datagram". + + +11.3. Protocol prefixes +----------------------- + +'quic4@<address>[:port1[-port2]]' following <address> is always considered as + an IPv4 address but socket type is forced to + "datagram" and the transport method is forced + to "stream". Depending on the statement using + this address, a UDP port or port range can or + must be specified. + +'quic6@<address>[:port1[-port2]]' following <address> is always considered as + an IPv6 address but socket type is forced to + "datagram" and the transport method is forced + to "stream". Depending on the statement using + this address, a UDP port or port range can or + must be specified. + +'tcp@<address>[:port1[-port2]]' following <address> is considered as an IPv4 + or IPv6 address depending of the syntax but + socket type and transport method is forced to + "stream". Depending on the statement using + this address, a port or a port range can or + must be specified. It is considered as an alias + of 'stream+ip@'. + +'tcp4@<address>[:port1[-port2]]' following <address> is always considered as + an IPv4 address but socket type and transport + method is forced to "stream". Depending on the + statement using this address, a port or port + range can or must be specified. + It is considered as an alias of 'stream+ipv4@'. + +'tcp6@<address>[:port1[-port2]]' following <address> is always considered as + an IPv6 address but socket type and transport + method is forced to "stream". Depending on the + statement using this address, a port or port + range can or must be specified. + It is considered as an alias of 'stream+ipv4@'. + +'udp@<address>[:port1[-port2]]' following <address> is considered as an IPv4 + or IPv6 address depending of the syntax but + socket type and transport method is forced to + "datagram". Depending on the statement using + this address, a port or a port range can or + must be specified. It is considered as an alias + of 'dgram+ip@'. + +'udp4@<address>[:port1[-port2]]' following <address> is always considered as + an IPv4 address but socket type and transport + method is forced to "datagram". Depending on + the statement using this address, a port or + port range can or must be specified. + It is considered as an alias of 'dgram+ipv4@'. + +'udp6@<address>[:port1[-port2]]' following <address> is always considered as + an IPv6 address but socket type and transport + method is forced to "datagram". Depending on + the statement using this address, a port or + port range can or must be specified. + It is considered as an alias of 'dgram+ipv4@'. + +'uxdg@<path>' following string is considered as a unix socket <path> but + transport method is forced to "datagram". It is considered as + an alias of 'dgram+unix@'. + +'uxst@<path>' following string is considered as a unix socket <path> but + transport method is forced to "stream". It is considered as + an alias of 'stream+unix@'. + +In future versions, other prefixes could be used to specify protocols like +QUIC which proposes stream transport based on socket of type "datagram". + +/* + * Local variables: + * fill-column: 79 + * End: + */ diff --git a/doc/cookie-options.txt b/doc/cookie-options.txt new file mode 100644 index 0000000..b3badf3 --- /dev/null +++ b/doc/cookie-options.txt @@ -0,0 +1,25 @@ +2011/04/13 : List of possible cookie settings with associated behaviours. + +PSV="preserve", PFX="prefix", INS="insert", REW="rewrite", IND="indirect" +0 = option not set +1 = option is set +* = option doesn't matter + +PSV PFX INS REW IND Behaviour + 0 0 0 0 0 passive mode + 0 0 0 0 1 passive + indirect : remove response if not needed + 0 0 0 1 0 always rewrite response + 0 0 1 0 0 always insert or replace response + 0 0 1 0 1 insert + indirect : remove req and also resp if not needed + * * 1 1 * [ forbidden ] + 0 1 0 0 0 prefix + 0 1 0 0 1 !! prefix on request, remove response cookie if not needed + * 1 * 1 * [ forbidden ] + * 1 1 * * [ forbidden ] + * * * 1 1 [ forbidden ] + 1 * 0 * 0 [ forbidden ] + 1 0 0 0 1 passive mode (alternate form) + 1 0 1 0 0 insert only, and preserve server response cookie if any + 1 0 1 0 1 conditional insert only for new requests + 1 1 0 0 1 prefix on requests only (passive prefix) + diff --git a/doc/design-thoughts/backends-v0.txt b/doc/design-thoughts/backends-v0.txt new file mode 100644 index 0000000..d350e22 --- /dev/null +++ b/doc/design-thoughts/backends-v0.txt @@ -0,0 +1,27 @@ +1 type générique "entité", avec les attributs suivants : + + - frontend *f + - l7switch *s + - backend *b + +des types spécifiques sont simplement des entités avec certains +de ces champs remplis et pas forcément tous : + + listen = f [s] b + frontend = f [s] + l7switch = s + backend = [s] b + +Ensuite, les traitements sont évalués dans l'ordre : + - listen -> s'il a des rčgles de l7, on les évalue, et potentiellement on branche vers d'autres listen, l7 ou back, ou on travaille avec le back local. + - frontend -> s'il a des rčgles de l7, on les évalue, et potentiellement on branche vers d'autres listen, l7 ou back + - l7switch -> on évalue ses rčgles, potentiellement on branche vers d'autres listen, l7 ou backends + - backend -> s'il a des rčgles l7, on les évalue (quitte ŕ changer encore de backend) puis on traite. + +Les requętes sont traitées dans l'ordre des chaînages f->s*->b, et les réponses doivent ętre +traitées dans l'ordre inverse b->s*->f. Penser aux réécritures de champs Host ŕ l'aller et +Location en retour. + +D'autre part, prévoir des "profils" plutôt que des blocs de nouveaux paramčtres par défaut. +Ca permettra d'avoir plein de jeux de paramčtres par défaut ŕ utiliser dans chacun de ces +types. diff --git a/doc/design-thoughts/backends.txt b/doc/design-thoughts/backends.txt new file mode 100644 index 0000000..d2474ce --- /dev/null +++ b/doc/design-thoughts/backends.txt @@ -0,0 +1,125 @@ +There has been a lot of confusion during the development because of the +backends and frontends. + +What we want : + +- being able to still use a listener as it has always been working + +- being able to write a rule stating that we will *change* the backend when we + match some pattern. Only one jump is allowed. + +- being able to write a "use_filters_from XXX" line stating that we will ignore + any filter in the current listener, and that those from XXX will be borrowed + instead. A warning would be welcome for options which will silently get + masked. This is used to factor configuration. + +- being able to write a "use_backend_from XXX" line stating that we will ignore + any server and timeout config in the current listener, and that those from + XXX will be borrowed instead. A warning would be welcome for options which + will silently get masked. This is used to factor configuration. + + + +Example : +--------- + + | # frontend HTTP/80 + | listen fe_http 1.1.1.1:80 + | use_filters_from default_http + | use_backend_from appli1 + | + | # frontend HTTPS/443 + | listen fe_https 1.1.1.1:443 + | use_filters_from default_https + | use_backend_from appli1 + | + | # frontend HTTP/8080 + | listen fe_http-dev 1.1.1.1:8080 + | reqadd "X-proto: http" + | reqisetbe "^Host: www1" appli1 + | reqisetbe "^Host: www2" appli2 + | reqisetbe "^Host: www3" appli-dev + | use_backend_from appli1 + | + | + | # filters default_http + | listen default_http + | reqadd "X-proto: http" + | reqisetbe "^Host: www1" appli1 + | reqisetbe "^Host: www2" appli2 + | + | # filters default_https + | listen default_https + | reqadd "X-proto: https" + | reqisetbe "^Host: www1" appli1 + | reqisetbe "^Host: www2" appli2 + | + | + | # backend appli1 + | listen appli1 + | reqidel "^X-appli1:.*" + | reqadd "Via: appli1" + | balance roundrobin + | cookie app1 + | server srv1 + | server srv2 + | + | # backend appli2 + | listen appli2 + | reqidel "^X-appli2:.*" + | reqadd "Via: appli2" + | balance roundrobin + | cookie app2 + | server srv1 + | server srv2 + | + | # backend appli-dev + | listen appli-dev + | reqadd "Via: appli-dev" + | use_backend_from appli2 + | + | + + +Now we clearly see multiple things : +------------------------------------ + + - a frontend can EITHER have filters OR reference a use_filter + + - a backend can EITHER have servers OR reference a use_backend + + - we want the evaluation to cross TWO levels per request. When a request is + being processed, it keeps track of its "frontend" side (where it came + from), and of its "backend" side (where the server-side parameters have + been found). + + - the use_{filters|backend} have nothing to do with how the request is + decomposed. + + +Conclusion : +------------ + + - a proxy is always its own frontend. It also has 2 parameters : + - "fi_prm" : pointer to the proxy holding the filters (itself by default) + - "be_prm" : pointer to the proxy holding the servers (itself by default) + + - a request has a frontend (fe) and a backend (be). By default, the backend + is initialized to the frontend. Everything related to the client side is + accessed through ->fe. Everything related to the server side is accessed + through ->be. + + - request filters are first called from ->fe then ->be. Since only the + filters can change ->be, it is possible to iterate the filters on ->be + only and stop when ->be does not change anymore. + + - response filters are first called from ->be then ->fe IF (fe != be). + + +When we parse the configuration, we immediately configure ->fi and ->be for +all proxies. + +Upon session creation, s->fe and s->be are initialized to the proxy. Filters +are executed via s->fe->fi_prm and s->be->fi_prm. Servers are found in +s->be->be_prm. + diff --git a/doc/design-thoughts/be-fe-changes.txt b/doc/design-thoughts/be-fe-changes.txt new file mode 100644 index 0000000..f242f8a --- /dev/null +++ b/doc/design-thoughts/be-fe-changes.txt @@ -0,0 +1,74 @@ +- PR_O_TRANSP => FE !!! devra peut-ętre changer vu que c'est un complément du mode dispatch. +- PR_O_NULLNOLOG => FE +- PR_O_HTTP_CLOSE => FE. !!! mettre BE aussi !!! +- PR_O_TCP_CLI_KA => FE + +- PR_O_FWDFOR => BE. FE aussi ? +- PR_O_FORCE_CLO => BE +- PR_O_PERSIST => BE +- PR_O_COOK_RW, PR_O_COOK_INS, PR_O_COOK_PFX, PR_O_COOK_POST => BE +- PR_O_COOK_NOC, PR_O_COOK_IND => BE +- PR_O_ABRT_CLOSE => BE +- PR_O_REDISP => BE +- PR_O_BALANCE, PR_O_BALANCE_RR, PR_O_BALANCE_SH => BE +- PR_O_CHK_CACHE => BE +- PR_O_TCP_SRV_KA => BE +- PR_O_BIND_SRC => BE +- PR_O_TPXY_MASK => BE + + +- PR_MODE_TCP : BE côté serveur, FE côté client + +- nbconn -> fe->nbconn, be->nbconn. + Pb: rendre impossible le fait que (fe == be) avant de faire ça, + sinon on va compter les connexions en double. Ce ne sera possible + que lorsque les FE et BE seront des entités distinctes. On va donc + commencer par laisser uniquement fe->nbconn (vu que le fe ne change + pas), et modifier ceci plus tard, ne serait-ce que pour prendre en + compte correctement les minconn/maxconn. + => solution : avoir beconn et feconn dans chaque proxy. + +- failed_conns, failed_secu (réponses bloquées), failed_resp... : be + Attention: voir les cas de ERR_SRVCL, il semble que parfois on + indique ça alors qu'il y a un write error côté client (ex: ligne + 2044 dans proto_http). + + => be et pas be->beprm + +- logs du backup : ->be (idem) + +- queue : be + +- logs/debug : srv toujours associé ŕ be (ex: proxy->id:srv->id). Rien + pour le client pour le moment. D'une maničre générale, les erreurs + provoquées côté serveur vont sur BE et celles côté client vont sur + FE. +- logswait & LW_BYTES : FE (puisqu'on veut savoir si on logue tout de suite) + +- messages d'erreurs personnalisés (errmsg, ...) -> fe + +- monitor_uri -> fe +- uri_auth -> (fe->firpm puis be->fiprm). Utilisation de ->be + +- req_add, req_exp => fe->fiprm, puis be->fiprm +- req_cap, rsp_cap -> fe->fiprm +- rsp_add, rsp_exp => be->fiprm, devrait ętre fait ensuite aussi sur fe->fiprm +- capture_name, capture_namelen : fe->fiprm + + Ce n'est pas la solution idéale, mais au moins la capture et configurable + par les filtres du FE et ne bouge pas lorsque le BE est réassigné. Cela + résoud aussi un pb d'allocation mémoire. + + +- persistance (appsessions, cookiename, ...) -> be +- stats:scope "." = fe (celui par lequel on arrive) + !!!ERREUR!!! => utiliser be pour avoir celui qui a été validé par + l'uri_auth. + + +--------- corrections ŕ effectuer --------- + +- remplacement de headers : parser le header et éventuellement le supprimer puis le(les) rajouter. +- session->proto.{l4state,l7state,l7substate} pour CLI et SRV +- errorloc : si définie dans backend, la prendre, sinon dans front. +- logs : faire be sinon fe. diff --git a/doc/design-thoughts/binding-possibilities.txt b/doc/design-thoughts/binding-possibilities.txt new file mode 100644 index 0000000..3f5e432 --- /dev/null +++ b/doc/design-thoughts/binding-possibilities.txt @@ -0,0 +1,167 @@ +2013/10/10 - possibilities for setting source and destination addresses + + +When establishing a connection to a remote device, this device is designated +as a target, which designates an entity defined in the configuration. A same +target appears only once in a configuration, and multiple targets may share +the same settings if needed. + +The following types of targets are currently supported : + + - listener : all connections with this type of target come from clients ; + - server : connections to such targets are for "server" lines ; + - peer : connections to such target address "peer" lines in "peers" + sections ; + - proxy : these targets are used by "dispatch", "option transparent" + or "option http_proxy" statements. + +A connection might not be reused between two different targets, even if all +parameters seem similar. One of the reason is that some parameters are specific +to the target and are not easy or not cheap to compare (eg: bind to interface, +mss, ...). + +A number of source and destination addresses may be set for a given target. + + - listener : + - the "from" address:port is set by accept() + + - the "to" address:port is set if conn_get_to_addr() is called + + - peer : + - the "from" address:port is not set + + - the "to" address:port is static and dependent only on the peer + + - server : + - the "from" address may be set alone when "source" is used with + a forced IP address, or when "usesrc clientip" is used. + + - the "from" port may be set only combined with the address when + "source" is used with IP:port, IP:port-range or "usesrc client" is + used. Note that in this case, both the address and the port may be + 0, meaning that the kernel will pick the address or port and that + the final value might not match the one explicitly set (eg: + important for logging). + + - the "from" address may be forced from a header which implies it + may change between two consecutive requests on the same connection. + + - the "to" address and port are set together when connecting to a + regular server, or by copying the client's IP address when + "server 0.0.0.0" is used. Note that the destination port may be + an offset applied to the original destination port. + + - proxy : + - the "from" address may be set alone when "source" is used with a + forced IP address or when "usesrc clientip" is used. + + - the "from" port may be set only combined with the address when + "source" is used with IP:port or with "usesrc client". There is + no ip:port range for a proxy as of now. Same comment applies as + above when port and/or address are 0. + + - the "from" address may be forced from a header which implies it + may change between two consecutive requests on the same connection. + + - the "to" address and port are set together, either by configuration + when "dispatch" is used, or dynamically when "transparent" is used + (1:1 with client connection) or "option http_proxy" is used, where + each client request may lead to a different destination address. + + +At the moment, there are some limits in what might happen between multiple +concurrent requests to a same target. + + - peers parameter do not change, so no problem. + + - server parameters may change in this way : + - a connection may require a source bound to an IP address found in a + header, which will fall back to the "source" settings if the address + is not found in this header. This means that the source address may + switch between a dynamically forced IP address and another forced + IP and/or port range. + + - if the element is not found (eg: header), the remaining "forced" + source address might very well be empty (unset), so the connection + reuse is acceptable when switching in that direction. + + - it is not possible to switch between client and clientip or any of + these and hdr_ip() because they're exclusive. + + - using a source address/port belonging to a port range is compatible + with connection reuse because there is a single range per target, so + switching from a range to another range means we remain in the same + range. + + - destination address may currently not change since the only possible + case for dynamic destination address setting is the transparent mode, + reproducing the client's destination address. + + - proxy parameters may change in this way : + - a connection may require a source bound to an IP address found in a + header, which will fall back to the "source" settings if the address + is not found in this header. This means that the source address may + switch between a dynamically forced IP address and another forced + IP and/or port range. + + - if the element is not found (eg: header), the remaining "forced" + source address might very well be empty (unset), so the connection + reuse is acceptable when switching in that direction. + + - it is not possible to switch between client and clientip or any of + these and hdr_ip() because they're exclusive. + + - proxies do not support port ranges at the moment. + + - destination address might change in the case where "option http_proxy" + is used. + +So, for each source element (IP, port), we want to know : + - if the element was assigned by static configuration (eg: ":80") + - if the element was assigned from a connection-specific value (eg: usesrc clientip) + - if the element was assigned from a configuration-specific range (eg: 1024-65535) + - if the element was assigned from a request-specific value (eg: hdr_ip(xff)) + - if the element was not assigned at all + +For the destination, we want to know : + - if the element was assigned by static configuration (eg: ":80") + - if the element was assigned from a connection-specific value (eg: transparent) + - if the element was assigned from a request-specific value (eg: http_proxy) + +We don't need to store the information about the origin of the dynamic value +since we have the value itself. So in practice we have : + - default value, unknown (not yet checked with getsockname/getpeername) + - default value, known (check done) + - forced value (known) + - forced range (known) + +We can't do that on an ip:port basis because the port may be fixed regardless +of the address and conversely. + +So that means : + + enum { + CO_ADDR_NONE = 0, /* not set, unknown value */ + CO_ADDR_KNOWN = 1, /* not set, known value */ + CO_ADDR_FIXED = 2, /* fixed value, known */ + CO_ADDR_RANGE = 3, /* from assigned range, known */ + } conn_addr_values; + + unsigned int new_l3_src_status:2; + unsigned int new_l4_src_status:2; + unsigned int new_l3_dst_status:2; + unsigned int new_l4_dst_status:2; + + unsigned int cur_l3_src_status:2; + unsigned int cur_l4_src_status:2; + unsigned int cur_l3_dsp_status:2; + unsigned int cur_l4_dst_status:2; + + unsigned int new_family:2; + unsigned int cur_family:2; + +Note: this obsoletes CO_FL_ADDR_FROM_SET and CO_FL_ADDR_TO_SET. These flags +must be changed to individual l3+l4 checks ORed between old and new values, +or better, set to cur only which will inherit new. + +In the connection, these values may be merged in the same word as err_code. diff --git a/doc/design-thoughts/config-language.txt b/doc/design-thoughts/config-language.txt new file mode 100644 index 0000000..20c4fbd --- /dev/null +++ b/doc/design-thoughts/config-language.txt @@ -0,0 +1,262 @@ +Prévoir des commandes en plusieurs mots clés. +Par exemple : + + timeout connection XXX + connection scale XXX + +On doit aussi accepter les préfixes : + + tim co XXX + co sca XXX + +Prévoir de ranger les combinaisons dans un tableau. On doit męme +pouvoir effectuer un mapping simplifiant le parseur. + + +Pour les filtres : + + + <direction> <where> <what> <operator> <pattern> <action> [ <args>* ] + + <direction> = [ req | rsp ] + <where> = [ in | out ] + <what> = [ line | LINE | METH | URI | h(hdr) | H(hdr) | c(cookie) | C(cookie) ] + <operator> = [ == | =~ | =* | =^ | =/ | != | !~ | !* | !^ | !/ ] + <pattern> = "<string>" + <action> = [ allow | permit | deny | delete | replace | switch | add | set | redir ] + <args> = optional action args + + examples: + + req in URI =^ "/images" switch images + req in h(host) =* ".mydomain.com" switch mydomain + req in h(host) =~ "localhost(.*)" replace "www\1" + + alternative : + + <direction> <where> <action> [not] <what> [<operator> <pattern> [ <args>* ]] + + req in switch URI =^ "/images" images + req in switch h(host) =* ".mydomain.com" mydomain + req in replace h(host) =~ "localhost(.*)" "www\1" + req in delete h(Connection) + req in deny not line =~ "((GET|HEAD|POST|OPTIONS) /)|(OPTIONS *)" + req out set h(Connection) "close" + req out add line "Server: truc" + + + <direction> <action> <where> [not] <what> [<operator> <pattern> [ <args>* ]] ';' <action2> <what2> + + req in switch URI =^ "/images/" images ; replace "/" + req in switch h(host) =* ".mydomain.com" mydomain + req in replace h(host) =~ "localhost(.*)" "www\1" + req in delete h(Connection) + req in deny not line =~ "((GET|HEAD|POST|OPTIONS) /)|(OPTIONS *)" + req out set h(Connection) "close" + req out add line == "Server: truc" + + +Extension avec des ACL : + + req in acl(meth_valid) METH =~ "(GET|POST|HEAD|OPTIONS)" + req in acl(meth_options) METH == "OPTIONS" + req in acl(uri_slash) URI =^ "/" + req in acl(uri_star) URI == "*" + + req in deny acl !(meth_options && uri_star || meth_valid && uri_slash) + +Peut-ętre plus simplement : + + acl meth_valid METH =~ "(GET|POST|HEAD|OPTIONS)" + acl meth_options METH == "OPTIONS" + acl uri_slash URI =^ "/" + acl uri_star URI == "*" + + req in deny not acl(meth_options uri_star, meth_valid uri_slash) + + req in switch URI =^ "/images/" images ; replace "/" + req in switch h(host) =* ".mydomain.com" mydomain + req in replace h(host) =~ "localhost(.*)" "www\1" + req in delete h(Connection) + req in deny not line =~ "((GET|HEAD|POST|OPTIONS) /)|(OPTIONS *)" + req out set h(Connection) "close" + req out add line == "Server: truc" + +Prévoir le cas du "if" pour exécuter plusieurs actions : + + req in if URI =^ "/images/" then replace "/" ; switch images + +Utiliser les noms en majuscules/minuscules pour indiquer si on veut prendre +en compte la casse ou non : + + if uri =^ "/watch/" setbe watch rebase "/watch/" "/" + if uri =* ".jpg" setbe images + if uri =~ ".*dll.*" deny + if HOST =* ".mydomain.com" setbe mydomain + etc... + +Another solution would be to have a dedicated keyword to URI remapping. It +would both rewrite the URI and optionally switch to another backend. + + uriremap "/watch/" "/" watch + uriremap "/chat/" "/" chat + uriremap "/event/" "/event/" event + +Or better : + + uriremap "/watch/" watch "/" + uriremap "/chat/" chat "/" + uriremap "/event/" event + +For the URI, using a regex is sometimes useful (eg: providing a set of possible prefixes. + + +Sinon, peut-ętre que le "switch" peut prendre un paramčtre de mapping pour la partie matchée : + + req in switch URI =^ "/images/" images:"/" + + +2007/03/31 - Besoins plus précis. + +1) aucune extension de branchement ou autre dans les "listen", c'est trop complexe. + +Distinguer les données entrantes (in) et sortantes (out). + +Le frontend ne voit que les requetes entrantes et les réponses sortantes. +Le backend voir les requętes in/out et les réponses in/out. +Le frontend permet les branchements d'ensembles de filtres de requętes vers +d'autres. Le frontend et les ensembles de filtres de requętes peuvent brancher +vers un backend. + +-----------+--------+----------+----------+---------+----------+ + \ Where | | | | | | + \______ | Listen | Frontend | ReqRules | Backend | RspRules | + \| | | | | | +Capability | | | | | | +-----------+--------+----------+----------+---------+----------+ +Frontend | X | X | | | | +-----------+--------+----------+----------+---------+----------+ +FiltReqIn | X | X | X | X | | +-----------+--------+----------+----------+---------+----------+ +JumpFiltReq| X | X | X | | | \ +-----------+--------+----------+----------+---------+----------+ > = ReqJump +SetBackend | X | X | X | | | / +-----------+--------+----------+----------+---------+----------+ +FiltReqOut | | | | X | | +-----------+--------+----------+----------+---------+----------+ +FiltRspIn | X | | | X | X | +-----------+--------+----------+----------+---------+----------+ +JumpFiltRsp| | | | X | X | +-----------+--------+----------+----------+---------+----------+ +FiltRspOut | | X | | X | X | +-----------+--------+----------+----------+---------+----------+ +Backend | X | | | X | | +-----------+--------+----------+----------+---------+----------+ + +En conclusion +------------- + +Il y a au moins besoin de distinguer 8 fonctionnalités de base : + - capacité ŕ recevoir des connexions (frontend) + - capacité ŕ filtrer les requętes entrantes + - capacité ŕ brancher vers un backend ou un ensemble de rčgles de requętes + - capacité ŕ filtrer les requętes sortantes + - capacité ŕ filtrer les réponses entrantes + - capacité ŕ brancher vers un autre ensemble de rčgles de réponses + - capacité ŕ filtrer la réponse sortante + - capacité ŕ gérer des serveurs (backend) + +Remarque +-------- + - on a souvent besoin de pouvoir appliquer un petit traitement sur un ensemble + host/uri/autre. Le petit traitement peut consister en quelques filtres ainsi + qu'une réécriture du couple (host,uri). + + +Proposition : ACL + +Syntaxe : +--------- + + acl <name> <what> <operator> <value> ... + +Ceci créera une acl référencée sous le nom <name> qui sera validée si +l'application d'au moins une des valeurs <value> avec l'opérateur <operator> +sur le sujet <what> est validée. + +Opérateurs : +------------ + +Toujours 2 caractčres : + + [=!][~=*^%/.] + +Premier caractčre : + '=' : OK si test valide + '!' : OK si test échoué. + +Second caractčre : + '~' : compare avec une regex + '=' : compare chaîne ŕ chaîne + '*' : compare la fin de la chaîne (ex: =* ".mydomain.com") + '^' : compare le début de la chaîne (ex: =^ "/images/") + '%' : recherche une sous-chaîne + '/' : compare avec un mot entier en acceptant le '/' comme délimiteur + '.' : compare avec un mot entier en acceptant el '.' comme délimiteur + +Ensuite on exécute une action de maničre conditionnelle si l'ensemble des ACLs +mentionnées sont validées (ou invalidées pour celles précédées d'un "!") : + + <what> <where> <action> on [!]<aclname> ... + + +Exemple : +--------- + + acl www_pub host =. www www01 dev preprod + acl imghost host =. images + acl imgdir uri =/ img + acl imagedir uri =/ images + acl msie h(user-agent) =% "MSIE" + + set_host "images" on www_pub imgdir + remap_uri "/img" "/" on www_pub imgdir + remap_uri "/images" "/" on www_pub imagedir + setbe images on imghost + reqdel "Cookie" on all + + + +Actions possibles : + + req {in|out} {append|delete|rem|add|set|rep|mapuri|rewrite|reqline|deny|allow|setbe|tarpit} + resp {in|out} {append|delete|rem|add|set|rep|maploc|rewrite|stsline|deny|allow} + + req in append <line> + req in delete <line_regex> + req in rem <header> + req in add <header> <new_value> + req in set <header> <new_value> + req in rep <header> <old_value> <new_value> + req in mapuri <old_uri_prefix> <new_uri_prefix> + req in rewrite <old_uri_regex> <new_uri> + req in reqline <old_req_regex> <new_req> + req in deny + req in allow + req in tarpit + req in setbe <backend> + + resp out maploc <old_location_prefix> <new_loc_prefix> + resp out stsline <old_sts_regex> <new_sts_regex> + +Les chaînes doivent ętre délimitées par un męme caractčre au début et ŕ la fin, +qui doit ętre échappé s'il est présent dans la chaîne. Tout ce qui se trouve +entre le caractčre de fin et les premiers espace est considéré comme des +options passées au traitement. Par exemple : + + req in rep host /www/i /www/ + req in rep connection /keep-alive/i "close" + +Il serait pratique de pouvoir effectuer un remap en męme temps qu'un setbe. + +Captures: les séparer en in/out. Les rendre conditionnelles ? diff --git a/doc/design-thoughts/connection-reuse.txt b/doc/design-thoughts/connection-reuse.txt new file mode 100644 index 0000000..4eb22f6 --- /dev/null +++ b/doc/design-thoughts/connection-reuse.txt @@ -0,0 +1,224 @@ +2015/08/06 - server connection sharing + +Improvements on the connection sharing strategies +------------------------------------------------- + +4 strategies are currently supported : + - never + - safe + - aggressive + - always + +The "aggressive" and "always" strategies take into account the fact that the +connection has already been reused at least once or not. The principle is that +second requests can be used to safely "validate" connection reuse on newly +added connections, and that such validated connections may be used even by +first requests from other sessions. A validated connection is a connection +which has already been reused, hence proving that it definitely supports +multiple requests. Such connections are easy to verify : after processing the +response, if the txn already had the TX_NOT_FIRST flag, then it was not the +first request over that connection, and it is validated as safe for reuse. +Validated connections are put into a distinct list : server->safe_conns. + +Incoming requests with TX_NOT_FIRST first pick from the regular idle_conns +list so that any new idle connection is validated as soon as possible. + +Incoming requests without TX_NOT_FIRST only pick from the safe_conns list for +strategy "aggressive", guaranteeing that the server properly supports connection +reuse, or first from the safe_conns list, then from the idle_conns list for +strategy "always". + +Connections are always stacked into the list (LIFO) so that there are higher +changes to convert recent connections and to use them. This will first optimize +the likeliness that the connection works, and will avoid TCP metrics from being +lost due to an idle state, and/or the congestion window to drop and the +connection going to slow start mode. + + +Handling connections in pools +----------------------------- + +A per-server "pool-max" setting should be added to permit disposing unused idle +connections not attached anymore to a session for use by future requests. The +principle will be that attached connections are queued from the front of the +list while the detached connections will be queued from the tail of the list. + +This way, most reused connections will be fairly recent and detached connections +will most often be ignored. The number of detached idle connections in the lists +should be accounted for (pool_used) and limited (pool_max). + +After some time, a part of these detached idle connections should be killed. +For this, the list is walked from tail to head and connections without an owner +may be evicted. It may be useful to have a per-server pool_min setting +indicating how many idle connections should remain in the pool, ready for use +by new requests. Conversely, a pool_low metric should be kept between eviction +runs, to indicate the lowest amount of detached connections that were found in +the pool. + +For eviction, the principle of a half-life is appealing. The principle is +simple : over a period of time, half of the connections between pool_min and +pool_low should be gone. Since pool_low indicates how many connections were +remaining unused over a period, it makes sense to kill some of them. + +In order to avoid killing thousands of connections in one run, the purge +interval should be split into smaller batches. Let's call N the ratio of the +half-life interval and the effective interval. + +The algorithm consists in walking over them from the end every interval and +killing ((pool_low - pool_min) + 2 * N - 1) / (2 * N). It ensures that half +of the unused connections are killed over the half-life period, in N batches +of population/2N entries at most. + +Unsafe connections should be evicted first. There should be quite few of them +since most of them are probed and become safe. Since detached connections are +quickly recycled and attached to a new session, there should not be too many +detached connections in the pool, and those present there may be killed really +quickly. + +Another interesting point of pools is that when a pool-max is not null, then it +makes sense to automatically enable pretend-keep-alive on non-private connections +going to the server in order to be able to feed them back into the pool. With +the "aggressive" or "always" strategies, it can allow clients making a single +request over their connection to share persistent connections to the servers. + + + +2013/10/17 - server connection management and reuse + +Current state +------------- + +At the moment, a connection entity is needed to carry any address +information. This means in the following situations, we need a server +connection : + +- server is elected and the server's destination address is set + +- transparent mode is elected and the destination address is set from + the incoming connection + +- proxy mode is enabled, and the destination's address is set during + the parsing of the HTTP request + +- connection to the server fails and must be retried on the same + server using the same parameters, especially the destination + address (SN_ADDR_SET not removed) + + +On the accepting side, we have further requirements : + +- allocate a clean connection without a stream interface + +- incrementally set the accepted connection's parameters without + clearing it, and keep track of what is set (eg: getsockname). + +- initialize a stream interface in established mode + +- attach the accepted connection to a stream interface + + +This means several things : + +- the connection has to be allocated on the fly the first time it is + needed to store the source or destination address ; + +- the connection has to be attached to the stream interface at this + moment ; + +- it must be possible to incrementally set some settings on the + connection's addresses regardless of the connection's current state + +- the connection must not be released across connection retries ; + +- it must be possible to clear a connection's parameters for a + redispatch without having to detach/attach the connection ; + +- we need to allocate a connection without an existing stream interface + +So on the accept() side, it looks like this : + + fd = accept(); + conn = new_conn(); + get_some_addr_info(&conn->addr); + ... + si = new_si(); + si_attach_conn(si, conn); + si_set_state(si, SI_ST_EST); + ... + get_more_addr_info(&conn->addr); + +On the connect() side, it looks like this : + + si = new_si(); + while (!properly_connected) { + if (!(conn = si->end)) { + conn = new_conn(); + conn_clear(conn); + si_attach_conn(si, conn); + } + else { + if (connected) { + f = conn->flags & CO_FL_XPRT_TRACKED; + conn->flags &= ~CO_FL_XPRT_TRACKED; + conn_close(conn); + conn->flags |= f; + } + if (!correct_dest) + conn_clear(conn); + } + set_some_addr_info(&conn->addr); + si_set_state(si, SI_ST_CON); + ... + set_more_addr_info(&conn->addr); + conn->connect(); + if (must_retry) { + close_conn(conn); + } + } + +Note: we need to be able to set the control and transport protocols. +On outgoing connections, this is set once we know the destination address. +On incoming connections, this is set the earliest possible (once we know +the source address). + +The problem analysed below was solved on 2013/10/22 + +| ==> the real requirement is to know whether a connection is still valid or not +| before deciding to close it. CO_FL_CONNECTED could be enough, though it +| will not indicate connections that are still waiting for a connect to occur. +| This combined with CO_FL_WAIT_L4_CONN and CO_FL_WAIT_L6_CONN should be OK. +| +| Alternatively, conn->xprt could be used for this, but needs some careful checks +| (it's used by conn_full_close at least). +| +| Right now, conn_xprt_close() checks conn->xprt and sets it to NULL. +| conn_full_close() also checks conn->xprt and sets it to NULL, except +| that the check on ctrl is performed within xprt. So conn_xprt_close() +| followed by conn_full_close() will not close the file descriptor. +| Note that conn_xprt_close() is never called, maybe we should kill it ? +| +| Note: at the moment, it's problematic to leave conn->xprt to NULL before doing +| xprt_init() because we might end up with a pending file descriptor. Or at +| least with some transport not de-initialized. We might thus need +| conn_xprt_close() when conn_xprt_init() fails. +| +| The fd should be conditioned by ->ctrl only, and the transport layer by ->xprt. +| +| - conn_prepare_ctrl(conn, ctrl) +| - conn_prepare_xprt(conn, xprt) +| - conn_prepare_data(conn, data) +| +| Note: conn_xprt_init() needs conn->xprt so it's not a problem to set it early. +| +| One problem might be with conn_xprt_close() not being able to know if xprt_init() +| was called or not. That's where it might make sense to only set ->xprt during init. +| Except that it does not fly with outgoing connections (xprt_init is called after +| connect()). +| +| => currently conn_xprt_close() is only used by ssl_sock.c and decides whether +| to do something based on ->xprt_ctx which is set by ->init() from xprt_init(). +| So there is nothing to worry about. We just need to restore conn_xprt_close() +| and rely on ->ctrl to close the fd instead of ->xprt. +| +| => we have the same issue with conn_ctrl_close() : when is the fd supposed to be +| valid ? On outgoing connections, the control is set much before the fd... diff --git a/doc/design-thoughts/connection-sharing.txt b/doc/design-thoughts/connection-sharing.txt new file mode 100644 index 0000000..99be1cd --- /dev/null +++ b/doc/design-thoughts/connection-sharing.txt @@ -0,0 +1,31 @@ +2014/10/28 - Server connection sharing + +For HTTP/2 we'll have to use multiplexed connections to the servers and to +share them between multiple streams. We'll also have to do this for H/1, but +with some variations since H1 doesn't offer connection status verification. + +In order to validate that an idle connection is still usable, it is desirable +to periodically send health checks over it. Normally, idle connections are +meant to be heavily used, so there is no reason for having them idle for a long +time. Thus we have two possibilities : + + - either we time them out after some inactivity, this saves server resources ; + - or we check them after some inactivity. For this we can send the server- + side HTTP health check (only when the server uses HTTP checks), and avoid + using that to mark the server down, and instead consider the connection as + dead. + +For HTTP/2 we'll have to send pings periodically over these connections, so +it's worth considering a per-connection task to validate that the channel still +works. + +In the current model, a connection necessarily belongs to a session, so it's +not really possible to share them, at best they can be exchanged, but that +doesn't make much sense as it means that it could disturb parallel traffic. + +Thus we need to have a per-server list of idle connections and a max-idle-conn +setting to kill them when there are too many. In the case of H/1 it is also +advisable to consider that if a connection was created to pass a first non- +idempotent request while other idle connections were still existing, then a +connection will have to be killed in order not to exceed the limit. + diff --git a/doc/design-thoughts/dynamic-buffers.txt b/doc/design-thoughts/dynamic-buffers.txt new file mode 100644 index 0000000..564d868 --- /dev/null +++ b/doc/design-thoughts/dynamic-buffers.txt @@ -0,0 +1,41 @@ +2014/10/30 - dynamic buffer management + +Since HTTP/2 processing will significantly increase the need for buffering, it +becomes mandatory to be able to support dynamic buffer allocation. This also +means that at any moment some buffer allocation will fail and that a task or an +I/O operation will have to be paused for the time needed to allocate a buffer. + +There are 3 places where buffers are needed : + + - receive side of a stream interface. A connection notifies about a pending + recv() and the SI calls the receive function to put the data into a buffer. + Here the buffer will have to be picked from a pool first, and if the + allocation fails, the I/O will have to temporarily be disabled, the + connection will have to subscribe for buffer release notification to be + woken up once a buffer is available again. It's important to keep in mind + that buffer availability doesn't necessarily mean a desire to enable recv + again, just that recv is not paused anymore for resource reasons. + + - receive side of a stream interface when the other end point is an applet. + The applet wants to write into the buffer and for this the buffer needs to + be allocated as well. It is the same as above except that it is the applet + which is put to a pause. Since the applet might be at the core of the task + itself, it could become tricky to handle the situation correctly. Stats and + peers are in this situation. + + - Tx of a task : some tasks perform spontaneous writes to a buffer. Checks + are an example of this. The checks will have to be able to sleep while a + buffer is being awaited. + +One important point is that such pauses must not prevent the task from timing +out. There it becomes difficult because in the case of a time out, we could +want to emit a timeout error message and for this, require a buffer. So it is +important to keep the ability not to send messages upon error processing, and +to be able to give up and stop waiting for buffers. + +The refill mechanism needs to be designed in a thread-safe way because this +will become one of the rare cases of inter-task activity. Thus it is important +to ensure that checking the state of the task and passing of the freshly +released buffer are performed atomically, and that in case the task doesn't +want it anymore, it is responsible for passing it to the next one. + diff --git a/doc/design-thoughts/entities-v2.txt b/doc/design-thoughts/entities-v2.txt new file mode 100644 index 0000000..91c4fa9 --- /dev/null +++ b/doc/design-thoughts/entities-v2.txt @@ -0,0 +1,276 @@ +2012/07/05 - Connection layering and sequencing + + +An FD has a state : + - CLOSED + - READY + - ERROR (?) + - LISTEN (?) + +A connection has a state : + - CLOSED + - ACCEPTED + - CONNECTING + - ESTABLISHED + - ERROR + +A stream interface has a state : + - INI, REQ, QUE, TAR, ASS, CON, CER, EST, DIS, CLO + +Note that CON and CER might be replaced by EST if the connection state is used +instead. CON might even be more suited than EST to indicate that a connection +is known. + + +si_shutw() must do : + + data_shutw() + if (shutr) { + data_close() + ctrl_shutw() + ctrl_close() + } + +si_shutr() must do : + data_shutr() + if (shutw) { + data_close() + ctrl_shutr() + ctrl_close() + } + +Each of these steps may fail, in which case the step must be retained and the +operations postponed in an asynchronous task. + +The first asynchronous data_shut() might already fail so it is mandatory to +save the other side's status with the connection in order to let the async task +know whether the 3 next steps must be performed. + +The connection (or perhaps the FD) needs to know : + - the desired close operations : DSHR, DSHW, CSHR, CSHW + - the completed close operations : DSHR, DSHW, CSHR, CSHW + + +On the accept() side, we probably need to know : + - if a header is expected (eg: accept-proxy) + - if this header is still being waited for + => maybe both info might be combined into one bit + + - if a data-layer accept() is expected + - if a data-layer accept() has been started + - if a data-layer accept() has been performed + => possibly 2 bits, to indicate the need to free() + +On the connect() side, we need to know : + - the desire to send a header (eg: send-proxy) + - if this header has been sent + => maybe both info might be combined + + - if a data-layer connect() is expected + - if a data-layer connect() has been started + - if a data-layer connect() has been completed + => possibly 2 bits, to indicate the need to free() + +On the response side, we also need to know : + - the desire to send a header (eg: health check response for monitor-net) + - if this header was sent + => might be the same as sending a header over a new connection + +Note: monitor-net has precedence over proxy proto and data layers. Same for + health mode. + +For multi-step operations, use 2 bits : + 00 = operation not desired, not performed + 10 = operation desired, not started + 11 = operation desired, started but not completed + 01 = operation desired, started and completed + + => X != 00 ==> operation desired + X & 01 ==> operation at least started + X & 10 ==> operation not completed + +Note: no way to store status information for error reporting. + +Note2: it would be nice if "tcp-request connection" rules could work at the +connection level, just after headers ! This means support for tracking stick +tables, possibly not too much complicated. + + +Proposal for incoming connection sequence : + +- accept() +- if monitor-net matches or if mode health => try to send response +- if accept-proxy, wait for proxy request +- if tcp-request connection, process tcp rules and possibly keep the + pointer to stick-table +- if SSL is enabled, switch to SSL handshake +- then switch to DATA state and instantiate a session + +We just need a map of handshake handlers on the connection. They all manage the +FD status themselves and set the callbacks themselves. If their work succeeds, +they remove themselves from the list. If it fails, they remain subscribed and +enable the required polling until they are woken up again or the timeout strikes. + +Identified handshake handlers for incoming connections : + - HH_HEALTH (tries to send OK and dies) + - HH_MONITOR_IN (matches src IP and adds/removes HH_SEND_OK/HH_SEND_HTTP_OK) + - HH_SEND_OK (tries to send "OK" and dies) + - HH_SEND_HTTP_OK (tries to send "HTTP/1.0 200 OK" and dies) + - HH_ACCEPT_PROXY (waits for PROXY line and parses it) + - HH_TCP_RULES (processes TCP rules) + - HH_SSL_HS (starts SSL handshake) + - HH_ACCEPT_SESSION (instantiates a session) + +Identified handshake handlers for outgoing connections : + - HH_SEND_PROXY (tries to build and send the PROXY line) + - HH_SSL_HS (starts SSL handshake) + +For the pollers, we could check that handshake handlers are not 0 and decide to +call a generic connection handshake handler instead of usual callbacks. Problem +is that pollers don't know connections, they know fds. So entities which manage +handlers should update change the FD callbacks accordingly. + +With a bit of care, we could have : + - HH_SEND_LAST_CHUNK (sends the chunk pointed to by a pointer and dies) + => merges HEALTH, SEND_OK and SEND_HTTP_OK + +It sounds like the ctrl vs data state for the connection are per-direction +(eg: support an async ctrl shutw while still reading data). + +Also support shutr/shutw status at L4/L7. + +In practice, what we really need is : + +shutdown(conn) = + conn.data.shut() + conn.ctrl.shut() + conn.fd.shut() + +close(conn) = + conn.data.close() + conn.ctrl.close() + conn.fd.close() + +With SSL over Remote TCP (RTCP + RSSL) to reach the server, we would have : + + HTTP -> RTCP+RSSL connection <-> RTCP+RRAW connection -> TCP+SSL connection + +The connection has to be closed at 3 places after a successful response : + - DATA (RSSL over RTCP) + - CTRL (RTCP to close connection to server) + - SOCK (FD to close connection to second process) + +Externally, the connection is seen with very few flags : + - SHR + - SHW + - ERR + +We don't need a CLOSED flag as a connection must always be detached when it's closed. + +The internal status doesn't need to be exposed : + - FD allocated (Y/N) + - CTRL initialized (Y/N) + - CTRL connected (Y/N) + - CTRL handlers done (Y/N) + - CTRL failed (Y/N) + - CTRL shutr (Y/N) + - CTRL shutw (Y/N) + - DATA initialized (Y/N) + - DATA connected (Y/N) + - DATA handlers done (Y/N) + - DATA failed (Y/N) + - DATA shutr (Y/N) + - DATA shutw (Y/N) + +(note that having flags for operations needing to be completed might be easier) +-------------- + +Maybe we need to be able to call conn->fdset() and conn->fdclr() but it sounds +very unlikely since the only functions manipulating this are in the code of +the data/ctrl handlers. + +FDSET/FDCLR cannot be directly controlled by the stream interface since it also +depends on the DATA layer (WANT_READ/WANT_WRITE). + +But FDSET/FDCLR is probably controlled by who owns the connection (eg: DATA). + +Example: an SSL conn relies on an FD. The buffer is full, and wants the conn to +stop reading. It must not stop the FD itself. It is the read function which +should notice that it has nothing to do with a read wake-up, which needs to +disable reading. + +Conversely, when calling conn->chk_rcv(), the reader might get a WANT_READ or +even WANT_WRITE and adjust the FDs accordingly. + +------------------------ + +OK, the problem is simple : we don't manipulate the FD at the right level. +We should have : + ->connect(), ->chk_snd(), ->chk_rcv(), ->shutw(), ->shutr() which are + called from the upper layer (buffer) + ->recv(), ->send(), called from the lower layer + +Note that the SHR is *reported* by lower layer but can be forced by upper +layer. In this case it's like a delayed abort. The difficulty consists in +knowing the output data were correctly read. Probably we'd need to drain +incoming data past the active shutr(). + +The only four purposes of the top-down shutr() call are : + - acknowledge a shut read report : could probably be done better + - read timeout => disable reading : it's a delayed abort. We want to + report that the buffer is SHR, maybe even the connection, but the + FD clearly isn't. + - read abort due to error on the other side or desire to close (eg: + http-server-close) : delayed abort + - complete abort + +The active shutr() is problematic as we can't disable reading if we expect some +exchanges for data acknowledgement. We probably need to drain data only until +the shutw() has been performed and ACKed. + +A connection shut down for read would behave like this : + + 1) bidir exchanges + + 2) shutr() => read_abort_pending=1 + + 3) drain input, still send output + + 4) shutw() + + 5) drain input, wait for read0 or ack(shutw) + + 6) close() + +--------------------- 2012/07/05 ------------------- + +Communications must be performed this way : + + connection <-> channel <-> connection + +A channel is composed of flags and stats, and may store data in either a buffer +or a pipe. We need low-layer operations between sockets and buffers or pipes. +Right now we only support sockets, but later we might support remote sockets +and maybe pipes or shared memory segments. + +So we need : + + - raw_sock_to_buf() => receive raw data from socket into buffer + - raw_sock_to_pipe => receive raw data from socket into pipe (splice in) + - raw_sock_from_buf() => send raw data from buffer to socket + - raw_sock_from_pipe => send raw data from pipe to socket (splice out) + + - ssl_sock_to_buf() => receive ssl data from socket into buffer + - ssl_sock_to_pipe => receive ssl data from socket into a pipe (NULL) + - ssl_sock_from_buf() => send ssl data from buffer to socket + - ssl_sock_from_pipe => send ssl data from pipe to socket (NULL) + +These functions should set such status flags : + +#define ERR_IN 0x01 +#define ERR_OUT 0x02 +#define SHUT_IN 0x04 +#define SHUT_OUT 0x08 +#define EMPTY_IN 0x10 +#define FULL_OUT 0x20 + diff --git a/doc/design-thoughts/how-it-works.txt b/doc/design-thoughts/how-it-works.txt new file mode 100644 index 0000000..2d1cb89 --- /dev/null +++ b/doc/design-thoughts/how-it-works.txt @@ -0,0 +1,60 @@ +How it works ? (unfinished and inexact) + +For TCP and HTTP : + +- listeners create listening sockets with a READ callback pointing to the + protocol-specific accept() function. + +- the protocol-specific accept() function then accept()'s the connection and + instantiates a "server TCP socket" (which is dedicated to the client side), + and configures it (non_block, get_original_dst, ...). + +For TCP : +- in case of pure TCP, a request buffer is created, as well as a "client TCP + socket", which tries to connect to the server. + +- once the connection is established, the response buffer is allocated and + connected to both ends. + +- both sockets are set to "autonomous mode" so that they only wake up their + supervising session when they encounter a special condition (error or close). + + +For HTTP : +- in case of HTTP, a request buffer is created with the "HOLD" flag set and + a read limit to support header rewriting (may be this one will be removed + eventually because it's better to limit only to the buffer size and report + an error when rewritten data overflows) + +- a "flow analyzer" is attached to the buffer (or possibly multiple flow + analyzers). For the request, the flow analyzer is "http_lb_req". The flow + analyzer is a function which gets called when new data is present and + blocked. It has a timeout (request timeout). It can also be bypassed on + demand. + +- when the "http_lb_req" has received the whole request, it creates a client + socket with all the parameters needed to try to connect to the server. When + the connection establishes, the response buffer is allocated on the fly, + put to HOLD mode, and a an "http_lb_resp" flow analyzer is attached to the + buffer. + + +For client-side HTTPS : + +- the accept() function must completely instantiate a TCP socket + an SSL + reader. It is when the SSL session is complete that we call the + protocol-specific accept(), and create its buffer. + + + + +Conclusions +----------- + +- we need a generic TCP accept() function with a lot of flags set by the + listener, to tell it what info we need to get at the accept() time, and + what flags will have to be set on the socket. + +- once the TCP accept() function ends, it wakes up the protocol supervisor + which is in charge of creating the buffers, etc, switch states, etc... + diff --git a/doc/design-thoughts/http2.txt b/doc/design-thoughts/http2.txt new file mode 100644 index 0000000..c21ac10 --- /dev/null +++ b/doc/design-thoughts/http2.txt @@ -0,0 +1,277 @@ +2014/10/23 - design thoughts for HTTP/2 + +- connections : HTTP/2 depends a lot more on a connection than HTTP/1 because a + connection holds a compression context (headers table, etc...). We probably + need to have an h2_conn struct. + +- multiple transactions will be handled in parallel for a given h2_conn. They + are called streams in HTTP/2 terminology. + +- multiplexing : for a given client-side h2 connection, we can have multiple + server-side h2 connections. And for a server-side h2 connection, we can have + multiple client-side h2 connections. Streams circulate in N-to-N fashion. + +- flow control : flow control will be applied between multiple streams. Special + care must be taken so that an H2 client cannot block some H2 servers by + sending requests spread over multiple servers to the point where one server + response is blocked and prevents other responses from the same server from + reaching their clients. H2 connection buffers must always be empty or nearly + empty. The per-stream flow control needs to be respected as well as the + connection's buffers. It is important to implement some fairness between all + the streams so that it's not always the same which gets the bandwidth when + the connection is congested. + +- some clients can be H1 with an H2 server (is this really needed ?). Most of + the initial use case will be H2 clients to H1 servers. It is important to keep + in mind that H1 servers do not do flow control and that we don't want them to + block transfers (eg: post upload). + +- internal tasks : some H2 clients will be internal tasks (eg: health checks). + Some H2 servers will be internal tasks (eg: stats, cache). The model must be + compatible with this use case. + +- header indexing : headers are transported compressed, with a reference to a + static or a dynamic header, or a literal, possibly huffman-encoded. Indexing + is specific to the H2 connection. This means there is no way any binary data + can flow between both sides, headers will have to be decoded according to the + incoming connection's context and re-encoded according to the outgoing + connection's context, which can significantly differ. In order to avoid the + parsing trouble we currently face, headers will have to be clearly split + between name and value. It is worth noting that neither the incoming nor the + outgoing connections' contexts will be of any use while processing the + headers. At best we can have some shortcuts for well-known names that map + well to the static ones (eg: use the first static entry with same name), and + maybe have a few special cases for static name+value as well. Probably we can + classify headers in such categories : + + - static name + value + - static name + other value + - dynamic name + other value + + This will allow for better processing in some specific cases. Headers + supporting a single value (:method, :status, :path, ...) should probably + be stored in a single location with a direct access. That would allow us + to retrieve a method using hdr[METHOD]. All such indexing must be performed + while parsing. That also means that HTTP/1 will have to be converted to this + representation very early in the parser and possibly converted back to H/1 + after processing. + + Header names/values will have to be placed in a small memory area that will + inevitably get fragmented as headers are rewritten. An automatic packing + mechanism must be implemented so that when there's no more room, headers are + simply defragmented/packet to a new table and the old one is released. Just + like for the static chunks, we need to have a few such tables pre-allocated + and ready to be swapped at any moment. Repacking must not change any index + nor affect the way headers are compressed so that it can happen late after a + retry (send-name-header for example). + +- header processing : can still happen on a (header, value) basis. Reqrep/ + rsprep completely disappear and will have to be replaced with something else + to support renaming headers and rewriting url/path/... + +- push_promise : servers can push dummy requests+responses. They advertise + the stream ID in the push_promise frame indicating the associated stream ID. + This means that it is possible to initiate a client-server stream from the + information coming from the server and make the data flow as if the client + had made it. It's likely that we'll have to support two types of server + connections: those which support push and those which do not. That way client + streams will be distributed to existing server connections based on their + capabilities. It's important to keep in mind that PUSH will not be rewritten + in responses. + +- stream ID mapping : since the stream ID is per H2 connection, stream IDs will + have to be mapped. Thus a given stream is an entity with two IDs (one per + side). Or more precisely a stream has two end points, each one carrying an ID + when it ends on an HTTP2 connection. Also, for each stream ID we need to + quickly find the associated transaction in progress. Using a small quick + unique tree seems indicated considering the wide range of valid values. + +- frame sizes : frame have to be remapped between both sides as multiplexed + connections won't always have the same characteristics. Thus some frames + might be spliced and others will be sliced. + +- error processing : care must be taken to never break a connection unless it + is dead or corrupt at the protocol level. Stats counter must exist to observe + the causes. Timeouts are a great problem because silent connections might + die out of inactivity. Ping frames should probably be scheduled a few seconds + before the connection timeout so that an unused connection is verified before + being killed. Abnormal requests must be dealt with using RST_STREAM. + +- ALPN : ALPN must be observed on the client side, and transmitted to the server + side. + +- proxy protocol : proxy protocol makes little to no sense in a multiplexed + protocol. A per-stream equivalent will surely be needed if implementations + do not quickly generalize the use of Forward. + +- simplified protocol for local devices (eg: haproxy->varnish in clear and + without handshake, and possibly even with splicing if the connection's + settings are shared) + +- logging : logging must report a number of extra information such as the + stream ID, and whether the transaction was initiated by the client or by the + server (which can be deduced from the stream ID's parity). In case of push, + the number of the associated stream must also be reported. + +- memory usage : H2 increases memory usage by mandating use of 16384 bytes + frame size minimum. That means slightly more than 16kB of buffer in each + direction to process any frame. It will definitely have an impact on the + deployed maxconn setting in places using less than this (4..8kB are common). + Also, the header list is persistent per connection, so if we reach the same + size as the request, that's another 16kB in each direction, resulting in + about 48kB of memory where 8 were previously used. A more careful encoder + can work with a much smaller set even if that implies evicting entries + between multiple headers of the same message. + +- HTTP/1.0 should very carefully be transported over H2. Since there's no way + to pass version information in the protocol, the server could use some + features of HTTP/1.1 that are unsafe in HTTP/1.0 (compression, trailers, + ...). + +- host / :authority : ":authority" is the norm, and "host" will be absent when + H2 clients generate :authority. This probably means that a dummy Host header + will have to be produced internally from :authority and removed when passing + to H2 behind. This can cause some trouble when passing H2 requests to H1 + proxies, because there's no way to know if the request should contain scheme + and authority in H1 or not based on the H2 request. Thus a "proxy" option + will have to be explicitly mentioned on HTTP/1 server lines. One of the + problem that it creates is that it's not longer possible to pass H/1 requests + to H/1 proxies without an explicit configuration. Maybe a table of the + various combinations is needed. + + :scheme :authority host + HTTP/2 request present present absent + HTTP/1 server req absent absent present + HTTP/1 proxy req present present present + + So in the end the issue is only with H/2 requests passed to H/1 proxies. + +- ping frames : they don't indicate any stream ID so by definition they cannot + be forwarded to any server. The H2 connection should deal with them only. + +There's a layering problem with H2. The framing layer has to be aware of the +upper layer semantics. We can't simply re-encode HTTP/1 to HTTP/2 then pass +it over a framing layer to mux the streams, the frame type must be passed below +so that frames are properly arranged. Header encoding is connection-based and +all streams using the same connection will interact in the way their headers +are encoded. Thus the encoder *has* to be placed in the h2_conn entity, and +this entity has to know for each stream what its headers are. + +Probably that we should remove *all* headers from transported data and move +them on the fly to a parallel structure that can be shared between H1 and H2 +and consumed at the appropriate level. That means buffers only transport data. +Trailers have to be dealt with differently. + +So if we consider an H1 request being forwarded between a client and a server, +it would look approximately like this : + + - request header + body land into a stream's receive buffer + - headers are indexed and stripped out so that only the body and whatever + follows remain in the buffer + - both the header index and the buffer with the body stay attached to the + stream + - the sender can rebuild the whole headers. Since they're found in a table + supposed to be stable, it can rebuild them as many times as desired and + will always get the same result, so it's safe to build them into the trash + buffer for immediate sending, just as we do for the PROXY protocol. + - the upper protocol should probably provide a build_hdr() callback which + when called by the socket layer, builds this header block based on the + current stream's header list, ready to be sent. + - the socket layer has to know how many bytes from the headers are left to be + forwarded prior to processing the body. + - the socket layer needs to consume only the acceptable part of the body and + must not release the buffer if any data remains in it (eg: pipelining over + H1). This is already handled by channel->o and channel->to_forward. + - we could possibly have another optional callback to send a preamble before + data, that could be used to send chunk sizes in H1. The danger is that it + absolutely needs to be stable if it has to be retried. But it could + considerably simplify de-chunking. + +When the request is sent to an H2 server, an H2 stream request must be made +to the server, we find an existing connection whose settings are compatible +with our needs (eg: tls/clear, push/no-push), and with a spare stream ID. If +none is found, a new connection must be established, unless maxconn is reached. + +Servers must have a maxstream setting just like they have a maxconn. The same +queue may be used for that. + +The "tcp-request content" ruleset must apply to the TCP layer. But with HTTP/2 +that becomes impossible (and useless). We still need something like the +"tcp-request session" hook to apply just after the SSL handshake is done. + +It is impossible to defragment the body on the fly in HTTP/2. Since multiple +messages are interleaved, we cannot wait for all of them and block the head of +line. Thus if body analysis is required, it will have to use the stream's +buffer, which necessarily implies a copy. That means that with each H2 end we +necessarily have at least one copy. Sometimes we might be able to "splice" some +bytes from one side to the other without copying into the stream buffer (same +rules as for TCP splicing). + +In theory, only data should flow through the channel buffer, so each side's +connector is responsible for encoding data (H1: linear/chunks, H2: frames). +Maybe the same mechanism could be extrapolated to tunnels / TCP. + +Since we'd use buffers only for data (and for receipt of headers), we need to +have dynamic buffer allocation. + +Thus : +- Tx buffers do not exist. We allocate a buffer on the fly when we're ready to + send something that we need to build and that needs to be persistent in case + of partial send. H1 headers are built on the fly from the header table to a + temporary buffer that is immediately sent and whose amount of sent bytes is + the only information kept (like for PROXY protocol). H2 headers are more + complex since the encoding depends on what was successfully sent. Thus we + need to build them and put them into a temporary buffer that remains + persistent in case send() fails. It is possible to have a limited pool of + Tx buffers and refrain from sending if there is no more buffer available in + the pool. In that case we need a wake-up mechanism once a buffer is + available. Once the data are sent, the Tx buffer is then immediately recycled + in its pool. Note that no tx buffer being used (eg: for hdr or control) means + that we have to be able to serialize access to the connection and retry with + the same stream. It also means that a stream that times out while waiting for + the connector to read the second half of its request has to stay there, or at + least needs to be handled gracefully. However if the connector cannot read + the data to be sent, it means that the buffer is congested and the connection + is dead, so that probably means it can be killed. + +- Rx buffers have to be pre-allocated just before calling recv(). A connection + will first try to pick a buffer and disable reception if it fails, then + subscribe to the list of tasks waiting for an Rx buffer. + +- full Rx buffers might sometimes be moved around to the next buffer instead of + experiencing a copy. That means that channels and connectors must use the + same format of buffer, and that only the channel will have to see its + pointers adjusted. + +- Tx of data should be made as much as possible without copying. That possibly + means by directly looking into the connection buffer on the other side if + the local Tx buffer does not exist and the stream buffer is not allocated, or + even performing a splice() call between the two sides. One of the problem in + doing this is that it requires proper ordering of the operations (eg: when + multiple readers are attached to a same buffer). If the splitting occurs upon + receipt, there's no problem. If we expect to retrieve data directly from the + original buffer, it's harder since it contains various things in an order + which does not even indicate what belongs to whom. Thus possibly the only + mechanism to implement is the buffer permutation which guarantees zero-copy + and only in the 100% safe case. Also it's atomic and does not cause HOL + blocking. + +It makes sense to chose the frontend_accept() function right after the +handshake ended. It is then possible to check the ALPN, the SNI, the ciphers +and to accept to switch to the h2_conn_accept handler only if everything is OK. +The h2_conn_accept handler will have to deal with the connection setup, +initialization of the header table, exchange of the settings frames and +preparing whatever is needed to fire new streams upon receipt of unknown +stream IDs. Note: most of the time it will not be possible to splice() because +we need to know in advance the amount of bytes to write the header, and here it +will not be possible. + +H2 health checks must be seen as regular transactions/streams. The check runs a +normal client which seeks an available stream from a server. The server then +finds one on an existing connection or initiates a new H2 connection. The H2 +checks will have to be configurable for sharing streams or not. Another option +could be to specify how many requests can be made over existing connections +before insisting on getting a separate connection. Note that such separate +connections might end up stacking up once released. So probably that they need +to be recycled very quickly (eg: fix how many unused ones can exist max). + diff --git a/doc/design-thoughts/http_load_time.url b/doc/design-thoughts/http_load_time.url new file mode 100644 index 0000000..f178e46 --- /dev/null +++ b/doc/design-thoughts/http_load_time.url @@ -0,0 +1,5 @@ +Excellent paper about page load time for keepalive on/off, pipelining, +multiple host names, etc... + +http://www.die.net/musings/page_load_time/ + diff --git a/doc/design-thoughts/pool-debugging.txt b/doc/design-thoughts/pool-debugging.txt new file mode 100644 index 0000000..106e41c --- /dev/null +++ b/doc/design-thoughts/pool-debugging.txt @@ -0,0 +1,243 @@ +2022-02-22 - debugging options with pools + +Two goals: + - help developers spot bugs as early as possible + + - make the process more reliable in field, by killing sick ones as soon as + possible instead of letting them corrupt data, cause trouble, or even be + exploited. + +An allocated object may exist in 5 forms: + - in use: currently referenced and used by haproxy, 100% of its size are + dedicated to the application which can do absolutely anything with it, + but it may never touch anything before nor after that area. + + - in cache: the object is neither referenced nor used anymore, but it sits + in a thread's cache. The application may not touch it at all anymore, and + some parts of it could even be unmapped. Only the current thread may safely + reach it, though others might find/release it when under thread isolation. + The thread cache needs some LRU linking that may be stored anywhere, either + inside the area, or outside. The parts surrounding the <size> parts remain + invisible to the application layer, and can serve as a protection. + + - in shared cache: the object is neither referenced nor used anymore, but it + may be reached by any thread. Some parts of it could be unmapped. Any + thread may pick it but only one may find it, hence once grabbed, it is + guaranteed no other one will find it. The shared cache needs to set up a + linked list and a single pointer needs to be stored anywhere, either inside + or outside the area. The parts surrounding the <size> parts remain + invisible to the application layer, and can serve as a protection. + + - in the system's memory allocator: the object is not known anymore from + haproxy. It may be reassigned in parts or totally to other pools or other + subsystems (e.g. crypto library). Some or all of it may be unmapped. The + areas surrounding the <size> parts are also part of the object from the + library's point of view and may be delivered to other areas. Tampering + with these may cause any other part to malfunction in dirty ways. + + - in the OS only: the memory allocator gave it back to the OS. + +The following options need to be configurable: + - detect improper initialization: this is done by poisonning objects before + delivering them to the application. + + - help figure where an object was allocated when in use: a pointer to the + call place will help. Pointing to the last pool_free() as well for the + same reasons when dealing with a UAF. + + - detection of wrong pointer/pool when in use: a pointer to the pool before + or after the area will definitely help. + + - detection of overflows when in use: a canary at the end of the area + (closest possible to <size>) will definitely help. The pool above can do + that job. Ideally, we should fill some data at the end so that even + unaligned sizes can be checked (e.g. a buffer that gets a zero appended). + If we just align on 2 pointers, writing the same pointer twice at the end + may do the job, but we won't necessarily have our bytes. Thus a particular + end-of-string pattern would be useful (e.g. ff55aa01) to fill it. + + - detection of double free when in cache: similar to detection of wrong + pointer/pool when in use: the pointer at the end may simply be changed so + that it cannot match the pool anymore. By using a pointer to the caller of + the previous free() operation, we have the guarantee to see different + pointers, and this pointer can be inspected to figure where the object was + previously freed. An extra check may even distinguish a perfect double-free + (same caller) from just a wrong free (pointer differs from pool). + + - detection of late corruption when in cache: keeping a copy of the + checksum of the whole area upon free() will do the job, but requires one + extra storage area for the checksum. Filling the area with a pattern also + does the job and doesn't require extra storage, but it loses the contents + and can be a bit slower. Sometimes losing the contents can be a feature, + especially when trying to detect late reads. Probably that both need to + be implemented. Note that if contents are not strictly needed, storing a + checksum inside the area does the job. + + - preserve total contents in cache for debugging: losing some precious + information can be a problem. + + - pattern filling of the area helps detect use-after-free in read-only mode. + + - allocate cold first helps with both cases above. + +Uncovered: + - overflow/underflow when in cache/shared/libc: it belongs to use-after-free + pattern and such an error during regular use ought to be caught while the + object was still in use. + + - integrity when in libc: not under our control anymore, this is a libc + problem. + +Arbitrable: + - integrity when in shared cache: unlikely to happen only then if it could + have happened in the local cache. Shared cache not often used anymore, thus + probably not worth the effort + + - protection against double-free when in shared cache/libc: might be done for + a cheap price, probably worth being able to quickly tell that such an + object left the local cache (e.g. the mark points to the caller, but could + possibly just be incremented, hence still point to the same code location+1 + byte when released. Calls are 4 bytes min on RISC, 5 on x86 so we do have + some margin by having a caller's location be +0,+1,+2 or +3. + + - underflow when in use: hasn't been really needed over time but may change. + + - detection of late corruption when in shared cache: checksum or area filling + are possible, but is this as relevant as it used to considering the less + common use of the shared cache ? + +Design considerations: + - object allocation when in use must remain minimal + + - when in cache, there are 2 lists which the compiler expect to be at least + aligned each (e.g. if/when we start to use DWCAS). + + - the original "pool debugging" feature covers both pool tracking, double- + free detection, overflow detection and caller info at the cost of a single + pointer placed immediately after the area. + + - preserving the contents might be done by placing the cache links and the + shared cache's list outside of the area (either before or after). Placing + it before has the merit that the allocated object preserves the 4-ptr + alignment. But when a larger alignment is desired this often does not work + anymore. Placing it after requires some dynamic adjustment depending on the + object's size. If any protection is installed, this protection must be + placed before the links so that the list doesn't get randomly corrupted and + corrupts adjacent elements. Note that if protection is desired, the extra + waste is probably less critical. + + - a link to the last caller might have to be stored somewhere. Without + preservation the free() caller may be placed anywhere while the alloc() + caller may only be placed outside. With preservation, again the free() + caller may be placed either before the object or after the mark at the end. + There is no particular need that both share the same location though it may + help. Note that when debugging is enabled, the free() caller doesn't need + to be duplicated and can continue to serve as the double-free detection. + Thus maybe in the end we only need to store the caller to the last alloc() + but not the free() since if we want it it's available via the pool debug. + + - use-after-free detection: contents may be erased on free() and checked on + alloc(), but they can also be checksummed on free() and rechecked on + alloc(). In the latter case we need to store a checksum somewhere. Note + that with pure checksum we don't know what part was modified, but seeing + previous contents can be useful. + +Possibilities: + +1) Linked lists inside the area: + + V size alloc + ---+------------------------------+-----------------+-- + in use |##############################| (Pool) (Tracer) | + ---+------------------------------+-----------------+-- + + ---+--+--+------------------------+-----------------+-- + in cache |L1|L2|########################| (Caller) (Sum) | + ---+--+--+------------------------+-----------------+-- +or: + ---+--+--+------------------------+-----------------+-- + in cache |L1|L2|###################(sum)| (Caller) | + ---+--+--+------------------------+-----------------+-- + + ---+-+----------------------------+-----------------+-- + in global |N|XXXX########################| (Caller) | + ---+-+----------------------------+-----------------+-- + + +2) Linked lists before the the area leave room for tracer and pool before + the area, but the canary must remain at the end, however the area will + be more difficult to keep aligned: + + V head size alloc + ----+-+-+------------------------------+-----------------+-- + in use |T|P|##############################| (canary) | + ----+-+-+------------------------------+-----------------+-- + + --+-----+------------------------------+-----------------+-- + in cache |L1|L2|##############################| (Caller) (Sum) | + --+-----+------------------------------+-----------------+-- + + ------+-+------------------------------+-----------------+-- + in global |N|##############################| (Caller) | + ------+-+------------------------------+-----------------+-- + + +3) Linked lists at the end of the area, might be shared with extra data + depending on the state: + + V size alloc + ---+------------------------------+-----------------+-- + in use |##############################| (Pool) (Tracer) | + ---+------------------------------+-----------------+-- + + ---+------------------------------+--+--+-----------+-- + in cache |##############################|L1|L2| (Caller) (Sum) + ---+------------------------------+--+--+-----------+-- + + ---+------------------------------+-+---------------+-- + in global |##############################|N| (Caller) | + ---+------------------------------+-+---------------+-- + +This model requires a little bit of alignment at the end of the area, which is +not incompatible with pattern filling and/or checksumming: + - preserving the area for post-mortem analysis means nothing may be placed + inside. In this case it could make sense to always store the last releaser. + - detecting late corruption may be done either with filling or checksumming, + but the simple fact of assuming a risk of corruption that needs to be + chased means we must not store the lists nor caller inside the area. + +Some models imply dedicating some place when in cache: + - preserving contents forces the lists to be prefixed or appended, which + leaves unused places when in use. Thus we could systematically place the + pool pointer and the caller in this case. + + - if preserving contents is not desired, almost everything can be stored + inside when not in use. Then each situation's size should be calculated + so that the allocated size is known, and entries are filled from the + beginning while not in use, or after the size when in use. + + - if poisonning is requested, late corruption might be detected but then we + don't want the list to be stored inside at the risk of being corrupted. + +Maybe just implement a few models: + - compact/optimal: put l1/l2 inside + - detect late corruption: fill/sum, put l1/l2 out + - preserve contents: put l1/l2 out + - corruption+preserve: do not fill, sum out + - poisonning: not needed on free if pattern filling is done. + +try2: + - poison on alloc to detect missing initialization: yes/no + (note: nothing to do if filling done) + - poison on free to detect use-after-free: yes/no + (note: nothing to do if filling done) + - check on alloc for corruption-after-free: yes/no + If content-preserving => sum, otherwise pattern filling; in + any case, move L1/L2 out. + - check for overflows: yes/no: use a canary after the area. The + canary can be the pointer to the pool. + - check for alloc caller: yes/no => always after the area + - content preservation: yes/no + (disables filling, moves lists out) + - improved caller tracking: used to detect double-free, may benefit + from content-preserving but not only. diff --git a/doc/design-thoughts/rate-shaping.txt b/doc/design-thoughts/rate-shaping.txt new file mode 100644 index 0000000..ca09408 --- /dev/null +++ b/doc/design-thoughts/rate-shaping.txt @@ -0,0 +1,90 @@ +2010/01/24 - Design of multi-criteria request rate shaping. + +We want to be able to rate-shape traffic on multiple cirteria. For instance, we +may want to support shaping of per-host header requests, as well as per source. + +In order to achieve this, we will use checkpoints, one per criterion. Each of +these checkpoints will consist in a test, a rate counter and a queue. + +A request reaches the checkpoint and checks the counter. If the counter is +below the limit, it is updated and the request continues. If the limit is +reached, the request attaches itself into the queue and sleeps. The sleep time +is computed from the queue status, and updates the queue status. + +A task is dedicated to each queue. Its sole purpose is to be woken up when the +next task may wake up, to check the frequency counter, wake as many requests as +possible and update the counter. All the woken up requests are detached from +the queue. Maybe the task dedicated to the queue can be avoided and replaced +with all queued tasks's sleep counters, though this looks tricky. Or maybe it's +just the first request in the queue that should be responsible for waking up +other tasks, and not to forget to pass on this responsibility to next tasks if +it leaves the queue. + +The woken up request then goes on evaluating other criteria and possibly sleeps +again on another one. In the end, the task will have waited the amount of time +required to pass all checkpoints, and all checkpoints will be able to maintain +a permanent load of exactly their limit if enough streams flow through them. + +Since a request can only sleep in one queue at a time, it makes sense to use a +linked list element in each session to attach it to any queue. It could very +well be shared with the pendconn hooks which could then be part of the session. + +This mechanism could be used to rate-shape sessions and requests per backend +and per server. + +When rate-shaping on dynamic criteria, such as the source IP address, we have +to first extract the data pattern, then look it up in a table very similar to +the stickiness tables, but with a frequency counter. At the checkpoint, the +pattern is looked up, the entry created or refreshed, and the frequency counter +updated and checked. Then the request either goes on or sleeps as described +above, but if it sleeps, it's still in the checkpoint's queue, but with a date +computed from the criterion's status. + +This means that we need 3 distinct features : + + - optional pattern extraction + - per-pattern or per-queue frequency counter + - time-ordered queue with a task + +Based on past experiences with frequency counters, it does not appear very easy +to exactly compute sleep delays in advance for multiple requests. So most +likely we'll have to run per-criterion queues too, with only the head of the +queue holding a wake-up timeout. + +This finally leads us to the following : + + - optional pattern extraction + - per-pattern or per-queue frequency counter + - per-frequency counter queue + - head of the queue serves as a global queue timer. + +This brings us to a very flexible architecture : + - 1 list of rule-based checkpoints per frontend + - 1 list of rule-based checkpoints per backend + - 1 list of rule-based checkpoints per server + +Each of these lists have a lot of rules conditioned by ACLs, just like the +use-backend rules, except that all rules are evaluated in turn. + +Since we might sometimes just want to enable that without setting any limit and +just for enabling control in ACLs (or logging ?), we should probably try to +find a flexible way of declaring just a counter without a queue. + +These checkpoints could be of two types : + - rate-limit (described here) + - concurrency-limit (very similar with the counter and no timer). This + feature would require to keep track of all accounted criteria in a + request so that they can be released upon request completion. + +It should be possible to define a max of requests in the queue, above which a +503 is returned. The same applies for the max delay in the queue. We could have +it per-task (currently it's the connection timeout) and abort tasks with a 503 +when the delay is exceeded. + +Per-server connection concurrency could be converted to use this mechanism +which is very similar. + +The construct should be flexible enough so that the counters may be checked +from ACLs. That would allow to reject connections or switch to an alternate +backend when some limits are reached. + diff --git a/doc/design-thoughts/sess_par_sec.txt b/doc/design-thoughts/sess_par_sec.txt new file mode 100644 index 0000000..e936374 --- /dev/null +++ b/doc/design-thoughts/sess_par_sec.txt @@ -0,0 +1,13 @@ +Graphe des nombres de traitements par seconde unité de temps avec + - un algo linéaire et trčs peu coűteux unitairement (0.01 ut) + - un algo en log(2) et 5 fois plus coűteux (0.05 ut) + +set yrange [0:1] +plot [0:1000] 1/(1+0.01*x), 1/(1+0.05*log(x+1)/log(2)) + +Graphe de la latence induite par ces traitements en unités de temps : + +set yrange [0:1000] +plot [0:1000] x/(1+0.01*x), x/(1+0.05*log(x+1)/log(2)) + + diff --git a/doc/design-thoughts/thread-group.txt b/doc/design-thoughts/thread-group.txt new file mode 100644 index 0000000..f1aad7b --- /dev/null +++ b/doc/design-thoughts/thread-group.txt @@ -0,0 +1,528 @@ +Thread groups +############# + +2021-07-13 - first draft +========== + +Objective +--------- +- support multi-socket systems with limited cache-line bouncing between + physical CPUs and/or L3 caches + +- overcome the 64-thread limitation + +- Support a reasonable number of groups. I.e. if modern CPUs arrive with + core complexes made of 8 cores, with 8 CC per chip and 2 chips in a + system, it makes sense to support 16 groups. + + +Non-objective +------------- +- no need to optimize to the last possible cycle. I.e. some algos like + leastconn will remain shared across all threads, servers will keep a + single queue, etc. Global information remains global. + +- no stubborn enforcement of FD sharing. Per-server idle connection lists + can become per-group; listeners can (and should probably) be per-group. + Other mechanisms (like SO_REUSEADDR) can already overcome this. + +- no need to go beyond 64 threads per group. + + +Identified tasks +================ + +General +------- +Everywhere tid_bit is used we absolutely need to find a complement using +either the current group or a specific one. Thread debugging will need to +be extended as masks are extensively used. + + +Scheduler +--------- +The global run queue and global wait queue must become per-group. This +means that a task may only be queued into one of them at a time. It +sounds like tasks may only belong to a given group, but doing so would +bring back the original issue that it's impossible to perform remote wake +ups. + +We could probably ignore the group if we don't need to set the thread mask +in the task. the task's thread_mask is never manipulated using atomics so +it's safe to complement it with a group. + +The sleeping_thread_mask should become per-group. Thus possibly that a +wakeup may only be performed on the assigned group, meaning that either +a task is not assigned, in which case it be self-assigned (like today), +otherwise the tg to be woken up will be retrieved from the task itself. + +Task creation currently takes a thread mask of either tid_bit, a specific +mask, or MAX_THREADS_MASK. How to create a task able to run anywhere +(checks, Lua, ...) ? + +Profiling +--------- +There should be one task_profiling_mask per thread group. Enabling or +disabling profiling should be made per group (possibly by iterating). + +Thread isolation +---------------- +Thread isolation is difficult as we solely rely on atomic ops to figure +who can complete. Such operation is rare, maybe we could have a global +read_mostly flag containing a mask of the groups that require isolation. +Then the threads_want_rdv_mask etc can become per-group. However setting +and clearing the bits will become problematic as this will happen in two +steps hence will require careful ordering. + +FD +-- +Tidbit is used in a number of atomic ops on the running_mask. If we have +one fdtab[] per group, the mask implies that it's within the group. +Theoretically we should never face a situation where an FD is reported nor +manipulated for a remote group. + +There will still be one poller per thread, except that this time all +operations will be related to the current thread_group. No fd may appear +in two thread_groups at once, but we can probably not prevent that (e.g. +delayed close and reopen). Should we instead have a single shared fdtab[] +(less memory usage also) ? Maybe adding the group in the fdtab entry would +work, but when does a thread know it can leave it ? Currently this is +solved by running_mask and by update_mask. Having two tables could help +with this (each table sees the FD in a different group with a different +mask) but this looks overkill. + +There's polled_mask[] which needs to be decided upon. Probably that it +should be doubled as well. Note, polled_mask left fdtab[] for cacheline +alignment reasons in commit cb92f5cae4. + +If we have one fdtab[] per group, what *really* prevents from using the +same FD in multiple groups ? _fd_delete_orphan() and fd_update_events() +need to check for no-thread usage before closing the FD. This could be +a limiting factor. Enabling could require to wake every poller. + +Shouldn't we remerge fdinfo[] with fdtab[] (one pointer + one int/short, +used only during creation and close) ? + +Other problem, if we have one fdtab[] per TG, disabling/enabling an FD +(e.g. pause/resume on listener) can become a problem if it's not necessarily +on the current TG. We'll then need a way to figure that one. It sounds like +FDs from listeners and receivers are very specific and suffer from problems +all other ones under high load do not suffer from. Maybe something specific +ought to be done for them, if we can guarantee there is no risk of accidental +reuse (e.g. locate the TG info in the receiver and have a "MT" bit in the +FD's flags). The risk is always that a close() can result in instant pop-up +of the same FD on any other thread of the same process. + +Observations: right now fdtab[].thread_mask more or less corresponds to a +declaration of interest, it's very close to meaning "active per thread". It is +in fact located in the FD while it ought to do nothing there, as it should be +where the FD is used as it rules accesses to a shared resource that is not +the FD but what uses it. Indeed, if neither polled_mask nor running_mask have +a thread's bit, the FD is unknown to that thread and the element using it may +only be reached from above and not from the FD. As such we ought to have a +thread_mask on a listener and another one on connections. These ones will +indicate who uses them. A takeover could then be simplified (atomically set +exclusivity on the FD's running_mask, upon success, takeover the connection, +clear the running mask). Probably that the change ought to be performed on +the connection level first, not the FD level by the way. But running and +polled are the two relevant elements, one indicates userland knowledge, +the other one kernel knowledge. For listeners there's no exclusivity so it's +a bit different but the rule remains the same that we don't have to know +what threads are *interested* in the FD, only its holder. + +Not exact in fact, see FD notes below. + +activity +-------- +There should be one activity array per thread group. The dump should +simply scan them all since the cumuled values are not very important +anyway. + +applets +------- +They use tid_bit only for the task. It looks like the appctx's thread_mask +is never used (now removed). Furthermore, it looks like the argument is +*always* tid_bit. + +CPU binding +----------- +This is going to be tough. It will be needed to detect that threads overlap +and are not bound (i.e. all threads on same mask). In this case, if the number +of threads is higher than the number of threads per physical socket, one must +try hard to evenly spread them among physical sockets (e.g. one thread group +per physical socket) and start as many threads as needed on each, bound to +all threads/cores of each socket. If there is a single socket, the same job +may be done based on L3 caches. Maybe it could always be done based on L3 +caches. The difficulty behind this is the number of sockets to be bound: it +is not possible to bind several FDs per listener. Maybe with a new bind +keyword we can imagine to automatically duplicate listeners ? In any case, +the initially bound cpumap (via taskset) must always be respected, and +everything should probably start from there. + +Frontend binding +---------------- +We'll have to define a list of threads and thread-groups per frontend. +Probably that having a group mask and a same thread-mask for each group +would suffice. + +Threads should have two numbers: + - the per-process number (e.g. 1..256) + - the per-group number (1..64) + +The "bind-thread" lines ought to use the following syntax: + - bind 45 ## bind to process' thread 45 + - bind 1/45 ## bind to group 1's thread 45 + - bind all/45 ## bind to thread 45 in each group + - bind 1/all ## bind to all threads in group 1 + - bind all ## bind to all threads + - bind all/all ## bind to all threads in all groups (=all) + - bind 1/65 ## rejected + - bind 65 ## OK if there are enough + - bind 35-45 ## depends. Rejected if it crosses a group boundary. + +The global directive "nbthread 28" means 28 total threads for the process. The +number of groups will sub-divide this. E.g. 4 groups will very likely imply 7 +threads per group. At the beginning, the nbgroup should be manual since it +implies config adjustments to bind lines. + +There should be a trivial way to map a global thread to a group and local ID +and to do the opposite. + + +Panic handler + watchdog +------------------------ +Will probably depend on what's done for thread_isolate + +Per-thread arrays inside structures +----------------------------------- +- listeners have a thr_conn[] array, currently limited to MAX_THREADS. Should + we simply bump the limit ? +- same for servers with idle connections. +=> doesn't seem very practical. +- another solution might be to point to dynamically allocated arrays of + arrays (e.g. nbthread * nbgroup) or a first level per group and a second + per thread. +=> dynamic allocation based on the global number + +Other +----- +- what about dynamic thread start/stop (e.g. for containers/VMs) ? + E.g. if we decide to start $MANY threads in 4 groups, and only use + one, in the end it will not be possible to use less than one thread + per group, and at most 64 will be present in each group. + + +FD Notes +-------- + - updt_fd_polling() uses thread_mask to figure where to send the update, + the local list or a shared list, and which bits to set in update_mask. + This could be changed so that it takes the update mask in argument. The + call from the poller's fork would just have to broadcast everywhere. + + - pollers use it to figure whether they're concerned or not by the activity + update. This looks important as otherwise we could re-enable polling on + an FD that changed to another thread. + + - thread_mask being a per-thread active mask looks more exact and is + precisely used this way by _update_fd(). In this case using it instead + of running_mask to gauge a change or temporarily lock it during a + removal could make sense. + + - running should be conditioned by thread. Polled not (since deferred + or migrated). In this case testing thread_mask can be enough most of + the time, but this requires synchronization that will have to be + extended to tgid.. But migration seems a different beast that we shouldn't + care about here: if first performed at the higher level it ought to + be safe. + +In practice the update_mask can be dropped to zero by the first fd_delete() +as the only authority allowed to fd_delete() is *the* owner, and as soon as +all running_mask are gone, the FD will be closed, hence removed from all +pollers. This will be the only way to make sure that update_mask always +refers to the current tgid. + +However, it may happen that a takeover within the same group causes a thread +to read the update_mask late, while the FD is being wiped by another thread. +That other thread may close it, causing another thread in another group to +catch it, and change the tgid and start to update the update_mask. This means +that it would be possible for a thread entering do_poll() to see the correct +tgid, then the fd would be closed, reopened and reassigned to another tgid, +and the thread would see its bit in the update_mask, being confused. Right +now this should already happen when the update_mask is not cleared, except +that upon wakeup a migration would be detected and that would be all. + +Thus we might need to set the running bit to prevent the FD from migrating +before reading update_mask, which also implies closing on fd_clr_running() == 0 :-( + +Also even fd_update_events() leaves a risk of updating update_mask after +clearing running, thus affecting the wrong one. Probably that update_mask +should be updated before clearing running_mask there. Also, how about not +creating an update on a close ? Not trivial if done before running, unless +thread_mask==0. + +########################################################### + +Current state: + + +Mux / takeover / fd_delete() code ||| poller code +-------------------------------------------------|||--------------------------------------------------- + \|/ +mux_takeover(): | fd_set_running(): + if (fd_takeover()<0) | old = {running, thread}; + return fail; | new = {tid_bit, tid_bit}; + ... | +fd_takeover(): | do { + atomic_or(running, tid_bit); | if (!(old.thread & tid_bit)) + old = {running, thread}; | return -1; + new = {tid_bit, tid_bit}; | new = { running | tid_bit, old.thread } + if (owner != expected) { | } while (!dwcas({running, thread}, &old, &new)); + atomic_and(runnning, ~tid_bit); | + return -1; // fail | fd_clr_running(): + } | return atomic_and_fetch(running, ~tid_bit); + | + while (old == {tid_bit, !=0 }) | poll(): + if (dwcas({running, thread}, &old, &new)) { | if (!owner) + atomic_and(runnning, ~tid_bit); | continue; + return 0; // success | + } | if (!(thread_mask & tid_bit)) { + } | epoll_ctl_del(); + | continue; + atomic_and(runnning, ~tid_bit); | } + return -1; // fail | + | // via fd_update_events() +fd_delete(): | if (fd_set_running() != -1) { + atomic_or(running, tid_bit); | iocb(); + atomic_store(thread, 0); | if (fd_clr_running() == 0 && !thread_mask) + if (fd_clr_running(fd) = 0) | fd_delete_orphan(); + fd_delete_orphan(); | } + + +The idle_conns_lock prevents the connection from being *picked* and released +while someone else is reading it. What it does is guarantee that on idle +connections, the caller of the IOCB will not dereference the task's context +while the connection is still in the idle list, since it might be picked then +freed at the same instant by another thread. As soon as the IOCB manages to +get that lock, it removes the connection from the list so that it cannot be +taken over anymore. Conversely, the mux's takeover() code runs under that +lock so that if it frees the connection and task, this will appear atomic +to the IOCB. The timeout task (which is another entry point for connection +deletion) does the same. Thus, when coming from the low-level (I/O or timeout): + - task always exists, but ctx checked under lock validates; conn removal + from list prevents takeover(). + - t->context is stable, except during changes under takeover lock. So + h2_timeout_task may well run on a different thread than h2_io_cb(). + +Coming from the top: + - takeover() done under lock() clears task's ctx and possibly closes the FD + (unless some running remains present). + +Unlikely but currently possible situations: + - multiple pollers (up to N) may have an idle connection's FD being + polled, if the connection was passed from thread to thread. The first + event on the connection would wake all of them. Most of them would + see fdtab[].owner set (the late ones might miss it). All but one would + see that their bit is missing from fdtab[].thread_mask and give up. + However, just after this test, others might take over the connection, + so in practice if terribly unlucky, all but 1 could see their bit in + thread_mask just before it gets removed, all of them set their bit + in running_mask, and all of them call iocb() (sock_conn_iocb()). + Thus all of them dereference the connection and touch the subscriber + with no protection, then end up in conn_notify_mux() that will call + the mux's wake(). + + - multiple pollers (up to N-1) might still be in fd_update_events() + manipulating fdtab[].state. The cause is that the "locked" variable + is determined by atleast2(thread_mask) but that thread_mask is read + at a random instant (i.e. it may be stolen by another one during a + takeover) since we don't yet hold running to prevent this from being + done. Thus we can arrive here with thread_mask==something_else (1bit), + locked==0 and fdtab[].state assigned non-atomically. + + - it looks like nothing prevents h2_release() from being called on a + thread (e.g. from the top or task timeout) while sock_conn_iocb() + dereferences the connection on another thread. Those killing the + connection don't yet consider the fact that it's an FD that others + might currently be waking up on. + +################### + +pb with counter: + +users count doesn't say who's using the FD and two users can do the same +close in turn. The thread_mask should define who's responsible for closing +the FD, and all those with a bit in it ought to do it. + + +2021-08-25 - update with minimal locking on tgid value +========== + + - tgid + refcount at once using CAS + - idle_conns lock during updates + - update: + if tgid differs => close happened, thus drop update + otherwise normal stuff. Lock tgid until running if needed. + - poll report: + if tgid differs => closed + if thread differs => stop polling (migrated) + keep tgid lock until running + - test on thread_id: + if (xadd(&tgid,65536) != my_tgid) { + // was closed + sub(&tgid, 65536) + return -1 + } + if !(thread_id & tidbit) => migrated/closed + set_running() + sub(tgid,65536) + - note: either fd_insert() or the final close() ought to set + polled and update to 0. + +2021-09-13 - tid / tgroups etc. +========== + + - tid currently is the thread's global ID. It's essentially used as an index + for arrays. It must be clearly stated that it works this way. + + - tid_bit makes no sense process-wide, so it must be redefined to represent + the thread's tid within its group. The name is not much welcome though, but + there are 286 of it that are not going to be changed that fast. + + - just like "ti" is the thread_info, we need to have "tg" pointing to the + thread_group. + + - other less commonly used elements should be retrieved from ti->xxx. E.g. + the thread's local ID. + + - lock debugging must reproduce tgid + + - an offset might be placed in the tgroup so that even with 64 threads max + we could have completely separate tid_bits over several groups. + +2021-09-15 - bind + listen() + rx +========== + + - thread_mask (in bind_conf->rx_settings) should become an array of + MAX_TGROUP longs. + - when parsing "thread 123" or "thread 2/37", the proper bit is set, + assuming the array is either a contiguous bitfield or a tgroup array. + An option RX_O_THR_PER_GRP or RX_O_THR_PER_PROC is set depending on + how the thread num was parsed, so that we reject mixes. + - end of parsing: entries translated to the cleanest form (to be determined) + - binding: for each socket()/bind()/listen()... just perform one extra dup() + for each tgroup and store the multiple FDs into an FD array indexed on + MAX_TGROUP. => allows to use one FD per tgroup for the same socket, hence + to have multiple entries in all tgroup pollers without requiring the user + to duplicate the bind line. + +2021-09-15 - global thread masks +========== + +Some global variables currently expect to know about thread IDs and it's +uncertain what must be done with them: + - global_tasks_mask /* Mask of threads with tasks in the global runqueue */ + => touched under the rq lock. Change it per-group ? What exact use is made ? + + - sleeping_thread_mask /* Threads that are about to sleep in poll() */ + => seems that it can be made per group + + - all_threads_mask: a bit complicated, derived from nbthread and used with + masks and with my_ffsl() to wake threads up. Should probably be per-group + but we might miss something for global. + + - stopping_thread_mask: used in combination with all_threads_mask, should + move per-group. + + - threads_harmless_mask: indicates all threads that are currently harmless in + that they promise not to access a shared resource. Must be made per-group + but then we'll likely need a second stage to have the harmless groups mask. + threads_idle_mask, threads_sync_mask, threads_want_rdv_mask go with the one + above. Maybe the right approach will be to request harmless on a group mask + so that we can detect collisions and arbiter them like today, but on top of + this it becomes possible to request harmless only on the local group if + desired. The subtlety is that requesting harmless at the group level does + not mean it's achieved since the requester cannot vouch for the other ones + in the same group. + +In addition, some variables are related to the global runqueue: + __decl_aligned_spinlock(rq_lock); /* spin lock related to run queue */ + struct eb_root rqueue; /* tree constituting the global run queue, accessed under rq_lock */ + unsigned int grq_total; /* total number of entries in the global run queue, atomic */ + static unsigned int global_rqueue_ticks; /* insertion count in the grq, use rq_lock */ + +And others to the global wait queue: + struct eb_root timers; /* sorted timers tree, global, accessed under wq_lock */ + __decl_aligned_rwlock(wq_lock); /* RW lock related to the wait queue */ + struct eb_root timers; /* sorted timers tree, global, accessed under wq_lock */ + + +2021-09-29 - group designation and masks +========== + +Neither FDs nor tasks will belong to incomplete subsets of threads spanning +over multiple thread groups. In addition there may be a difference between +configuration and operation (for FDs). This allows to fix the following rules: + + group mask description + 0 0 bind_conf: groups & thread not set. bind to any/all + task: it would be nice to mean "run on the same as the caller". + + 0 xxx bind_conf: thread set but not group: thread IDs are global + FD/task: group 0, mask xxx + + G>0 0 bind_conf: only group is set: bind to all threads of group G + FD/task: mask 0 not permitted (= not owned). May be used to + mention "any thread of this group", though already covered by + G/xxx like today. + + G>0 xxx bind_conf: Bind to these threads of this group + FD/task: group G, mask xxx + +It looks like keeping groups starting at zero internally complicates everything +though. But forcing it to start at 1 might also require that we rescan all tasks +to replace 0 with 1 upon startup. This would also allow group 0 to be special and +be used as the default group for any new thread creation, so that group0.count +would keep the number of unassigned threads. Let's try: + + group mask description + 0 0 bind_conf: groups & thread not set. bind to any/all + task: "run on the same group & thread as the caller". + + 0 xxx bind_conf: thread set but not group: thread IDs are global + FD/task: invalid. Or maybe for a task we could use this to + mean "run on current group, thread XXX", which would cover + the need for health checks (g/t 0/0 while sleeping, 0/xxx + while running) and have wake_expired_tasks() detect 0/0 and + wake them up to a random group. + + G>0 0 bind_conf: only group is set: bind to all threads of group G + FD/task: mask 0 not permitted (= not owned). May be used to + mention "any thread of this group", though already covered by + G/xxx like today. + + G>0 xxx bind_conf: Bind to these threads of this group + FD/task: group G, mask xxx + +With a single group declared in the config, group 0 would implicitly find the +first one. + + +The problem with the approach above is that a task queued in one group+thread's +wait queue could very well receive a signal from another thread and/or group, +and that there is no indication about where the task is queued, nor how to +dequeue it. Thus it seems that it's up to the application itself to unbind/ +rebind a task. This contradicts the principle of leaving a task waiting in a +wait queue and waking it anywhere. + +Another possibility might be to decide that a task having a defined group but +a mask of zero is shared and will always be queued into its group's wait queue. +However, upon expiry, the scheduler would notice the thread-mask 0 and would +broadcast it to any group. + +Right now in the code we have: + - 18 calls of task_new(tid_bit) + - 18 calls of task_new(MAX_THREADS_MASK) + - 2 calls with a single bit + +Thus it looks like "task_new_anywhere()", "task_new_on()" and +"task_new_here()" would be sufficient. diff --git a/doc/gpl.txt b/doc/gpl.txt new file mode 100644 index 0000000..f90922e --- /dev/null +++ b/doc/gpl.txt @@ -0,0 +1,340 @@ + GNU GENERAL PUBLIC LICENSE + Version 2, June 1991 + + Copyright (C) 1989, 1991 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software--to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Lesser General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + + GNU GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The "Program", below, +refers to any such program or work, and a "work based on the Program" +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term "modification".) Each licensee is addressed as "you". + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + + 1. You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + + 2. You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) You must cause the modified files to carry prominent notices + stating that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any + part thereof, to be licensed as a whole at no charge to all third + parties under the terms of this License. + + c) If the modified program normally reads commands interactively + when run, you must cause it, when started running for such + interactive use in the most ordinary way, to print or display an + announcement including an appropriate copyright notice and a + notice that there is no warranty (or else, saying that you provide + a warranty) and that users may redistribute the program under + these conditions, and telling the user how to view a copy of this + License. (Exception: if the Program itself is interactive but + does not normally print such an announcement, your work based on + the Program is not required to print an announcement.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable + source code, which must be distributed under the terms of Sections + 1 and 2 above on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three + years, to give any third party, for a charge no more than your + cost of physically performing source distribution, a complete + machine-readable copy of the corresponding source code, to be + distributed under the terms of Sections 1 and 2 above on a medium + customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer + to distribute corresponding source code. (This alternative is + allowed only for noncommercial distribution and only if you + received the program in object code or executable form with such + an offer, in accord with Subsection b above.) + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + + 4. You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + + 5. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + + 6. Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + + 7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 8. If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + + 9. The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any +later version", you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + + 10. If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + + NO WARRANTY + + 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + + 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the "copyright" line and a pointer to where the full notice is found. + + <one line to give the program's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author + Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. + This is free software, and you are welcome to redistribute it + under certain conditions; type `show c' for details. + +The hypothetical commands `show w' and `show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may +be called something other than `show w' and `show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + `Gnomovision' (which makes passes at compilers) written by James Hacker. + + <signature of Ty Coon>, 1 April 1989 + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Lesser General +Public License instead of this License. diff --git a/doc/haproxy.1 b/doc/haproxy.1 new file mode 100644 index 0000000..4c2d786 --- /dev/null +++ b/doc/haproxy.1 @@ -0,0 +1,227 @@ +.TH HAPROXY 1 "17 August 2007" + +.SH NAME + +HAProxy \- fast and reliable http reverse proxy and load balancer + +.SH SYNOPSIS + +haproxy \-f <configuration\ file|dir> [\-L\ <name>] [\-n\ maxconn] [\-N\ maxconn] [\-C\ <dir>] [\-v|\-vv] [\-d] [\-D] [\-W] [\-Ws] [\-q] [\-V] [\-c] [\-p\ <pidfile>] [\-dk] [\-ds] [\-de] [\-dp] [\-db] [\-dM[<byte>]] [\-m\ <megs>] [\-x <unix_socket>] [{\-sf|\-st}\ pidlist...] + +.SH DESCRIPTION + +HAProxy is a TCP/HTTP reverse proxy which is particularly suited for +high availability environments. Indeed, it can: + \- route HTTP requests depending on statically assigned cookies ; + \- spread the load among several servers while assuring server + persistence through the use of HTTP cookies ; + \- switch to backup servers in the event a main one fails ; + \- accept connections to special ports dedicated to service + monitoring ; + \- stop accepting connections without breaking existing ones ; + \- add/modify/delete HTTP headers both ways ; + \- block requests matching a particular pattern ; + \- hold clients to the right application server depending on + application cookies + \- report detailed status as HTML pages to authenticated users from an + URI intercepted from the application. + +It needs very little resource. Its event-driven architecture allows it +to easily handle thousands of simultaneous connections on hundreds of +instances without risking the system's stability. + +.SH OPTIONS + +.TP +\fB\-f <configuration file|dir>\fP +Specify configuration file or directory path. If the argument is a directory +the files (and only files) it contains are added in lexical order (using +LC_COLLATE=C) ; only non hidden files with ".cfg" extension are added. + +.TP +\fB\-L <name>\fP +Set the local instance's peer name. Peers are defined in the \fBpeers\fP +configuration section and used for syncing stick tables between different +instances. If this option is not specified, the local hostname is used as peer +name. This name is exported in the $HAPROXY_LOCALPEER environment variable and +can be used in the configuration file. + +.TP +\fB\-n <maxconn>\fP +Set the high limit for the total number of simultaneous connections. + +.TP +\fB\-N <maxconn>\fP +Set the high limit for the per-listener number of simultaneous connections. + +.TP +\fB\-C <dir>\fP +Change directory to <\fIdir\fP> before loading any files. + +.TP +\fB\-v\fP +Display HAProxy's version. + +.TP +\fB\-vv\fP +Display HAProxy's version and all build options. + +.TP +\fB\-d\fP +Start in foreground with debugging mode enabled. +When the proxy runs in this mode, it dumps every connections, +disconnections, timestamps, and HTTP headers to stdout. This should +NEVER be used in an init script since it will prevent the system from +starting up. + +.TP +\fB\-D\fP +Start in daemon mode. + +.TP +\fB\-W\fP +Start in master-worker mode. Could be used either with foreground or daemon +mode. + +.TP +\fB\-Ws\fP +Start in master-worker mode with systemd notify support. It tells systemd when +the process is ready. This mode forces foreground. + +.TP +\fB\-q\fP +Disable messages on output. + +.TP +\fB\-V\fP +Displays messages on output even when \-q or 'quiet' are specified. Some +information about pollers and config file are displayed during startup. + +.TP +\fB\-c\fP +Only checks config file and exits with code 0 if no error was found, or +exits with code 1 if a syntax error was found. + +.TP +\fB\-p <pidfile>\fP +Ask the process to write down each of its children's pids to this file +in daemon mode or ask the process to write down its master's pid to +this file in master-worker mode. + +.TP +\fB\-dk\fP +Disable use of \fBkqueue\fP(2). \fBkqueue\fP(2) is available only on BSD systems. + +.TP +\fB\-dv\fP +Disable use of event ports. Event ports are available only on SunOS systems +derived from Solaris 10 and later (including illumos systems). + +.TP +\fB\-ds\fP +Disable use of speculative \fBepoll\fP(7). \fBepoll\fP(7) is available only on +Linux 2.6 and some custom Linux 2.4 systems. + +.TP +\fB\-de\fP +Disable use of \fBepoll\fP(7). \fBepoll\fP(7) is available only on Linux 2.6 +and some custom Linux 2.4 systems. + +.TP +\fB\-dp\fP +Disables use of \fBpoll\fP(2). \fBselect\fP(2) might be used instead. + +.TP +\fB\-dS\fP +Disables use of \fBsplice\fP(2), which is broken on older kernels. + +.TP +\fB\-db\fP +Disables background mode (stays in foreground, useful for debugging). +For debugging, the '\-db' option is very useful as it temporarily +disables daemon mode and multi-process mode. The service can then be +stopped by simply pressing Ctrl-C, without having to edit the config nor +run full debug. + +.TP +\fB\-dM[<byte>]\fP +Initializes all allocated memory areas with the given <\fIbyte\fP>. This makes +it easier to detect bugs resulting from uninitialized memory accesses, at the +expense of touching all allocated memory once. If <\fIbyte\fP> is not +specified, it defaults to 0x50 (ASCII 'P'). + +.TP +\fB\-m <megs>\fP +Enforce a memory usage limit to a maximum of <megs> megabytes. + +.TP +\fB\-sf <pidlist>\fP +Send FINISH signal to the pids in pidlist after startup. The processes +which receive this signal will wait for all sessions to finish before +exiting. This option must be specified last, followed by any number of +PIDs. Technically speaking, \fBSIGTTOU\fP and \fBSIGUSR1\fP are sent. + +.TP +\fB\-st <pidlist>\fP +Send TERMINATE signal to the pids in pidlist after startup. The processes +which receive this signal will terminate immediately, closing all active +sessions. This option must be specified last, followed by any number of +PIDs. Technically speaking, \fBSIGTTOU\fP and \fBSIGTERM\fP are sent. + +.TP +\f8\-x <unix_socket>\fP +Attempt to connect to the unix socket, and retrieve all the listening sockets +from the old process. Those sockets will then be used if possible instead of +binding new ones. + +.TP +\fB\-S <bind>[,<bind options>...]\fP +In master-worker mode, create a master CLI. This CLI will enable access to the +CLI of every worker. Useful for debugging, it's a convenient way of accessing a +leaving process. + +.SH LOGGING +Since HAProxy can run inside a chroot, it cannot reliably access /dev/log. +For this reason, it uses the UDP protocol to send its logs to the server, +even if it is the local server. People who experience trouble receiving +logs should ensure that their syslog daemon listens to the UDP socket. +Several Linux distributions which ship with syslogd from the sysklogd +package have UDP disabled by default. The \fB\-r\fP option must be passed +to the daemon in order to enable UDP. + +.SH SIGNALS +Some signals have a special meaning for the haproxy daemon. Generally, they are used between daemons and need not be used by the administrator. +.TP +\- \fBSIGUSR1\fP +Tells the daemon to stop all proxies and exit once all sessions are closed. It is often referred to as the "soft-stop" signal. +.TP +\- \fBSIGUSR2\fP +In master-worker mode, reloads the configuration and sends a soft-stop signal to old processes. +.TP +\- \fBSIGTTOU\fP +Tells the daemon to stop listening to all sockets. Used internally by \fB\-sf\fP and \fB\-st\fP. +.TP +\- \fBSIGTTIN\fP +Tells the daemon to restart listening to all sockets after a \fBSIGTTOU\fP. Used internally when there was a problem during hot reconfiguration. +.TP +\- \fBSIGINT\fP and \fBSIGTERM\fP +Both signals can be used to quickly stop the daemon. +.TP +\- \fBSIGHUP\fP +Dumps the status of all proxies and servers into the logs. Mostly used for trouble-shooting purposes. +.TP +\- \fBSIGQUIT\fP +Dumps information about memory pools on stderr. Mostly used for debugging purposes. +.TP +\- \fBSIGPIPE\fP +This signal is intercepted and ignored on systems without \fBMSG_NOSIGNAL\fP. + +.SH SEE ALSO + +A much better documentation can be found in configuration.txt. On Debian +systems, you can find this file in /usr/share/doc/haproxy/configuration.txt.gz. + +.SH AUTHOR + +HAProxy was written by Willy Tarreau. This man page was written by Arnaud Cornet and Willy Tarreau. + diff --git a/doc/internals/acl.txt b/doc/internals/acl.txt new file mode 100644 index 0000000..0379331 --- /dev/null +++ b/doc/internals/acl.txt @@ -0,0 +1,82 @@ +2011/12/16 - How ACLs work internally in haproxy - w@1wt.eu + +An ACL is declared by the keyword "acl" followed by a name, followed by a +matching method, followed by one or multiple pattern values : + + acl internal src 127.0.0.0/8 10.0.0.0/8 192.168.0.0/16 + +In the statement above, "internal" is the ACL's name (acl->name), "src" is the +ACL keyword defining the matching method (acl_expr->kw) and the IP addresses +are patterns of type acl_pattern to match against the source address. + +The acl_pattern struct may define one single pattern, a range of values or a +tree of values to match against. The type of the patterns is implied by the +ACL keyword. For instance, the "src" keyword implies IPv4 patterns. + +The line above constitutes an ACL expression (acl_expr). ACL expressions are +formed of a keyword, an optional argument for the keyword, and a list of +patterns (in fact, both a list and a root tree). + +Dynamic values are extracted according to a fetch function defined by the ACL +keyword. This fetch function fills or updates a struct acl_test with all the +extracted information so that a match function can compare it against all the +patterns. The fetch function is called iteratively by the ACL engine until it +reports no more value. This makes sense for instance when checking IP addresses +found in HTTP headers, which can appear multiple times. The acl_test is kept +intact between calls and even holds a context so that the fetch function knows +where to start from for subsequent calls. The match function may also use the +context even though it was not designed for that purpose. + +An ACL is defined only by its name and can be a series of ACL expressions. The +ACL is deemed true when any of its expressions is true. They are evaluated in +the declared order and can involve multiple matching methods. + +So in summary : + + - an ACL is a series of tests to perform on a stream, any of which is enough + to validate the result. + + - each test is defined by an expression associating a keyword and a series of + patterns. + + - a keyword implies several things at once : + - the type of the patterns and how to parse them + - the method to fetch the required information from the stream + - the method to match the fetched information against the patterns + + - a fetch function fills an acl_test struct which is passed to the match + function defined by the keyword + + - the match function tries to match the value in the acl_test against the + pattern list declared in the expression which involved its acl_keyword. + + +ACLs are used by conditional processing rules. A rule generally uses an "if" or +"unless" keyword followed by an ACL condition (acl_cond). This condition is a +series of term suites which are ORed together. Each term suite is a series of +terms which are ANDed together. Terms may be negated before being evaluated in +a suite. A term simply is a pointer to an ACL. + +We could then represent a rule by the following BNF : + + rule = if-cond + | unless-cond + + if-cond (struct acl_cond with ->pol = ACL_COND_IF) + = "if" condition + + unless-cond (struct acl_cond with ->pol = ACL_COND_UNLESS) + = "unless" condition + + condition + = term-suite + | term-suite "||" term-suite + | term-suite "or" term-suite + + term-suite (struct acl_term_suite) + = term + | term term + + term = acl + | "!" acl + diff --git a/doc/internals/api/appctx.txt b/doc/internals/api/appctx.txt new file mode 100644 index 0000000..137ec7b --- /dev/null +++ b/doc/internals/api/appctx.txt @@ -0,0 +1,142 @@ +Instantiation of applet contexts (appctx) in 2.6. + + +1. Background + +Most applets are in fact simplified services that are called by the CLI when a +registered keyword is matched. Some of them only have a ->parse() function +which immediately returns with a final result, while others will return zero +asking for the->io_handler() one to be called till the end. For these ones, a +context is generally needed between calls to know where to restart from. + +Other applets are completely autonomous applets with their init function and +an I/O handler, and these ones also need a persistent context between calls to +the I/O handler. These ones are typically instantiated by "use-service" or by +other means. + +Originally a few integers were provided to keep a trivial state (st0, st1, st2) +and these ones progressively proved insufficient, leading to a "ctx.cli" sub- +context that was allowed to use extra fields of various types. Other applets +preferred to use their own context definition. + +All this resulted in the appctx->ctx to contain a myriad of definitions of +various service contexts, and in some services abusing other services' +definitions by laziness, and others being extended to use their own definition +after having run for a long time on the generic types, some of which were not +noticed and mistakenly used the same storage locations by accident. A massive +cleanup was needed. + + +2. New approach in 2.6 + +In 2.6, there's an "svcctx" pointer that's initialized to NULL before any +instantiation of an applet or of a CLI keyword's function. Applets and keyword +handlers are free to make it point wherever they want, and to find it unaltered +between subsequent calls, including up to the ->release() call. The "st2" state +that was totally abused with random enums is not used anymore and was marked as +deprecated. It's still initialized to zero before the first call though. + +One special area, "svc.storage[]", is large enough to contain any of the +contexts that used to be present under "appctx->ctx". The "svcctx" may be set +to point to this area so that a small structure can be allocated for free and +without requiring error checking. In order to make this easier, a specially +purposed function is provided: "applet_reserve_svcctx()". This function will +require the caller to indicate how large an area it needs, and will return a +pointer to this area after checking that it fits. If it does not, haproxy will +crash. This is purposely done so that it's known during development that if a +small structure doesn't fit, a different approach is required. + +As such, for the vast majority of commands, the process is the following one: + + struct foo_ctx { + int myfield1; + int myfield2; + char *myfield3; + }; + + int io_handler(struct appctx *appctx) + { + struct foo_ctx *ctx = applet_reserve_svcctx(appctx, sizeof(*ctx)); + + if (!ctx->myfield1) { + /* first call */ + ctx->myfield1++; + } + ... + } + +The pointer may be directly accessed from the I/O handler if it's known that it +was already reserved by the init handler or parsing function. Otherwise it's +guaranteed to be NULL so that can also serve as a test for a first call: + + int parse_handler(struct appctx *appctx) + { + struct foo_ctx *ctx = applet_reserve_svcctx(appctx, sizeof(*ctx)); + ctx->myfield1 = 12; + return 0; + } + + int io_handler(struct appctx *appctx) + { + struct foo_ctx *ctx = appctx->svcctx; + + for (; !ctx->myfield1; ctx->myfield1--) { + do_something(); + } + ... + } + +There is no need to free anything because that space is not allocated but just +points to a reserved area. + +If it is too small (its size is APPLET_MAX_SVCCTX bytes), it is preferable to +use it with dynamically allocated structures (pools, malloc, etc). For example: + + int io_handler(struct appctx *appctx) + { + struct foo_ctx *ctx = appctx->svcctx; + + if (!ctx) { + /* first call */ + ctx = pool_alloc(pool_foo_ctx); + if (!ctx) + return 1; + } + ... + } + + void io_release(struct appctx *appctx) + { + pool_free(pool_foo_ctx, appctx->svcctx); + } + +The CLI code itself uses this mechanism for the cli_print_*() functions. Since +these functions are terminal (i.e. not meant to be used in the middle of an I/O +handler as they share the same contextual space), they always reset the svcctx +pointer to place it to the "cli_print_ctx" mapped in ->svc.storage. + + +3. Transition for old code + +A lot of care was taken to make the transition as smooth as possible for +out-of-tree code since that's an API change. A dummy "ctx.cli" struct still +exists in the appctx struct, and it happens to map perfectly to the one set by +cli_print_*, so that if some code uses a mix of both, it will still work. +However, it will build with "deprecated" warnings allowing to spot the +remaining places. It's a good exercise to rename "ctx.cli" in "appctx" and see +if the code still compiles. + +Regarding the "st2" sub-state, it will disappear as well after 2.6, but is +still provided and initialized so that code relying on it will still work even +if it builds with deprecation warnings. The correct approach is to move this +state into the newly defined applet's context, and to stop using the stats +enums STAT_ST_* that often barely match the needs and result in code that is +more complicated than desired (the STAT_ST_* enum values have also been marked +as deprecated). + +The code dealing with "show fd", "show sess" and the peers applet show good +examples of how to convert a registered keyword or an applet. + +All this transition code requires complex layouts that will be removed during +2.7-dev so there is no other long-term option but to update the code (or better +get it merged if it can be useful to other users). diff --git a/doc/internals/api/buffer-api.txt b/doc/internals/api/buffer-api.txt new file mode 100644 index 0000000..ac35300 --- /dev/null +++ b/doc/internals/api/buffer-api.txt @@ -0,0 +1,653 @@ +2018-07-13 - HAProxy Internal Buffer API + + +1. Background + +HAProxy uses a "struct buffer" internally to store data received from external +agents, as well as data to be sent to external agents. These buffers are also +used during data transformation such as compression, header insertion or +defragmentation, and are used to carry intermediary representations between the +various internal layers. They support wrapping at the end, and they carry their +own size information so that in theory it would be possible to use different +buffer sizes in parallel even though this is not currently implemented. + +The format of this structure has evolved over time, to reach a point where it +is convenient and versatile enough to have permitted to make several internal +types converge into a single one (specifically the struct chunk disappeared). + + +2. Representation as of 1.9-dev1 + +The current buffer representation consists in a linear storage area of known +size, with a head position indicating the oldest data, and a total data count +expressed in bytes. The head position, data count and size are expressed as +integers and are positive or null. By convention, the head position is strictly +smaller than the buffer size and the data count is smaller than or equal to the +size, so that wrapping can be resolved with a single subtract. A buffer not +respecting these rules is said to be degenerate. Unless specified otherwise, +the various API functions will adopt an undefined behaviour when passed such a +degenerate buffer. + + Buffer declaration : + + struct buffer { + size_t size; // size of the storage area (wrapping point) + char *area; // start of the storage area + size_t data; // contents length after head + size_t head; // start offset of remaining data relative to area + }; + + + Linear buffer representation : + + area + | + V<--------------------------------------------------------->| size + +-----------+---------------------------------+-------------+ + | |/////////////////////////////////| | + +-----------+---------------------------------+-------------+ + |<--------->|<------------------------------->| + head data ^ + | + tail + + + Wrapping buffer representation : + + area + | + V<--------------------------------------------------------->| size + +---------------+------------------------+------------------+ + |///////////////| |//////////////////| + +---------------+------------------------+------------------+ + |<-------------------------------------->| head + |-------------->| ...data data...|<-----------------| + ^ + | + tail + + +3. Terminology + +Manipulating a buffer just based on a head and a wrapping data count is not +very convenient, so we define a certain number of terms for important elements +characterizing a buffer : + + - origin : pointer to relative position 0 in the storage area. Undefined + when the buffer is not allocated. + + - size : the allocated size of the storage area starting at the origin, + expressed in bytes. A buffer whose size is zero is said not to + be allocated, and its origin in this case is undefined. + + - data : the amount of data the buffer contains, in bytes. It is always + lower than or equal to the buffer's size, hence it is always 0 + for an unallocated buffer. + + - emptiness : a buffer is said to be empty when it contains no data, hence + data == 0. It is possible for such buffers not to be allocated + and to have size == 0 as well. + + - room : the available space in the buffer. This is its size minus data. + + - head : position relative to origin where the oldest data byte is found + (it typically is what send() uses to pick outgoing data). The + head is strictly smaller than the size. + + - tail : position relative to origin where the first spare byte is found + (it typically is what recv() uses to store incoming data). It + is always equal to the buffer's data added to its head modulo + the buffer's size. + + - wrapping : the byte following the last one of the storage area loops back + to position 0. This is called wrapping. The wrapping point is + the first position relative to origin which doesn't belong to + the storage area. There is no wrapping when a buffer is not + allocated. Wrapping requires special care and means that the + regular string manipulation functions are not usable on most + buffers, unless it is known that no wrapping happens. Free + space may wrap as well if the buffer only contains data in the + middle. + + - alignment : a buffer is said to be aligned if its data do not wrap. That + is, its head is strictly before the tail, or the buffer is + empty and the head is null. Aligning a buffer may be required + to use regular string manipulation functions which have no + support for wrapping. + + +A buffer may be in three different states : + - unallocated : size == 0, area == 0 (b_is_null() is true) + - waiting : size == 0, area != 0 + - allocated : size > 0, area > 0 + +It is not permitted to have area == 0 with a non-null size. In addition, the +waiting state may also be used to indicate a read-only buffer which does not +wrap and which must not be freed (e.g. for use with error messages). + +The basic API only covers allocated buffers. Switching to/from the other states +is covered by the management API since it requires specific allocation and free +calls. + + +4. Using buffers + +Buffers are defined in a few files : + - include/common/buf.h : structure definition, and manipulation functions + - include/common/buffer.h : resource management (alloc/free/wait lists) + - include/common/istbuf.h : advanced string manipulation + + +4.1. Basic API + +The basic API is made of the functions which abstract accesses to the buffers +and which help calculating their state, free space or used space. + +====================+==================+======================================= +Function | Arguments/Return | Description +--------------------+------------------+--------------------------------------- +b_is_null() | const buffer *buf| returns true if (and only if) the + | ret: int | buffer is not yet allocated and thus + | | points to a NULL area +--------------------+------------------+--------------------------------------- +b_orig() | const buffer *buf| returns the pointer to the origin of + | ret: char * | the storage, which is the location of + | | byte at offset zero. This is mostly + | | used by functions which handle the + | | wrapping by themselves +--------------------+------------------+--------------------------------------- +b_size() | const buffer *buf| returns the size of the buffer + | ret: size_t | +--------------------+------------------+--------------------------------------- +b_wrap() | const buffer *buf| returns the pointer to the wrapping + | ret: char * | position of the buffer area, which is + | | by definition the first byte not part + | | of the buffer +--------------------+------------------+--------------------------------------- +b_data() | const buffer *buf| returns the number of bytes present in + | ret: size_t | the buffer +--------------------+------------------+--------------------------------------- +b_room() | const buffer *buf| returns the amount of room left in the + | ret: size_t | buffer +--------------------+------------------+--------------------------------------- +b_full() | const buffer *buf| returns true if the buffer is full + | ret: int | +--------------------+------------------+--------------------------------------- +__b_stop() | const buffer *buf| returns a pointer to the byte + | ret: char * | following the end of the buffer, which + | | may be out of the buffer if the buffer + | | ends on the last byte of the area. It + | | is the caller's responsibility to + | | either know that the buffer does not + | | wrap or to check that the result does + | | not wrap +--------------------+------------------+--------------------------------------- +__b_stop_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the byte following the end + | | of the buffer, which may be out of the + | | buffer if the buffer ends on the last + | | byte of the area. It's the caller's + | | responsibility to either know that the + | | buffer does not wrap or to check that + | | the result does not wrap +--------------------+------------------+--------------------------------------- +b_stop() | const buffer *buf| returns the pointer to the byte + | ret: char * | following the end of the buffer, which + | | may be out of the buffer if the buffer + | | ends on the last byte of the area +--------------------+------------------+--------------------------------------- +b_stop_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the byte following the end + | | of the buffer, which may be out of the + | | buffer if the buffer ends on the last + | | byte of the area +--------------------+------------------+--------------------------------------- +__b_peek() | const buffer *buf| returns a pointer to the data at + | size_t ofs | position <ofs> relative to the head of + | ret: char * | the buffer. Will typically point to + | | input data if called with the amount + | | of output data. It's the caller's + | | responsibility to either know that the + | | buffer does not wrap or to check that + | | the result does not wrap +--------------------+------------------+--------------------------------------- +__b_peek_ofs() | const buffer *buf| returns an origin-relative offset + | size_t ofs | pointing to the data at position <ofs> + | ret: size_t | relative to the head of the + | | buffer. Will typically point to input + | | data if called with the amount of + | | output data. It's the caller's + | | responsibility to either know that the + | | buffer does not wrap or to check that + | | the result does not wrap +--------------------+------------------+--------------------------------------- +b_peek() | const buffer *buf| returns a pointer to the data at + | size_t ofs | position <ofs> relative to the head of + | ret: char * | the buffer. Will typically point to + | | input data if called with the amount + | | of output data. If applying <ofs> to + | | the buffers' head results in a + | | position between <size> and 2*>size>-1 + | | included, a wrapping compensation is + | | applied to the result +--------------------+------------------+--------------------------------------- +b_peek_ofs() | const buffer *buf| returns an origin-relative offset + | size_t ofs | pointing to the data at position <ofs> + | ret: size_t | relative to the head of the + | | buffer. Will typically point to input + | | data if called with the amount of + | | output data. If applying <ofs> to the + | | buffers' head results in a position + | | between <size> and 2*>size>-1 + | | included, a wrapping compensation is + | | applied to the result +--------------------+------------------+--------------------------------------- +__b_head() | const buffer *buf| returns the pointer to the buffer's + | ret: char * | head, which is the location of the + | | next byte to be dequeued. The result + | | is undefined for unallocated buffers +--------------------+------------------+--------------------------------------- +__b_head_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the buffer's head, which + | | is the location of the next byte to be + | | dequeued. The result is undefined for + | | unallocated buffers +--------------------+------------------+--------------------------------------- +b_head() | const buffer *buf| returns the pointer to the buffer's + | ret: char * | head, which is the location of the + | | next byte to be dequeued. The result + | | is undefined for unallocated + | | buffers. If applying <ofs> to the + | | buffers' head results in a position + | | between <size> and 2*>size>-1 + | | included, a wrapping compensation is + | | applied to the result +--------------------+------------------+--------------------------------------- +b_head_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the buffer's head, which + | | is the location of the next byte to be + | | dequeued. The result is undefined for + | | unallocated buffers. If applying + | | <ofs> to the buffers' head results in + | | a position between <size> and + | | 2*>size>-1 included, a wrapping + | | compensation is applied to the result +--------------------+------------------+--------------------------------------- +__b_tail() | const buffer *buf| returns the pointer to the tail of the + | ret: char * | buffer, which is the location of the + | | first byte where it is possible to + | | enqueue new data. The result is + | | undefined for unallocated buffers +--------------------+------------------+--------------------------------------- +__b_tail_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the tail of the buffer, + | | which is the location of the first + | | byte where it is possible to enqueue + | | new data. The result is undefined for + | | unallocated buffers +--------------------+------------------+--------------------------------------- +b_tail() | const buffer *buf| returns the pointer to the tail of the + | ret: char * | buffer, which is the location of the + | | first byte where it is possible to + | | enqueue new data. The result is + | | undefined for unallocated buffers +--------------------+------------------+--------------------------------------- +b_tail_ofs() | const buffer *buf| returns an origin-relative offset + | ret: size_t | pointing to the tail of the buffer, + | | which is the location of the first + | | byte where it is possible to enqueue + | | new data. The result is undefined for + | | unallocated buffers +--------------------+------------------+--------------------------------------- +b_next() | const buffer *buf| for an absolute pointer <p> pointing + | const char *p | to a valid location within buffer <b>, + | ret: char * | returns the absolute pointer to the + | | next byte, which usually is at (p + 1) + | | unless p reaches the wrapping point + | | and wrapping is needed +--------------------+------------------+--------------------------------------- +b_next_ofs() | const buffer *buf| for an origin-relative offset <o> + | size_t o | pointing to a valid location within + | ret: size_t | buffer <b>, returns either the + | | relative offset pointing to the next + | | byte, which usually is at (o + 1) + | | unless o reaches the wrapping point + | | and wrapping is needed +--------------------+------------------+--------------------------------------- +b_dist() | const buffer *buf| returns the distance between two + | const char *from | pointers, taking into account the + | const char *to | ability to wrap around the buffer's + | ret: size_t | end. The operation is not defined if + | | either of the pointers does not belong + | | to the buffer or if their distance is + | | greater than the buffer's size +--------------------+------------------+--------------------------------------- +b_almost_full() | const buffer *buf| returns 1 if the buffer uses at least + | ret: int | 3/4 of its capacity, otherwise + | | zero. Buffers of size zero are + | | considered full +--------------------+------------------+--------------------------------------- +b_space_wraps() | const buffer *buf| returns non-zero only if the buffer's + | ret: int | free space wraps, which means that the + | | buffer contains data that are not + | | touching at least one edge +--------------------+------------------+--------------------------------------- +b_contig_data() | const buffer *buf| returns the amount of data that can + | size_t start | contiguously be read at once starting + | ret: size_t | from a relative offset <start> (which + | | allows to easily pre-compute blocks + | | for memcpy). The start point will + | | typically contain the amount of past + | | data already returned by a previous + | | call to this function +--------------------+------------------+--------------------------------------- +b_contig_space() | const buffer *buf| returns the amount of bytes that can + | ret: size_t | be appended to the buffer at once +--------------------+------------------+--------------------------------------- +b_getblk() | const buffer *buf| gets one full block of data at once + | char *blk | from a buffer, starting from offset + | size_t len | <offset> after the buffer's head, and + | size_t offset | limited to no more than <len> bytes. + | ret: size_t | The caller is responsible for ensuring + | | that neither <offset> nor <offset> + + | | <len> exceed the total number of bytes + | | available in the buffer. Return zero + | | if not enough data was available, in + | | which case blk is left undefined, or + | | the number of bytes read which is + | | equal to the requested size +--------------------+------------------+--------------------------------------- +b_getblk_nc() | const buffer *buf| gets one or two blocks of data at once + | const char **blk1| from a buffer, starting from offset + | size_t *len1 | <ofs> after the beginning of its + | const char **blk2| output, and limited to no more than + | size_t *len2 | <max> bytes. The caller is responsible + | size_t ofs | for ensuring that neither <ofs> nor + | size_t max | <ofs>+<max> exceed the total number of + | ret: int | bytes available in the buffer. Returns + | | 0 if not enough data were available, + | | or the number of blocks filled (1 or + | | 2). <blk1> is always filled before + | | <blk2>. The unused blocks are left + | | undefined, and the buffer is left + | | unaffected. Unused buffers are left in + | | an undefined state +--------------------+------------------+--------------------------------------- +b_reset() | buffer *buf | resets a buffer. The size is not + | ret: void | touched. In practice it resets the + | | head and the data length +--------------------+------------------+--------------------------------------- +b_sub() | buffer *buf | decreases the buffer length by <count> + | size_t count | without touching the head position + | ret: void | (only the tail moves). this may mostly + | | be used to trim pending data before + | | reusing a buffer. The caller is + | | responsible for not removing more than + | | the available data +--------------------+------------------+--------------------------------------- +b_add() | buffer *buf | increase the buffer length by <count> + | size_t count | without touching the head position + | ret: void | (only the tail moves). This is used + | | when adding data at the tail of a + | | buffer. The caller is responsible for + | | not adding more than the available + | | room +--------------------+------------------+--------------------------------------- +b_set_data() | buffer *buf | sets the buffer's length, by adjusting + | size_t len | the buffer's tail only. The caller is + | ret: void | responsible for passing a valid length +--------------------+------------------+--------------------------------------- +b_del() | buffer *buf | deletes <del> bytes at the head of + | size_t del | buffer <b> and updates the head. The + | ret: void | caller is responsible for not removing + | | more than the available data. This is + | | used after sending data from the + | | buffer +--------------------+------------------+--------------------------------------- +b_realign_if_empty()| buffer *buf | realigns a buffer if it's empty, does + | ret: void | nothing otherwise. This is mostly used + | | after b_del() to make an empty + | | buffer's free space contiguous +--------------------+------------------+--------------------------------------- +b_slow_realign() | buffer *buf | realigns a possibly wrapping buffer so + | size_t output | that the part remaining to be parsed + | ret: void | is contiguous and starts at the + | | beginning of the buffer and the + | | already parsed output part ends at the + | | end of the buffer. This provides the + | | best conditions since it allows the + | | largest inputs to be processed at once + | | and ensures that once the output data + | | leaves, the whole buffer is available + | | at once. The number of output bytes + | | supposedly present at the beginning of + | | the buffer and which need to be moved + | | to the end must be passed in <output>. + | | It will effectively make this offset + | | the new wrapping point. A temporary + | | swap area at least as large as b->size + | | must be provided in <swap>. It's up + | | to the caller to ensure <output> is no + | | larger than the difference between the + | | whole buffer's length and its input +--------------------+------------------+--------------------------------------- +b_putchar() | buffer *buf | tries to append char <c> at the end of + | char c | buffer <b>. Supports wrapping. New + | ret: void | data are silently discarded if the + | | buffer is already full +--------------------+------------------+--------------------------------------- +b_putblk() | buffer *buf | tries to append block <blk> at the end + | const char *blk | of buffer <b>. Supports wrapping. Data + | size_t len | are truncated if the buffer is too + | ret: size_t | short or if not enough space is + | | available. It returns the number of + | | bytes really copied +--------------------+------------------+--------------------------------------- +b_move() | buffer *buf | moves block (src,len) left or right + | size_t src | by <shift> bytes, supporting wrapping + | size_t len | and overlapping. + | size_t shift | +--------------------+------------------+--------------------------------------- +b_rep_blk() | buffer *buf | writes the block <blk> at position + | char *pos | <pos> which must be in buffer <b>, and + | char *end | moves the part between <end> and the + | const char *blk | buffer's tail just after the end of + | size_t len | the copy of <blk>. This effectively + | ret: int | replaces the part located between + | | <pos> and <end> with a copy of <blk> + | | of length <len>. The buffer's length + | | is automatically updated. This is used + | | to replace a block with another one + | | inside a buffer. The shift value + | | (positive or negative) is returned. If + | | there's no space left, the move is not + | | done. If <len> is null, the <blk> + | | pointer is allowed to be null, in + | | order to erase a block +--------------------+------------------+--------------------------------------- +b_xfer() | buffer *src | transfers at most <count> bytes from + | buffer *dst | buffer <src> to buffer <dst> and + | size_t cout | returns the number of bytes copied. + | ret: size_t | The bytes are removed from <src> and + | | added to <dst>. The caller guarantees + | | that <count> is <= b_room(dst) +====================+==================+======================================= + + +4.2. String API + +The string API aims at providing both convenient and efficient ways to read and +write to/from buffers using indirect strings (ist). These strings and some +associated functions are defined in ist.h. + +====================+==================+======================================= +Function | Arguments/Return | Description +--------------------+------------------+--------------------------------------- +b_isteq() | const buffer *b | b_isteq() : returns > 0 if the first + | size_t o | <n> characters of buffer <b> starting + | size_t n | at offset <o> relative to the buffer's + | const ist ist | head match <ist>. (empty strings do + | ret: int | match). It is designed to be used with + | | reasonably small strings (it matches a + | | single byte per loop iteration). It is + | | expected to be used with an offset to + | | skip old data. Return value number of + | | matching bytes if >0, not enough bytes + | | or empty string if 0, or non-matching + | | byte found if <0. +--------------------+------------------+--------------------------------------- +b_isteat | struct buffer *b | b_isteat() : "eats" string <ist> from + | const ist ist | the head of buffer <b>. Wrapping data + | ret: ssize_t | is explicitly supported. It matches a + | | single byte per iteration so strings + | | should remain reasonably small. + | | Returns the number of bytes matched + | | and eaten if >0, not enough bytes or + | | matched empty string if 0, or non + | | matching byte found if <0. +--------------------+------------------+--------------------------------------- +b_istput | struct buffer *b | b_istput() : injects string <ist> at + | const ist ist | the tail of output buffer <b> provided + | ret: ssize_t | that it fits. Wrapping is supported. + | | It's designed for small strings as it + | | only writes a single byte per + | | iteration. Returns the number of + | | characters copied (ist.len), 0 if it + | | temporarily does not fit, or -1 if it + | | will never fit. It will only modify + | | the buffer upon success. In all cases, + | | the contents are copied prior to + | | reporting an error, so that the + | | destination at least contains a valid + | | but truncated string. +--------------------+------------------+--------------------------------------- +b_putist | struct buffer *b | b_putist() : tries to copy as much as + | const ist ist | possible of string <ist> into buffer + | ret: size_t | <b> and returns the number of bytes + | | copied (truncation is possible). It + | | uses b_putblk() and is suitable for + | | large blocks. +====================+==================+======================================= + + +4.3. Management API + +The management API makes a distinction between an empty buffer, which by +definition is not allocated but is ready to be allocated at any time, and a +buffer which failed an allocation and is waiting for an available area to be +offered. The functions allow to register on a list to be notified about buffer +availability, to notify others of a number of buffers just released, and to be +and to be notified of buffer availability. All allocations are made through the +standard buffer pools. + +====================+==================+======================================= +Function | Arguments/Return | Description +--------------------+------------------+--------------------------------------- +buffer_almost_full | const buffer *buf| returns true if the buffer is not null + | ret: int | and at least 3/4 of the buffer's space + | | are used. A waiting buffer will match. +--------------------+------------------+--------------------------------------- +b_alloc | buffer *buf | ensures that <buf> is allocated or + | ret: buffer * | allocates a buffer and assigns it to + | | *buf. If no memory is available, (1) + | | is assigned instead with a zero size. + | | The allocated buffer is returned, or + | | NULL in case no memory is available +--------------------+------------------+--------------------------------------- +__b_free | buffer *buf | releases <buf> which must be allocated + | ret: void | and marks it empty +--------------------+------------------+--------------------------------------- +b_free | buffer *buf | releases <buf> only if it is allocated + | ret: void | and marks it empty +--------------------+------------------+--------------------------------------- +offer_buffers() | void *from | offer a buffer currently belonging to + | uint threshold | target <from> to whoever needs + | ret: void | one. Any pointer is valid for <from>, + | | including NULL. Its purpose is to + | | avoid passing a buffer to oneself in + | | case of failed allocations (e.g. need + | | two buffers, get one, fail, release it + | | and wake up self again). In case of + | | normal buffer release where it is + | | expected that the caller is not + | | waiting for a buffer, NULL is fine +====================+==================+======================================= + + +5. Porting code from older versions + +The previous buffer API introduced in 1.5-dev9 (May 2012) used to look like the +following (with the struct renamed to old_buffer here to avoid confusion during +quick lookups at the doc). It's worth noting that the "data" field used to be +part of the struct but with a different type and meaning. It's important to be +careful about potential code making use of &b->data as it will silently compile +but fail. + + Previous buffer declaration : + + struct old_buffer { + char *p; /* buffer's start pointer, separates in and out data */ + unsigned int size; /* buffer size in bytes */ + unsigned int i; /* number of input bytes pending for analysis in the buffer */ + unsigned int o; /* number of out bytes the sender can consume from this buffer */ + char data[0]; /* <size> bytes */ + }; + + Previous linear buffer representation : + + data p + | | + V V + +-----------+--------------------+------------+-------------+ + | |////////////////////|////////////| | + +-----------+--------------------+------------+-------------+ + <---------------------------------------------------------> size + <------------------> <----------> + o i + +There is this correspondence between old and new fields (some will involve a +knowledge of a channel when the output byte count is required) : + + Old | New + --------+---------------------------------------------------- + p | data + head + co_data(channel) // ci_head(channel) + size | size + i | data - co_data(channel) // ci_data(channel) + o | co_data(channel) // channel->output + data | area + --------+----------------------------------------------------- + +Then some common expressions can be mapped like this : + + Old | New + -----------------------+--------------------------------------- + b->data | b_orig(b) + &b->data | b_orig(b) + bi_ptr(b) | ci_head(channel) + bi_end(b) | b_tail(b) + bo_ptr(b) | b_head(b) + bo_end(b) | co_tail(channel) + bi_putblk(b,s,l) | b_putblk(b,s,l) + bo_getblk(b,s,l,o) | b_getblk(b,s,l,o) + bo_getblk_nc(b,s,l,o) | b_getblk_nc(b,s,l,o,0,co_data(channel)) + b->i + b->o | b_data(b) + b->data + b->size | b_wrap(b) + b->i += len | b_add(b, len) + b->i -= len | b_sub(b, len) + b->i = len | b_set_data(b, co_data(channel) + len) + b->o += len | b_add(b, len); channel->output += len + b->o -= len | b_del(b, len); channel->output -= len + -----------------------+--------------------------------------- + +The buffer modification functions are less straightforward and depend a lot on +the context where they are used. It is strongly advised to figure in the list +of functions above what is available based on what is attempted to be done in +the existing code. + +Note that it is very likely that any out-of-tree code relying on buffers will +not use both ->i and ->o but instead will use exclusively ->i on the side +producing data and use exclusively ->o on the side consuming data (such as in a +mux or in an applet). In both cases, it should be assumed that the other side +is always zero and that either ->i or ->o is replaced with ->data, making the +remaining code much simpler (no more code duplication based on the data +direction). diff --git a/doc/internals/api/filters.txt b/doc/internals/api/filters.txt new file mode 100644 index 0000000..eee74cf --- /dev/null +++ b/doc/internals/api/filters.txt @@ -0,0 +1,1186 @@ + ----------------------------------------- + Filters Guide - version 2.5 + ( Last update: 2021-02-24 ) + ------------------------------------------ + Author : Christopher Faulet + Contact : christopher dot faulet at capflam dot org + + +ABSTRACT +-------- + +The filters support is a new feature of HAProxy 1.7. It is a way to extend +HAProxy without touching its core code and, in certain extent, without knowing +its internals. This feature will ease contributions, reducing impact of +changes. Another advantage will be to simplify HAProxy by replacing some parts +by filters. As we will see, and as an example, the HTTP compression is the first +feature moved in a filter. + +This document describes how to write a filter and what to keep in mind to do +so. It also talks about the known limits and the pitfalls to avoid. + +As said, filters are quite new for now. The API is not freezed and will be +updated/modified/improved/extended as needed. + + + +SUMMARY +------- + + 1. Filters introduction + 2. How to use filters + 3. How to write a new filter + 3.1. API Overview + 3.2. Defining the filter name and its configuration + 3.3. Managing the filter lifecycle + 3.3.1. Dealing with threads + 3.4. Handling the streams activity + 3.5. Analyzing the channels activity + 3.6. Filtering the data exchanged + 4. FAQ + + + +1. FILTERS INTRODUCTION +----------------------- + +First of all, to fully understand how filters work and how to create one, it is +best to know, at least from a distance, what is a proxy (frontend/backend), a +stream and a channel in HAProxy and how these entities are linked to each other. +doc/internals/entities.pdf is a good overview. + +Then, to support filters, many callbacks has been added to HAProxy at different +places, mainly around channel analyzers. Their purpose is to allow filters to +be involved in the data processing, from the stream creation/destruction to +the data forwarding. Depending of what it should do, a filter can implement all +or part of these callbacks. For now, existing callbacks are focused on +streams. But future improvements could enlarge filters scope. For instance, it +could be useful to handle events at the connection level. + +In HAProxy configuration file, a filter is declared in a proxy section, except +default. So the configuration corresponding to a filter declaration is attached +to a specific proxy, and will be shared by all its instances. it is opaque from +the HAProxy point of view, this is the filter responsibility to manage it. For +each filter declaration matches a uniq configuration. Several declarations of +the same filter in the same proxy will be handle as different filters by +HAProxy. + +A filter instance is represented by a partially opaque context (or a state) +attached to a stream and passed as arguments to callbacks. Through this context, +filter instances are stateful. Depending the filter is declared in a frontend or +a backend section, its instances will be created, respectively, when a stream is +created or when a backend is selected. Their behaviors will also be +different. Only instances of filters declared in a frontend section will be +aware of the creation and the destruction of the stream, and will take part in +the channels analyzing before the backend is defined. + +It is important to remember the configuration of a filter is shared by all its +instances, while the context of an instance is owned by a uniq stream. + +Filters are designed to be chained. It is possible to declare several filters in +the same proxy section. The declaration order is important because filters will +be called one after the other respecting this order. Frontend and backend +filters are also chained, frontend ones called first. Even if the filters +processing is serialized, each filter will bahave as it was alone (unless it was +developed to be aware of other filters). For all that, some constraints are +imposed to filters, especially when data exchanged between the client and the +server are processed. We will discuss again these constraints when we will tackle +the subject of writing a filter. + + + +2. HOW TO USE FILTERS +--------------------- + +To use a filter, the parameter 'filter' should be used, followed by the filter +name and, optionally, its configuration in the desired listen, frontend or +backend section. For instance : + + listen test + ... + filter trace name TST + ... + + +See doc/configuration.txt for a formal definition of the parameter 'filter'. +Note that additional parameters on the filter line must be parsed by the filter +itself. + +The list of available filters is reported by 'haproxy -vv' : + + $> haproxy -vv + HAProxy version 1.7-dev2-3a1d4a-33 2016/03/21 + Copyright 2000-2016 Willy Tarreau <willy@haproxy.org> + + [...] + + Available filters : + [COMP] compression + [TRACE] trace + + +Multiple filter lines can be used in a proxy section to chain filters. Filters +will be called in the declaration order. + +Some filters can support implicit declarations in certain circumstances +(without the filter line). This is not recommended for new features but are +useful for existing ones moved in a filter, for backward compatibility +reasons. Implicit declarations are supported when there is only one filter used +on a proxy. When several filters are used, explicit declarations are mandatory. +The HTTP compression filter is one of these filters. Alone, using 'compression' +keywords is enough to use it. But when at least a second filter is used, a +filter line must be added. + + # filter line is optional + listen t1 + bind *:80 + compression algo gzip + compression offload + server srv x.x.x.x:80 + + # filter line is mandatory for the compression filter + listen t2 + bind *:81 + filter trace name T2 + filter compression + compression algo gzip + compression offload + server srv x.x.x.x:80 + + + + +3. HOW TO WRITE A NEW FILTER +---------------------------- + +To write a filter, there are 2 header files to explore : + + * include/haproxy/filters-t.h : This is the main header file, containing all + important structures to use. It represents the + filter API. + + * include/haproxy/filters.h : This header file contains helper functions that + may be used. It also contains the internal API + used by HAProxy to handle filters. + +To ease the filters integration, it is better to follow some conventions : + + * Use 'flt_' prefix to name the filter (e.g flt_http_comp or flt_trace). + + * Keep everything related to the filter in a same file. + +The filter 'trace' can be used as a template to write new filter. It is a good +start to see how filters really work. + +3.1 API OVERVIEW +---------------- + +Writing a filter can be summarized to write functions and attach them to the +existing callbacks. Available callbacks are listed in the following structure : + + struct flt_ops { + /* + * Callbacks to manage the filter lifecycle + */ + int (*init) (struct proxy *p, struct flt_conf *fconf); + void (*deinit) (struct proxy *p, struct flt_conf *fconf); + int (*check) (struct proxy *p, struct flt_conf *fconf); + int (*init_per_thread) (struct proxy *p, struct flt_conf *fconf); + void (*deinit_per_thread)(struct proxy *p, struct flt_conf *fconf); + + /* + * Stream callbacks + */ + int (*attach) (struct stream *s, struct filter *f); + int (*stream_start) (struct stream *s, struct filter *f); + int (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be); + void (*stream_stop) (struct stream *s, struct filter *f); + void (*detach) (struct stream *s, struct filter *f); + void (*check_timeouts) (struct stream *s, struct filter *f); + + /* + * Channel callbacks + */ + int (*channel_start_analyze)(struct stream *s, struct filter *f, + struct channel *chn); + int (*channel_pre_analyze) (struct stream *s, struct filter *f, + struct channel *chn, + unsigned int an_bit); + int (*channel_post_analyze) (struct stream *s, struct filter *f, + struct channel *chn, + unsigned int an_bit); + int (*channel_end_analyze) (struct stream *s, struct filter *f, + struct channel *chn); + + /* + * HTTP callbacks + */ + int (*http_headers) (struct stream *s, struct filter *f, + struct http_msg *msg); + int (*http_payload) (struct stream *s, struct filter *f, + struct http_msg *msg, unsigned int offset, + unsigned int len); + int (*http_end) (struct stream *s, struct filter *f, + struct http_msg *msg); + + void (*http_reset) (struct stream *s, struct filter *f, + struct http_msg *msg); + void (*http_reply) (struct stream *s, struct filter *f, + short status, + const struct buffer *msg); + + /* + * TCP callbacks + */ + int (*tcp_payload) (struct stream *s, struct filter *f, + struct channel *chn, unsigned int offset, + unsigned int len); + }; + + +We will explain in following parts when these callbacks are called and what they +should do. + +Filters are declared in proxy sections. So each proxy have an ordered list of +filters, possibly empty if no filter is used. When the configuration of a proxy +is parsed, each filter line represents an entry in this list. In the structure +'proxy', the filters configurations are stored in the field 'filter_configs', +each one of type 'struct flt_conf *' : + + /* + * Structure representing the filter configuration, attached to a proxy and + * accessible from a filter when instantiated in a stream + */ + struct flt_conf { + const char *id; /* The filter id */ + struct flt_ops *ops; /* The filter callbacks */ + void *conf; /* The filter configuration */ + struct list list; /* Next filter for the same proxy */ + unsigned int flags; /* FLT_CFG_FL_* */ + }; + + * 'flt_conf.id' is an identifier, defined by the filter. It can be + NULL. HAProxy does not use this field. Filters can use it in log messages or + as a uniq identifier to check multiple declarations. It is the filter + responsibility to free it, if necessary. + + * 'flt_conf.conf' is opaque. It is the internal configuration of a filter, + generally allocated and filled by its parsing function (See § 3.2). It is + the filter responsibility to free it. + + * 'flt_conf.ops' references the callbacks implemented by the filter. This + field must be set during the parsing phase (See § 3.2) and can be refine + during the initialization phase (See § 3.3). If it is dynamically allocated, + it is the filter responsibility to free it. + + * 'flt_conf.flags' is a bitfield to specify the filter capabilities. For now, + only FLT_CFG_FL_HTX may be set when a filter is able to process HTX + streams. If not set, the filter is excluded from the HTTP filtering. + + +The filter configuration is global and shared by all its instances. A filter +instance is created in the context of a stream and attached to this stream. in +the structure 'stream', the field 'strm_flt' is the state of all filter +instances attached to a stream : + + /* + * Structure representing the "global" state of filters attached to a + * stream. + */ + struct strm_flt { + struct list filters; /* List of filters attached to a stream */ + struct filter *current[2]; /* From which filter resume processing, for a specific channel. + * This is used for resumable callbacks only, + * If NULL, we start from the first filter. + * 0: request channel, 1: response channel */ + unsigned short flags; /* STRM_FL_* */ + unsigned char nb_req_data_filters; /* Number of data filters registered on the request channel */ + unsigned char nb_rsp_data_filters; /* Number of data filters registered on the response channel */ + unsigned long long offset[2]; /* gloal offset of input data already filtered for a specific channel + * 0: request channel, 1: response channel */ + }; + + +Filter instances attached to a stream are stored in the field +'strm_flt.filters', each instance is of type 'struct filter *' : + + /* + * Structure representing a filter instance attached to a stream + * + * 2D-Array fields are used to store info per channel. The first index + * stands for the request channel, and the second one for the response + * channel. Especially, <next> and <fwd> are offsets representing amount of + * data that the filter are, respectively, parsed and forwarded on a + * channel. Filters can access these values using FLT_NXT and FLT_FWD + * macros. + */ + struct filter { + struct flt_conf *config; /* the filter's configuration */ + void *ctx; /* The filter context (opaque) */ + unsigned short flags; /* FLT_FL_* */ + unsigned long long offset[2]; /* Offset of input data already filtered for a specific channel + * 0: request channel, 1: response channel */ + unsigned int pre_analyzers; /* bit field indicating analyzers to + * pre-process */ + unsigned int post_analyzers; /* bit field indicating analyzers to + * post-process */ + struct list list; /* Next filter for the same proxy/stream */ + }; + + * 'filter.config' is the filter configuration previously described. All + instances of a filter share it. + + * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its + responsibility to free it. + + * 'filter.pre_analyzers and 'filter.post_analyzers will be described later + (See § 3.5). + + * 'filter.offset' will be described later (See § 3.6). + + +3.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION +--------------------------------------------------- + +During the filter development, the first thing to do is to add it in the +supported filters. To do so, its name must be registered as a valid keyword on +the filter line : + + /* Declare the filter parser for "my_filter" keyword */ + static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, { + { "my_filter", parse_my_filter_cfg, NULL /* private data */ }, + { NULL, NULL, NULL }, + } + }; + INITCALL1(STG_REGISTER, flt_register_keywords, &flt_kws); + + +Then the filter internal configuration must be defined. For instance : + + struct my_filter_config { + struct proxy *proxy; + char *name; + /* ... */ + }; + + +All callbacks implemented by the filter must then be declared. Here, a global +variable is used : + + struct flt_ops my_filter_ops { + .init = my_filter_init, + .deinit = my_filter_deinit, + .check = my_filter_config_check, + + /* ... */ + }; + + +Finally, the function to parse the filter configuration must be written, here +'parse_my_filter_cfg'. This function must parse all remaining keywords on the +filter line : + + /* Return -1 on error, else 0 */ + static int + parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px, + struct flt_conf *flt_conf, char **err, void *private) + { + struct my_filter_config *my_conf; + int pos = *cur_arg; + + /* Allocate the internal configuration used by the filter */ + my_conf = calloc(1, sizeof(*my_conf)); + if (!my_conf) { + memprintf(err, "%s : out of memory", args[*cur_arg]); + return -1; + } + my_conf->proxy = px; + + /* ... */ + + /* Parse all keywords supported by the filter and fill the internal + * configuration */ + pos++; /* Skip the filter name */ + while (*args[pos]) { + if (!strcmp(args[pos], "name")) { + if (!*args[pos + 1]) { + memprintf(err, "'%s' : '%s' option without value", + args[*cur_arg], args[pos]); + goto error; + } + my_conf->name = strdup(args[pos + 1]); + if (!my_conf->name) { + memprintf(err, "%s : out of memory", args[*cur_arg]); + goto error; + } + pos += 2; + } + + /* ... parse other keywords ... */ + } + *cur_arg = pos; + + /* Set callbacks supported by the filter */ + flt_conf->ops = &my_filter_ops; + + /* Last, save the internal configuration */ + flt_conf->conf = my_conf; + return 0; + + error: + if (my_conf->name) + free(my_conf->name); + free(my_conf); + return -1; + } + + +WARNING : In this parsing function, 'flt_conf->ops' must be initialized. All + arguments of the filter line must also be parsed. This is mandatory. + +In the previous example, the filter lne should be read as follows : + + filter my_filter name MY_NAME ... + + +Optionally, by implementing the 'flt_ops.check' callback, an extra set is added +to check the internal configuration of the filter after the parsing phase, when +the HAProxy configuration is fully defined. For instance : + + /* Check configuration of a trace filter for a specified proxy. + * Return 1 on error, else 0. */ + static int + my_filter_config_check(struct proxy *px, struct flt_conf *my_conf) + { + if (px->mode != PR_MODE_HTTP) { + Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n"); + return 1; + } + + /* ... */ + + return 0; + } + + + +3.3. MANAGING THE FILTER LIFECYCLE +---------------------------------- + +Once the configuration parsed and checked, filters are ready to by used. There +are two main callbacks to manage the filter lifecycle : + + * 'flt_ops.init' : It initializes the filter for a proxy. This callback may be + defined to finish the filter configuration. + + * 'flt_ops.deinit' : It cleans up what the parsing function and the init + callback have done. This callback is useful to release + memory allocated for the filter configuration. + +Here is an example : + + /* Initialize the filter. Returns -1 on error, else 0. */ + static int + my_filter_init(struct proxy *px, struct flt_conf *fconf) + { + struct my_filter_config *my_conf = fconf->conf; + + /* ... */ + + return 0; + } + + /* Free resources allocated by the trace filter. */ + static void + my_filter_deinit(struct proxy *px, struct flt_conf *fconf) + { + struct my_filter_config *my_conf = fconf->conf; + + if (my_conf) { + free(my_conf->name); + /* ... */ + free(my_conf); + } + fconf->conf = NULL; + } + + +3.3.1 DEALING WITH THREADS +-------------------------- + +When HAProxy is compiled with the threads support and started with more that one +thread (global.nbthread > 1), then it is possible to manage the filter per +thread with following callbacks : + + * 'flt_ops.init_per_thread': It initializes the filter for each thread. It + works the same way than 'flt_ops.init' but in the + context of a thread. This callback is called + after the thread creation. + + * 'flt_ops.deinit_per_thread': It cleans up what the init_per_thread callback + have done. It is called in the context of a + thread, before exiting it. + +It is the filter responsibility to deal with concurrency. check, init and deinit +callbacks are called on the main thread. All others are called on a "worker" +thread (not always the same). It is also the filter responsibility to know if +HAProxy is started with more than one thread. If it is started with one thread +(or compiled without the threads support), these callbacks will be silently +ignored (in this case, global.nbthread will be always equal to one). + + +3.4. HANDLING THE STREAMS ACTIVITY +----------------------------------- + +It may be interesting to handle streams activity. For now, there is three +callbacks that should define to do so : + + * 'flt_ops.stream_start' : It is called when a stream is started. This + callback can fail by returning a negative value. It + will be considered as a critical error by HAProxy + which disabled the listener for a short time. + + * 'flt_ops.stream_set_backend' : It is called when a backend is set for a + stream. This callbacks will be called for all + filters attached to a stream (frontend and + backend). Note this callback is not called if + the frontend and the backend are the same. + + * 'flt_ops.stream_stop' : It is called when a stream is stopped. This callback + always succeed. Anyway, it is too late to return an + error. + +For instance : + + /* Called when a stream is created. Returns -1 on error, else 0. */ + static int + my_filter_stream_start(struct stream *s, struct filter *filter) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... */ + + return 0; + } + + /* Called when a backend is set for a stream */ + static int + my_filter_stream_set_backend(struct stream *s, struct filter *filter, + struct proxy *be) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... */ + + return 0; + } + + /* Called when a stream is destroyed */ + static void + my_filter_stream_stop(struct stream *s, struct filter *filter) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... */ + } + + +WARNING : Handling the streams creation and destruction is only possible for + filters defined on proxies with the frontend capability. + +In addition, it is possible to handle creation and destruction of filter +instances using following callbacks: + + * 'flt_ops.attach' : It is called after a filter instance creation, when it is + attached to a stream. This happens when the stream is + started for filters defined on the stream's frontend and + when the backend is set for filters declared on the + stream's backend. It is possible to ignore the filter, if + needed, by returning 0. This could be useful to have + conditional filtering. + + * 'flt_ops.detach' : It is called when a filter instance is detached from a + stream, before its destruction. This happens when the + stream is stopped for filters defined on the stream's + frontend and when the analyze ends for filters defined on + the stream's backend. + +For instance : + + /* Called when a filter instance is created and attach to a stream */ + static int + my_filter_attach(struct stream *s, struct filter *filter) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + if (/* ... */) + return 0; /* Ignore the filter here */ + return 1; + } + + /* Called when a filter instance is detach from a stream, just before its + * destruction */ + static void + my_filter_detach(struct stream *s, struct filter *filter) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... */ + } + +Finally, it may be interesting to notify the filter when the stream is woken up +because of an expired timer. This could let a chance to check some internal +timeouts, if any. To do so the following callback must be used : + + * 'flt_opt.check_timeouts' : It is called when a stream is woken up because of + an expired timer. + +For instance : + + /* Called when a stream is woken up because of an expired timer */ + static void + my_filter_check_timeouts(struct stream *s, struct filter *filter) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... */ + } + + +3.5. ANALYZING THE CHANNELS ACTIVITY +------------------------------------ + +The main purpose of filters is to take part in the channels analyzing. To do so, +there is 2 callbacks, 'flt_ops.channel_pre_analyze' and +'flt_ops.channel_post_analyze', called respectively before and after each +analyzer attached to a channel, except analyzers responsible for the data +forwarding (TCP or HTTP). Concretely, on the request channel, these callbacks +could be called before following analyzers : + + * tcp_inspect_request (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE) + * http_wait_for_request (AN_REQ_WAIT_HTTP) + * http_wait_for_request_body (AN_REQ_HTTP_BODY) + * http_process_req_common (AN_REQ_HTTP_PROCESS_FE) + * process_switching_rules (AN_REQ_SWITCHING_RULES) + * http_process_req_ common (AN_REQ_HTTP_PROCESS_BE) + * http_process_tarpit (AN_REQ_HTTP_TARPIT) + * process_server_rules (AN_REQ_SRV_RULES) + * http_process_request (AN_REQ_HTTP_INNER) + * tcp_persist_rdp_cookie (AN_REQ_PRST_RDP_COOKIE) + * process_sticking_rules (AN_REQ_STICKING_RULES) + +And on the response channel : + + * tcp_inspect_response (AN_RES_INSPECT) + * http_wait_for_response (AN_RES_WAIT_HTTP) + * process_store_rules (AN_RES_STORE_RULES) + * http_process_res_common (AN_RES_HTTP_PROCESS_BE) + +Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze' +can interrupt the stream processing. So a filter can decide to not execute the +analyzer that follows and wait the next iteration. If there are more than one +filter, following ones are skipped. On the next iteration, the filtering resumes +where it was stopped, i.e. on the filter that has previously stopped the +processing. So it is possible for a filter to stop the stream processing on a +specific analyzer for a while before continuing. Moreover, this callback can be +called many times for the same analyzer, until it finishes its processing. For +instance : + + /* Called before a processing happens on a given channel. + * Returns a negative value if an error occurs, 0 if it needs to wait, + * any other value otherwise. */ + static int + my_filter_chn_pre_analyze(struct stream *s, struct filter *filter, + struct channel *chn, unsigned an_bit) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + switch (an_bit) { + case AN_REQ_WAIT_HTTP: + if (/* wait that a condition is verified before continuing */) + return 0; + break; + /* ... * / + } + return 1; + } + + * 'an_bit' is the analyzer id. All analyzers are listed in + 'include/haproxy/channels-t.h'. + + * 'chn' is the channel on which the analyzing is done. It is possible to + determine if it is the request or the response channel by testing if + CF_ISRESP flag is set : + + │ ((chn->flags & CF_ISRESP) == CF_ISRESP) + + +In previous example, the stream processing is blocked before receipt of the HTTP +request until a condition is verified. + +'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a +negative value if an error occurs, any other value otherwise. It is called when +a filterable analyzer finishes its processing, so once for the same analyzer. +For instance : + + /* Called after a processing happens on a given channel. + * Returns a negative value if an error occurs, any other + * value otherwise. */ + static int + my_filter_chn_post_analyze(struct stream *s, struct filter *filter, + struct channel *chn, unsigned an_bit) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + struct http_msg *msg; + + switch (an_bit) { + case AN_REQ_WAIT_HTTP: + if (/* A test on received headers before any other treatment */) { + msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req); + txn->status = 400; + msg->msg_state = HTTP_MSG_ERROR; + http_reply_and_close(s, s->txn->status, http_error_message(s)); + return -1; /* This is an error ! */ + } + break; + /* ... * / + } + return 1; + } + + +Pre and post analyzer callbacks of a filter are not automatically called. They +must be regiesterd explicitly on analyzers, updating the value of +'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits +are listed in 'include/types/channels.h'. Here is an example : + + static int + my_filter_stream_start(struct stream *s, struct filter *filter) + { + /* ... * / + + /* Register the pre analyzer callback on all request and response + * analyzers */ + filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL) + + /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and + * AN_RES_WAIT_HTTP analyzers */ + filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP) + + /* ... * / + return 0; + } + + +To surround activity of a filter during the channel analyzing, two new analyzers +has been added : + + * 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE) : For + a specific filter, this analyzer is called before any call to the + 'channel_analyze' callback. From the filter point of view, it calls the + 'flt_ops.channel_start_analyze' callback. + + * 'flt_end_analyze' (AN_REQ/RES_FLT_END) : For a specific filter, this + analyzer is called when all other analyzers have finished their + processing. From the filter point of view, it calls the + 'flt_ops.channel_end_analyze' callback. + +These analyzers are called only once per streams. + +'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can +interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an +example : + + /* Called when analyze starts for a given channel + * Returns a negative value if an error occurs, 0 if it needs to wait, + * any other value otherwise. */ + static int + my_filter_chn_start_analyze(struct stream *s, struct filter *filter, + struct channel *chn) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... TODO ... */ + + return 1; + } + + /* Called when analyze ends for a given channel + * Returns a negative value if an error occurs, 0 if it needs to wait, + * any other value otherwise. */ + static int + my_filter_chn_end_analyze(struct stream *s, struct filter *filter, + struct channel *chn) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* ... TODO ... */ + + return 1; + } + + +Workflow on channels can be summarized as following : + + FE: Called for filters defined on the stream's frontend + BE: Called for filters defined on the stream's backend + + +------->---------+ + | | | + +----------------------+ | +----------------------+ + | flt_ops.attach (FE) | | | flt_ops.attach (BE) | + +----------------------+ | +----------------------+ + | | | + V | V + +--------------------------+ | +------------------------------------+ + | flt_ops.stream_start (FE)| | | flt_ops.stream_set_backend (FE+BE) | + +--------------------------+ | +------------------------------------+ + | | | + ... | ... + | | | + | ^ | + | --+ | | --+ + +------<----------+ | | +--------<--------+ | + | | | | | | | + V | | | V | | ++-------------------------------+ | | | +-------------------------------+ | | +| flt_start_analyze (FE) +-+ | | | flt_start_analyze (BE) +-+ | +|(flt_ops.channel_start_analyze)| | F | |(flt_ops.channel_start_analyze)| | ++---------------+---------------+ | R | +-------------------------------+ | + | | O | | | + +------<---------+ | N ^ +--------<-------+ | B + | | | T | | | | A ++---------------|------------+ | | E | +---------------|------------+ | | C +|+--------------V-------------+ | | N | |+--------------V-------------+ | | K +||+----------------------------+ | | D | ||+----------------------------+ | | E +|||flt_ops.channel_pre_analyze | | | | |||flt_ops.channel_pre_analyze | | | N +||| V | | | | ||| V | | | D +||| analyzer (FE) +-+ | | ||| analyzer (FE+BE) +-+ | ++|| V | | | +|| V | | + +|flt_ops.channel_post_analyze| | | +|flt_ops.channel_post_analyze| | + +----------------------------+ | | +----------------------------+ | + | --+ | | | + +------------>------------+ ... | + | | + [ data filtering (see below) ] | + | | + ... | + | | + +--------<--------+ | + | | | + V | | + +-------------------------------+ | | + | flt_end_analyze (FE+BE) +-+ | + | (flt_ops.channel_end_analyze) | | + +---------------+---------------+ | + | --+ + V + +----------------------+ + | flt_ops.detach (BE) | + +----------------------+ + | + V + +--------------------------+ + | flt_ops.stream_stop (FE) | + +--------------------------+ + | + V + +----------------------+ + | flt_ops.detach (FE) | + +----------------------+ + | + V + +By zooming on an analyzer box we have: + + ... + | + V + | + +-----------<-----------+ + | | + +-----------------+--------------------+ | + | | | | + | +--------<---------+ | | + | | | | | + | V | | | + | flt_ops.channel_pre_analyze ->-+ | ^ + | | | | + | | | | + | V | | + | analyzer --------->-----+--+ + | | | + | | | + | V | + | flt_ops.channel_post_analyze | + | | | + | | | + +-----------------+--------------------+ + | + V + ... + + + 3.6. FILTERING THE DATA EXCHANGED +----------------------------------- + +WARNING : To fully understand this part, it is important to be aware on how the + buffers work in HAProxy. For the HTTP part, it is also important to + understand how data are parsed and structured, and how the internal + representation, called HTX, works. See doc/internals/buffer-api.txt + and doc/internals/htx-api.txt for details. + +An extended feature of the filters is the data filtering. By default a filter +does not look into data exchanged between the client and the server because it +is expensive. Indeed, instead of forwarding data without any processing, each +byte need to be buffered. + +So, to enable the data filtering on a channel, at any time, in one of previous +callbacks, 'register_data_filter' function must be called. And conversely, to +disable it, 'unregister_data_filter' function must be called. For instance : + + my_filter_http_headers(struct stream *s, struct filter *filter, + struct http_msg *msg) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + + /* 'chn' must be the request channel */ + if (!(msg->chn->flags & CF_ISRESP)) { + struct htx *htx; + struct ist hdr; + struct http_hdr_ctx ctx; + + htx = htxbuf(msg->chn->buf); + + /* Enable the data filtering for the request if 'X-Filter' header + * is set to 'true'. */ + hdr = ist("X-Filter); + ctx.blk = NULL; + if (http_find_header(htx, hdr, &ctx, 0) && + ctx.value.len >= 4 && memcmp(ctx.value.ptr, "true", 4) == 0) + register_data_filter(s, chn, filter); + } + + return 1; + } + +Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and +set to 'true'. + +If several filters are declared, the evaluation order remains the same, +regardless the order of the registrations to the data filtering. Data +registrations must be performed before the data forwarding step. However, a +filter may be unregistered from the data filtering at any time. + +Depending on the stream type, TCP or HTTP, the way to handle data filtering is +different. HTTP data are structured while TCP data are raw. And there are more +callbacks for HTTP streams to fully handle all steps of an HTTP transaction. But +the main part is the same. The data filtering is performed in one callback, +called in loop on input data starting at a specific offset for a given +length. Data analyzed by a filter are considered as forwarded from its point of +view. Because filters are chained, a filter never analyzes more data than its +predecessors. Thus only data analyzed by the last filter are effectively +forwarded. This means, at any time, any filter may choose to not analyze all +available data (available from its point of view), blocking the data forwarding. + +Internally, filters own 2 offsets representing the number of bytes already +analyzed in the available input data, one per channel. There is also an offset +couple at the stream level, in the strm_flt object, representing the total +number of bytes already forwarded. These offsets may be retrieved and updated +using following macros : + + * FLT_OFF(flt, chn) + + * FLT_STRM_OFF(s, chn) + +where 'flt' is the 'struct filter' passed as argument in all callbacks, 's' the +filtered stream and 'chn' is the considered channel. However, there is no reason +for a filter to use these macros or take care of these offsets. + + +3.6.1 FILTERING DATA ON TCP STREAMS +----------------------------------- + +The TCP data filtering for TCP streams is the easy case, because HAProxy do not +parse these data. Data are stored in raw in the buffer. So there is only one +callback to consider: + + * 'flt_ops.tcp_payload : This callback is called when input data are + available. If not defined, all available data will be considered as analyzed + and forwarded from the filter point of view. + +This callback is called only if the filter is registered to analyze TCP +data. Here is an example : + + /* Returns a negative value if an error occurs, else the number of + * consumed bytes. */ + static int + my_filter_tcp_payload(struct stream *s, struct filter *filter, + struct channel *chn, unsigned int offset, + unsigned int len) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + int ret = len; + + /* Do not parse more than 'my_conf->max_parse' bytes at a time */ + if (my_conf->max_parse != 0 && ret > my_conf->max_parse) + ret = my_conf->max_parse; + + /* if available data are not completely parsed, wake up the stream to + * be sure to not freeze it. The best is probably to set a + * chn->analyse_exp timer */ + if (ret != len) + task_wakeup(s->task, TASK_WOKEN_MSG); + return ret; + } + +But it is important to note that tunnelled data of an HTTP stream may also be +filtered via this callback. Tunnelled data are data exchange after an HTTP tunnel +is established between the client and the server, via an HTTP CONNECT or via a +protocol upgrade. In this case, the data are structured. Of course, to do so, +the filter must be able to parse HTX data and must have the FLT_CFG_FL_HTX flag +set. At any time, the IS_HTX_STRM() macros may be used on the stream to know if +it is an HTX stream or a TCP stream. + + +3.6.2 FILTERING DATA ON HTTP STREAMS +------------------------------------ + +The HTTP data filtering is a bit more complex because HAProxy data are +structutred and represented to an internal format, called HTX. So basically +there is the HTTP counterpart to the previous callback : + + * 'flt_ops.http_payload' : This callback is called when input data are + available. If not defined, all available data will be considered as analyzed + and forwarded for the filter. + +But the prototype for this callbacks is slightly different. Instead of having +the channel as parameter, we have the HTTP message (struct http_msg). This +callback is called only if the filter is registered to analyze TCP data. Here is +an example : + + /* Returns a negative value if an error occurs, else the number of + * consumed bytes. */ + static int + my_filter_http_payload(struct stream *s, struct filter *filter, + struct http_msg *msg, unsigned int offset, + unsigned int len) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + struct htx *htx = htxbuf(&msg->chn->buf); + struct htx_ret htxret = htx_find_offset(htx, offset); + struct htx_blk *blk; + + blk = htxret.blk; + offset = htxret.ret; + for (; blk; blk = htx_get_next_blk(blk, htx)) { + enum htx_blk_type type = htx_get_blk_type(blk); + + if (type == HTX_BLK_UNUSED) + continue; + else if (type == HTX_BLK_DATA) { + /* filter data */ + } + else + break; + } + + return len; + } + +In addition, there are two others callbacks : + + * 'flt_ops.http_headers' : This callback is called just before the HTTP body + forwarding and after any processing on the request/response HTTP + headers. When defined, this callback is always called for HTTP streams + (i.e. without needs of a registration on data filtering). + Here is an example : + + + /* Returns a negative value if an error occurs, 0 if it needs to wait, + * any other value otherwise. */ + static int + my_filter_http_headers(struct stream *s, struct filter *filter, + struct http_msg *msg) + { + struct my_filter_config *my_conf = FLT_CONF(filter); + struct htx *htx = htxbuf(&msg->chn->buf); + struct htx_sl *sl = http_get_stline(htx); + int32_t pos; + + for (pos = htx_get_first(htx); pos != -1; pos = htx_get_next(htx, pos)) { + struct htx_blk *blk = htx_get_blk(htx, pos); + enum htx_blk_type type = htx_get_blk_type(blk); + struct ist n, v; + + if (type == HTX_BLK_EOH) + break; + if (type != HTX_BLK_HDR) + continue; + + n = htx_get_blk_name(htx, blk); + v = htx_get_blk_value(htx, blk); + /* Do something on the header name/value */ + } + + return 1; + } + + * 'flt_ops.http_end' : This callback is called when the whole HTTP message was + processed. It may interrupt the stream processing. So, it could be used to + synchronize the HTTP request with the HTTP response, for instance : + + /* Returns a negative value if an error occurs, 0 if it needs to wait, + * any other value otherwise. */ + static int + my_filter_http_end(struct stream *s, struct filter *filter, + struct http_msg *msg) + { + struct my_filter_ctx *my_ctx = filter->ctx; + + + if (!(msg->chn->flags & CF_ISRESP)) /* The request */ + my_ctx->end_of_req = 1; + else /* The response */ + my_ctx->end_of_rsp = 1; + + /* Both the request and the response are finished */ + if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1) + return 1; + + /* Wait */ + return 0; + } + +Then, to finish, there are 2 informational callbacks : + + * 'flt_ops.http_reset' : This callback is called when an HTTP message is + reset. This happens either when a 1xx informational response is received, or + if we're retrying to send the request to the server after it failed. It + could be useful to reset the filter context before receiving the true + response. + By checking s->txn->status, it is possible to know why this callback is + called. If it's a 1xx, we're called because of an informational + message. Otherwise, it is a L7 retry. + + * 'flt_ops.http_reply' : This callback is called when, at any time, HAProxy + decides to stop the processing on a HTTP message and to send an internal + response to the client. This mainly happens when an error or a redirect + occurs. + + +3.6.3 REWRITING DATA +-------------------- + +The last part, and the trickiest one about the data filtering, is about the data +rewriting. For now, the filter API does not offer a lot of functions to handle +it. There are only functions to notify HAProxy that the data size has changed to +let it update internal state of filters. This is the developer responsibility to +update data itself, i.e. the buffer offsets, using following function : + + * 'flt_update_offsets()' : This function must be called when a filter alter + incoming data. It updates offsets of the stream and of all filters + preceding the calling one. Do not call this function when a filter change + the size of incoming data leads to an undefined behavior. + +A good example of filter changing the data size is the HTTP compression filter. diff --git a/doc/internals/api/htx-api.txt b/doc/internals/api/htx-api.txt new file mode 100644 index 0000000..971328b --- /dev/null +++ b/doc/internals/api/htx-api.txt @@ -0,0 +1,570 @@ + ----------------------------------------------- + HTX API + Version 1.1 + ( Last update: 2021-02-24 ) + ----------------------------------------------- + Author : Christopher Faulet + Contact : cfaulet at haproxy dot com + +1. Background + +Historically, HAProxy stored HTTP messages in a raw fashion in buffers, keeping +parsing information separately in a "struct http_msg" owned by the stream. It was +optimized to the data transfer, but not so much for rewrites. It was also HTTP/1 +centered. While it was the only HTTP version supported, it was not a +problem. But with the rise of HTTP/2, it starts to be hard to still use this +representation. + +At the first age of the HTTP/2 in HAProxy, H2 messages were converted into +H1. This was terribly unefficient because it required two parsing passes, a +first one in H2 and a second one in H1, with a conversion in the middle. And of +course, the same was also true in the opposite direction. outgoing H1 messages +had to be converted back in H2 to be sent. Even worse, because the H2->H1 +conversion, only client H2 connections were supported. + +So, to address all these problems, we decided to replace the old raw +representation by a version-agnostic and self-structured internal HTTP +representation, the HTX. As an additional benefit, with this new representation, +the message parsing and its processing are now separated, making all the HTTP +analysis simpler and cleaner. The parsing of HTTP messages is now handled by +the multiplexers (h1 or h2). + + +2. The HTX message + +The HTX is a structure containing useful information about an HTTP message +followed by a contiguous array with some parts of the message. These parts are +called blocks. A block is composed of metadata (htx_blk) and an associated +payload. Blocks' metadata are stored starting from the end of the array while +their payload are stored at the beginning. Blocks' metadata are often simply +called blocks. it is a misuse of language that's simplify explanations. + +Internally, this structure is "hidden" in a buffer. This way, there are few +changes into intermediate layers (stream-interface and channels). They still +manipulate buffers. Only the multiplexer and the stream have to know how data +are really stored. From the HTX perspective, a buffer is just a memory +area. When an HTX message is stored in a buffer, this one appears as full. + + * General view of an HTX message : + + + buffer->area + | + |<------------ buffer->size == buffer->data ----------------------| + | | + | |<------------- Blocks array (htx->size) ------------------>| + V | | + +-----+-----------------+-------------------------+---------------+ + | HTX | PAYLOADS ==> | | <== HTX_BLKs | + +-----+-----------------+-------------------------+---------------+ + | | | | + |<-payloads part->|<----- free space ------>|<-blocks part->| + (htx->data) + + +The blocks part remains linear and sorted. It may be see as an array with +negative indexes. But, instead of using negative indexes, we use positive +positions to identify a block. This position is then converted to an address +relatively to the beginning of the blocks array. + + tail head + | | + V V + .....--+----+-----------------------+------+------+ + | Bn | ... | B1 | B0 | + .....--+----+-----------------------+------+------+ + ^ ^ ^ + Addr of the block Addr of the block Addr of the block + at the position N at the position 1 at the position 0 + + +In the HTX structure, 3 "special" positions are stored : + + - tail : Position of the newest inserted block + - head : Position of the oldest inserted block + - first : Position of the first block to (re)start the analyse + +The blocks part never wrap. If we have no space to allocate a new block and if +there is a hole at the beginning of the blocks part (so at the end of the blocks +array), we move back all blocks. + + + tail head tail head + | | | | + V V V V + ...+--------------+---------+ blocks ...----------+--------------+ + | X== HTX_BLKS | | defrag | <== HTX_BLKS | + ...+--------------+---------+ =====> ...----------+--------------+ + + +The payloads part is a raw space that may wrap. A block's payload must never be +accessed directly. Instead a block must be selected to retrieve the address of +its payload. + + + +------------------------( B0.addr )--------------------------+ + | +-------------------( B1.addr )----------------------+ | + | | +-----------( B2.addr )----------------+ | | + V V V | | | + +-----+----+-------+----+--------+-------------+-------+----+----+----+ + | HTX | P0 | P1 | P2 | ...==> | | <=... | B2 | B1 | B0 | + +-----+----+-------+----+--------+-------------+-------+----+----+----+ + + +Because the payloads part may wrap, there are 2 usable free spaces : + + - The free space in front of the blocks part. This one is used if and only if + the other one was not used yet. + + - The free space at the beginning of the message. Once this one is used, the + other one is never used again, until a message defragmentation. + + + * Linear payloads part : + + + head_addr end_addr tail_addr + | | | + V V V + +-----+--------------------+-------------+--------------------+-------... + | HTX | | PAYLOADS | | HTX_BLKs + +-----+--------------------+-------------+--------------------+-------... + |<-- free space 2 -->| |<-- free space 1 -->| + (used if the other is too small) (used in priority) + + + * Wrapping payloads part : + + + head_addr end_addr tail_addr + | | | + V V V + +-----+----+----------------+--------+----------------+-------+-------... + | HTX | | PAYLOADS part2 | | PAYLOADS part1 | | HTX_BLKs + +-----+----+----------------+--------+----------------+-------+-------... + |<-->| |<------>| |<----->| + unusable free space unusable + free space free space + + +Finally, when the usable free space is not enough to store a new block, unusable +parts may be get back with a full defragmentation. The payloads part is then +realigned at the beginning of the blocks array and the free space becomes +continuous again. + + +3. The HTX blocks + +An HTX block can be as well a start-line as a header, a body part or a +trailer. For all these types of block, a payload is attached to the block. It +can also be a marker, the end-of-headers or end-of-trailers. For these blocks, +there is no payload but it counts for a byte. It is important to not skip it +when data are forwarded. + +As already said, a block is composed of metadata and a payload. Metadata are +stored in the blocks part and are composed of 2 fields : + + - info : It a 32 bits field containing the block's type on 4 bits followed + by the payload length. See below for details. + + - addr : The payload's address, if any, relatively to the beginning the + array used to store part of the HTTP message itself. + + + * Block's info representation : + + 0b 0000 0000 0000 0000 0000 0000 0000 0000 + ---- ------------------------ --------- + type value (1 MB max) name length (header/trailer - 256B max) + ---------------------------------- + data length (256 MB max) + (body, method, path, version, status, reason) + + +Supported types are : + + - 0000 (0) : The request start-line + - 0001 (1) : The response start-line + - 0010 (2) : A header block + - 0011 (3) : The end-of-headers marker + - 0100 (4) : A data block + - 0101 (5) : A trailer block + - 0110 (6) : The end-of-trailers marker + - 1111 (15) : An unused block + +Other types are unused for now and reserved for futur extensions. + +An HTX message is typically composed of following blocks, in this order : + + - a start-line + - zero or more header blocks + - an end-of-headers marker + - zero or more data blocks + - zero or more trailer blocks (optional) + - an end-of-trailers marker (optional but always set if there is at least + one trailer block) + +Only one HTTP request at a time can be stored in an HTX message. For HTTP +response, it is more complicated. Only one "final" response can be stored in an +HTX message. It is a response with status-code 101 or greater or equal to +200. But it may be preceded by several 1xx informational responses. Such +responses are part of the same HTX message. + +When the end of the message is reached a special flag is set on the message +(HTX_FL_EOM). It means no more data are expected for this message, except +tunneled data. But tunneled data will never be mixed with message data to avoid +ambiguities. Thus once the flag marking the end of the message is set, it is +easy to know the message ends. The end is reached if the HTX message is empty or +on the tail HTX block in the HTX message. Once all blocks of the HTX message are +consumed, tunneled data, if any, may be transferred. + + +3.1. The start-line + +Every HTX message starts with a start-line. Its payload is a "struct htx_sl". In +addition to the parts of the HTTP start-line, this structure contains some +information about the represented HTTP message, mainly in the form of flags +(HTX_SL_F_*). For instance, if an HTTP message contains the header +"conten-length", then the flag HTX_SL_F_CLEN is set. + +Each HTTP message has its own start-line. So an HTX request has one and only one +start-line because it must contain only one HTTP request at a time. But an HTX +response may have more than one start-line if the final HTTP response is +precedeed by some 1xx informational responses. + +In HTTP/2, there is no start-line. So the H2 multiplexer must create one when it +converts an H2 message to HTX : + + - For the request, it uses the pseudo headers ":method", ":path" or + ":authority" depending on the method and the hardcoded version "HTTP/2.0". + + - For the response, it used the hardcoded version "HTTP/2.0", the + pseudo-header ":status" and an empty reason. + + +3.2. The headers and trailers + +HTX Headers and trailers are quite similar. Different types are used to simplify +headers processing. But from the HTX point of view, there is no real difference, +except their position in the HTX message. The header blocks always follow an HTX +start-line while trailer blocks come after the data. If there is no data, they +follow the end-of-headers marker. + +Headers and trailers are the only blocks containing a Key/Value payload. The +corresponding end-of marker must always be placed after each group to mark, as +it name suggests, the end. + +In HTTP/1, trailers are only present on chunked messages. But chunked messages +do not always have trailers. In this case, the end-of-trailers block may or may +not be present. Multiplexers must be able to handle both situations. In HTTP/2, +trailers are only present if a HEADERS frame is sent after DATA frames. + + +3.3. The data + +The payload body of an HTTP message is stored as DATA blocks in the HTX +message. For HTTP/1 messages, it is the message body without the chunks +formatting, if any. For HTTP/2, it is the payload of DATA frames. + +The DATA blocks are the only HTX blocks that may be partially processed (copied +or removed). All other types of block must be entierly processed. This means +DATA blocks can be resized. + + +3.4. The end-of markers + +These blocks are used to delimit parts of an HTX message. It exists two +markers : + + - end-of-headers (EOH) + - end-of-trailers (EOT) + +EOH is always present in an HTX message. EOT is optional. + + +4. The HTX API + + +4.1. Get/set HTX message from/to the underlying buffer + +The first thing to do to process an HTX message is to get it from the underlying +buffer. There are 2 functions to do so, the second one relying on the first : + + - htxbuf() returns an HTX message from a buffer. It does not modify the + buffer. It only initialize the HTX message if the buffer is empty. + + - htx_from_buf() uses htxbuf(). But it also updates the underlying buffer so + that it appears as full. + +Both functions return a "zero-sized" HTX message if the buffer is null. This +way, the HTX message is always valid. The first function is the default function +to use. The second one is only useful when some content will be added. For +instance, it used by the HTX analyzers when HAProxy generates a response. Thus, +the buffer is in a right state. + +Once the processing done, if the HTX message has been modified, the underlying +buffer must be also updated, except htx_from_buf() was used _AND_ data was only +added. For all other cases, the function htx_to_buf() must be called. + +Finally, the function htx_reset() may be called at any time to reset an HTX +message. And the function buf_room_for_htx_data() may be called to know if a raw +buffer is full from the HTX perspective. It is used during conversion from/to +the HTX. + + +4.2. Helpers to deal with free space in an HTX message + +Once with an HTX message, following functions may help to process it : + + - htx_used_space() and htx_meta_space() return, respectively, the total + space used in an HTX message and the space used by block's metadata only. + + - htx_free_space() and htx_free_data_space() return, respectively, the total + free space in an HTX message and the free space available for the payload + if a new HTX block is stored (so it is the total free space minus the size + of an HTX block). + + - htx_is_empty() and htx_is_not_empty() are boolean functions to know if an + HTX message is empty or not. + + - htx_get_max_blksz() returns the maximum size available for the payload, + not exceeding a maximum, metadata included. + + - htx_almost_full() should be used to know if an HTX message uses at least + 3/4 of its capacity. + + +4.3. HTX Blocks manipulations + +Once the available sapce in an HTX message is known, the next step is to add HTX +blocks. First of all the function htx_nbblks() returns the number of blocks +allocated in an HTX message. Then, there is an add function per block's type : + + - htx_add_stline() adds a start-line. The type (request or response) and the + flags of the start-line must be provided, as well as its three parts + (method,uri,version or version,status-code,reason). + + - htx_add_header() and htx_add_trailers() are similar. The name and the + value must be provided. The inserted HTX block is returned on success or + NULL if an error occurred. + + - htx_add_endof() must be used to add any end-of marker. The block's type + (EOH or EOT) must be specified. The inserted HTX block is returned on + success or NULL if an error occurred. + + - htx_add_all_headers() and htx_add_all_trailers() add, respectively, a list + of headers and a list of trailers, followed by the appropriate end-of + marker. On success, this marker is returned. Otherwise, NULL is + returned. Note there is no rollback on the HTX message when an error + occurred. Some headers or trailers may have been added. So it is the + caller responsibility to take care of that. + + - htx_add_data() must be used to add a DATA block. Unlike previous + functions, this one returns the number of bytes copied or 0 if nothing was + copied. If possible, the data are appended to the tail block if it is a + DATA block. Only a part of the payload may be copied because this function + will try to limit the message defragmentation and the wrapping of blocks + as far as possible. + + - htx_add_data_atonce() must be used if all data must be added or nothing. + It tries to insert all the payload, this function returns the inserted + block on success. Otherwise it returns NULL. + +When an HTX block is added, it is always the last one (the tail). But, if a +block must be added at a specific place, it is not really handy. 2 functions may +help (others could be added) : + + - htx_add_last_data() adds a DATA block just after all other DATA blocks and + before any trailers and EOT marker. It relies on htx_add_data_atonce(), so + a defragmentation may be performed. + + - htx_move_blk_before() moves a specific block just after another one. Both + blocks must already be in the HTX message and the block to move must + always be placed after the "pivot". + +Once added, there are three functions to update the block's payload : + + - htx_replace_stline() updates a start-line. The HTX block must be passed as + argument. Only string parts of the start-line are updated by this + function. On success, it returns the new start-line. So it is pretty easy + to update its flags. NULL is returned if an error occurred. + + - htx_replace_header() fully replaces a header (its name and its value) by a + new one. The HTX block must be passed a argument, as well as its new name + and its new value. The new header can be smaller or larger than the old + one. This function returns the new HTX block on success, or NULL is an + error occurred. + + - htx_replace_blk_value() replaces a part of a block's payload or its + totality. It works for HEADERS, TRAILERS or DATA blocks. The HTX block + must be provided with the part to remove and the new one. The new part can + be smaller or larger than the old one. This function returns the new HTX + block on success, or NULL is an error occurred. + + - htx_change_blk_value_len() changes the size of the value. It is the caller + responsibility to change the value itself, make sure there is enough space + and update allocated value. This function updates the HTX message + accordingly. + + - htx_set_blk_value_len() changes the size of the value. It is the caller + responsibility to change the value itself, make sure there is enough space + and update allocated value. Unlike the function + htx_change_blk_value_len(), this one does not update the HTX message. So + it should be used with caution. + + - htx_cut_data_blk() removes <n> bytes from the beginning of a DATA + block. The block's start address and its length are adjusted, and the + htx's total data count is updated. This is used to mark that part of some + data were transferred from a DATA block without removing this DATA + block. No sanity check is performed, the caller is responsible for doing + this exclusively on DATA blocks, and never removing more than the block's + size. + + - htx_remove_blk() removes a block from an HTX message. It returns the + following block or NULL if it is the tail block. + +Finally, a block may be removed using the function htx_remove_blk(). This +function returns the block following the one removed or NULL if it is the tail +block. + + +4.4. The HTX start-line + +Unlike other HTX blocks, the start-line is a bit special because its payload is +a structure followed by its three parts : + + +--------+-------+-------+-------+ + | HTX_SL | PART1 | PART2 | PART3 | + +--------+-------+-------+-------+ + +Some macros and functions may help to manipulate these parts : + + - HTX_SL_P{N}_LEN() and HTX_SL_P{N}_PTR() are macros to get the length of a + part and a pointer on it. {N} should be 1, 2 or 3. + + - HTX_SL_REQ_MLEN(), HTX_SL_REQ_ULEN(), HTX_SL_REQ_VLEN(), + HTX_SL_REQ_MPTR(), HTX_SL_REQ_UPTR() and HTX_SL_REQ_VPTR() are macros to + get info about a request start-line. These macros only wrap HTX_SL_P* + ones. + + - HTX_SL_RES_VLEN(), HTX_SL_RES_CLEN(), HTX_SL_RES_RLEN(), + HTX_SL_RES_VPTR(), HTX_SL_RES_CPTR() and HTX_SL_RES_RPTR() are macros to + get info about a response start-line. These macros only wrap HTX_SL_P* + ones. + + - htx_sl_p1(), htx_sl_p2() and htx_sl_p2() are functions to get the ist + corresponding to the right part of a start-line. + + - htx_sl_req_meth(), htx_sl_req_uri() and htx_sl_req_vsn() get the ist + corresponding to the right part of a request start-line. + + - htx_sl_res_vsn(), htx_sl_res_code() and htx_sl_res_reason() get the ist + corresponding to the right part of a response start-line. + + +4.5. Iterate on the HTX message + +To iterate on an HTX message, the first thing to do is to get the HTX block to +start the loop. There are three special blocks in an HTX message that may be +good candidates to start a loop : + + - the head block. It is the oldest inserted block. Multiplexers always start + to consume an HTX message from this block. The function htx_get_head() + returns its position and htx_get_head_blk() returns the blocks itself. In + addition, the function htx_get_head_type() returns its block's type. + + - the tail block. It is the newest inserted block. The function + htx_get_tail() returns its position and htx_get_tail_blk() returns the + blocks itself. In addition, the function htx_get_tail_type() returns its + block's type. + + - the first block. It is the block where to (re)start the analyse. It is + used as start point by HTX analyzers. The function htx_get_first() returns + its position and htx_get_first_blk() returns the blocks itself. In + addition, the function htx_get_first_type() returns its block's type. + +For all these functions, if the HTX message is empty, -1 is returned for the +block's position, NULL instead of a block and HTX_BLK_UNUSED for its type. + +Then to iterate on blocks, foreword or backward : + + - htx_get_prev() and htx_get_next() return, respectively, the position of + the previous block or the next block, given a specific position. Or -1 if + an edge is reached. + + - htx_get_prev_blk() and htx_get_next_blk() return, respectively, the + previous block or the next one, given a specific block. Or NULL if an edge + is reached. + +4.6. Access block content and info + +Following functions may be used to retrieve information about a specific HTX +block : + + - htx_get_blk_pos() returns the position of a block. It must be in the HTX + message. + + - htx_get_blk_ptr() returns a pointer on the payload of a block. + + - htx_get_blk_type() returns the type of a block. + + - htx_get_blksz() returns the payload size of a block + + - htx_get_blk_name() returns the name of a block, only if it is a header or + a trailer. Otherwise, it returns an empty string. + + - htx_get_blk_value() returns the value of a block, depending on its + type. For header and trailer blocks, it is the value field. For markers + (EOH or EOT), an empty string is returned. For other blocks an ist + pointing on the block payload is returned. + + - htx_is_unique_blk() may be used to know if a block is the only one + remaining inside an HTX message, excluding unused blocks. This function is + pretty useful to determine the end of a HTX message, in conjunction with + HTX_FL_EOM flag. + +4.7. Advanced functions + +Some more advanced functions may be used to do complex processing on the HTX +message. These functions are used by HTX analyzers or by multiplexers. + + - htx_truncate() removes all blocks after the one containing a specific + offset relatively to the head block of the HTX message. If the offset is + inside a DATA block, it is truncated. For all other blocks, the removal + starts to the next block. + + - htx_drain() tries to remove a specific amount of bytes of payload. If the + tail block is a DATA block, it may be truncated if necessary. All other + block are removed at once or kept. This function returns a mixed value, + with the first block not removed, or NULL if everything was removed, and + the amount of data drained. + + - htx_xfer_blks() transfers HTX blocks from an HTX message to another, + stopping on the first block of a specified type or when a specific amount + of bytes, including meta-data, was moved. If the tail block is a DATA + block, it may be partially moved. All other block are transferred at once + or kept. This function returns a mixed value, with the last block moved, + or NULL if nothing was moved, and the amount of data transferred. When + HEADERS or TRAILERS blocks must be transferred, this function transfers + all of them. Otherwise, if it is not possible, it triggers an error. It is + the caller responsibility to transfer all headers or trailers at once. + + - htx_append_msg() append an HTX message to another one. All the message is + copied or nothing. So, if an error occurred, a rollback is performed. This + function returns 1 on success and 0 on error. + + - htx_reserve_max_data() Reserves the maximum possible size for an HTX data + block, by extending an existing one or by creating a new one. It returns a + compound result with the HTX block and the position where new data must be + inserted (0 for a new block). If an error occurs or if there is no space + left, NULL is returned instead of a pointer on an HTX block. + + - htx_find_offset() looks for the HTX block containing a specific offset, + starting at the HTX message's head. The function returns the found HTX + block and the position inside this block where the offset is. If the + offset is outside of the HTX message, NULL is returned. + + - htx_defrag() defragments an HTX message. It removes unused blocks and + unwraps the payloads part. A temporary buffer is used to do so. This + function never fails. A referenced block may be provided. If so, the + corresponding new block is returned. Otherwise, NULL is returned. diff --git a/doc/internals/api/initcalls.txt b/doc/internals/api/initcalls.txt new file mode 100644 index 0000000..30d8737 --- /dev/null +++ b/doc/internals/api/initcalls.txt @@ -0,0 +1,360 @@ +Initialization stages aka how to get your code initialized at the right moment + + +1. Background + +Originally all subsystems were initialized via a dedicated function call +from the huge main() function. Then some code started to become conditional +or a bit more modular and the #ifdef placed there became a mess, resulting +in init code being moved to function constructors in each subsystem's own +file. Then pools of various things were introduced, starting to make the +whole init sequence more complicated due to some forms of internal +dependencies. Later epoll was introduced, requiring a post-fork callback, +and finally threads arrived also requiring some post-thread init/deinit +and allocation, marking the old architecture's last breath. Finally the +whole thing resulted in lots of init code duplication and was simplified +in 1.9 with the introduction of initcalls and initialization stages. + + +2. New architecture + +The new architecture relies on two layers : + - the registration functions + - the INITCALL macros and initialization stages + +The first ones are mostly used to add a callback to a list. The second ones +are used to specify when to call a function. Both are totally independent, +however they are generally combined via another set consisting in the REGISTER +macros which make some registration functions be called at some specific points +during the init sequence. + + +3. Registration functions + +Registration functions never fail. Or more precisely, if they fail it will only +be on out-of-memory condition, and they will cause the process to immediately +exit. As such they do not return any status and the caller doesn't have to care +about their success. + +All available functions are described below in alphanumeric ordering. Please +make sure to respect this ordering when adding new ones. + +- void hap_register_build_opts(const char *str, int must_free) + + This appends the zero-terminated constant string <str> to the list of known + build options that will be reported on the output of "haproxy -vv". A line + feed character ('\n') will automatically be appended after the string when it + is displayed. The <must_free> argument must be zero, unless the string was + allocated by any malloc-compatible function such as malloc()/calloc()/ + realloc()/strdup() or memprintf(), in which case it's better to pass a + non-null value so that the string is freed upon exit. Note that despite the + function's prototype taking a "const char *", the pointer will actually be + cast and freed. The const char* is here to leave more freedom to use consts + when making such options lists. + +- void hap_register_per_thread_alloc(int (*fct)()) + + This adds a call to function <fct> to the list of functions to be called when + threads are started, at the beginning of the polling loop. This is also valid + for the main thread and will be called even if threads are disabled, so that + it is guaranteed that this function will be called in any circumstance. Each + thread will first call all these functions exactly once when it starts. Calls + are serialized by the init_mutex, so that locking is not necessary in these + functions. There is no relation between the thread numbers and the callback + ordering. The function is expected to return non-zero on success, or zero on + failure. A failure will make the process emit a succinct error message and + immediately exit. See also hap_register_per_thread_free() for functions + called after these ones. + +- void hap_register_per_thread_deinit(void (*fct)()); + + This adds a call to function <fct> to the list of functions to be called when + threads are gracefully stopped, at the end of the polling loop. This is also + valid for the main thread and will be called even if threads are disabled, so + that it is guaranteed that this function will be called in any circumstance + if the process experiences a soft stop. Each thread will call this function + exactly once when it stops. However contrary to _alloc() and _init(), the + calls are made without any protection, thus if any shared resource if touched + by the function, the function is responsible for protecting it. The reason + behind this is that such resources are very likely to be still in use in one + other thread and that most of the time the functions will in fact only touch + a refcount or deinitialize their private resources. See also + hap_register_per_thread_free() for functions called after these ones. + +- void hap_register_per_thread_free(void (*fct)()); + + This adds a call to function <fct> to the list of functions to be called when + threads are gracefully stopped, at the end of the polling loop, after all calls + to _deinit() callbacks are done for this thread. This is also valid for the + main thread and will be called even if threads are disabled, so that it is + guaranteed that this function will be called in any circumstance if the + process experiences a soft stop. Each thread will call this function exactly + once when it stops. However contrary to _alloc() and _init(), the calls are + made without any protection, thus if any shared resource if touched by the + function, the function is responsible for protecting it. The reason behind + this is that such resources are very likely to be still in use in one other + thread and that most of the time the functions will in fact only touch a + refcount or deinitialize their private resources. See also + hap_register_per_thread_deinit() for functions called before these ones. + +- void hap_register_per_thread_init(int (*fct)()) + + This adds a call to function <fct> to the list of functions to be called when + threads are started, at the beginning of the polling loop, right after the + list of _alloc() functions. This is also valid for the main thread and will + be called even if threads are disabled, so that it is guaranteed that this + function will be called in any circumstance. Each thread will call this + function exactly once when it starts, and calls are serialized by the + init_mutex which is held over all _alloc() and _init() calls, so that locking + is not necessary in these functions. In other words for all threads but the + current one, the sequence of _alloc() and _init() calls will be atomic. There + is no relation between the thread numbers and the callback ordering. The + function is expected to return non-zero on success, or zero on failure. A + failure will make the process emit a succinct error message and immediately + exit. See also hap_register_per_thread_alloc() for functions called before + these ones. + +- void hap_register_pre_check(int (*fct)()) + + This adds a call to function <fct> to the list of functions to be called at + the step just before the configuration validity checks. This is useful when you + need to create things like it would have been done during the configuration + parsing and where the initialization should continue in the configuration + check. + It could be used for example to generate a proxy with multiple servers using + the configuration parser itself. At this step the trash buffers are allocated. + Threads are not yet started so no protection is required. The function is + expected to return non-zero on success, or zero on failure. A failure will make + the process emit a succinct error message and immediately exit. + +- void hap_register_post_check(int (*fct)()) + + This adds a call to function <fct> to the list of functions to be called at + the end of the configuration validity checks, just at the point where the + program either forks or exits depending whether it's called with "-c" or not. + Such calls are suited for memory allocation or internal table pre-computation + that would preferably not be done on the fly to avoid inducing extra time to + a pure configuration check. Threads are not yet started so no protection is + required. The function is expected to return non-zero on success, or zero on + failure. A failure will make the process emit a succinct error message and + immediately exit. + +- void hap_register_post_deinit(void (*fct)()) + + This adds a call to function <fct> to the list of functions to be called when + freeing the global sections at the end of deinit(), after everything is + stopped. The process is single-threaded at this point, thus these functions + are suitable for releasing configuration elements provided that no other + _deinit() function uses them, i.e. only close/release what is strictly + private to the subsystem. Since such functions are mostly only called during + soft stops (reloads) or failed startups, they tend to experience much less + test coverage than others despite being more exposed, and as such a lot of + care must be taken to test them especially when facing partial subsystem + initializations followed by errors. + +- void hap_register_post_proxy_check(int (*fct)(struct proxy *)) + + This adds a call to function <fct> to the list of functions to be called for + each proxy, after the calls to _post_server_check(). This can allow, for + example, to pre-configure default values for an option in a frontend based on + the "bind" lines or something in a backend based on the "server" lines. It's + worth being aware that such a function must be careful not to waste too much + time in order not to significantly slow down configurations with tens of + thousands of backends. The function is expected to return non-zero on + success, or zero on failure. A failure will make the process emit a succinct + error message and immediately exit. + +- void hap_register_post_server_check(int (*fct)(struct server *)) + + This adds a call to function <fct> to the list of functions to be called for + each server, after the call to check_config_validity(). This can allow, for + example, to preset a health state on a server or to allocate a protocol- + specific memory area. It's worth being aware that such a function must be + careful not to waste too much time in order not to significantly slow down + configurations with tens of thousands of servers. The function is expected + to return non-zero on success, or zero on failure. A failure will make the + process emit a succinct error message and immediately exit. + +- void hap_register_proxy_deinit(void (*fct)(struct proxy *)) + + This adds a call to function <fct> to the list of functions to be called when + freeing the resources during deinit(). These functions will be called as part + of the proxy's resource cleanup. Note that some of the proxy's fields will + already have been freed and others not, so such a function must not use any + information from the proxy that is subject to being released. In particular, + all servers have already been deleted. Since such functions are mostly only + called during soft stops (reloads) or failed startups, they tend to + experience much less test coverage than others despite being more exposed, + and as such a lot of care must be taken to test them especially when facing + partial subsystem initializations followed by errors. It's worth mentioning + that too slow functions could have a significant impact on the configuration + check or exit time especially on large configurations. + +- void hap_register_server_deinit(void (*fct)(struct server *)) + + This adds a call to function <fct> to the list of functions to be called when + freeing the resources during deinit(). These functions will be called as part + of the server's resource cleanup. Note that some of the server's fields will + already have been freed and others not, so such a function must not use any + information from the server that is subject to being released. Since such + functions are mostly only called during soft stops (reloads) or failed + startups, they tend to experience much less test coverage than others despite + being more exposed, and as such a lot of care must be taken to test them + especially when facing partial subsystem initializations followed by errors. + It's worth mentioning that too slow functions could have a significant impact + on the configuration check or exit time especially on large configurations. + + +4. Initialization stages + +In order to offer some guarantees, the startup of the program is split into +several stages. Some callbacks can be placed into each of these stages using +an INITCALL macro, with 0 to 3 arguments, respectively called INITCALL0 to +INITCALL3. These macros must be placed anywhere at the top level of a C file, +preferably at the end so that the referenced symbols have already been met, +but it may also be fine to place them right after the callbacks themselves. + +Such callbacks are referenced into small structures containing a pointer to the +function and 3 arguments. NULL replaces unused arguments. The callbacks are +cast to (void (*)(void *, void *, void *)) and the arguments to (void *). + +The first argument to the INITCALL macro is the initialization stage. The +second one is the callback function, and others if any are the arguments. +The init stage must be among the values of the "init_stage" enum, currently, +and in this execution order: + + - STG_PREPARE : used to preset variables, pre-initialize lookup tables and + pre-initialize list heads + - STG_LOCK : used to pre-initialize locks + - STG_REGISTER : used to register static lists such as keywords + - STG_ALLOC : used to allocate the required structures + - STG_POOL : used to create pools + - STG_INIT : used to initialize subsystems + +Each stage is guaranteed that previous stages have successfully completed. This +means that an INITCALL placed at stage STG_INIT is guaranteed that all pools +were already created and will be usable. Conversely, an INITCALL placed at +stage STG_REGISTER must not rely on any field that requires preliminary +allocation nor initialization. A callback cannot rely on other callbacks of the +same stage, as the execution order within a stage is undefined and essentially +depends on the linking order. + +The STG_REGISTER level is made for run-time linking of the various modules that +compose the executable. Keywords, protocols and various other elements that are +local known to each compilation unit can will be appended into common lists at +boot time. This is why this call is placed just before STG_ALLOC. + +Example: register a very early call to init_log() with no argument, and another + call to cli_register_kw(&cli_kws) much later: + + INITCALL0(STG_PREPARE, init_log); + INITCALL1(STG_REGISTER, cli_register_kw, &cli_kws); + +Technically speaking, each call to such a macro adds a distinct local symbol +whose dynamic name involves the line number. These symbols are placed into a +separate section and the beginning and end section pointers are provided by the +linker. When too old a linker is used, a fallback is applied consisting in +placing them into a linked list which is built by a constructor function for +each initcall (this takes more room). + +Due to the symbols internally using the line number, it is very important not +to place more than one INITCALL per line in the source file. + +It is also strongly recommended that functions and referenced arguments are +static symbols local to the source file, unless they are global registration +functions like in the example above with cli_register_kw(), where only the +argument is a local keywords table. + +INITCALLs do not expect the callback function to return anything and as such +do not perform any error check. As such, they are very similar to constructors +offered by the compiler except that they are segmented in stages. It is thus +the responsibility of the called functions to perform their own error checking +and to exit in case of error. This may change in the future. + + +5. REGISTER family of macros + +The association of INITCALLs and registration functions allows to perform some +early dynamic registration of functions to be used anywhere, as well as values +to be added to existing lists without having to manipulate list elements. For +the sake of simplification, these combinations are available as a set of +REGISTER macros which register calls to certain functions at the appropriate +init stage. Such macros must be used at the top level in a file, just like +INITCALL macros. The following macros are currently supported. Please keep them +alphanumerically ordered: + +- REGISTER_BUILD_OPTS(str) + + Adds the constant string <str> to the list of build options. This is done by + registering a call to hap_register_build_opts(str, 0) at stage STG_REGISTER. + The string will not be freed. + +- REGISTER_CONFIG_POSTPARSER(name, parser) + + Adds a call to function <parser> at the end of the config parsing. The + function is called at the very end of check_config_validity() and may be used + to initialize a subsystem based on global settings for example. This is done + by registering a call to cfg_register_postparser(name, parser) at stage + STG_REGISTER. + +- REGISTER_CONFIG_SECTION(name, parse, post) + + Registers a new config section name <name> which will be parsed by function + <parse> (if not null), and with an optional call to function <post> at the + end of the section. Function <parse> must be of type (int (*parse)(const char + *file, int linenum, char **args, int inv)), and returns 0 on success or an + error code among the ERR_* set on failure. The <post> callback takes no + argument and returns a similar error code. This is achieved by registering a + call to cfg_register_section() with the three arguments at stage + STG_REGISTER. + +- REGISTER_PER_THREAD_ALLOC(fct) + + Registers a call to register_per_thread_alloc(fct) at stage STG_REGISTER. + +- REGISTER_PER_THREAD_DEINIT(fct) + + Registers a call to register_per_thread_deinit(fct) at stage STG_REGISTER. + +- REGISTER_PER_THREAD_FREE(fct) + + Registers a call to register_per_thread_free(fct) at stage STG_REGISTER. + +- REGISTER_PER_THREAD_INIT(fct) + + Registers a call to register_per_thread_init(fct) at stage STG_REGISTER. + +- REGISTER_POOL(ptr, name, size) + + Used internally to declare a new pool. This is made by calling function + create_pool_callback() with these arguments at stage STG_POOL. Do not use it + directly, use either DECLARE_POOL() or DECLARE_STATIC_POOL() instead. + +- REGISTER_PRE_CHECK(fct) + + Registers a call to register_pre_check(fct) at stage STG_REGISTER. + +- REGISTER_POST_CHECK(fct) + + Registers a call to register_post_check(fct) at stage STG_REGISTER. + +- REGISTER_POST_DEINIT(fct) + + Registers a call to register_post_deinit(fct) at stage STG_REGISTER. + +- REGISTER_POST_PROXY_CHECK(fct) + + Registers a call to register_post_proxy_check(fct) at stage STG_REGISTER. + +- REGISTER_POST_SERVER_CHECK(fct) + + Registers a call to register_post_server_check(fct) at stage STG_REGISTER. + +- REGISTER_PROXY_DEINIT(fct) + + Registers a call to register_proxy_deinit(fct) at stage STG_REGISTER. + +- REGISTER_SERVER_DEINIT(fct) + + Registers a call to register_server_deinit(fct) at stage STG_REGISTER. + diff --git a/doc/internals/api/ist.txt b/doc/internals/api/ist.txt new file mode 100644 index 0000000..0f118d6 --- /dev/null +++ b/doc/internals/api/ist.txt @@ -0,0 +1,167 @@ +2021-11-08 - Indirect Strings (IST) API + + +1. Background +------------- + +When parsing traffic, most of the standard C string functions are unusable +since they rely on a trailing zero. In addition, for the rare ones that support +a length, we have to constantly maintain both the pointer and the length. But +then, it's easy to come up with complex lengths and offsets calculations all +over the place, rendering the code hard to read and bugs hard to avoid or spot. + +IST provides a solution to this by defining a structure made of exactly two +word size elements, that most C ABIs know how to handle as a register when +used as a function argument or a function's return value. The functions are +inlined to leave a maximum set of opportunities to the compiler or optimization +and expression reduction, and as a result they are often inexpensive to use. It +is important however to keep in mind that all of these are designed for minimal +code size when dealing with short strings (i.e. parsing tokens in protocols), +and they are not optimal for processing large blocks. + + +2. API description +------------------ + +IST are defined like this: + + struct ist { + char *ptr; // pointer to the string's first byte + size_t len; // number of valid bytes starting from ptr + }; + +A string is not set if its ->ptr member is NULL. In this case .len is undefined +and is recommended to be zero. + +Declaring a function returning an IST: + + struct ist produce_ist(int ok) + { + return ok ? IST("OK") : IST("KO"); + } + +Declaring a function consuming an IST: + + void say_ist(struct ist i) + { + write(1, istptr(i), istlen(i)); + } + +Chaining the two: + + void say_ok(int ok) + { + say_ist(produce_ist(ok)); + } + +Notes: + - the arguments are passed as value, not reference, so there's no need for + any "const" in their declaration (except to catch coding mistakes). + Pointers to ist may benefit from being marked "const" however. + + - similarly for the return value, there's no point is marking it "const" as + this would protect the pointer and length, not the data. + + - use ist0() to append a trailing zero to a variable string for use with + printf()'s "%s" format, or for use with functions that work on NUL- + terminated strings, but beware of not doing this with constants. + + - the API provides a starting pointer and current length, but does not + provide an allocated size. It remains up to the caller to know how large + the allocated area is when adding data, though most functions make this + easy. + +The following macros and functions are defined. Those whose name starts with +underscores require special care and must not be used without being certain +they are properly used (typically subject to buffer overflows if misused). Note +that most functions were added over time depending on instant needs, and some +are very close to each other. Many useful functions are still missing and would +deserve being added. + +Below, arguments "i1","i2" are all of type "ist". Arguments "s" are +NUL-terminated strings of type "char*", and "cs" are of type "const char *". +Arguments "c" are of type "char", and "n" are of type size_t. + + IST(cs):ist make constant IST from a NUL-terminated const string + IST_NULL:ist return an unset IST = ist2(NULL,0) + __istappend(i1,c):ist append character <c> at the end of ist <i1> + ist(s):ist return an IST from a nul-terminated string + ist0(i1):char* write a \0 at the end of an IST, return the string + ist2(cs,l):ist return a variable IST from a const string and length + ist2bin(s,i1):ist copy IST into a buffer, return the result + ist2bin_lc(s,i1):ist like ist2bin() but turning turning to lower case + ist2bin_uc(s,i1):ist like ist2bin() but turning turning to upper case + ist2str(s,i1):ist copy IST into a buffer, add NUL and return the result + ist2str_lc(s,i1):ist like ist2str() but turning turning to lower case + ist2str_uc(s,i1):ist like ist2str() but turning turning to upper case + ist_find(i1,c):ist return first occurrence of char <c> in <i1> + ist_find_ctl(i1):char* return pointer to first CTL char in <i1> or NULL + ist_skip(i1,c):ist return first occurrence of char not <c> in <i1> + istadv(i1,n):ist advance the string by <n> characters + istalloc(n):ist return allocated string of zero initial length + istcat(d,s,n):ssize_t copy <s> after <d> for <n> chars max, return len or -1 + istchr(i1,c):char* return pointer to first occurrence of <c> in <i1> + istclear(i1*):size_t return previous size and set size to zero + istcpy(d,s,n):ssize_t copy <s> over <d> for <n> chars max, return len or -1 + istdiff(i1,i2):int return the ordinal difference, like strcmp() + istdup(i1):ist allocate new ist and copy original one into it + istend(i1):char* return pointer to first character after the IST + isteq(i1,i2):int return non-zero if strings are equal + isteqi(i1,i2):int like isteq() but case-insensitive + istfree(i1*) free of allocated <i1>/IST_NULL and set it to IST_NULL + istissame(i1,i2):int return true if pointers and lengths are equal + istist(i1,i2):ist return first occurrence of <i2> in <i1> + istlen(i1):size_t return the length of the IST (number of characters) + istmatch(i1,i2):int return non-zero if i1 starts like i2 (empty OK) + istmatchi(i1,i2):int like istmatch() but case insensitive + istneq(i1,i2,n):int like isteq() but limited to the first <n> chars + istnext(i1):ist return the IST advanced by one character + istnmatch(i1,i2,n):int like istmatch() but limited to the first <n> chars + istpad(s,i1):ist copy IST into a buffer, add a NUL, return the result + istptr(i1):char* return the starting pointer of the IST + istscat(d,s,n):ssize_t same as istcat() but always place a NUL at the end + istscpy(d,s,n):ssize_t same as istcpy() but always place a NUL at the end + istshift(i1*):char return the first character and advance the IST by one + istsplit(i1*,c):ist return part before <c>, make ist start from <c> + iststop(i1,c):ist truncate ist before first occurrence of <c> + isttest(i1):int return true if ist is not NULL, false otherwise + isttrim(i1,n):ist return ist trimmed to no more than <n> characters + istzero(i1,n):ist trim to <n> chars, trailing zero included. + + +3. Quick index by typical C construct or function +------------------------------------------------- + +Some common C constructs may be adjusted to use ist instead. The mapping is not +always one-to-one, but usually the computations on the length part tends to +disappear in the refactoring, allowing to directly chain function calls. The +entries below are hints to figure what function to look for in order to rewrite +some common use cases. + + char* IST equivalent + + strchr() istchr(), ist_find(), iststop() + strstr() istist() + strcpy() istcpy() + strscpy() istscpy() + strlcpy() istscpy() + strcat() istcat() + strscat() istscat() + strlcat() istscat() + strcmp() istdiff() + strdup() istdup() + !strcmp() isteq() + !strncmp() istneq(), istmatch(), istnmatch() + !strcasecmp() isteqi() + !strncasecmp() istneqi(), istmatchi() + strtok() istsplit() + return NULL return IST_NULL + s = malloc() s = istalloc() + free(s); s = NULL istfree(&s) + p != NULL isttest(p) + c = *(p++) c = istshift(p) + *(p++) = c __istappend(p, c) + p += n istadv(p, n) + p + strlen(p) istend(p) + p[max] = 0 isttrim(p, max) + p[max+1] = 0 istzero(p, max) diff --git a/doc/internals/api/layers.txt b/doc/internals/api/layers.txt new file mode 100644 index 0000000..b5c35f4 --- /dev/null +++ b/doc/internals/api/layers.txt @@ -0,0 +1,190 @@ +2022-05-27 - Stream layers in HAProxy 2.6 + + +1. Background + +There are streams at plenty of levels in haproxy, essentially due to the +introduction of multiplexed protocols which provide high-level streams on top +of low-level streams, themselves either based on stream-oriented protocols or +datagram-oriented protocols. + +The refactoring of the appctx and muxes that allowed to drop a lot of duplicate +code between 2.5 and 2.6-dev6 raised another concern with some entities like +"conn_stream" that were not specific to connections anymore, "endpoints" that +became entities on their own, and "targets" whose life had been extended to +last all along a connection. + +It was time to rename all such legacy entities introduced in 1.8 and which had +turned particularly confusing over time as their roles evolved. + + +2. Naming principles + +The global renaming of some entities between streams and connections was +articulated around several principles: + + - avoid the confusing use of "context" in shared places. For example, the + endpoint's connection is in "ctx" and nothing makes it obvious that the + endpoint's context is a connection, especially when an applet is there. + + - reserve relative nouns for pointers and not for types. "endpoint", just + like "owner" or "peer" is relative, but when accessed from a different + layer it starts to make no sense at all, or to make one believe it's + something else, particularly with void*. + + - avoid too generic terms that have multiple meanings, or words that are + synonyms in a same place (e.g. "peer" and "remote", or "endpoint" and + "target"). If two synonyms are needed to designate two distinct entities, + there's probably a problem elsewhere, or the problem is poorly defined. + + - make it clearer that all that is manipulated is related to streams. This + particularly important in sample fetch functions for example, which tend + to require low-level access and could be mislead in trying to follow the + wrong chain when trying to get information about a connection. + + - use easily spellable short names that abbreviate unambiguously when used + together in adjacent contexts + + +3. Current state as of 2.6 + +- when a name is required to designate the lower block that starts at the mux + stream or the appctx, it is spoken of as a "stream endpoint", and abbreviated + "se". It's okay because while "endpoint" itself is relative, "stream + endpoint" unequivocally designates one extremity of a stream. If a type is + needed for this in the future (e.g. via obj_type), then the type "stendp" + may be used. Before 2.6-dev6 there was no name for this, it was known as + conn_stream->ctx. + +- the 2.6-dev6 cs_endpoint which preserves the state of a mux stream or an + appctx and abstracts them in front of a conn_stream becomes a "stream + endpoint descriptor", of type "sedesc" and often abbreviated "sd", "sed" + or "ed". Its "target" pointer became "se" as per the rule above. Before + 2.6-dev6, these elements were mixed with others inside conn_stream. From + the appctx it's called "sedesc" (few occurrences hence long name OK). + +- the conn_stream which is always attached to either a stream or a health check + and that is used to reach a mux or an applet becomes a "stream connector" of + type "stconn", generally abbreviated "sc". Its "endp" pointer becomes + "sedesc" as per the rule above, and that one has a back pointer "sc". The + stream uses "scf" and "scb" as the respective front and back pointers to the + stconns. Prior to 2.6-dev6, these parts were split between conn_stream and + stream_interface. + +- the sedesc's "ctx" which is solely used to store the connection as of now, is + renamed "conn" to void any doubt in the context of applets or even muxes. In + the future the connection should be attached to the "se" instead and this + pointer should disappear (or be recycled for anything else). + +The new 2.6 model looks like this: + + +------------------------+ + | stream or health check | + +------------------------+ + ^ \ scf, scb + / \ + | | + \ / + app \ v + +----------+ + | stconn | + +----------+ + ^ \ sedesc + / \ + . . . . | . . . | . . . . . split point (retries etc) + \ / + sc \ v + +----------+ + flags <--| sedesc | : sedesc : + +----------+ ... +----------+ + conn / ^ \ se ^ \ + +------------+ / / \ | \ + | connection |<--' | | ... OR ... | | + +------------+ \ / \ | + mux| ^ |ctx sd \ v : sedesc \ v + | | | +----------------------+ \ # +----------+ svcctx + | | | | mux stream or appctx | | # | appctx |--. + | | | +----------------------+ | # +----------+ | + | | | ^ | / private # : : | + v | | | v > to the # +----------+ | + mux_ops | | +----------------+ \ mux # | svcctx |<-' + | +---->| mux connection | ) # +----------+ + +------ +----------------+ / # + +Stream descriptors may exist in the following modes: + - .conn = NULL, .se = NULL : backend, not connection attempt yet + - .conn = NULL, .se = <appctx> : frontend or backend, applet + - .conn = <conn>, .se = NULL : backend, connection in progress + - .conn = <conn>, .se = <muxs> : frontend or backend, connected + +Notes: + - for historical reasons (connect, forced protocol upgrades, etc), during a + connection setup or a rule-based protocol upgrade, the connection's "ctx" + may temporarily point to the stconn + + +4. Invariants and cardinalities + +Usually a stream is created from an existing stconn from a mux or some applets, +but may also be allocated first by other applets schedulers. After stream_new() +a stream always has exactly one stconn per side (scf, scb), each of which has +one ->sedesc. Each side is initialized with either one or no stream endpoint +attached to the descriptor. + +Both applets and a mux stream always have a stream endpoint descriptor. AS SUCH +IT IS NEVER NECESSARY TO TEST FOR THE EXISTENCE OF THE SEDESC FROM ANY SIDE, IT +ALWAYS EXISTS. This explains why as much as possible it's preferable to use the +sedesc to access flags and statuses from any side, rather than bouncing via the +stconn. + +An applet's app layer is always a stream, which means that there are always +channels accessible above, and there is always an opposite stream connector and +a stream endpoint descriptor. As such, it's always safe for an applet to access +the other side using sc_opposite(). + +When an outgoing connection is in the process of being established, the backend +side sedesc has its ->conn pointer pointing to the pending connection, and no +->se. Once the connection is established and a mux is chosen, it's attached to +the ->se. If an applet is used instead of a mux, the appctx is attached to the +sedesc's ->se and ->conn remains NULL. + +If either side wants to detach from the other, it must allocate a new virgin +sedesc to replace the existing one, and leave the existing one to the endpoint, +since it continues to describe the stream endpoint. The stconn keeps its state +(modulo the updates related to the disconnection). The previous sedesc points +to a NULL stconn. For example, disconnecting from a backend mux will leave the +entities like this: + + +------------------------+ + | stream or health check | + +------------------------+ + ^ \ scf, scb + / \ + | | + \ / + app \ v + +----------+ + | stconn | + +----------+ + ^ \ sedesc + / \ + NULL | | + ^ \ / + sc | / sc \ v + +----------+ / +----------+ + flags <--| sedesc1 | . . . . . | sedesc2 |--> flags + +----------+ / +----------+ + conn / ^ \ se / conn / \ se + +------------+ / / \ | | + | connection |<--' | | v v + +------------+ \ / NULL NULL + mux| ^ |ctx sd \ v + | | | +----------------------+ + | | | | mux stream or appctx | + | | | +----------------------+ + | | | ^ | + v | | | v + mux_ops | | +----------------+ + | +---->| mux connection | + +------ +----------------+ + diff --git a/doc/internals/api/list.txt b/doc/internals/api/list.txt new file mode 100644 index 0000000..d03cf03 --- /dev/null +++ b/doc/internals/api/list.txt @@ -0,0 +1,195 @@ +2021-11-09 - List API + + +1. Background +------------- + +HAProxy's lists are almost all doubly-linked and circular so that it is always +possible to insert at the beginning, append at the end, scan them in any order +and delete any element without having to scan to search the predecessor nor the +successor. + +A list's head is just a regular list element, and an element always points to +another list element. Such elements only have two pointers, the next and the +previous elements. The object being pointed to is retrieved by subtracting the +list element's offset in its structure from the list element's pointer. This +way there is no need for any separate allocation for the list element, for a +pointer to the object in the list, nor for a pointer to the list element from +the object, as the list is embedded into the object. + +All basic operations are provided, as well as some iterators. Some iterators +are safe for removal of the current element within the loop, others not. In any +case a list cannot be freely modified while iterating over it (e.g. the current +element's successor cannot not be freed if it's saved as the restart point). + +Extreme care is taken nowadays in HAProxy to make sure that no dangling +pointers are left in elements, so it is important to always initialize list +heads and list elements, as well as elements that are removed from a list if +they are not immediately freed, so that their deletion is idempotent. A rule of +thumb is that a list pointer's validity never has to be checked, it is always +valid to dereference it. A lot of complex bugs have been caused in the past by +incorrect list manipulation, such as an element being deleted twice, resulting +in damaging previously adjacent elements' neighbours. This usually has serious +consequences at locations that are totally different from the one of the bug, +and that are only detected much later, so it is required to be particularly +strict on using lists safely. + +The lists are not thread-safe, but mt_lists may be used instead. + + +2. API description +------------------ + +A list is defined like this, both for the list's head, and for any other +element: + + struct list { + struct list *n; /* next */ + struct list *p; /* prev */ + }; + +An empty list points to itself for both pointers. I.e. a list's head is both +its own successor and its own predecessor. This guarantees that insertions +and deletions can be done without any check and that deletion is idempotent. +For this reason and by convention, a detached element ought to be represented +like an empty head. + +Lists are manipulated using a set of macros which are used to initialize, add, +remove, or iterate over elements. Most of these macros are extremely simple and +are not even protected against multiple evaluation, so it is fundamentally +important that the expressions used in the arguments are idempotent and that +the result does not depend on the evaluation order of the arguments. + +Macro Description + +ILH + Initialized List Head : this is a non-NULL, non-empty list element used + to prevent the compiler from moving an empty list head declaration to + BSS, typically when it appears in an array of keywords Without this, + some older versions of gcc tend to trim all the array and cause + corruption. + +LIST_INIT(l) + Initialize the list as an empty list head + +LIST_HEAD_INIT(l) + Return a valid initialized empty list head pointing to this + element. Essentially used with assignments in declarations. + +LIST_INSERT(l, e) + Add an element at the beginning of a list and return it + +LIST_APPEND(l, e) + Add an element at the end of a list and return it + +LIST_SPLICE(n, o) + Add the contents of a list <o> at the beginning of another list <n>. + The old list head remains untouched. + +LIST_SPLICE_END_DETACHED(n, o) + Add the contents of a list whose first element is is <o> and last one + is <o->p> at the end of another list <n>. The old list DOES NOT have + any head here. + +LIST_DELETE(e) + Remove an element from a list and return it. Safe to call on + initialized elements, but will not change the element itself so it is + not idempotent. Consider using LIST_DEL_INIT() instead unless called + immediately after a free(). + +LIST_DEL_INIT(e) + Remove an element from a list, initialize it and return it so that a + subsequent LIST_DELETE() is safe. This is faster than performing a + LIST_DELETE() followed by a LIST_INIT() as pointers are not reloaded. + +LIST_ELEM(l, t, m) + Return a pointer of type <t> to a structure containing a list head + member called <m> at address <l>. Note that <l> can be the result of a + function or macro since it's used only once. + +LIST_ISEMPTY(l) + Check if the list head <l> is empty (=initialized) or not, and return + non-zero only if so. + +LIST_INLIST(e) + Check if the list element <e> was added to a list or not, thus return + true unless the element was initialized. + +LIST_INLIST_ATOMIC(e) + Atomically check if the list element's next pointer points to anything + different from itself, implying the element should be part of a + list. This usually is similar to LIST_INLIST() except that while that + one might be instrumented using debugging code to perform further + consistency checks, the macro below guarantees to always perform a + single atomic test and is safe to use with barriers. + +LIST_NEXT(l, t, m) + Return a pointer of type <t> to a structure following the element which + contains list head <l>, which is known as member <m> in struct <t>. + +LIST_PREV(l, t, m) + Return a pointer of type <t> to a structure preceding the element which + contains list head <l>, which is known as member <m> in struct <t>. + Note that this macro is first undefined as it happened to already exist + on some old OSes. + +list_for_each_entry(i, l, m) + Iterate local variable <i> through a list of items of type "typeof(*i)" + which are linked via a "struct list" member named <m>. A pointer to the + head of the list is passed in <l>. No temporary variable is needed. + Note that <i> must not be modified during the loop. + +list_for_each_entry_from(i, l, m) + Same as list_for_each_entry() but starting from current value of <i> + instead of the list's head. + +list_for_each_entry_from_rev(i, l, m) + Same as list_for_each_entry_rev() but starting from current value of <i> + instead of the list's head. + +list_for_each_entry_rev(i, l, m) + Iterate backwards local variable <i> through a list of items of type + "typeof(*i)" which are linked via a "struct list" member named <m>. A + pointer to the head of the list is passed in <l>. No temporary variable + is needed. Note that <i> must not be modified during the loop. + +list_for_each_entry_safe(i, b, l, m) + Iterate variable <i> through a list of items of type "typeof(*i)" which + are linked via a "struct list" member named <m>. A pointer to the head + of the list is passed in <l>. A temporary backup variable <b> of same + type as <i> is needed so that <i> may safely be deleted if needed. Note + that it is only permitted to delete <i> and no other element during + this operation! + +list_for_each_entry_safe_from(i, b, l, m) + Same as list_for_each_entry_safe() but starting from current value of + <i> instead of the list's head. + +list_for_each_entry_safe_from_rev(i, b, l, m) + Same as list_for_each_entry_safe_rev() but starting from current value + of <i> instead of the list's head. + +list_for_each_entry_safe_rev(i, b, l, m) + Iterate backwards local variable <i> through a list of items of type + "typeof(*i)" which are linked via a "struct list" member named <m>. A + pointer to the head of the list is passed in <l>. A temporary variable + <b> of same type as <i> is needed so that <i> may safely be deleted if + needed. Note that it is only permitted to delete <i> and no other + element during this operation! + +3. Notes +-------- + +- This API is quite old and some macros are missing. For example there's still + no list_first() so it's common to use LIST_ELEM(head->n, ...) instead. Some + older parts of the code also used to rely on list_for_each() followed by a + break to stop on the first element. + +- Some parts were recently renamed because LIST_ADD() used to do what + LIST_INSERT() currently does and was often mistaken with LIST_ADDQ() which is + what LIST_APPEND() now is. As such it is not totally impossible that some + places use a LIST_INSERT() where a LIST_APPEND() would be desired. + +- The structure must not be modified at all (even to add debug info). Some + parts of the code assume that its layout is exactly this one, particularly + the parts ensuring the casting between MT lists and lists. diff --git a/doc/internals/api/pools.txt b/doc/internals/api/pools.txt new file mode 100644 index 0000000..2c54409 --- /dev/null +++ b/doc/internals/api/pools.txt @@ -0,0 +1,577 @@ +2022-02-24 - Pools structure and API + +1. Background +------------- + +Memory allocation is a complex problem covered by a massive amount of +literature. Memory allocators found in field cover a broad spectrum of +capabilities, performance, fragmentation, efficiency etc. + +The main difficulty of memory allocation comes from finding the optimal chunks +for arbitrary sized requests, that will still preserve a low fragmentation +level. Doing this well is often expensive in CPU usage and/or memory usage. + +In programs like HAProxy that deal with a large number of fixed size objects, +there is no point having to endure all this risk of fragmentation, and the +associated costs (sometimes up to several milliseconds with certain minimalist +allocators) are simply not acceptable. A better approach consists in grouping +frequently used objects by size, knowing that due to the high repetitiveness of +operations, a freed object will immediately be needed for another operation. + +This grouping of objects by size is what is called a pool. Pools are created +for certain frequently allocated objects, are usually merged together when they +are of the same size (or almost the same size), and significantly reduce the +number of calls to the memory allocator. + +With the arrival of threads, pools started to become a bottleneck so they now +implement an optional thread-local lockless cache. Finally with the arrival of +really efficient memory allocator in modern operating systems, the shared part +has also become optional so that it doesn't consume memory if it does not bring +any value. + +In 2.6-dev2, a number of debugging options that used to be configured at build +time only changed to boot-time and can be modified using keywords passed after +"-dM" on the command line, which sets or clears bits in the pool_debugging +variable. The build-time options still affect the default settings however. +Default values may be consulted using "haproxy -dMhelp". + + +2. Principles +------------- + +The pools architecture is selected at build time. The main options are: + + - thread-local caches and process-wide shared pool enabled (1) + + This is the default situation on most operating systems. Each thread has + its own local cache, and when depleted it refills from the process-wide + pool that avoids calling the standard allocator too often. It is possible + to force this mode at build time by setting CONFIG_HAP_GLOBAL_POOLS or at + boot time with "-dMglobal". + + - thread-local caches only are enabled (2) + + This is the situation on operating systems where a fast and modern memory + allocator is detected and when it is estimated that the process-wide shared + pool will not bring any benefit. This detection is automatic at build time, + but may also be forced at build tmie by setting CONFIG_HAP_NO_GLOBAL_POOLS + or at boot time with "-dMno-global". + + - pass-through to the standard allocator (3) + + This is used when one absolutely wants to disable pools and rely on regular + malloc() and free() calls, essentially in order to trace memory allocations + by call points, either internally via DEBUG_MEM_STATS, or externally via + tools such as Valgrind. This mode of operation may be forced at build time + by setting DEBUG_NO_POOLS or at boot time with "-dMno-cache". + + - pass-through to an mmap-based allocator for debugging (4) + + This is used only during deep debugging when trying to detect various + conditions such as use-after-free. In this case each allocated object's + size is rounded up to a multiple of a page size (4096 bytes) and an + integral number of pages is allocated for each object using mmap(), + surrounded by two unaccessible holes that aim to detect some out-of-bounds + accesses. Released objects are instantly freed using munmap() so that any + immediate subsequent access to the memory area crashes the process if the + area had not been reallocated yet. This mode can be enabled at build time + by setting DEBUG_UAF. It tends to consume a lot of memory and not to scale + at all with concurrent calls, that tends to make the system stall. The + watchdog may even trigger on some slow allocations. + +There are no more provisions for running with a shared pool but no thread-local +cache: the shared pool's main goal is to compensate for the expensive calls to +the memory allocator. This gain may be huge on tiny systems using basic +allocators, but the thread-local cache will already achieve this. And on larger +threaded systems, the shared pool's benefit is visible when the underlying +allocator scales poorly, but in this case the shared pool would suffer from +the same limitations without its thread-local cache and wouldn't provide any +benefit. + +Summary of the various operation modes: + + (1) (2) (3) (4) + + User User User User + | | | | + pool_alloc() V V | | + +---------+ +---------+ | | + | Thread | | Thread | | | + | Local | | Local | | | + | Cache | | Cache | | | + +---------+ +---------+ | | + | | | | + pool_refill*() V | | | + +---------+ | | | + | Shared | | | | + | Pool | | | | + +---------+ | | | + | | | | + malloc() V V V | + +---------+ +---------+ +---------+ | + | Library | | Library | | Library | | + +---------+ +---------+ +---------+ | + | | | | + mmap() V V V V + +---------+ +---------+ +---------+ +---------+ + | OS | | OS | | OS | | OS | + +---------+ +---------+ +---------+ +---------+ + +One extra build define, DEBUG_FAIL_ALLOC, is used to enforce random allocation +failure in pool_alloc() by randomly returning NULL, to test that callers +properly handle allocation failures. It may also be enabled at boot time using +"-dMfail". In this case the desired average rate of allocation failures can be +fixed by global setting "tune.fail-alloc" expressed in percent. + +The thread-local caches contain the freshest objects whose total size amounts +to CONFIG_HAP_POOL_CACHE_SIZE bytes, which is typically was 1MB before 2.6 and +is 512kB after. The aim is to keep hot objects that still fit in the CPU core's +private L2 cache. Once these objects do not fit into the cache anymore, there's +no benefit keeping them local to the thread, so they'd rather be returned to +the shared pool or the main allocator so that any other thread may make use of +them. + + +3. Storage in thread-local caches +--------------------------------- + +This section describes how objects are linked in thread local caches. This is +not meant to be a concern for users of the pools API but it can be useful when +inspecting post-mortem dumps or when trying to figure certain size constraints. + +Objects are stored in the local cache using a doubly-linked list. This ensures +that they can be visited by freshness order like a stack, while at the same +time being able to access them from oldest to newest when it is needed to +evict coldest ones first: + + - releasing an object to the cache always puts it on the top. + + - allocating an object from the cache always takes the topmost one, hence the + freshest one. + + - scanning for older objects to evict starts from the bottom, where the + oldest ones are located + +To that end, each thread-local cache keeps a list head in the "list" member of +its "pool_cache_head" descriptor, that links all objects cast to type +"pool_cache_item" via their "by_pool" member. + +Note that the mechanism described above only works for a single pool. When +trying to limit the total cache size to a certain value, all pools included, +there is also a need to arrange all objects from all pools together in the +local caches. For this, each thread_ctx maintains a list head of recently +released objects, all pools included, in its member "pool_lru_head". All items +in a thread-local cache are linked there via their "by_lru" member. + +This means that releasing an object using pool_free() consists in inserting +it at the beginning of two lists: + - the local pool_cache_head's "list" list head + - the thread context's "pool_lru_head" list head + +Allocating an object consists in picking the first entry from the pool's "list" +and deleting its "by_pool" and "by_lru" links. + +Evicting an object consists in scanning the thread context's "pool_lru_head" +backwards and deleting the object's "by_pool" and "by_lru" links. + +Given that entries are both inserted and removed synchronously, we have the +guarantee that the oldest object in the thread's LRU list is always the oldest +object in its pool, and that the next element is the cache's list head. This is +what allows the LRU eviction mechanism to figure what pool an object belongs to +when releasing it. + +Note: + | Since a pool_cache_item has two list entries, on 64-bit systems it will be + | 32-bytes long. This is the smallest size that a pool may be, and any smaller + | size will automatically be rounded up to this size. + +When build option DEBUG_POOL_INTEGRITY is set, or the boot-time option +"-dMintegrity" is passed on the command line, the area of the object between +the two list elements and the end according to pool->size will be filled with +pseudo-random words during pool_put_to_cache(), and these words will be +compared between each other during pool_get_from_cache(), and the process will +crash in case any bit differs, as this would indicate that the memory area was +modified after the free. The pseudo-random pattern is in fact incremented by +(~0)/3 upon each free so that roughly half of the bits change each time and we +maximize the likelihood of detecting a single bit flip in either direction. In +order to avoid an immediate reuse and maximize the time the object spends in +the cache, when this option is set, objects are picked from the cache from the +oldest one instead of the freshest one. This way even late memory corruptions +have a chance to be detected. + +When build option DEBUG_MEMORY_POOLS is set, or the boot-time option "-dMtag" +is passed on the executable's command line, pool objects are allocated with +one extra pointer compared to the requested size, so that the bytes that follow +the memory area point to the pool descriptor itself as long as the object is +allocated via pool_alloc(). Upon releasing via pool_free(), the pointer is +compared and the code will crash in if it differs. This allows to detect both +memory overflows and object released to the wrong pool (code bug resulting from +a copy-paste error typically). + +Thus an object will look like this depending whether it's in the cache or is +currently in use: + + in cache in use + +------------+ +------------+ + <--+ by_pool.p | | N bytes | + | by_pool.n +--> | | + +------------+ |N=16 min on | + <--+ by_lru.p | | 32-bit, | + | by_lru.n +--> | 32 min on | + +------------+ | 64-bit | + : : : : + | N bytes | | | + +------------+ +------------+ \ optional, only if + : (unused) : : pool ptr : > DEBUG_MEMORY_POOLS + +------------+ +------------+ / is set at build time + or -dMtag at boot time + +Right now no provisions are made to return objects aligned on larger boundaries +than those currently covered by malloc() (i.e. two pointers). This need appears +from time to time and the layout above might evolve a little bit if needed. + + +4. Storage in the process-wide shared pool +------------------------------------------ + +In order for the shared pool not to be a contention point in a multi-threaded +environment, objects are allocated from or released to shared pools by clusters +of a few objects at once. The maximum number of objects that may be moved to or +from a shared pool at once is defined by CONFIG_HAP_POOL_CLUSTER_SIZE at build +time, and currently defaults to 8. + +In order to remain scalable, the shared pool has to make some tradeoffs to +limit the number of atomic operations and the duration of any locked operation. +As such, it's composed of a single-linked list of clusters, themselves made of +a single-linked list of objects. + +Clusters and objects are of the same type "pool_item" and are accessed from the +pool's "free_list" member. This member points to the latest pool_item inserted +into the pool by a release operation. And the pool_item's "next" member points +to the next pool_item, which was the one present in the pool's free_list just +before the pool_item was inserted, and the last pool_item in the list simply +has a NULL "next" field. + +The pool_item's "down" pointer points down to the next objects part of the same +cluster, that will be released or allocated at the same time as the first one. +Each of these items also has a NULL "next" field, and are chained by their +respective "down" pointers until the last one is detected by a NULL value. + +This results in the following layout: + + pool pool_item pool_item pool_item + +-----------+ +------+ +------+ +------+ + | free_list +--> | next +--> | next +--> | NULL | + +-----------+ +------+ +------+ +------+ + | down | | NULL | | down | + +--+---+ +------+ +--+---+ + | | + V V + +------+ +------+ + | NULL | | NULL | + +------+ +------+ + | down | | NULL | + +--+---+ +------+ + | + V + +------+ + | NULL | + +------+ + | NULL | + +------+ + +Allocating an entry is only a matter of performing two atomic allocations on +the free_list and reading the pool's "next" value: + + - atomically mark the free_list as being updated by writing a "magic" pointer + - read the first pool_item's "next" field + - atomically replace the free_list with this value + +This results in a fast operation that instantly retrieves a cluster at once. +Then outside of the critical section entries are walked over and inserted into +the local cache one at a time. In order to keep the code simple and efficient, +objects allocated from the shared pool are all placed into the local cache, and +only then the first one is allocated from the cache. This operation is +performed by the dedicated function pool_refill_local_from_shared() which is +called from pool_get_from_cache() when the cache is empty. It means there is an +overhead of two list insert/delete operations for the first object and that +could be avoided at the expense of more complex code in the fast path, but this +is negligible since it only concerns objects that need to be visited anyway. + +Freeing a group of objects consists in performing the operation the other way +around: + + - atomically mark the free_list as being updated by writing a "magic" pointer + - write the free_list value to the to-be-released item's "next" entry + - atomically replace the free_list with the pool_item's pointer + +The cluster will simply have to be prepared before being sent to the shared +pool. The operation of releasing a cluster at once is performed by function +pool_put_to_shared_cache() which is called from pool_evict_last_items() which +itself is responsible for building the clusters. + +Due to the way objects are stored, it is important to try to group objects as +much as possible when releasing them because this is what will condition their +retrieval as groups as well. This is the reason why pool_evict_last_items() +uses the LRU to find a first entry but tries to pick several items at once from +a single cache. Tests have shown that CONFIG_HAP_POOL_CLUSTER_SIZE set to 8 +achieves up to 6-6.5 objects on average per operation, which effectively +divides by as much the average time spent per object by each thread and pushes +the contention point further. + +Also, grouping items in clusters is a property of the process-wide shared pool +and not of the thread-local caches. This means that there is no grouped +operation when not using the shared pool (mode "2" in the diagram above). + + +5. API +------ + +The following functions are public and available for user code: + +struct pool_head *create_pool(char *name, uint size, uint flags) + Create a new pool named <name> for objects of size <size> bytes. Pool + names are truncated to their first 11 characters. Pools of very similar + size will usually be merged if both have set the flag MEM_F_SHARED in + <flags>. When DEBUG_DONT_SHARE_POOLS was set at build time, or + "-dMno-merge" is passed on the executable's command line, the pools + also need to have the exact same name to be merged. In addition, unless + MEM_F_EXACT is set in <flags>, the object size will usually be rounded + up to the size of pointers (16 or 32 bytes). The name that will appear + in the pool upon merging is the name of the first created pool. The + returned pointer is the new (or reused) pool head, or NULL upon error. + Pools created this way must be destroyed using pool_destroy(). + +void *pool_destroy(struct pool_head *pool) + Destroy pool <pool>, that is, all of its unused objects are freed and + the structure is freed as well if the pool didn't have any used objects + anymore. In this case NULL is returned. If some objects remain in use, + the pool is preserved and its pointer is returned. This ought to be + used essentially on exit or in rare situations where some internal + entities that hold pools have to be destroyed. + +void pool_destroy_all(void) + Destroy all pools, without checking which ones still have used entries. + This is only meant for use on exit. + +void *__pool_alloc(struct pool_head *pool, uint flags) + Allocate an entry from the pool <pool>. The allocator will first look + for an object in the thread-local cache if enabled, then in the shared + pool if enabled, then will fall back to the operating system's default + allocator. NULL is returned if the object couldn't be allocated (due to + configured limits or lack of memory). Object allocated this way have to + be released using pool_free(). Like with malloc(), by default the + contents of the returned object are undefined. If memory poisonning is + enabled, the object will be filled with the poisonning byte. If the + global "pool.fail-alloc" setting is non-zero and DEBUG_FAIL_ALLOC is + enabled, a random number generator will be called to randomly return a + NULL. The allocator's behavior may be adjusted using a few flags passed + in <flags>: + - POOL_F_NO_POISON : when set, disables memory poisonning (e.g. when + pointless and expensive, like for buffers) + - POOL_F_MUST_ZERO : when set, the memory area will be zeroed before + being returned, similar to what calloc() does + - POOL_F_NO_FAIL : when set, disables the random allocation failure, + e.g. for use during early init code or critical sections. + +void *pool_alloc(struct pool_head *pool) + This is an exact equivalent of __pool_alloc(pool, 0). It is the regular + way to allocate entries from a pool. + +void *pool_alloc_nocache(struct pool_head *pool) + Allocate an entry from the pool <pool>, bypassing the cache. If shared + pools are enabled, they will be consulted first. Otherwise the object + is allocated using the operating system's default allocator. This is + essentially used during early boot to pre-allocate a number of objects + for pools which require a minimum number of entries to exist. + +void *pool_zalloc(struct pool_head *pool) + This is an exact equivalent of __pool_alloc(pool, POOL_F_MUST_ZERO). + +void pool_free(struct pool_head *pool, void *ptr) + Free an entry allocate from one of the pool_alloc() functions above + from pool <pool>. The object will be placed into the thread-local cache + if enabled, or in the shared pool if enabled, or will be released using + the operating system's default allocator. When a local cache is + enabled, if the local cache size becomes larger than 75% of the maximum + size configured at build time, some objects will be evicted to the + shared pool. Such objects are taken first from the same pool, but if + the total size is really huge, other pools might be checked as well. + Some extra checks enabled at build time may enforce extra checks so + that the process will immediately crash if the object was not allocated + from this pool or experienced an overflow or some memory corruption. + +void pool_flush(struct pool_head *pool) + Free all unused objects from shared pool <pool>. Thread-local caches + are not affected. This is essentially used when running low on memory + or when stopping, in order to release a maximum amount of memory for + the new process. + +void pool_gc(struct pool_head *pool) + Free all unused objects from all pools, but respecting the minimum + number of spare objects required for each of them. Then, for operating + systems which support it, indicate the system that all unused memory + can be released. Thread-local caches are not affected. This operation + differs from pool_flush() in that it is run locklessly, under thread + isolation, and on all pools in a row. It is called by the SIGQUIT + signal handler and upon exit. Note that the obsolete argument <pool> is + not used and the convention is to pass NULL there. + +void dump_pools_to_trash(void) + Dump the current status of all pools into the trash buffer. This is + essentially used by the "show pools" CLI command or the SIGQUIT signal + handler to dump them on stderr. The total report size may not exceed + the size of the trash buffer. If it does, some entries will be missing. + +void dump_pools(void) + Dump the current status of all pools to stderr. This just calls + dump_pools_to_trash() and writes the trash to stderr. + +int pool_total_failures(void) + Report the total number of failed allocations. This is solely used to + report the "PoolFailed" metrics of the "show info" output. The total + is calculated on the fly by summing the number of failures in all pools + and is only meant to be used as an indicator rather than a precise + measure. + +ullong pool_total_allocated(void) + Report the total number of bytes allocated in all pools, for reporting + in the "PoolAlloc_MB" field of the "show info" output. The total is + calculated on the fly by summing the number of allocated bytes in all + pools and is only meant to be used as an indicator rather than a + precise measure. + +ullong pool_total_used(void) + Report the total number of bytes used in all pools, for reporting in + the "PoolUsed_MB" field of the "show info" output. The total is + calculated on the fly by summing the number of used bytes in all pools + and is only meant to be used as an indicator rather than a precise + measure. Note that objects present in caches are accounted as used. + +Some other functions exist and are only used by the pools code itself. While +not strictly forbidden to use outside of this code, it is generally recommended +to avoid touching them in order not to create undesired dependencies that will +complicate maintenance. + +A few macros exist to ease the declaration of pools: + +DECLARE_POOL(ptr, name, size) + Placed at the top level of a file, this declares a global memory pool + as variable <ptr>, name <name> and size <size> bytes per element. This + is made via a call to REGISTER_POOL() and by assigning the resulting + pointer to variable <ptr>. <ptr> will be created of type "struct + pool_head *". If the pool needs to be visible outside of the function + (which is likely), it will also need to be declared somewhere as + "extern struct pool_head *<ptr>;". It is recommended to place such + declarations very early in the source file so that the variable is + already known to all subsequent functions which may use it. + +DECLARE_STATIC_POOL(ptr, name, size) + Placed at the top level of a file, this declares a static memory pool + as variable <ptr>, name <name> and size <size> bytes per element. This + is made via a call to REGISTER_POOL() and by assigning the resulting + pointer to local variable <ptr>. <ptr> will be created of type "static + struct pool_head *". It is recommended to place such declarations very + early in the source file so that the variable is already known to all + subsequent functions which may use it. + + +6. Build options +---------------- + +A number of build-time defines allow to tune the pools behavior. All of them +have to be enabled using "-Dxxx" or "-Dxxx=yyy" in the makefile's DEBUG +variable. + +DEBUG_NO_POOLS + When this is set, pools are entirely disabled, and allocations are made + using malloc() instead. This is not recommended for production but may + be useful for tracing allocations. It corresponds to "-dMno-cache" at + boot time. + +DEBUG_MEMORY_POOLS + When this is set, an extra pointer is allocated at the end of each + object to reference the pool the object was allocated from and detect + buffer overflows. Then, pool_free() will provoke a crash in case it + detects an anomaly (pointer at the end not matching the pool). It + corresponds to "-dMtag" at boot time. + +DEBUG_FAIL_ALLOC + When enabled, a global setting "tune.fail-alloc" may be set to a non- + zero value representing a percentage of memory allocations that will be + made to fail in order to stress the calling code. It corresponds to + "-dMfail" at boot time. + +DEBUG_DONT_SHARE_POOLS + When enabled, pools of similar sizes are not merged unless the have the + exact same name. It corresponds to "-dMno-merge" at boot time. + +DEBUG_UAF + When enabled, pools are disabled and all allocations and releases pass + through mmap() and munmap(). The memory usage significantly inflates + and the performance degrades, but this allows to detect a lot of + use-after-free conditions by crashing the program at the first abnormal + access. This should not be used in production. + +DEBUG_POOL_INTEGRITY + When enabled, objects picked from the cache are checked for corruption + by comparing their contents against a pattern that was placed when they + were inserted into the cache. Objects are also allocated in the reverse + order, from the oldest one to the most recent, so as to maximize the + ability to detect such a corruption. The goal is to detect writes after + free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF + this cannot detect reads after free, but may possibly detect later + corruptions and will not consume extra memory. The CPU usage will + increase a bit due to the cost of filling/checking the area and for the + preference for cold cache instead of hot cache, though not as much as + with DEBUG_UAF. This option is meant to be usable in production. It + corresponds to boot-time options "-dMcold-first,integrity". + +DEBUG_POOL_TRACING + When enabled, the callers of pool_alloc() and pool_free() will be + recorded into an extra memory area placed after the end of the object. + This may only be required by developers who want to get a few more + hints about code paths involved in some crashes, but will serve no + purpose outside of this. It remains compatible (and completes well) + DEBUG_POOL_INTEGRITY above. Such information become meaningless once + the objects leave the thread-local cache. It corresponds to boot-time + option "-dMcaller". + +DEBUG_MEM_STATS + When enabled, all malloc/calloc/realloc/strdup/free calls are accounted + for per call place (file+line number), and may be displayed or reset on + the CLI using "debug dev memstats". This is essentially used to detect + potential leaks or abnormal usages. When pools are enabled (default), + such calls are rare and the output will mostly contain calls induced by + libraries. When pools are disabled, about all calls to pool_alloc() and + pool_free() will also appear since they will be remapped to standard + functions. + +CONFIG_HAP_GLOBAL_POOLS + When enabled, process-wide shared pools will be forcefully enabled even + if not considered useful on the platform. The default is to let haproxy + decide based on the OS and C library. It corresponds to boot-time + option "-dMglobal". + +CONFIG_HAP_NO_GLOBAL_POOLS + When enabled, process-wide shared pools will be forcefully disabled + even if considered useful on the platform. The default is to let + haproxy decide based on the OS and C library. It corresponds to + boot-time option "-dMno-global". + +CONFIG_HAP_POOL_CACHE_SIZE + This allows one to define the size of the per-thread cache, in bytes. + The default value is 512 kB (524288). Smaller values will use less + memory at the expense of a possibly higher CPU usage when using many + threads. Higher values will give diminishing returns on performance + while using much more memory. Usually there is no benefit in using + more than a per-core L2 cache size. It would be better not to set this + value lower than a few times the size of a buffer (bufsize, defaults to + 16 kB). + +CONFIG_HAP_POOL_CLUSTER_SIZE + This allows one to define the maximum number of objects that will be + groupped together in an allocation from the shared pool. Values 4 to 8 + have experimentally shown good results with 16 threads. On systems with + more cores or loosely coupled caches exhibiting slow atomic operations, + it could possibly make sense to slightly increase this value. diff --git a/doc/internals/api/scheduler.txt b/doc/internals/api/scheduler.txt new file mode 100644 index 0000000..3469543 --- /dev/null +++ b/doc/internals/api/scheduler.txt @@ -0,0 +1,226 @@ +2021-11-17 - Scheduler API + + +1. Background +------------- + +The scheduler relies on two major parts: + - the wait queue or timers queue, which contains an ordered tree of the next + timers to expire + + - the run queue, which contains tasks that were already woken up and are + waiting for a CPU slot to execute. + +There are two types of schedulable objects in HAProxy: + - tasks: they contain one timer and can be in the run queue without leaving + their place in the timers queue. + + - tasklets: they do not have the timers part and are either sleeping or + running. + +Both the timers queue and run queue in fact exist both shared between all +threads and per-thread. A task or tasklet may only be queued in a single of +each at a time. The thread-local queues are not thread-safe while the shared +ones are. This means that it is only permitted to manipulate an object which +is in the local queue or in a shared queue, but then after locking it. As such +tasks and tasklets are usually pinned to threads and do not move, or only in +very specific ways not detailed here. + +In case of doubt, keep in mind that it's not permitted to manipulate another +thread's private task or tasklet, and that any task held by another thread +might vanish while it's being looked at. + +Internally a large part of the task and tasklet struct is shared between +the two types, which reduces code duplication and eases the preservation +of fairness in the run queue by interleaving all of them. As such, some +fields or flags may not always be relevant to tasklets and may be ignored. + + +Tasklets do not use a thread mask but use a thread ID instead, to which they +are bound. If the thread ID is negative, the tasklet is not bound but may only +be run on the calling thread. + + +2. API +------ + +There are few functions exposed by the scheduler. A few more ones are in fact +accessible but if not documented there they'd rather be avoided or used only +when absolutely certain they're suitable, as some have delicate corner cases. +In doubt, checking the sched.pdf diagram may help. + +int total_run_queues() + Return the approximate number of tasks in run queues. This is racy + and a bit inaccurate as it iterates over all queues, but it is + sufficient for stats reporting. + +int task_in_rq(t) + Return non-zero if the designated task is in the run queue (i.e. it was + already woken up). + +int task_in_wq(t) + Return non-zero if the designated task is in the timers queue (i.e. it + has a valid timeout and will eventually expire). + +int thread_has_tasks() + Return non-zero if the current thread has some work to be done in the + run queue. This is used to decide whether or not to sleep in poll(). + +void task_wakeup(t, f) + Will make sure task <t> will wake up, that is, will execute at least + once after the start of the function is called. The task flags <f> will + be ORed on the task's state, among TASK_WOKEN_* flags exclusively. In + multi-threaded environments it is safe to wake up another thread's task + and even if the thread is sleeping it will be woken up. Users have to + keep in mind that a task running on another thread might very well + finish and go back to sleep before the function returns. It is + permitted to wake the current task up, in which case it will be + scheduled to run another time after it returns to the scheduler. + +struct task *task_unlink_wq(t) + Remove the task from the timers queue if it was in it, and return it. + It may only be done for the local thread, or for a shared thread that + might be in the shared queue. It must not be done for another thread's + task. + +void task_queue(t) + Place or update task <t> into the timers queue, where it may already + be, scheduling it for an expiration at date t->expire. If t->expire is + infinite, nothing is done, so it's safe to call this function without + prior checking the expiration date. It is only valid to call this + function for local tasks or for shared tasks who have the calling + thread in their thread mask. + +void task_set_affinity(t, m) + Change task <t>'s thread_mask to new value <m>. This may only be + performed by the task itself while running. This is only used to let a + task voluntarily migrate to another thread. + +void tasklet_wakeup(tl) + Make sure that tasklet <tl> will wake up, that is, will execute at + least once. The tasklet will run on its assigned thread, or on any + thread if its TID is negative. + +void tasklet_wakeup_on(tl, thr) + Make sure that tasklet <tl> will wake up on thread <thr>, that is, will + execute at least once. The designated thread may only differ from the + calling one if the tasklet is already configured to run on another + thread, and it is not permitted to self-assign a tasklet if its tid is + negative, as it may already be scheduled to run somewhere else. Just in + case, only use tasklet_wakeup() which will pick the tasklet's assigned + thread ID. + +struct tasklet *tasklet_new() + Allocate a new tasklet and set it to run by default on the calling + thread. The caller may change its tid to another one before using it. + The new tasklet is returned. + +struct task *task_new_anywhere() + Allocate a new task to run on any thread, and return the task, or NULL + in case of allocation issue. Note that such tasks will be marked as + shared and will go through the locked queues, thus their activity will + be heavier than for other ones. See also task_new_here(). + +struct task *task_new_here() + Allocate a new task to run on the calling thread, and return the task, + or NULL in case of allocation issue. + +struct task *task_new_on(t) + Allocate a new task to run on thread <t>, and return the task, or NULL + in case of allocation issue. + +void task_destroy(t) + Destroy this task. The task will be unlinked from any timers queue, + and either immediately freed, or asynchronously killed if currently + running. This may only be done by one of the threads this task is + allowed to run on. Developers must not forget that the task's memory + area is not always immediately freed, and that certain misuses could + only have effect later down the chain (e.g. use-after-free). + +void tasklet_free() + Free this tasklet, which must not be running, so that may only be + called by the thread responsible for the tasklet, typically the + tasklet's process() function itself. + +void task_schedule(t, d) + Schedule task <t> to run no later than date <d>. If the task is already + running, or scheduled for an earlier instant, nothing is done. If the + task was not in queued or was scheduled to run later, its timer entry + will be updated. This function assumes that it will never be called + with a timer in the past nor with TICK_ETERNITY. Only one of the + threads assigned to the task may call this function. + +The task's ->process() function receives the following arguments: + + - struct task *t: a pointer to the task itself. It is always valid. + + - void *ctx : a copy of the task's ->context pointer at the moment + the ->process() function was called by the scheduler. A + function must use this and not task->context, because + task->context might possibly be changed by another thread. + For instance, the muxes' takeover() function do this. + + - uint state : a copy of the task's ->state field at the moment the + ->process() function was executed. A function must use + this and not task->state as the latter misses the wakeup + reasons and may constantly change during execution along + concurrent wakeups (threads or signals). + +The possible state flags to use during a call to task_wakeup() or seen by the +task being called are the following; they're automatically cleaned from the +state field before the call to ->process() + + - TASK_WOKEN_INIT each creation of a task causes a first wakeup with this + flag set. Applications should not set it themselves. + + - TASK_WOKEN_TIMER this indicates the task's expire date was reached in the + timers queue. Applications should not set it themselves. + + - TASK_WOKEN_IO indicates the wake-up happened due to I/O activity. Now + that all low-level I/O processing happens on tasklets, + this notion of I/O is now application-defined (for + example stream-interfaces use it to notify the stream). + + - TASK_WOKEN_SIGNAL indicates that a signal the task was subscribed to was + received. Applications should not set it themselves. + + - TASK_WOKEN_MSG any application-defined wake-up reason, usually for + inter-task communication (e.g filters vs streams). + + - TASK_WOKEN_RES a resource the task was waiting for was finally made + available, allowing the task to continue its work. This + is essentially used by buffers and queues. Applications + may carefully use it for their own purpose if they're + certain not to rely on existing ones. + + - TASK_WOKEN_OTHER any other application-defined wake-up reason. + + +In addition, a few persistent flags may be observed or manipulated by the +application, both for tasks and tasklets: + + - TASK_SELF_WAKING when set, indicates that this task was found waking + itself up, and its class will change to bulk processing. + If this behavior is under control temporarily expected, + and it is not expected to happen again, it may make + sense to reset this flag from the ->process() function + itself. + + - TASK_HEAVY when set, indicates that this task does so heavy + processing that it will become mandatory to give back + control to I/Os otherwise big latencies might occur. It + may be set by an application that expects something + heavy to happen (tens to hundreds of microseconds), and + reset once finished. An example of user is the TLS stack + which sets it when an imminent crypto operation is + expected. + + - TASK_F_USR1 This is the first application-defined persistent flag. + It is always zero unless the application changes it. An + example of use cases is the I/O handler for backend + connections, to mention whether the connection is safe + to use or might have recently been migrated. + +Finally, when built with -DDEBUG_TASK, an extra sub-structure "debug" is added +to both tasks and tasklets to note the code locations of the last two calls to +task_wakeup() and tasklet_wakeup(). diff --git a/doc/internals/body-parsing.txt b/doc/internals/body-parsing.txt new file mode 100644 index 0000000..be209af --- /dev/null +++ b/doc/internals/body-parsing.txt @@ -0,0 +1,165 @@ +2014/04/16 - Pointer assignments during processing of the HTTP body + +In HAProxy, a struct http_msg is a descriptor for an HTTP message, which stores +the state of an HTTP parser at any given instant, relative to a buffer which +contains part of the message being inspected. + +Currently, an http_msg holds a few pointers and offsets to some important +locations in a message depending on the state the parser is in. Some of these +pointers and offsets may move when data are inserted into or removed from the +buffer, others won't move. + +An important point is that the state of the parser only translates what the +parser is reading, and not at all what is being done on the message (eg: +forwarding). + +For an HTTP message <msg> and a buffer <buf>, we have the following elements +to work with : + + +Buffer : +-------- + +buf.size : the allocated size of the buffer. A message cannot be larger than + this size. In general, a message will even be smaller because the + size is almost always reduced by global.maxrewrite bytes. + +buf.data : memory area containing the part of the message being worked on. This + area is exactly <buf.size> bytes long. It should be seen as a sliding + window over the message, but in terms of implementation, it's closer + to a wrapping window. For ease of processing, new messages (requests + or responses) are aligned to the beginning of the buffer so that they + never wrap and common string processing functions can be used. + +buf.p : memory pointer (char *) to the beginning of the buffer as the parser + understands it. It commonly refers to the first character of an HTTP + request or response, but during forwarding, it can point to other + locations. This pointer always points to a location in <buf.data>. + +buf.i : number of bytes after <buf.p> that are available in the buffer. If + <buf.p + buf.i> exceeds <buf.data + buf.size>, then the pending data + wrap at the end of the buffer and continue at <buf.data>. + +buf.o : number of bytes already processed before <buf.p> that are pending + for departure. These bytes may leave at any instant once a connection + is established. These ones may wrap before <buf.data> to start before + <buf.data + buf.size>. + +It's common to call the part between buf.p and buf.p+buf.i the input buffer, and +the part between buf.p-buf.o and buf.p the output buffer. This design permits +efficient forwarding without copies. As a result, forwarding one byte from the +input buffer to the output buffer only consists in : + - incrementing buf.p + - incrementing buf.o + - decrementing buf.i + + +Message : +--------- +Unless stated otherwise, all values are relative to <buf.p>, and are always +comprised between 0 and <buf.i>. These values are relative offsets and they do +not need to take wrapping into account, they are used as if the buffer was an +infinite length sliding window. The buffer management functions handle the +wrapping automatically. + +msg.next : points to the next byte to inspect. This offset is automatically + adjusted when inserting/removing some headers. In data states, it is + automatically adjusted to the number of bytes already inspected. + +msg.sov : start of value. First character of the header's value in the header + states, start of the body in the data states. Strictly positive + values indicate that headers were not forwarded yet (<buf.p> is + before the start of the body), and null or negative values are seen + after headers are forwarded (<buf.p> is at or past the start of the + body). The value stops changing when data start to leave the buffer + (in order to avoid integer overflows). So the maximum possible range + is -<buf.size> to +<buf.size>. This offset is automatically adjusted + when inserting or removing some headers. It is useful to rewind the + request buffer to the beginning of the body at any phase. The + response buffer does not really use it since it is immediately + forwarded to the client. + +msg.sol : start of line. Points to the beginning of the current header line + while parsing headers. It is cleared to zero in the BODY state, + and contains exactly the number of bytes comprising the preceding + chunk size in the DATA state (which can be zero), so that the sum of + msg.sov + msg.sol always points to the beginning of data for all + states starting with DATA. For chunked encoded messages, this sum + always corresponds to the beginning of the current chunk of data as + it appears in the buffer, or to be more precise, it corresponds to + the first of the remaining bytes of chunked data to be inspected. In + TRAILERS state, it contains the length of the last parsed part of + the trailer headers. + +msg.eoh : end of headers. Points to the CRLF (or LF) preceding the body and + marking the end of headers. It is where new headers are appended. + This offset is automatically adjusted when inserting/removing some + headers. It always contains the size of the headers excluding the + trailing CRLF even after headers have been forwarded. + +msg.eol : end of line. Points to the CRLF or LF of the current header line + being inspected during the various header states. In data states, it + holds the trailing CRLF length (1 or 2) so that msg.eoh + msg.eol + always equals the exact header length. It is not affected during data + states nor by forwarding. + +The beginning of the message headers can always be found this way even after +headers or data have been forwarded, provided that everything is still present +in the buffer : + + headers = buf.p + msg->sov - msg->eoh - msg->eol + + +Message length : +---------------- +msg.chunk_len : amount of bytes of the current chunk or total message body + remaining to be inspected after msg.next. It is automatically + incremented when parsing a chunk size, and decremented as data + are forwarded. + +msg.body_len : total message body length, for logging. Equals Content-Length + when used, otherwise is the sum of all correctly parsed chunks. + + +Message state : +--------------- +msg.msg_state contains the current parser state, one of HTTP_MSG_*. The state +indicates what byte is expected at msg->next. + +HTTP_MSG_BODY : all headers have been parsed, parsing of body has not + started yet. + +HTTP_MSG_100_SENT : parsing of body has started. If a 100-Continue was needed + it has already been sent. + +HTTP_MSG_DATA : some bytes are remaining for either the whole body when + the message size is determined by Content-Length, or for + the current chunk in chunked-encoded mode. + +HTTP_MSG_CHUNK_CRLF : msg->next points to the CRLF after the current data chunk. + +HTTP_MSG_TRAILERS : msg->next points to the beginning of a possibly empty + trailer line after the final empty chunk. + +HTTP_MSG_DONE : all the Content-Length data has been inspected, or the + final CRLF after trailers has been met. + + +Message forwarding : +-------------------- +Forwarding part of a message consists in advancing buf.p up to the point where +it points to the byte following the last one to be forwarded. This can be done +inline if enough bytes are present in the buffer, or in multiple steps if more +buffers need to be forwarded (possibly including splicing). Thus by definition, +after a block has been scheduled for being forwarded, msg->next and msg->sov +must be reset. + +The communication channel between the producer and the consumer holds a counter +of extra bytes remaining to be forwarded directly without consulting analysers, +after buf.p. This counter is called to_forward. It commonly holds the advertised +chunk length or content-length that does not fit in the buffer. For example, if +2000 bytes are to be forwarded, and 10 bytes are present after buf.p as reported +by buf.i, then both buf.o and buf.p will advance by 10, buf.i will be reset, and +to_forward will be set to 1990 so that in total, 2000 bytes will be forwarded. +At the end of the forwarding, buf.p will point to the first byte to be inspected +after the 2000 forwarded bytes. diff --git a/doc/internals/connect-status.txt b/doc/internals/connect-status.txt new file mode 100644 index 0000000..70bbcc5 --- /dev/null +++ b/doc/internals/connect-status.txt @@ -0,0 +1,28 @@ +Normally, we should use getsockopt(fd, SOL_SOCKET, SO_ERROR) on a pending +connect() to detect whether the connection correctly established or not. + +Unfortunately, getsockopt() does not report the status of a pending connection, +which means that it returns 0 if the connection is still pending. This has to +be expected, because as the name implies it, it only returns errors. + +With the speculative I/O, a new problem was introduced : if we pretend the +socket was indicated as ready and we go to the socket's write() function, +a pending connection will then inevitably be identified as established. + +In fact, there are solutions to this issue : + + - send() returns -EAGAIN if it cannot write, so that as long as there are + pending data in the buffer, we'll be informed about the status of the + connection + + - connect() on an already pending connection will return -1 with errno set to + one of the following values : + - EALREADY : connection already in progress + - EISCONN : connection already established + - anything else will indicate an error. + +=> So instead of using getsockopt() on a pending connection with no data, we + will switch to connect(). This implies that the connection address must be + known within the socket's write() function. + + diff --git a/doc/internals/connection-header.txt b/doc/internals/connection-header.txt new file mode 100644 index 0000000..b74cea0 --- /dev/null +++ b/doc/internals/connection-header.txt @@ -0,0 +1,196 @@ +2010/01/16 - Connection header adjustments depending on the transaction mode. + + +HTTP transactions supports 5 possible modes : + + WANT_TUN : default, nothing changed + WANT_TUN + httpclose : headers set for close in both dirs + WANT_KAL : keep-alive desired in both dirs + WANT_SCL : want close with the server and KA with the client + WANT_CLO : want close on both sides. + +When only WANT_TUN is set, nothing is changed nor analysed, so for commodity +below, we'll refer to WANT_TUN+httpclose as WANT_TUN. + +The mode is adjusted in 3 steps : + - configuration sets initial mode + - request headers set required request mode + - response headers set the final mode + + +1) Adjusting the initial mode via the configuration + + option httpclose => TUN + option http-keep-alive => KAL + option http-server-close => SCL + option forceclose => CLO + +Note that option httpclose combined with any other option is equivalent to +forceclose. + + +2) Adjusting the request mode once the request is parsed + +If we cannot determine the body length from the headers, we set the mode to CLO +but later we'll switch to tunnel mode once forwarding the body. That way, all +parties are informed of the correct mode. + +Depending on the request version and request Connection header, we may have to +adjust the current transaction mode and to update the connection header. + +mode req_ver req_hdr new_mode hdr_change +TUN 1.0 - TUN - +TUN 1.0 ka TUN del_ka +TUN 1.0 close TUN del_close +TUN 1.0 both TUN del_ka, del_close + +TUN 1.1 - TUN add_close +TUN 1.1 ka TUN del_ka, add_close +TUN 1.1 close TUN - +TUN 1.1 both TUN del_ka + +KAL 1.0 - CLO - +KAL 1.0 ka KAL - +KAL 1.0 close CLO del_close +KAL 1.0 both CLO del_ka, del_close + +KAL 1.1 - KAL - +KAL 1.1 ka KAL del_ka +KAL 1.1 close CLO - +KAL 1.1 both CLO del_ka + +SCL 1.0 - CLO - +SCL 1.0 ka SCL del_ka +SCL 1.0 close CLO del_close +SCL 1.0 both CLO del_ka, del_close + +SCL 1.1 - SCL add_close +SCL 1.1 ka SCL del_ka, add_close +SCL 1.1 close CLO - +SCL 1.1 both CLO del_ka + +CLO 1.0 - CLO - +CLO 1.0 ka CLO del_ka +CLO 1.0 close CLO del_close +CLO 1.0 both CLO del_ka, del_close + +CLO 1.1 - CLO add_close +CLO 1.1 ka CLO del_ka, add_close +CLO 1.1 close CLO - +CLO 1.1 both CLO del_ka + +=> Summary: + - KAL and SCL are only possible with the same requests : + - 1.0 + ka + - 1.1 + ka or nothing + + - CLO is assumed for any non-TUN request which contains at least a close + header, as well as for any 1.0 request without a keep-alive header. + + - del_ka is set whenever we want a CLO or SCL or TUN and req contains a KA, + or when the req is 1.1 and contains a KA. + + - del_close is set whenever a 1.0 request contains a close. + + - add_close is set whenever a 1.1 request must be switched to TUN, SCL, CLO + and did not have a close hdr. + +Note that the request processing is performed in two passes, one with the +frontend's config and a second one with the backend's config. It is only +possible to "raise" the mode between them, so during the second pass, we have +no reason to re-add a header that we previously removed. As an exception, the +TUN mode is converted to CLO once combined because in fact it's an httpclose +option set on a TUN mode connection : + + BE (2) + | TUN KAL SCL CLO + ----+----+----+----+---- + TUN | TUN CLO CLO CLO + + + KAL | CLO KAL SCL CLO + FE + + (1) SCL | CLO SCL SCL CLO + + + CLO | CLO CLO CLO CLO + + +3) Adjusting the final mode once the response is parsed + +This part becomes trickier. It is possible that the server responds with a +version that the client does not necessarily understand. Obviously, 1.1 clients +are asusmed to understand 1.0 responses. The problematic case is a 1.0 client +receiving a 1.1 response without any Connection header. Some 1.0 clients might +know that in 1.1 this means "keep-alive" while others might ignore the version +and assume a "close". Since we know the version on both sides, we may have to +adjust some responses to remove any ambiguous case. That's the reason why the +following table considers both the request and the response version. If the +response length cannot be determined, we switch to CLO mode. + +mode res_ver res_hdr req_ver new_mode hdr_change +TUN 1.0 - any TUN - +TUN 1.0 ka any TUN del_ka +TUN 1.0 close any TUN del_close +TUN 1.0 both any TUN del_ka, del_close + +TUN 1.1 - any TUN add_close +TUN 1.1 ka any TUN del_ka, add_close +TUN 1.1 close any TUN - +TUN 1.1 both any TUN del_ka + +KAL 1.0 - any SCL add_ka +KAL 1.0 ka any KAL - +KAL 1.0 close any SCL del_close, add_ka +KAL 1.0 both any SCL del_close + +KAL 1.1 - 1.0 KAL add_ka +KAL 1.1 - 1.1 KAL - +KAL 1.1 ka 1.0 KAL - +KAL 1.1 ka 1.1 KAL del_ka +KAL 1.1 close 1.0 SCL del_close, add_ka +KAL 1.1 close 1.1 SCL del_close +KAL 1.1 both 1.0 SCL del_close +KAL 1.1 both 1.1 SCL del_ka, del_close + +SCL 1.0 - any SCL add_ka +SCL 1.0 ka any SCL - +SCL 1.0 close any SCL del_close, add_ka +SCL 1.0 both any SCL del_close + +SCL 1.1 - 1.0 SCL add_ka +SCL 1.1 - 1.1 SCL - +SCL 1.1 ka 1.0 SCL - +SCL 1.1 ka 1.1 SCL del_ka +SCL 1.1 close 1.0 SCL del_close, add_ka +SCL 1.1 close 1.1 SCL del_close +SCL 1.1 both 1.0 SCL del_close +SCL 1.1 both 1.1 SCL del_ka, del_close + +CLO 1.0 - any CLO - +CLO 1.0 ka any CLO del_ka +CLO 1.0 close any CLO del_close +CLO 1.0 both any CLO del_ka, del_close + +CLO 1.1 - any CLO add_close +CLO 1.1 ka any CLO del_ka, add_close +CLO 1.1 close any CLO - +CLO 1.1 both any CLO del_ka + +=> in summary : + - the header operations do not depend on the initial mode, they only depend + on versions and current connection header(s). + + - both CLO and TUN modes work similarly, they need to set a close mode on the + response. A 1.1 response will exclusively need the close header, while a 1.0 + response will have it removed. Any keep-alive header is always removed when + found. + + - a KAL request where the server wants to close turns into an SCL response so + that we release the server but still maintain the connection to the client. + + - the KAL and SCL modes work the same way as we need to set keep-alive on the + response. So a 1.0 response will only have the keep-alive header with any + close header removed. A 1.1 response will have the keep-alive header added + for 1.0 requests and the close header removed for all requests. + +Note that the SCL and CLO modes will automatically cause the server connection +to be closed at the end of the data transfer. diff --git a/doc/internals/connection-scale.txt b/doc/internals/connection-scale.txt new file mode 100644 index 0000000..7c3d902 --- /dev/null +++ b/doc/internals/connection-scale.txt @@ -0,0 +1,44 @@ +Problčme des connexions simultanées avec un backend + +Pour chaque serveur, 3 cas possibles : + + - pas de limite (par défaut) + - limite statique (maxconn) + - limite dynamique (maxconn/(ratio de px->conn), avec minconn) + +On a donc besoin d'une limite sur le proxy dans le cas de la limite +dynamique, afin de fixer un seuil et un ratio. Ce qui compte, c'est +le point aprčs lequel on passe d'un régime linéaire ŕ un régime +saturé. + +On a donc 3 phases : + + - régime minimal (0..srv->minconn) + - régime linéaire (srv->minconn..srv->maxconn) + - régime saturé (srv->maxconn..) + +Le minconn pourrait aussi ressortir du serveur ? +En pratique, on veut : + - un max par serveur + - un seuil global auquel les serveurs appliquent le max + - un seuil minimal en-dessous duquel le nb de conn est + maintenu. Cette limite a un sens par serveur (jamais moins de X conns) + mais aussi en global (pas la peine de faire du dynamique en dessous de + X conns ŕ répartir). La difficulté en global, c'est de savoir comment + on calcule le nombre min associé ŕ chaque serveur, vu que c'est un ratio + défini ŕ partir du max. + +Ca revient ŕ peu prčs ŕ la męme chose que de faire 2 états : + + - régime linéaire avec un offset (srv->minconn..srv->maxconn) + - régime saturé (srv->maxconn..) + +Sauf que dans ce cas, le min et le max sont bien par serveur, et le seuil est +global et correspond ŕ la limite de connexions au-delŕ de laquel on veut +tourner ŕ plein régime sur l'ensemble des serveurs. On peut donc parler de +passage en mode "full", "saturated", "optimal". On peut également parler de +la fin de la partie "scalable", "dynamique". + +=> fullconn 1000 par exemple ? + + diff --git a/doc/internals/entities-v2.txt b/doc/internals/entities-v2.txt new file mode 100644 index 0000000..86782c3 --- /dev/null +++ b/doc/internals/entities-v2.txt @@ -0,0 +1,193 @@ +An FD has a state : + - CLOSED + - READY + - ERROR (?) + - LISTEN (?) + +A connection has a state : + - CLOSED + - ACCEPTED + - CONNECTING + - ESTABLISHED + - ERROR + +A stream interface has a state : + - INI, REQ, QUE, TAR, ASS, CON, CER, EST, DIS, CLO + +Note that CON and CER might be replaced by EST if the connection state is used +instead. CON might even be more suited than EST to indicate that a connection +is known. + + +si_shutw() must do : + + data_shutw() + if (shutr) { + data_close() + ctrl_shutw() + ctrl_close() + } + +si_shutr() must do : + data_shutr() + if (shutw) { + data_close() + ctrl_shutr() + ctrl_close() + } + +Each of these steps may fail, in which case the step must be retained and the +operations postponed in an asynchronous task. + +The first asynchronous data_shut() might already fail so it is mandatory to +save the other side's status with the connection in order to let the async task +know whether the 3 next steps must be performed. + +The connection (or perhaps the FD) needs to know : + - the desired close operations : DSHR, DSHW, CSHR, CSHW + - the completed close operations : DSHR, DSHW, CSHR, CSHW + + +On the accept() side, we probably need to know : + - if a header is expected (eg: accept-proxy) + - if this header is still being waited for + => maybe both info might be combined into one bit + + - if a data-layer accept() is expected + - if a data-layer accept() has been started + - if a data-layer accept() has been performed + => possibly 2 bits, to indicate the need to free() + +On the connect() side, we need to know : + - the desire to send a header (eg: send-proxy) + - if this header has been sent + => maybe both info might be combined + + - if a data-layer connect() is expected + - if a data-layer connect() has been started + - if a data-layer connect() has been completed + => possibly 2 bits, to indicate the need to free() + +On the response side, we also need to know : + - the desire to send a header (eg: health check response for monitor-net) + - if this header was sent + => might be the same as sending a header over a new connection + +Note: monitor-net has precedence over proxy proto and data layers. Same for + health mode. + +For multi-step operations, use 2 bits : + 00 = operation not desired, not performed + 10 = operation desired, not started + 11 = operation desired, started but not completed + 01 = operation desired, started and completed + + => X != 00 ==> operation desired + X & 01 ==> operation at least started + X & 10 ==> operation not completed + +Note: no way to store status information for error reporting. + +Note2: it would be nice if "tcp-request connection" rules could work at the +connection level, just after headers ! This means support for tracking stick +tables, possibly not too much complicated. + + +Proposal for incoming connection sequence : + +- accept() +- if monitor-net matches or if mode health => try to send response +- if accept-proxy, wait for proxy request +- if tcp-request connection, process tcp rules and possibly keep the + pointer to stick-table +- if SSL is enabled, switch to SSL handshake +- then switch to DATA state and instantiate a session + +We just need a map of handshake handlers on the connection. They all manage the +FD status themselves and set the callbacks themselves. If their work succeeds, +they remove themselves from the list. If it fails, they remain subscribed and +enable the required polling until they are woken up again or the timeout strikes. + +Identified handshake handlers for incoming connections : + - HH_HEALTH (tries to send OK and dies) + - HH_MONITOR_IN (matches src IP and adds/removes HH_SEND_OK/HH_SEND_HTTP_OK) + - HH_SEND_OK (tries to send "OK" and dies) + - HH_SEND_HTTP_OK (tries to send "HTTP/1.0 200 OK" and dies) + - HH_ACCEPT_PROXY (waits for PROXY line and parses it) + - HH_TCP_RULES (processes TCP rules) + - HH_SSL_HS (starts SSL handshake) + - HH_ACCEPT_SESSION (instantiates a session) + +Identified handshake handlers for outgoing connections : + - HH_SEND_PROXY (tries to build and send the PROXY line) + - HH_SSL_HS (starts SSL handshake) + +For the pollers, we could check that handshake handlers are not 0 and decide to +call a generic connection handshake handler instead of usual callbacks. Problem +is that pollers don't know connections, they know fds. So entities which manage +handlers should update change the FD callbacks accordingly. + +With a bit of care, we could have : + - HH_SEND_LAST_CHUNK (sends the chunk pointed to by a pointer and dies) + => merges HEALTH, SEND_OK and SEND_HTTP_OK + +It sounds like the ctrl vs data state for the connection are per-direction +(eg: support an async ctrl shutw while still reading data). + +Also support shutr/shutw status at L4/L7. + +In practice, what we really need is : + +shutdown(conn) = + conn.data.shut() + conn.ctrl.shut() + conn.fd.shut() + +close(conn) = + conn.data.close() + conn.ctrl.close() + conn.fd.close() + +With SSL over Remote TCP (RTCP + RSSL) to reach the server, we would have : + + HTTP -> RTCP+RSSL connection <-> RTCP+RRAW connection -> TCP+SSL connection + +The connection has to be closed at 3 places after a successful response : + - DATA (RSSL over RTCP) + - CTRL (RTCP to close connection to server) + - SOCK (FD to close connection to second process) + +Externally, the connection is seen with very few flags : + - SHR + - SHW + - ERR + +We don't need a CLOSED flag as a connection must always be detached when it's closed. + +The internal status doesn't need to be exposed : + - FD allocated (Y/N) + - CTRL initialized (Y/N) + - CTRL connected (Y/N) + - CTRL handlers done (Y/N) + - CTRL failed (Y/N) + - CTRL shutr (Y/N) + - CTRL shutw (Y/N) + - DATA initialized (Y/N) + - DATA connected (Y/N) + - DATA handlers done (Y/N) + - DATA failed (Y/N) + - DATA shutr (Y/N) + - DATA shutw (Y/N) + +(note that having flags for operations needing to be completed might be easier) +-------------- + +Maybe we need to be able to call conn->fdset() and conn->fdclr() but it sounds +very unlikely since the only functions manipulating this are in the code of +the data/ctrl handlers. + +FDSET/FDCLR cannot be directly controlled by the stream interface since it also +depends on the DATA layer (WANT_READ/wANT_WRITE). + +But FDSET/FDCLR is probably controlled by who owns the connection (eg: DATA). + diff --git a/doc/internals/entities.txt b/doc/internals/entities.txt new file mode 100644 index 0000000..cdde82e --- /dev/null +++ b/doc/internals/entities.txt @@ -0,0 +1,96 @@ +2011/02/25 - Description of the different entities in haproxy - w@1wt.eu + + +1) Definitions +-------------- + +Listener +-------- + +A listener is the entity which is part of a frontend and which accepts +connections. There are as many listeners as there are ip:port couples. +There is at least one listener instantiated for each "bind" entry, and +port ranges will lead to as many listeners as there are ports in the +range. A listener just has a listening file descriptor ready to accept +incoming connections and to dispatch them to upper layers. + + +Initiator +--------- + +An initiator is instantiated for each incoming connection on a listener. It may +also be instantiated by a task pretending to be a client. An initiator calls +the next stage's accept() callback to present it with the parameters of the +incoming connection. + + +Session +------- + +A session is the only entity located between an initiator and a connector. +This is the last stage which offers an accept() callback, and all of its +processing will continue with the next stage's connect() callback. It holds +the buffers needed to forward the protocol data between each side. This entity +sees the native protocol, and is able to call analysers on these buffers. As it +is used in both directions, it always has two buffers. + +When transformations are required, some of them may be done on the initiator +side and other ones on the connector side. If additional buffers are needed for +such transforms, those buffers cannot replace the session's buffers, but they +may complete them. + +A session only needs to be instantiated when forwarding of data is required +between two sides. Accepting and filtering on layer 4 information only does not +require a session. + +For instance, let's consider the case of a proxy which receives and decodes +HTTPS traffic, processes it as HTTP and recodes it as HTTPS before forwarding +it. We'd have 3 layers of buffers, where the middle ones are used for +forwarding of the protocol data (HTTP here) : + + <-- ssl dec --> <-forwarding-> <-- ssl enc --> + + ,->[||||]--. ,->[||||]--. ,->[||||]--. + client (|) (|) (|) (|) server + ^--[||||]<-' ^--[||||]<-' ^--[||||]<-' + + HTTPS HTTP HTTPS + +The session handling code is only responsible for monitoring the forwarding +buffers here. It may declare the end of the session once those buffers are +closed and no analyser wants to re-open them. The session is also the entity +which applies the load balancing algorithm and decides the server to use. + +The other sides are responsible for propagating the state up to the session +which takes decisions. + + +Connector +--------- + +A connector is the entity which permits to instantiate a connection to a known +destination. It presents a connect() callback, and as such appears on the right +side of diagrams. + + +Connection +---------- + +A connection is the entity instantiated by a connector. It may be composed of +multiple stages linked together. Generally it is the part of the stream +interface holding a file descriptor, but it can also be a processing block or a +transformation block terminated by a connection. A connection presents a +server-side interface. + + +2) Sequencing +------------- + +Upon startup, listeners are instantiated by the configuration. When an incoming +connection reaches a listening file descriptor, its read() callback calls the +corresponding listener's accept() function which instantiates an initiator and +in turn recursively calls upper layers' accept() callbacks until +accept_session() is called. accept_session() instantiates a new session which +starts protocol analysis via process_session(). When all protocol analysis is +done, process_session() calls the connect() callback of the connector in order +to get a connection. diff --git a/doc/internals/fd-migration.txt b/doc/internals/fd-migration.txt new file mode 100644 index 0000000..aaddad3 --- /dev/null +++ b/doc/internals/fd-migration.txt @@ -0,0 +1,138 @@ +2021-07-30 - File descriptor migration between threads + +An FD migration may happen on any idle connection that experiences a takeover() +operation by another thread. In this case the acting thread becomes the owner +of the connection (and FD) while previous one(s) need to forget about it. + +File descriptor migration between threads is a fairly complex operation because +it is required to maintain a durable consistency between the pollers states and +the haproxy's desired state. Indeed, very often the FD is registered within one +thread's poller and that thread might be waiting in the system, so there is no +way to synchronously update it. This is where thread_mask, polled_mask and per +thread updates are used: + + - a thread knows if it's allowed to manipulate an FD by looking at its bit in + the FD's thread_mask ; + + - each thread knows if it was polling an FD by looking at its bit in the + polled_mask field ; a recent migration is usually indicated by a bit being + present in polled_mask and absent from thread_mask. + + - other threads know whether it's safe to take over an FD by looking at the + running mask: if it contains any other thread's bit, then other threads are + using it and it's not safe to take it over. + + - sleeping threads are notified about the need to update their polling via + local or global updates to the FD. Each thread has its own local update + list and its own bit in the update_mask to know whether there are pending + updates for it. This allows to reconverge polling with the desired state + at the last instant before polling. + +While the description above could be seen as "progressive" (it technically is) +in that there is always a transition and convergence period in a migrated FD's +life, functionally speaking it's perfectly atomic thanks to the running bit and +to the per-thread idle connections lock: no takeover is permitted without +holding the idle_conns lock, and takeover may only happen by atomically picking +a connection from the list that is also protected by this lock. In practice, an +FD is never taken over by itself, but always in the context of a connection, +and by atomically removing a connection from an idle list, it is possible to +guarantee that a connection will not be picked, hence that its FD will not be +taken over. + +same thread as list! + +The possible entry points to a race to use a file descriptor are the following +ones, with their respective sequences: + + 1) takeover: requested by conn_backend_get() on behalf of connect_server() + - take the idle_conns_lock, protecting against a parallel access from the + I/O tasklet or timeout task + - pick the first connection from the list + - attempt an fd_takeover() on this connection's fd. Usually it works, + unless a late wakeup of the owning thread shows up in the FD's running + mask. The operation is performed in fd_takeover() using a DWCAS which + tries to switch both running and thread_mask to the caller's tid_bit. A + concurrent bit in running is enough to make it fail. This guarantees + another thread does not wakeup from I/O in the middle of the takeover. + In case of conflict, this FD is skipped and the attempt is tried again + with the next connection. + - resets the task/tasklet contexts to NULL, as a signal that they are not + allowed to run anymore. The tasks retrieve their execution context from + the scheduler in the arguments, but will check the tasks' context from + the structure under the lock to detect this possible change, and abort. + - at this point the takeover succeeded, the idle_conns_lock is released and + the connection and its FD are now owned by the caller + + 2) poll report: happens on late rx, shutdown or error on idle conns + - fd_set_running() is called to atomically set the running_mask and check + that the caller's tid_bit is still present in the thread_mask. Upon + failure the caller arranges itself to stop reporting that FD (e.g. by + immediate removal or by an asynchronous update). Upon success, it's + guaranteed that any concurrent fd_takeover() will fail the DWCAS and that + another connection will need to be picked instead. + - FD's state is possibly updated + - the iocb is called if needed (almost always) + - if the iocb didn't kill the connection, release the bit from running_mask + making the connection possibly available to a subsequent fd_takeover(). + + 3) I/O tasklet, timeout task: timeout or subscribed wakeup + - start by taking the idle_conns_lock, ensuring no takeover() will pick the + same connection from this point. + - check the task/tasklet's context to verify that no recently completed + takeover() stole the connection. If it's NULL, the connection was lost, + the lock is released and the task/tasklet killed. Otherwise it is + guaranteed that no other thread may use that connection (current takeover + candidates are waiting on the lock, previous owners waking from poll() + lost their bit in the thread_mask and will not touch the FD). + - the connection is removed from the idle conns list. From this point on, + no other thread will even find it there nor even try fd_takeover() on it. + - the idle_conns_lock is now released, the connection is protected and its + FD is not reachable by other threads anymore. + - the task does what it has to do + - if the connection is still usable (i.e. not upon timeout), it's inserted + again into the idle conns list, meaning it may instantly be taken over + by a competing thread. + + 4) wake() callback: happens on last user after xfers (may free() the conn) + - the connection is still owned by the caller, it's still subscribed to + polling but the connection is idle thus inactive. Errors or shutdowns + may be reported late, via sock_conn_iocb() and conn_notify_mux(), thus + the running bit is set (i.e. a concurrent fd_takeover() will fail). + - if the connection is in the list, the idle_conns_lock is grabbed, the + connection is removed from the list, and the lock is released. + - mux->wake() is called + - if the connection previously was in the list, it's reinserted under the + idle_conns_lock. + + +With the DWCAS removal between running_mask & thread_mask: + +fd_takeover: + 1 if (!CAS(&running_mask, 0, tid_bit)) + 2 return fail; + 3 atomic_store(&thread_mask, tid_bit); + 4 atomic_and(&running_mask, ~tid_bit); + +poller: + 1 do { + 2 /* read consistent running_mask & thread_mask */ + 3 do { + 4 run = atomic_load(&running_mask); + 5 thr = atomic_load(&thread_mask); + 6 } while (run & ~thr); + 7 + 8 if (!(thr & tid_bit)) { + 9 /* takeover has started */ + 10 goto disable_fd; + 11 } + 12 } while (!CAS(&running_mask, run, run | tid_bit)); + +fd_delete: + 1 atomic_or(&running_mask, tid_bit); + 2 atomic_store(&thread_mask, 0); + 3 atomic_and(&running_mask, ~tid_bit); + +The loop in poller:3-6 is used to make sure the thread_mask we read matches +the last updated running_mask. If nobody can give up on fd_takeover(), it +might even be possible to spin on thread_mask only. Late pollers will not +set running anymore with this. diff --git a/doc/internals/hashing.txt b/doc/internals/hashing.txt new file mode 100644 index 0000000..260b6af --- /dev/null +++ b/doc/internals/hashing.txt @@ -0,0 +1,83 @@ +2013/11/20 - How hashing works internally in haproxy - maddalab@gmail.com + +This document describes how HAProxy implements hashing both map-based and +consistent hashing, both prior to versions 1.5 and the motivation and tests +that were done when providing additional options starting in version 2.4 + +A note on hashing in general, hash functions strive to have little +correlation between input and output. The heart of a hash function is its +mixing step. The behavior of the mixing step largely determines whether the +hash function is collision-resistant. Hash functions that are collision +resistant are more likely to have an even distribution of load. + +The purpose of the mixing function is to spread the effect of each message +bit throughout all the bits of the internal state. Ideally every bit in the +hash state is affected by every bit in the message. And we want to do that +as quickly as possible simply for the sake of program performance. A +function is said to satisfy the strict avalanche criterion if, whenever a +single input bit is complemented (toggled between 0 and 1), each of the +output bits should change with a probability of one half for an arbitrary +selection of the remaining input bits. + +To guard against a combination of hash function and input that results in +high rate of collisions, haproxy implements an avalanche algorithm on the +result of the hashing function. In all versions 1.4 and prior avalanche is +always applied when using the consistent hashing directive. It is intended +to provide quite a good distribution for little input variations. The result +is quite suited to fit over a 32-bit space with enough variations so that +a randomly picked number falls equally before any server position, which is +ideal for consistently hashed backends, a common use case for caches. + +In all versions 1.4 and prior HAProxy implements the SDBM hashing function. +However tests show that alternatives to SDBM have a better cache +distribution on different hashing criteria. Additional tests involving +alternatives for hash input and an option to trigger avalanche, we found +different algorithms perform better on different criteria. DJB2 performs +well when hashing ascii text and is a good choice when hashing on host +header. Other alternatives perform better on numbers and are a good choice +when using source ip. The results also vary by use of the avalanche flag. + +The results of the testing can be found under the tests folder. Here is +a summary of the discussion on the results on 1 input criteria and the +methodology used to generate the results. + +A note of the setup when validating the results independently, one +would want to avoid backend server counts that may skew the results. As +an example with DJB2 avoid 33 servers. Please see the implementations of +the hashing function, which can be found in the links under references. + +The following was the set up used + +(a) hash-type consistent/map-based +(b) avalanche on/off +(c) balanche host(hdr) +(d) 3 criteria for inputs + - ~ 10K requests, including duplicates + - ~ 46K requests, unique requests from 1 MM requests were obtained + - ~ 250K requests, including duplicates +(e) 17 servers in backend, all servers were assigned the same weight + +Result of the hashing were obtained across the server via monitoring log +files for haproxy. Population Standard deviation was used to evaluate the +efficacy of the hashing algorithm. Lower standard deviation, indicates +a better distribution of load across the backends. + +On 10K requests, when using consistent hashing with avalanche on host +headers, DJB2 significantly out performs SDBM. Std dev on SDBM was 48.95 +and DJB2 was 26.29. This relationship is inverted with avalanche disabled, +however DJB2 with avalanche enabled out performs SDBM with avalanche +disabled. + +On map-based hashing SDBM out performs DJB2 irrespective of the avalanche +option. SDBM without avalanche is marginally better than with avalanche. +DJB2 performs significantly worse with avalanche enabled. + +Summary: The results of the testing indicate that there isn't a hashing +algorithm that can be applied across all input criteria. It is necessary +to support alternatives to SDBM, which is generally the best option, with +algorithms that are better for different inputs. Avalanche is not always +applicable and may result in less smooth distribution. + +References: +Mixing Functions/Avalanche: https://papa.bretmulvey.com/post/124027987928/hash-functions +Hash Functions: http://www.cse.yorku.ca/~oz/hash.html diff --git a/doc/internals/header-parser-speed.txt b/doc/internals/header-parser-speed.txt new file mode 100644 index 0000000..285e2fa --- /dev/null +++ b/doc/internals/header-parser-speed.txt @@ -0,0 +1,92 @@ +TEST 3: + + printf "GET /\r\nbla: truc\r\n\r\n" + + +NO SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071080, r=0x8071094 +WHL: hdr_st=0x01, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071080, r=0x8071094 +WHL: hdr_st=0x32, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071086, r=0x8071094 +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071087, r=0x8071094 +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071091, r=0x8071094 +WHL: hdr_st=0x03, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x8071092, lr=0x8071092, r=0x8071094 +WHL: hdr_st=0x34, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x8071092, lr=0x8071093, r=0x8071094 +WHL: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x8071092, lr=0x8071093, r=0x8071094 +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x8071092, lr=0x8071094, r=0x8071094 +=> 9 trans + + +FULL SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x806a770, lr=0x806a770, r=0x806a784 +WHL: hdr_st=0x32, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x806a770, lr=0x806a776, r=0x806a784 +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x806a777, lr=0x806a777, r=0x806a784 +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x806a777, lr=0x806a781, r=0x806a784 +WHL: hdr_st=0x26, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x806a782, lr=0x806a783, r=0x806a784 +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x806a782, lr=0x806a784, r=0x806a784 +=> 6 trans + + + +TEST 4: + + + printf "GET /\nbla: truc\n\n" + + +NO SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x80750d0, lr=0x80750d0, r=0x80750e1 +WHL: hdr_st=0x01, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x80750d0, lr=0x80750d0, r=0x80750e1 +WHL: hdr_st=0x02, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x80750d0, lr=0x80750d5, r=0x80750e1 +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x80750d6, lr=0x80750d6, r=0x80750e1 +WHL: hdr_st=0x04, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x80750d6, lr=0x80750df, r=0x80750e1 +WHL: hdr_st=0x03, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x80750e0, lr=0x80750e0, r=0x80750e1 +WHL: hdr_st=0x04, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x80750e0, lr=0x80750e0, r=0x80750e1 +WHL: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x80750e0, lr=0x80750e0, r=0x80750e1 +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x80750e0, lr=0x80750e1, r=0x80750e1 +=> 9 trans + + +FULL SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8072010, lr=0x8072010, r=0x8072021 +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8072016, lr=0x8072016, r=0x8072021 +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x8072020, lr=0x8072021, r=0x8072021 +=> 3 trans + + +TEST 5: + + + printf "GET /\r\nbla: truc\r\n truc2\r\n\r\n" + + +NO SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071080, r=0x807109d +WHL: hdr_st=0x01, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071080, r=0x807109d +WHL: hdr_st=0x32, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x8071080, lr=0x8071086, r=0x807109d +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071087, r=0x807109d +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071091, r=0x807109d +WHL: hdr_st=0x05, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071092, r=0x807109d +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x8071094, r=0x807109d +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x8071087, lr=0x807109a, r=0x807109d +WHL: hdr_st=0x03, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x807109b, lr=0x807109b, r=0x807109d +WHL: hdr_st=0x34, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x807109b, lr=0x807109c, r=0x807109d +WHL: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x807109b, lr=0x807109c, r=0x807109d +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x807109b, lr=0x807109d, r=0x807109d +=> 12 trans + + +FULL SPEEDUP : + +WHL: hdr_st=0x00, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x806dfc0, lr=0x806dfc0, r=0x806dfdd +WHL: hdr_st=0x32, hdr_used=1 hdr_tail=0 hdr_last=1, h=0x806dfc0, lr=0x806dfc6, r=0x806dfdd +WHL: hdr_st=0x03, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x806dfc7, lr=0x806dfc7, r=0x806dfdd +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x806dfc7, lr=0x806dfd1, r=0x806dfdd +WHL: hdr_st=0x34, hdr_used=2 hdr_tail=1 hdr_last=2, h=0x806dfc7, lr=0x806dfda, r=0x806dfdd +WHL: hdr_st=0x26, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x806dfdb, lr=0x806dfdc, r=0x806dfdd +END: hdr_st=0x06, hdr_used=3 hdr_tail=2 hdr_last=3, h=0x806dfdb, lr=0x806dfdd, r=0x806dfdd +=> 7 trans diff --git a/doc/internals/header-tree.txt b/doc/internals/header-tree.txt new file mode 100644 index 0000000..9a97361 --- /dev/null +++ b/doc/internals/header-tree.txt @@ -0,0 +1,124 @@ +2007/03/30 - Header storage in trees + +This documentation describes how to store headers in radix trees, providing +fast access to any known position, while retaining the ability to grow/reduce +any arbitrary header without having to recompute all positions. + +Principle : + We have a radix tree represented in an integer array, which represents the + total number of bytes used by all headers whose position is below it. This + ensures that we can compute any header's position in O(log(N)) where N is + the number of headers. + +Example with N=16 : + + +-----------------------+ + | | + +-----------+ +-----------+ + | | | | + +-----+ +-----+ +-----+ +-----+ + | | | | | | | | + +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ + | | | | | | | | | | | | | | | | + + 0 1 2 3 4 5 6 7 8 9 A B C D E F + + To reach header 6, we have to compute hdr[0]+hdr[4]+hdr[6] + + With this method, it becomes easy to grow any header and update the array. + To achieve this, we have to replace one after the other all bits on the + right with one 1 followed by zeroes, and update the position if it's higher + than current position, and stop when it's above number of stored headers. + + For instance, if we want to grow hdr[6], we proceed like this : + + 6 = 0110 (BIN) + + Let's consider the values to update : + + (bit 0) : (0110 & ~0001) | 0001 = 0111 = 7 > 6 => update + (bit 1) : (0110 & ~0011) | 0010 = 0110 = 6 <= 6 => leave it + (bit 2) : (0110 & ~0111) | 0100 = 0100 = 4 <= 6 => leave it + (bit 4) : (0110 & ~1111) | 1000 = 1000 = 8 > 6 => update + (bit 5) : larger than array size, stop. + + +It's easy to walk through the tree too. We only have one iteration per bit +changing from X to the ancestor, and one per bit from the ancestor to Y. +The ancestor is found while walking. To go from X to Y : + + pos = pos(X) + + while (Y != X) { + if (Y > X) { + // walk from Y to ancestor + pos += hdr[Y] + Y &= (Y - 1) + } else { + // walk from X to ancestor + pos -= hdr[X] + X &= (X - 1) + } + } + +However, it is not trivial anymore to linearly walk the tree. We have to move +from a known place to another known place, but a jump to next entry costs the +same as a jump to a random place. + +Other caveats : + - it is not possible to remove a header, it is only possible to empty it. + - it is not possible to insert a header, as that would imply a renumbering. + => this means that a "defrag" function is required. Headers should preferably + be added, then should be stuffed on top of destroyed ones, then only + inserted if absolutely required. + + +When we have this, we can then focus on a 32-bit header descriptor which would +look like this : + +{ + unsigned line_len :13; /* total line length, including CRLF */ + unsigned name_len :6; /* header name length, max 63 chars */ + unsigned sp1 :5; /* max spaces before value : 31 */ + unsigned sp2 :8; /* max spaces after value : 255 */ +} + +Example : + + Connection: close \r\n + <---------+-----+-----+-------------> line_len + <-------->| | | name_len + <-----> | sp1 + <-------------> sp2 +Rem: + - if there are more than 31 spaces before the value, the buffer will have to + be moved before being registered + + - if there are more than 255 spaces after the value, the buffer will have to + be moved before being registered + + - we can use the empty header name as an indicator for a deleted header + + - it would be wise to format a new request before sending lots of random + spaces to the servers. + + - normal clients do not send such crap, so those operations *may* reasonably + be more expensive than the rest provided that other ones are very fast. + +It would be handy to have the following macros : + + hdr_eon(hdr) => end of name + hdr_sov(hdr) => start of value + hdr_eof(hdr) => end of value + hdr_vlen(hdr) => length of value + hdr_hlen(hdr) => total header length + + +A 48-bit encoding would look like this : + + Connection: close \r\n + <---------+------+---+--------------> eoh = 16 bits + <-------->| | | eon = 8 bits + <--------------->| | sov = 8 bits + <---> vlen = 16 bits + diff --git a/doc/internals/http-cookies.txt b/doc/internals/http-cookies.txt new file mode 100644 index 0000000..6d65c54 --- /dev/null +++ b/doc/internals/http-cookies.txt @@ -0,0 +1,45 @@ +2010/08/31 - HTTP Cookies - Theory and reality + +HTTP cookies are not uniformly supported across browsers, which makes it very +hard to build a widely compatible implementation. At least four conflicting +documents exist to describe how cookies should be handled, and browsers +generally don't respect any but a sensibly selected mix of them : + + - Netscape's original spec (also mirrored at Curl's site among others) : + http://web.archive.org/web/20070805052634/http://wp.netscape.com/newsref/std/cookie_spec.html + http://curl.haxx.se/rfc/cookie_spec.html + + Issues: uses an unquoted "Expires" field that includes a comma. + + - RFC 2109 : + http://www.ietf.org/rfc/rfc2109.txt + + Issues: specifies use of "Max-Age" (not universally implemented) and does + not talk about "Expires" (generally supported). References quoted + strings, not generally supported (eg: MSIE). Stricter than browsers + about domains. Ambiguous about allowed spaces in values and attrs. + + - RFC 2965 : + http://www.ietf.org/rfc/rfc2965.txt + + Issues: same as RFC2109 + describes Set-Cookie2 which only Opera supports. + + - Current internet draft : + https://datatracker.ietf.org/wg/httpstate/charter/ + + Issues: as of -p10, does not explain how the Set-Cookie2 header must be + emitted/handled, while suggesting a stricter approach for Cookie. + Documents reality and as such reintroduces the widely used unquoted + "Expires" attribute with its error-prone syntax. States that a + server should not emit more than one cookie per Set-Cookie header, + which is incompatible with HTTP which says that multiple headers + are allowed only if they can be folded. + +See also the following URL for a browser * feature matrix : + http://code.google.com/p/browsersec/wiki/Part2#Same-origin_policy_for_cookies + +In short, MSIE and Safari neither support quoted strings nor max-age, which +make it mandatory to continue to send an unquoted Expires value (maybe the +day of week could be omitted though). Only Safari supports comma-separated +lists of Set-Cookie headers. Support for cross-domains is not uniform either. + diff --git a/doc/internals/http-docs.txt b/doc/internals/http-docs.txt new file mode 100644 index 0000000..4ed2480 --- /dev/null +++ b/doc/internals/http-docs.txt @@ -0,0 +1,5 @@ +Many interesting RFC and drafts linked to from this site : + + http://www.web-cache.com/Writings/protocols-standards.html + + diff --git a/doc/internals/http-parsing.txt b/doc/internals/http-parsing.txt new file mode 100644 index 0000000..8b3f239 --- /dev/null +++ b/doc/internals/http-parsing.txt @@ -0,0 +1,335 @@ +--- Relevant portions of RFC2616 --- + +OCTET = <any 8-bit sequence of data> +CHAR = <any US-ASCII character (octets 0 - 127)> +UPALPHA = <any US-ASCII uppercase letter "A".."Z"> +LOALPHA = <any US-ASCII lowercase letter "a".."z"> +ALPHA = UPALPHA | LOALPHA +DIGIT = <any US-ASCII digit "0".."9"> +CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)> +CR = <US-ASCII CR, carriage return (13)> +LF = <US-ASCII LF, linefeed (10)> +SP = <US-ASCII SP, space (32)> +HT = <US-ASCII HT, horizontal-tab (9)> +<"> = <US-ASCII double-quote mark (34)> +CRLF = CR LF +LWS = [CRLF] 1*( SP | HT ) +TEXT = <any OCTET except CTLs, but including LWS> +HEX = "A" | "B" | "C" | "D" | "E" | "F" + | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT +separators = "(" | ")" | "<" | ">" | "@" + | "," | ";" | ":" | "\" | <"> + | "/" | "[" | "]" | "?" | "=" + | "{" | "}" | SP | HT +token = 1*<any CHAR except CTLs or separators> + +quoted-pair = "\" CHAR +ctext = <any TEXT excluding "(" and ")"> +qdtext = <any TEXT except <">> +quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) +comment = "(" *( ctext | quoted-pair | comment ) ")" + + + + + +4 HTTP Message +4.1 Message Types + +HTTP messages consist of requests from client to server and responses from +server to client. Request (section 5) and Response (section 6) messages use the +generic message format of RFC 822 [9] for transferring entities (the payload of +the message). Both types of message consist of : + + - a start-line + - zero or more header fields (also known as "headers") + - an empty line (i.e., a line with nothing preceding the CRLF) indicating the + end of the header fields + - and possibly a message-body. + + +HTTP-message = Request | Response + +start-line = Request-Line | Status-Line +generic-message = start-line + *(message-header CRLF) + CRLF + [ message-body ] + +In the interest of robustness, servers SHOULD ignore any empty line(s) received +where a Request-Line is expected. In other words, if the server is reading the +protocol stream at the beginning of a message and receives a CRLF first, it +should ignore the CRLF. + + +4.2 Message headers + +- Each header field consists of a name followed by a colon (":") and the field + value. +- Field names are case-insensitive. +- The field value MAY be preceded by any amount of LWS, though a single SP is + preferred. +- Header fields can be extended over multiple lines by preceding each extra + line with at least one SP or HT. + + +message-header = field-name ":" [ field-value ] +field-name = token +field-value = *( field-content | LWS ) +field-content = <the OCTETs making up the field-value and consisting of + either *TEXT or combinations of token, separators, and + quoted-string> + + +The field-content does not include any leading or trailing LWS occurring before +the first non-whitespace character of the field-value or after the last +non-whitespace character of the field-value. Such leading or trailing LWS MAY +be removed without changing the semantics of the field value. Any LWS that +occurs between field-content MAY be replaced with a single SP before +interpreting the field value or forwarding the message downstream. + + +=> format des headers = 1*(CHAR & !ctl & !sep) ":" *(OCTET & (!ctl | LWS)) +=> les regex de matching de headers s'appliquent sur field-content, et peuvent + utiliser field-value comme espace de travail (mais de préférence aprčs le + premier SP). + +(19.3) The line terminator for message-header fields is the sequence CRLF. +However, we recommend that applications, when parsing such headers, recognize +a single LF as a line terminator and ignore the leading CR. + + + + + +message-body = entity-body + | <entity-body encoded as per Transfer-Encoding> + + + +5 Request + +Request = Request-Line + *(( general-header + | request-header + | entity-header ) CRLF) + CRLF + [ message-body ] + + + +5.1 Request line + +The elements are separated by SP characters. No CR or LF is allowed except in +the final CRLF sequence. + +Request-Line = Method SP Request-URI SP HTTP-Version CRLF + +(19.3) Clients SHOULD be tolerant in parsing the Status-Line and servers +tolerant when parsing the Request-Line. In particular, they SHOULD accept any +amount of SP or HT characters between fields, even though only a single SP is +required. + +4.5 General headers +Apply to MESSAGE. + +general-header = Cache-Control + | Connection + | Date + | Pragma + | Trailer + | Transfer-Encoding + | Upgrade + | Via + | Warning + +General-header field names can be extended reliably only in combination with a +change in the protocol version. However, new or experimental header fields may +be given the semantics of general header fields if all parties in the +communication recognize them to be general-header fields. Unrecognized header +fields are treated as entity-header fields. + + + + +5.3 Request Header Fields + +The request-header fields allow the client to pass additional information about +the request, and about the client itself, to the server. These fields act as +request modifiers, with semantics equivalent to the parameters on a programming +language method invocation. + +request-header = Accept + | Accept-Charset + | Accept-Encoding + | Accept-Language + | Authorization + | Expect + | From + | Host + | If-Match + | If-Modified-Since + | If-None-Match + | If-Range + | If-Unmodified-Since + | Max-Forwards + | Proxy-Authorization + | Range + | Referer + | TE + | User-Agent + +Request-header field names can be extended reliably only in combination with a +change in the protocol version. However, new or experimental header fields MAY +be given the semantics of request-header fields if all parties in the +communication recognize them to be request-header fields. Unrecognized header +fields are treated as entity-header fields. + + + +7.1 Entity header fields + +Entity-header fields define metainformation about the entity-body or, if no +body is present, about the resource identified by the request. Some of this +metainformation is OPTIONAL; some might be REQUIRED by portions of this +specification. + +entity-header = Allow + | Content-Encoding + | Content-Language + | Content-Length + | Content-Location + | Content-MD5 + | Content-Range + | Content-Type + | Expires + | Last-Modified + | extension-header +extension-header = message-header + +The extension-header mechanism allows additional entity-header fields to be +defined without changing the protocol, but these fields cannot be assumed to be +recognizable by the recipient. Unrecognized header fields SHOULD be ignored by +the recipient and MUST be forwarded by transparent proxies. + +---------------------------------- + +The format of Request-URI is defined by RFC3986 : + + URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] + + hier-part = "//" authority path-abempty + / path-absolute + / path-rootless + / path-empty + + URI-reference = URI / relative-ref + + absolute-URI = scheme ":" hier-part [ "?" query ] + + relative-ref = relative-part [ "?" query ] [ "#" fragment ] + + relative-part = "//" authority path-abempty + / path-absolute + / path-noscheme + / path-empty + + scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) + + authority = [ userinfo "@" ] host [ ":" port ] + userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) + host = IP-literal / IPv4address / reg-name + port = *DIGIT + + IP-literal = "[" ( IPv6address / IPvFuture ) "]" + + IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) + + IPv6address = 6( h16 ":" ) ls32 + / "::" 5( h16 ":" ) ls32 + / [ h16 ] "::" 4( h16 ":" ) ls32 + / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 + / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 + / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 + / [ *4( h16 ":" ) h16 ] "::" ls32 + / [ *5( h16 ":" ) h16 ] "::" h16 + / [ *6( h16 ":" ) h16 ] "::" + + h16 = 1*4HEXDIG + ls32 = ( h16 ":" h16 ) / IPv4address + IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet + dec-octet = DIGIT ; 0-9 + / %x31-39 DIGIT ; 10-99 + / "1" 2DIGIT ; 100-199 + / "2" %x30-34 DIGIT ; 200-249 + / "25" %x30-35 ; 250-255 + + reg-name = *( unreserved / pct-encoded / sub-delims ) + + path = path-abempty ; begins with "/" or is empty + / path-absolute ; begins with "/" but not "//" + / path-noscheme ; begins with a non-colon segment + / path-rootless ; begins with a segment + / path-empty ; zero characters + + path-abempty = *( "/" segment ) + path-absolute = "/" [ segment-nz *( "/" segment ) ] + path-noscheme = segment-nz-nc *( "/" segment ) + path-rootless = segment-nz *( "/" segment ) + path-empty = 0<pchar> + + segment = *pchar + segment-nz = 1*pchar + segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" ) + ; non-zero-length segment without any colon ":" + + pchar = unreserved / pct-encoded / sub-delims / ":" / "@" + + query = *( pchar / "/" / "?" ) + + fragment = *( pchar / "/" / "?" ) + + pct-encoded = "%" HEXDIG HEXDIG + + unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" + reserved = gen-delims / sub-delims + gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" + sub-delims = "!" / "$" / "&" / "'" / "(" / ")" + / "*" / "+" / "," / ";" / "=" + +=> so the list of allowed characters in a URI is : + + uri-char = unreserved / gen-delims / sub-delims / "%" + = ALPHA / DIGIT / "-" / "." / "_" / "~" + / ":" / "/" / "?" / "#" / "[" / "]" / "@" + / "!" / "$" / "&" / "'" / "(" / ")" / + / "*" / "+" / "," / ";" / "=" / "%" + +Note that non-ascii characters are forbidden ! Spaces and CTL are forbidden. +Unfortunately, some products such as Apache allow such characters :-/ + +---- The correct way to do it ---- + +- one http_session + It is basically any transport session on which we talk HTTP. It may be TCP, + SSL over TCP, etc... It knows a way to talk to the client, either the socket + file descriptor or a direct access to the client-side buffer. It should hold + information about the last accessed server so that we can guarantee that the + same server can be used during a whole session if needed. A first version + without optimal support for HTTP pipelining will have the client buffers tied + to the http_session. It may be possible that it is not sufficient for full + pipelining, but this will need further study. The link from the buffers to + the backend should be managed by the http transaction (http_txn), provided + that they are serialized. Each http_session, has 0 to N http_txn. Each + http_txn belongs to one and only one http_session. + +- each http_txn has 1 request message (http_req), and 0 or 1 response message + (http_rtr). Each of them has 1 and only one http_txn. An http_txn holds + information such as the HTTP method, the URI, the HTTP version, the + transfer-encoding, the HTTP status, the authorization, the req and rtr + content-length, the timers, logs, etc... The backend and server which process + the request are also known from the http_txn. + +- both request and response messages hold header and parsing information, such + as the parsing state, start of headers, start of message, captures, etc... + diff --git a/doc/internals/list.fig b/doc/internals/list.fig new file mode 100644 index 0000000..aeb1f1d --- /dev/null +++ b/doc/internals/list.fig @@ -0,0 +1,599 @@ +#FIG 3.2 Produced by xfig version 2.4 +Landscape +Center +Metric +A4 +119.50 +Single +-2 +1200 2 +6 3960 3420 4320 4230 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 4230 3860 4005 3860 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 4005 3510 4230 3510 4230 4185 4005 4185 4005 3510 +4 1 0 50 0 14 10 0.0000 4 105 105 4120 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 4118 3735 N\001 +-6 +6 4185 5580 4545 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 4455 6020 4230 6020 +2 2 0 2 0 4 53 0 20 0.000 0 0 -1 0 0 5 + 4230 5670 4455 5670 4455 6345 4230 6345 4230 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 4345 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 4343 5895 N\001 +-6 +6 4905 5445 5445 6525 +6 4905 5445 5445 6525 +6 5085 5580 5445 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 5355 6020 5130 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 5130 5670 5355 5670 5355 6345 5130 6345 5130 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 5245 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 5243 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 5355 5670 4905 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 4905 6345 5355 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 4905 5445 5355 5445 5355 6525 4905 6525 4905 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 5040 6075 L\001 +-6 +-6 +6 5805 5445 6345 6525 +6 5805 5445 6345 6525 +6 5985 5580 6345 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 6255 6020 6030 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 6030 5670 6255 5670 6255 6345 6030 6345 6030 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 6145 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 6143 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 6255 5670 5805 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 5805 6345 6255 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 5805 5445 6255 5445 6255 6525 5805 6525 5805 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 5940 6075 L\001 +-6 +-6 +6 6705 5445 7245 6525 +6 6705 5445 7245 6525 +6 6885 5580 7245 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7155 6020 6930 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 6930 5670 7155 5670 7155 6345 6930 6345 6930 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 7045 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 7043 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7155 5670 6705 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 6705 6345 7155 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 6705 5445 7155 5445 7155 6525 6705 6525 6705 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 6840 6075 L\001 +-6 +-6 +6 450 5580 810 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 720 6020 495 6020 +2 2 0 2 0 4 53 0 20 0.000 0 0 -1 0 0 5 + 495 5670 720 5670 720 6345 495 6345 495 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 610 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 608 5895 N\001 +-6 +6 1170 5445 1710 6525 +6 1170 5445 1710 6525 +6 1350 5580 1710 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 1620 6020 1395 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 1395 5670 1620 5670 1620 6345 1395 6345 1395 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 1510 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 1508 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 1620 5670 1170 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 1170 6345 1620 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 1170 5445 1620 5445 1620 6525 1170 6525 1170 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 1305 6075 L\001 +-6 +-6 +6 2070 5445 2610 6525 +6 2070 5445 2610 6525 +6 2250 5580 2610 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2520 6020 2295 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 2295 5670 2520 5670 2520 6345 2295 6345 2295 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 2410 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 2408 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2520 5670 2070 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2070 6345 2520 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 2070 5445 2520 5445 2520 6525 2070 6525 2070 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 2205 6075 L\001 +-6 +-6 +6 2970 5445 3510 6525 +6 2970 5445 3510 6525 +6 3150 5580 3510 6390 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 3420 6020 3195 6020 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 3195 5670 3420 5670 3420 6345 3195 6345 3195 5670 +4 1 0 50 0 14 10 0.0000 4 105 105 3310 6222 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 3308 5895 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 3420 5670 2970 5670 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2970 6345 3420 6345 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 2970 5445 3420 5445 3420 6525 2970 6525 2970 5445 +4 1 0 50 0 14 12 0.0000 4 120 120 3105 6075 L\001 +-6 +-6 +6 720 3420 1080 4230 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 990 3860 765 3860 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 765 3510 990 3510 990 4185 765 4185 765 3510 +4 1 0 50 0 14 10 0.0000 4 105 105 880 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 878 3735 N\001 +-6 +6 2700 3420 3060 4230 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2970 3860 2745 3860 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 2745 3510 2970 3510 2970 4185 2745 4185 2745 3510 +4 1 0 50 0 14 10 0.0000 4 105 105 2860 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 2858 3735 N\001 +-6 +6 1620 3465 1935 4230 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 1890 3860 1665 3860 +2 2 0 2 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 1665 3510 1890 3510 1890 4185 1665 4185 1665 3510 +4 1 0 50 0 14 10 0.0000 4 105 105 1780 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 1778 3735 N\001 +-6 +6 10485 3330 11025 4410 +6 10665 3465 11025 4275 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 10935 3905 10710 3905 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 10710 3555 10935 3555 10935 4230 10710 4230 10710 3555 +4 1 0 50 0 14 10 0.0000 4 105 105 10825 4107 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 10823 3780 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 10935 3555 10485 3555 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 10485 4230 10935 4230 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 10485 3330 10935 3330 10935 4410 10485 4410 10485 3330 +4 1 0 50 0 14 12 0.0000 4 120 120 10620 3960 L\001 +-6 +6 7110 3105 7650 4185 +6 7110 3105 7650 4185 +6 7290 3240 7650 4050 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7560 3680 7335 3680 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 7335 3330 7560 3330 7560 4005 7335 4005 7335 3330 +4 1 0 50 0 14 10 0.0000 4 105 105 7450 3882 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 7448 3555 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7560 3330 7110 3330 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7110 4005 7560 4005 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 7110 3105 7560 3105 7560 4185 7110 4185 7110 3105 +4 1 0 50 0 14 12 0.0000 4 120 120 7245 3735 L\001 +-6 +-6 +6 8010 3105 8550 4185 +6 8010 3105 8550 4185 +6 8190 3240 8550 4050 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 8460 3680 8235 3680 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 8235 3330 8460 3330 8460 4005 8235 4005 8235 3330 +4 1 0 50 0 14 10 0.0000 4 105 105 8350 3882 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 8348 3555 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 8460 3330 8010 3330 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 8010 4005 8460 4005 +2 2 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 8010 3105 8460 3105 8460 4185 8010 4185 8010 3105 +4 1 0 50 0 14 12 0.0000 4 120 120 8145 3735 L\001 +-6 +-6 +6 9315 990 12195 2160 +6 9675 1080 10035 1890 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 9945 1520 9720 1520 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 9720 1170 9945 1170 9945 1845 9720 1845 9720 1170 +4 1 0 50 0 14 10 0.0000 4 105 105 9835 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 9833 1395 N\001 +-6 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 10935 1520 10710 1520 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 11925 1520 11700 1520 +2 2 0 2 0 7 52 0 20 0.000 0 0 -1 0 0 5 + 10710 1170 10935 1170 10935 1845 10710 1845 10710 1170 +2 2 0 2 0 6 52 0 20 0.000 0 0 -1 0 0 5 + 11700 1170 11925 1170 11925 1845 11700 1845 11700 1170 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 9945 1350 10665 1350 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 10935 1350 11655 1350 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 11925 1350 12105 1350 12195 1350 12195 990 9315 990 9315 1350 + 9495 1350 9675 1350 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 9675 1710 9495 1710 9315 1710 9405 2160 12195 2160 12195 1710 + 12105 1710 11925 1710 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 11655 1710 10935 1710 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 10665 1710 9945 1710 + 0.000 0.000 +4 1 0 50 0 14 10 0.0000 4 105 105 10825 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 10823 1395 N\001 +4 1 0 50 0 14 10 0.0000 4 105 105 11815 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 11813 1395 N\001 +-6 +6 6345 1080 6705 1890 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 6615 1520 6390 1520 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 6390 1170 6615 1170 6615 1845 6390 1845 6390 1170 +4 1 0 50 0 14 10 0.0000 4 105 105 6505 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 6503 1395 N\001 +-6 +6 7335 1080 7695 1890 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 7605 1520 7380 1520 +2 2 0 2 0 6 52 0 20 0.000 0 0 -1 0 0 5 + 7380 1170 7605 1170 7605 1845 7380 1845 7380 1170 +4 1 0 50 0 14 10 0.0000 4 105 105 7495 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 7493 1395 N\001 +-6 +6 8325 1080 8685 1890 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 8595 1520 8370 1520 +2 2 0 2 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 8370 1170 8595 1170 8595 1845 8370 1845 8370 1170 +4 1 0 50 0 14 10 0.0000 4 105 105 8485 1722 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 8483 1395 N\001 +-6 +6 3870 1215 4185 1980 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 4140 1610 3915 1610 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 3915 1260 4140 1260 4140 1935 3915 1935 3915 1260 +4 1 0 50 0 14 10 0.0000 4 105 105 4030 1812 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 4028 1485 N\001 +-6 +6 4770 1215 5085 1980 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 5040 1610 4815 1610 +2 2 0 2 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 4815 1260 5040 1260 5040 1935 4815 1935 4815 1260 +4 1 0 50 0 14 10 0.0000 4 105 105 4930 1812 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 4928 1485 N\001 +-6 +6 2205 990 2925 2160 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 2655 1610 2430 1610 +2 2 0 2 0 2 53 0 20 0.000 0 0 -1 0 0 5 + 2430 1260 2655 1260 2655 1935 2430 1935 2430 1260 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 6 + 1 1 1.00 60.00 120.00 + 2655 1440 2880 1440 2880 1035 2205 1035 2205 1440 2430 1440 + 0.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 6 + 1 1 1.00 60.00 120.00 + 2655 1755 2880 1755 2880 2160 2205 2160 2205 1755 2430 1755 + 0.000 1.000 1.000 1.000 1.000 0.000 +4 1 0 50 0 14 10 0.0000 4 105 105 2545 1812 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 2543 1485 N\001 +-6 +6 525 1350 1455 1830 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 540 1590 1440 1590 +2 2 0 2 0 7 50 0 -1 0.000 0 0 -1 0 0 5 + 540 1365 1440 1365 1440 1815 540 1815 540 1365 +4 1 0 50 0 14 10 0.0000 4 105 735 990 1545 list *N\001 +4 1 0 50 0 14 10 0.0000 4 105 735 990 1770 list *P\001 +-6 +6 4815 3420 5175 4230 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 5085 3860 4860 3860 +2 2 0 2 0 7 53 0 20 0.000 0 0 -1 0 0 5 + 4860 3510 5085 3510 5085 4185 4860 4185 4860 3510 +4 1 0 50 0 14 10 0.0000 4 105 105 4975 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 4973 3735 N\001 +-6 +6 5715 3285 6390 4410 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 + 6165 3860 5940 3860 +2 2 0 2 0 6 53 0 20 0.000 0 0 -1 0 0 5 + 5940 3510 6165 3510 6165 4185 5940 4185 5940 3510 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 6 + 1 1 1.00 60.00 120.00 + 6165 3690 6390 3690 6390 3285 5715 3285 5715 3690 5940 3690 + 0.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 6 + 1 1 1.00 60.00 120.00 + 6165 4005 6390 4005 6390 4410 5715 4410 5715 4005 5940 4005 + 0.000 1.000 1.000 1.000 1.000 0.000 +4 1 0 50 0 14 10 0.0000 4 105 105 6055 4062 P\001 +4 1 0 50 0 14 10 0.0000 4 105 105 6053 3735 N\001 +-6 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 4050 4725 7605 4725 7605 6840 4050 6840 4050 4725 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 315 4725 3870 4725 3870 6840 315 6840 315 4725 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 3150 4500 315 4500 315 2475 3150 2475 3150 4500 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 6660 2475 8910 2475 8910 4500 6660 4500 6660 2475 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 10035 3375 10485 3330 +2 1 0 1 0 7 50 0 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 10080 3735 10485 3555 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 9135 2475 12285 2475 12285 4500 9135 4500 9135 2475 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 9270 270 12285 270 12285 2250 9270 2250 9270 270 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 5760 270 9045 270 9045 2250 5760 2250 5760 270 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 3465 270 5535 270 5535 2250 3465 2250 3465 270 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 1845 270 3240 270 3240 2250 1845 2250 1845 270 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 315 270 1620 270 1620 2250 315 2250 315 270 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 3330 2475 6435 2475 6435 4500 3330 4500 3330 2475 +2 4 0 1 0 7 50 0 -1 0.000 0 0 7 0 0 5 + 12285 6840 12285 4725 7785 4725 7785 6840 12285 6840 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4230 3690 4860 3690 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4860 4050 4230 4050 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 3960 4050 3780 4050 3600 4050 3600 4410 5580 4410 5580 4050 + 5130 4050 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 6261 5805 6711 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4461 5805 4911 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 5358 5805 5808 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 6705 6210 6255 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 5805 6210 5355 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4905 6210 4455 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 4320 6345 4320 6525 4320 6750 7470 6750 7470 6480 7470 6210 + 7155 6210 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 7155 5850 7335 5850 7470 5850 7470 5355 7470 5085 4590 5085 + 4590 5355 4860 5625 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2526 5805 2976 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 726 5805 1176 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 1623 5805 2073 5670 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2970 6210 2520 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2070 6210 1620 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 1170 6210 720 6210 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 585 6345 585 6525 585 6750 3735 6750 3735 6480 3735 6210 + 3420 6210 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 3420 5850 3600 5850 3735 5850 3735 5355 3735 5085 585 5085 + 585 5265 585 5670 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 990 3690 1620 3690 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 1620 4050 990 4050 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 1890 3690 2340 3690 2340 3240 360 3240 360 3690 540 3690 + 720 3690 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 720 4050 540 4050 360 4050 360 4410 2340 4410 2340 4050 + 1890 4050 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 7560 3465 8010 3330 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 7560 3915 8010 3375 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 8460 3465 8775 3465 8820 3060 8730 2745 6750 2745 6705 3330 + 7110 3330 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 8460 3870 8820 3870 8820 4230 8640 4365 6930 4365 6750 4230 + 6705 3510 7065 3375 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 6615 1350 7335 1350 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 7605 1350 8325 1350 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 8595 1350 8775 1350 8865 1350 8865 990 5985 990 5985 1350 + 6165 1350 6345 1350 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 8 + 1 1 1.00 60.00 120.00 + 6345 1710 6165 1710 5985 1710 6075 2160 8865 2160 8865 1710 + 8775 1710 8595 1710 + 0.000 1.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 8325 1710 7605 1710 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 7335 1710 6615 1710 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4140 1440 4770 1440 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 4770 1800 4140 1800 + 0.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 5040 1440 5490 1440 5490 990 3510 990 3510 1440 3690 1440 + 3870 1440 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 3870 1800 3690 1800 3510 1800 3510 2160 5490 2160 5490 1800 + 5040 1800 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 0 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 5130 3690 5580 3690 5580 3240 3600 3240 3600 3690 3780 3690 + 3960 3690 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +4 1 0 50 0 14 10 0.0000 4 135 3780 5805 4950 Asymmetrical list starting at R(red)\001 +4 1 0 50 0 14 10 0.0000 4 135 4095 10215 4950 Symmetrical lists vs Asymmetrical lists\001 +4 1 0 50 0 12 10 0.0000 4 135 525 5130 5355 foo_0\001 +4 1 0 50 0 12 10 0.0000 4 135 525 6030 5355 foo_1\001 +4 1 0 50 0 12 10 0.0000 4 135 525 6930 5355 foo_2\001 +4 1 0 50 0 14 10 0.0000 4 135 3675 2070 4950 Symmetrical list starting at R(red)\001 +4 1 0 50 0 12 10 0.0000 4 135 525 3195 5355 foo_2\001 +4 1 0 50 0 12 10 0.0000 4 135 525 2295 5355 foo_1\001 +4 1 0 50 0 12 10 0.0000 4 135 525 1395 5355 foo_0\001 +4 1 0 50 0 12 10 0.0000 4 105 315 9855 3420 foo\001 +4 1 0 50 0 12 10 0.0000 4 105 105 9990 3825 E\001 +4 1 0 50 0 14 10 0.0000 4 135 1680 7785 2655 Linking elements\001 +4 1 0 50 0 12 10 0.0000 4 135 525 8235 3015 foo_1\001 +4 1 0 50 0 12 10 0.0000 4 135 525 7335 3015 foo_0\001 +4 1 0 50 0 14 10 0.0000 4 105 1470 2565 675 struct list *G\001 +4 1 0 50 0 14 10 0.0000 4 135 1470 2565 495 LIST_INIT(G):G\001 +4 1 0 50 0 14 10 0.0000 4 135 1890 4500 495 LIST_INSERT(G,W):W\001 +4 1 0 50 0 14 10 0.0000 4 135 3360 10665 2700 foo=LIST_ELEM(E, struct foo*, L)\001 +4 1 0 50 0 14 10 0.0000 4 135 1890 4500 675 LIST_APPEND(G,W):W\001 +4 1 0 50 0 14 10 0.0000 4 135 1890 7425 540 LIST_INSERT(G,Y):Y\001 +4 1 0 50 0 14 10 0.0000 4 135 1890 10755 540 LIST_APPEND(G,Y):Y\001 +4 1 0 50 0 14 10 0.0000 4 135 1680 1755 2790 LIST_DELETE(Y):Y\001 +4 1 0 50 0 14 10 0.0000 4 135 1890 4905 2745 LIST_DEL_INIT(Y):Y\001 +4 1 0 50 0 12 9 0.0000 4 120 2880 10665 2925 Returns a pointer to struct foo*\001 +4 1 0 50 0 12 9 0.0000 4 120 2790 10665 3105 containing header E as member L\001 +4 1 0 50 0 12 9 0.0000 4 120 2700 10755 810 adds Y at the queue (before G)\001 +4 1 0 50 0 12 9 0.0000 4 120 3060 7425 810 adds Y(yellow) just after G(green)\001 +4 1 0 50 0 12 9 0.0000 4 120 1170 4500 855 adds W(white)\001 +4 1 0 50 0 12 9 0.0000 4 105 1080 2565 900 Terminates G\001 +4 1 0 50 0 12 9 0.0000 4 90 540 990 855 N=next\001 +4 1 0 50 0 12 9 0.0000 4 105 540 990 1080 P=prev\001 +4 1 0 50 0 12 9 0.0000 4 120 2610 1755 3060 unlinks and returns Y(yellow)\001 +4 1 0 50 0 12 9 0.0000 4 120 2610 4905 3060 unlinks, inits, and returns Y\001 +4 0 0 50 0 12 8 0.0000 4 105 2175 7875 5265 - both are empty if R->P == R\001 +4 0 0 50 0 12 8 0.0000 4 90 2175 7875 5490 - last element has R->P == &L\001 +4 0 0 50 0 12 8 0.0000 4 105 3150 7875 5715 - FOREACH_ITEM(it, R, end, struct foo*, L)\001 +4 0 0 50 0 12 8 0.0000 4 105 3300 7875 5940 iterates <it> through foo{0,1,2} and stops\001 +4 0 0 50 0 12 8 0.0000 4 105 3900 7875 6165 - FOREACH_ITEM_SAFE(it, bck, R, end, struct foo*, L)\001 +4 0 0 50 0 12 8 0.0000 4 105 3750 7875 6390 does the same except that <bck> allows to delete\001 +4 0 0 50 0 12 8 0.0000 4 105 1950 7875 6570 any node, including <it>\001 +4 1 0 50 0 14 11 0.0000 4 135 1155 945 585 struct list\001 diff --git a/doc/internals/list.png b/doc/internals/list.png Binary files differnew file mode 100644 index 0000000..ec41a6b --- /dev/null +++ b/doc/internals/list.png diff --git a/doc/internals/listener-states.fig b/doc/internals/listener-states.fig new file mode 100644 index 0000000..863e7f5 --- /dev/null +++ b/doc/internals/listener-states.fig @@ -0,0 +1,150 @@ +#FIG 3.2 Produced by xfig version 2.3 +Portrait +Center +Metric +A4 +300.00 +Single +-2 +1200 2 +0 32 #ff60e0 +0 33 #ff8020 +0 34 #56c5ff +0 35 #55d941 +0 36 #f8e010 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 900 450 495 225 900 450 1395 450 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 2700 450 495 225 2700 450 3195 450 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 4500 450 495 225 4500 450 4995 450 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 900 3465 495 225 900 3465 1395 3465 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 2700 2475 495 225 2700 2475 3195 2475 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 3645 1575 495 225 3645 1575 4140 1575 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 4500 2475 495 225 4500 2475 4995 2475 +1 1 0 3 0 7 51 -1 20 0.000 1 0.0000 2700 3471 495 225 2700 3471 3195 3471 +2 1 1 3 1 7 52 -1 -1 8.000 1 0 -1 0 0 2 + 270 1980 5355 1350 +2 2 0 2 32 32 52 -1 20 0.000 1 0 -1 0 0 5 + 2070 3060 3330 3060 3330 3870 2070 3870 2070 3060 +2 3 0 1 33 33 53 -1 20 0.000 1 0 -1 0 0 5 + 2070 990 5130 990 5130 2880 2070 2880 2070 990 +2 2 0 2 35 35 52 -1 20 0.000 1 0 -1 0 0 5 + 270 90 5130 90 5130 855 270 855 270 90 +2 2 0 2 36 36 52 -1 20 0.000 1 0 -1 0 0 5 + 270 3060 1530 3060 1530 3870 270 3870 270 3060 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 1395 450 2250 450 + 0.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 3195 450 4050 450 + 0.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 4095 1665 4455 2025 4500 2250 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 3195 3510 3600 3465 4140 2655 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 4410 2250 4365 2070 4050 1710 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 945 3240 936 2142 2961 1917 3240 1710 + 0.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 3195 1665 2835 1845 855 2115 855 3240 + 0.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 5 + 1 1 1.00 60.00 120.00 + 990 3690 1035 3960 2880 4050 4365 3915 4410 2700 + 0.000 1.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2700 2700 2700 3240 + 0.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 4095 2610 3600 3375 3150 3420 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 5 + 1 1 1.00 60.00 120.00 + 4500 2700 4455 4005 2655 4140 945 4005 900 3690 + 0.000 1.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 2205 2520 1395 2745 1260 2970 1125 3240 + 0.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 3510 1800 3330 2025 3330 2835 2970 3285 + 0.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 1170 3285 1305 3015 1485 2790 2250 2610 + 0.000 1.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 2205 3420 1710 3420 1395 3465 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 1395 3510 1800 3510 2205 3465 + 0.000 1.000 0.000 +3 0 0 3 0 7 51 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 2925 2295 3060 1980 3330 1755 + 0.000 1.000 0.000 +3 0 0 3 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 4500 675 4455 990 3960 1395 + 0.000 1.000 0.000 +4 1 0 50 -1 18 10 0.0000 4 120 375 900 450 NEW\001 +4 1 0 50 -1 18 10 0.0000 4 120 315 2700 450 INIT\001 +4 1 0 50 -1 18 10 0.0000 4 120 810 4500 450 ASSIGNED\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 900 630 0\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 2700 630 1\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 4500 630 2\001 +4 1 0 50 -1 16 7 0.0000 4 120 420 1755 405 create()\001 +4 2 0 50 -1 16 7 0.0000 4 120 660 1215 2160 enable() &&\001 +4 2 0 50 -1 16 7 0.0000 4 90 540 1080 2295 !maxconn\001 +4 2 1 51 -1 16 7 1.5708 4 105 600 5355 1485 transitions\001 +4 0 1 51 -1 16 7 1.5708 4 105 600 5355 1260 transitions\001 +4 2 1 51 -1 16 7 1.5708 4 105 795 5265 1485 multi-threaded\001 +4 0 1 51 -1 16 7 1.5708 4 120 870 5265 1260 single-threaded\001 +4 0 0 52 -1 17 7 0.0000 4 90 345 315 765 no FD\001 +4 0 0 52 -1 17 7 0.0000 4 135 315 315 3825 polled\001 +4 1 0 50 -1 18 10 0.0000 4 120 555 900 3465 READY\001 +4 0 0 50 -1 16 7 0.0000 4 120 255 1170 3825 full()\001 +4 2 0 50 -1 16 7 0.0000 4 90 540 2205 3375 !maxconn\001 +4 2 0 50 -1 16 7 0.0000 4 105 675 2295 3240 resume() &&\001 +4 0 0 50 -1 16 7 0.0000 4 105 405 1395 3645 pause()\001 +4 0 0 52 -1 17 7 0.0000 4 135 585 2115 3825 shut(sock)\001 +4 2 0 50 -1 16 7 0.0000 4 120 480 4320 2205 disable()\001 +4 2 0 50 -1 16 7 0.0000 4 105 405 4005 2655 pause()\001 +4 0 0 50 -1 16 7 0.0000 4 105 465 4545 2835 resume()\001 +4 2 0 50 -1 16 7 0.0000 4 120 480 2925 2160 disable()\001 +4 0 0 50 -1 16 7 0.0000 4 105 405 3465 1980 pause()\001 +4 0 0 50 -1 16 7 0.0000 4 120 660 4230 1710 enable() &&\001 +4 0 0 50 -1 16 7 0.0000 4 75 510 4320 1845 maxconn\001 +4 2 0 50 -1 16 7 0.0000 4 105 405 2655 2835 pause()\001 +4 0 0 50 -1 16 7 0.0000 4 105 675 3375 3555 resume() &&\001 +4 0 0 50 -1 16 7 0.0000 4 75 510 3375 3645 maxconn\001 +4 0 0 50 -1 16 7 0.0000 4 120 480 1080 2655 disable()\001 +4 2 0 50 -1 16 7 0.0000 4 105 465 2160 2475 resume()\001 +4 1 0 50 -1 16 7 0.0000 4 120 330 3555 405 .add()\001 +4 0 0 50 -1 16 7 0.0000 4 120 375 4545 810 .bind()\001 +4 0 0 52 -1 17 7 0.0000 4 135 1080 2115 1125 FD ready, not polled\001 +4 0 0 50 -1 16 7 0.0000 4 120 315 1305 3240 limit()\001 +4 1 0 50 -1 18 10 0.0000 4 120 630 2700 2475 LIMITED\001 +4 1 0 50 -1 18 10 0.0000 4 120 555 3645 1575 LISTEN\001 +4 1 0 50 -1 18 10 0.0000 4 120 375 4500 2475 FULL\001 +4 1 0 50 -1 18 10 0.0000 4 120 630 2700 3465 PAUSED\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 2700 3645 3\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 2700 2655 7\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 4500 2655 6\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 900 3645 5\001 +4 1 1 50 -1 16 10 0.0000 4 120 90 3645 1755 4\001 diff --git a/doc/internals/listener-states.png b/doc/internals/listener-states.png Binary files differnew file mode 100644 index 0000000..8757a12 --- /dev/null +++ b/doc/internals/listener-states.png diff --git a/doc/internals/lua_socket.fig b/doc/internals/lua_socket.fig new file mode 100644 index 0000000..7da3294 --- /dev/null +++ b/doc/internals/lua_socket.fig @@ -0,0 +1,113 @@ +#FIG 3.2 Produced by xfig version 1.8 +Landscape +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +6 1125 2745 2565 3555 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 1125 2745 2565 2745 2565 3555 1125 3555 1125 2745 +4 0 0 50 -1 16 12 0.0000 4 180 1080 1215 3195 lua_State *T\001 +4 0 0 50 -1 18 12 0.0000 4 150 990 1215 2925 struct hlua\001 +4 0 0 50 -1 16 12 0.0000 4 195 1245 1215 3465 stop_list *stop\001 +-6 +6 7560 4365 10620 5265 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 7650 4635 10530 4635 10530 5175 7650 5175 7650 4635 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 7560 4365 10620 4365 10620 5265 7560 5265 7560 4365 +4 0 0 50 -1 18 12 0.0000 4 195 2565 7740 4815 struct stream_interface si[0]\001 +4 0 0 50 -1 16 12 0.0000 4 195 1725 7740 5085 enum obj_type *end\001 +4 0 0 50 -1 18 12 0.0000 4 150 1215 7650 4545 struct stream\001 +-6 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 225 4500 2745 4500 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 225 5040 2745 5040 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 225 4770 2745 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1935 5715 7740 6705 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 2520 3420 3600 4095 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 225 4230 2745 4230 2745 7020 225 7020 225 4230 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 225 6300 2745 6300 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 225 6660 2745 6660 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 1035 2205 2655 2205 2655 3645 1035 3645 1035 2205 +2 1 1 4 4 7 500 -1 -1 4.000 0 0 -1 0 0 2 + 4860 1935 4860 9225 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7695 6435 5760 4410 +2 2 0 1 0 7 50 -1 20 0.000 0 0 -1 0 0 5 + 3600 3915 6075 3915 6075 4410 3600 4410 3600 3915 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 9450 5040 9225 5670 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 4 + 7740 6300 7695 6345 7695 6525 7740 6570 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 7560 5670 9765 5670 9765 7200 7560 7200 7560 5670 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 7650 5940 9675 5940 9675 7110 7650 7110 7650 5940 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 315 5310 2655 5310 2655 6165 315 6165 315 5310 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7830 6840 2565 5580 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 7740 6705 9540 6705 9540 6930 7740 6930 7740 6705 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 405 5580 2565 5580 2565 5805 405 5805 405 5580 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 5 + 1 1 1.00 60.00 120.00 + 1215 3105 765 3330 720 3555 765 3915 810 4230 + 0.000 1.000 1.000 1.000 0.000 +3 0 1 1 13 7 50 -1 -1 1.000 0 1 0 3 + 5 1 1.00 60.00 120.00 + 675 7020 675 7785 900 8104 + 0.000 1.000 0.000 +3 0 1 1 13 7 50 -1 -1 1.000 0 1 0 2 + 5 1 1.00 60.00 120.00 + 7740 7200 7740 8100 + 0.000 0.000 +3 0 1 1 13 7 50 -1 -1 1.000 0 1 0 3 + 5 1 1.00 60.00 120.00 + 7605 7200 7605 8865 7740 9000 + 0.000 1.000 0.000 +4 0 0 50 -1 18 12 0.0000 4 150 885 315 4410 stack Lua\001 +4 0 0 50 -1 16 12 0.0000 4 195 1140 315 4680 stack entry 0\001 +4 0 0 50 -1 16 12 0.0000 4 195 1140 315 4950 stack entry 1\001 +4 0 0 50 -1 16 12 0.0000 4 195 1140 315 5220 stack entry 2\001 +4 0 0 50 -1 18 12 0.0000 4 195 1695 405 5490 struct hlua_socket\001 +4 0 0 50 -1 16 12 0.0000 4 195 1140 315 6570 stack entry 3\001 +4 0 0 50 -1 16 12 0.0000 4 195 1140 315 6930 stack entry 4\001 +4 1 12 50 -1 12 9 5.6723 4 135 540 3150 3735 (list)\001 +4 0 0 50 -1 18 12 0.0000 4 150 1305 1125 2430 struct session\001 +4 0 0 50 -1 16 12 0.0000 4 150 1440 1125 2655 struct task *task\001 +4 0 0 50 -1 12 12 0.0000 4 165 1560 990 8100 hlua_tcp_gc()\001 +4 0 0 50 -1 16 12 0.0000 4 195 2430 990 8295 Called just before the object\001 +4 0 0 50 -1 16 12 0.0000 4 195 840 990 8535 garbaging\001 +4 1 12 50 -1 12 9 5.5327 4 135 540 6390 4905 (list)\001 +4 0 0 50 -1 18 12 0.0000 4 195 2205 3690 4095 struct hlua_socket_com\001 +4 0 0 50 -1 16 12 0.0000 4 150 1440 3690 4320 struct task *task\001 +4 0 0 50 -1 18 12 0.0000 4 195 1200 7650 5850 struct appctx\001 +4 0 0 50 -1 18 12 0.0000 4 150 1110 7740 6120 struct <lua>\001 +4 0 0 50 -1 16 12 0.0000 4 195 1620 7740 6615 struct hlua_tcp *wr\001 +4 0 0 50 -1 16 12 0.0000 4 195 1590 7740 6390 struct hlua_tcp *rd\001 +4 0 0 50 -1 12 12 0.0000 4 165 2160 7875 9000 hlua_tcp_release()\001 +4 0 0 50 -1 16 12 0.0000 4 195 3150 7875 9195 Called when the applet is destroyed.\001 +4 0 0 50 -1 12 12 0.0000 4 165 2400 7875 8100 update_tcp_handler()\001 +4 0 0 50 -1 16 12 0.0000 4 195 2640 7875 8295 Called on each change on the \001 +4 0 0 50 -1 16 12 0.0000 4 195 1830 7875 8535 tcp connection state.\001 +4 0 0 50 -1 16 12 0.0000 4 150 1350 495 5760 struct xref *xref\001 +4 0 0 50 -1 16 12 0.0000 4 150 1350 7830 6885 struct xref *xref\001 diff --git a/doc/internals/lua_socket.pdf b/doc/internals/lua_socket.pdf Binary files differnew file mode 100644 index 0000000..e3b80ee --- /dev/null +++ b/doc/internals/lua_socket.pdf diff --git a/doc/internals/muxes.fig b/doc/internals/muxes.fig new file mode 100644 index 0000000..babdd55 --- /dev/null +++ b/doc/internals/muxes.fig @@ -0,0 +1,401 @@ +#FIG 3.2 Produced by xfig version 3.2.8b +Landscape +Center +Inches +Letter +100.00 +Single +-1 +1200 2 +0 32 #bbf2e2 +0 33 #a7ceb3 +0 34 #dae8fc +0 35 #458dba +0 36 #ffe6cc +0 37 #e9b000 +0 38 #1a1a1a +0 39 #8e8e8e +0 40 #ffc1e7 +6 4200 8700 4800 9825 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 4261 9751 4261 8751 4761 8751 4761 9751 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 4761 9751 4761 8751 4261 8751 4261 9751 4761 9751 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 4261 8850 4761 8850 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 4261 8925 4761 8925 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 4261 9000 4761 9000 +-6 +6 1425 3525 2025 4650 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 1486 4576 1486 3576 1986 3576 1986 4576 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 1986 4576 1986 3576 1486 3576 1486 4576 1986 4576 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 1486 3675 1986 3675 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 1486 3750 1986 3750 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 1486 3825 1986 3825 +-6 +6 3225 3525 3825 4650 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 3286 4576 3286 3576 3786 3576 3786 4576 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 3786 4576 3786 3576 3286 3576 3286 4576 3786 4576 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 3286 3675 3786 3675 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 3286 3750 3786 3750 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 3286 3825 3786 3825 +-6 +6 5025 3525 5625 4650 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 5086 4576 5086 3576 5586 3576 5586 4576 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 5586 4576 5586 3576 5086 3576 5086 4576 5586 4576 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5086 3675 5586 3675 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5086 3750 5586 3750 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5086 3825 5586 3825 +-6 +6 6900 3525 7500 4650 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 6961 4576 6961 3576 7461 3576 7461 4576 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 7461 4576 7461 3576 6961 3576 6961 4576 7461 4576 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 6961 3675 7461 3675 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 6961 3750 7461 3750 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 6961 3825 7461 3825 +-6 +6 11925 10725 13875 11475 +2 4 0 3 0 35 50 -1 20 0.000 1 0 7 0 0 5 + 13800 11400 12000 11400 12000 10800 13800 10800 13800 11400 +4 1 0 49 -1 4 18 0.0000 4 285 1335 12900 11175 Transport\001 +-6 +6 6600 1200 10050 1800 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 6692 1261 9959 1261 9959 1761 6692 1761 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 6692 1761 9959 1761 9959 1261 6692 1261 6692 1761 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9750 1261 9750 1761 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9525 1261 9525 1761 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9300 1261 9300 1761 +4 1 0 46 -1 4 16 0.0000 4 210 1605 8025 1575 channel buf\001 +-6 +6 12375 8100 12900 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 12600 8161 12600 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 12425 8161 12825 8161 12825 8661 12425 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 12425 8661 12825 8661 12825 8161 12425 8161 12425 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 12675 8161 12675 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 12750 8161 12750 8661 +-6 +6 11700 8100 12225 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11925 8161 11925 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 11750 8161 12150 8161 12150 8661 11750 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 11750 8661 12150 8661 12150 8161 11750 8161 11750 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 12000 8161 12000 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 12075 8161 12075 8661 +-6 +6 11025 8100 11550 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11250 8161 11250 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 11075 8161 11475 8161 11475 8661 11075 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 11075 8661 11475 8661 11475 8161 11075 8161 11075 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11325 8161 11325 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11400 8161 11400 8661 +-6 +6 10350 8100 10875 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 10575 8161 10575 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 10400 8161 10800 8161 10800 8661 10400 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 10400 8661 10800 8661 10800 8161 10400 8161 10400 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 10650 8161 10650 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 10725 8161 10725 8661 +-6 +6 13050 8100 13575 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 13275 8161 13275 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 13100 8161 13500 8161 13500 8661 13100 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 13100 8661 13500 8661 13500 8161 13100 8161 13100 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 13350 8161 13350 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 13425 8161 13425 8661 +-6 +6 13725 8100 14250 8700 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 13950 8161 13950 8661 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 13775 8161 14175 8161 14175 8661 13775 8661 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 13775 8661 14175 8661 14175 8161 13775 8161 13775 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 14025 8161 14025 8661 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 14100 8161 14100 8661 +-6 +6 11100 11700 13050 12150 +1 1 0 4 20 40 49 -1 20 0.000 1 0.0000 11400 11925 225 150 11400 11925 11625 12075 +4 0 0 49 -1 4 12 0.0000 4 165 960 11850 12000 I/O tasklet\001 +-6 +6 11100 12300 11700 12600 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11357 12331 11357 12581 +2 1 0 4 35 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 11157 12331 11614 12331 11614 12581 11157 12581 +2 3 0 0 -1 34 49 -1 20 0.000 0 0 -1 0 0 5 + 11157 12581 11614 12581 11614 12331 11157 12331 11157 12581 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11443 12331 11443 12581 +2 1 0 2 35 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 11529 12331 11529 12581 +-6 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 10725 5700 75 75 10725 5700 10800 5700 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 12750 5700 75 75 12750 5700 12825 5700 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 13875 5700 75 75 13875 5700 13950 5700 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 11700 5700 75 75 11700 5700 11775 5700 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 2925 6750 75 75 2925 6750 3000 6750 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 4950 6750 75 75 4950 6750 5025 6750 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 6075 6750 75 75 6075 6750 6150 6750 +1 3 0 3 0 0 49 -1 20 0.000 1 0.0000 3900 6750 75 75 3900 6750 3975 6750 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 9525 4140 583 250 9525 4140 10108 3890 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 11341 4140 583 250 11341 4140 11924 3890 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 13154 4140 583 250 13154 4140 13737 3890 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 15033 4140 583 250 15033 4140 15616 3890 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 7182 5173 583 250 7182 5173 7765 4923 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 3507 5173 583 250 3507 5173 4090 4923 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 1719 5173 583 250 1719 5173 2302 4923 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 5325 5175 583 250 5325 5175 5908 4925 +1 1 0 4 10 11 45 -1 20 0.000 1 0.0000 4488 8082 612 250 4488 8082 5100 8082 +1 1 0 4 10 11 49 -1 20 0.000 1 0.0000 12333 7025 417 250 12333 7025 12750 7025 +1 1 0 4 20 40 49 -1 20 0.000 1 0.0000 12392 9240 808 210 12392 9240 13200 9240 +1 1 0 4 20 40 49 -1 20 0.000 1 0.0000 3167 9240 808 210 3167 9240 3975 9240 +1 1 0 4 37 36 49 -1 20 0.000 1 0.0000 1800 11925 225 150 1800 11925 2025 12075 +1 1 0 4 10 11 45 -1 20 0.000 1 0.0000 6600 11925 225 150 6600 11925 6825 12075 +1 1 0 4 20 40 49 -1 20 0.000 1 0.0000 8400 600 900 210 8400 600 9300 600 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 2550 3300 2550 6150 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 4500 3300 4500 6150 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 6300 3300 6300 6150 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 600 8025 600 12225 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 600 3150 600 1800 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 600 1500 600 150 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 1 4 + 1 1 1.00 90.00 180.00 + 1 1 1.00 90.00 180.00 + 3000 3300 3000 1425 3675 600 7500 600 +2 3 0 4 33 32 50 -1 20 0.000 0 0 -1 0 0 5 + 900 3300 900 9900 8100 9900 8100 3300 900 3300 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 4 + 1 1 1.00 90.00 180.00 + 3525 3525 3525 2625 4500 1500 6750 1500 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 3 + 1 1 1.00 90.00 180.00 + 11295 4425 11295 4725 11700 5625 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 3 + 1 1 1.00 90.00 180.00 + 9495 4425 9495 4725 10695 5700 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 3 + 1 1 1.00 90.00 180.00 + 13163 4425 13163 4725 12788 5625 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 3 + 1 1 1.00 90.00 180.00 + 15013 4427 15013 4725 13888 5702 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 9525 3525 9525 3825 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 13125 3525 13125 3825 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 15000 3525 15000 3825 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 5 + 1 1 1.00 90.00 180.00 + 12300 7275 12300 7725 9975 7725 9975 8400 10425 8400 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 0 3 + 1 1 1.00 90.00 180.00 + 11775 5850 12300 6450 12300 6825 +2 1 1 3 0 7 49 -1 -1 8.000 1 0 -1 0 0 3 + 11475 6150 13200 6150 13200 6825 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 3 + 1 1 1.00 90.00 180.00 + 3975 6900 4500 7650 4500 7875 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 3 + 1 1 1.00 90.00 180.00 + 3495 5475 3495 5775 3900 6675 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 3 + 1 1 1.00 90.00 180.00 + 1695 5475 1695 5775 2895 6750 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 3 + 1 1 1.00 90.00 180.00 + 7213 5477 7213 5775 6088 6752 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 1725 4875 1725 4575 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 3525 4875 3525 4575 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 5325 4875 5325 4575 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 7200 4875 7200 4575 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 2 + 1 1 1.00 90.00 180.00 + 4500 8325 4500 8721 +2 1 1 3 0 7 49 -1 -1 8.000 1 0 -1 0 0 3 + 3225 7875 3225 7350 4725 7350 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 1 2 + 1 1 1.00 90.00 180.00 + 1 1 1.00 90.00 180.00 + 3900 10800 3225 9450 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 4500 10800 4500 9750 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 1 1 3 + 1 1 1.00 90.00 180.00 + 1 1 1.00 90.00 180.00 + 12375 10800 12375 9750 12375 9450 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 12225 3300 12225 5025 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 10425 3300 10425 5025 +2 1 1 1 0 7 49 -1 -1 4.000 1 0 -1 0 0 2 + 14025 3300 14025 5025 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 4 + 1 1 1.00 90.00 180.00 + 9975 1500 10800 1500 11325 2100 11325 3825 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 1 4 + 1 1 1.00 90.00 180.00 + 1 1 1.00 90.00 180.00 + 9300 600 11175 600 11775 1275 11775 3300 +2 3 0 4 33 32 50 -1 20 0.000 0 0 -1 0 0 5 + 8700 3300 8700 9900 15900 9900 15900 3300 8700 3300 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 0 1 5 + 1 1 1.00 90.00 180.00 + 13200 10800 13200 10200 14625 9750 14625 8400 14175 8400 +2 1 0 3 0 7 49 -1 -1 0.000 1 0 -1 0 1 3 + 1 1 1.00 90.00 180.00 + 5325 5475 5325 5775 4950 6675 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 600 5400 600 3300 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 0 0 2 + 600 7800 600 5700 +2 4 0 3 0 35 50 -1 20 0.000 1 0 7 0 0 5 + 5400 11400 3600 11400 3600 10800 5400 10800 5400 11400 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 12150 8400 12450 8400 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 11475 8400 11775 8400 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 10800 8400 11100 8400 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 12825 8400 13125 8400 +2 1 0 3 0 7 49 -1 -1 8.000 1 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 13500 8400 13800 8400 +2 4 0 3 0 35 50 -1 20 0.000 1 0 7 0 0 5 + 2100 12600 1575 12600 1575 12300 2100 12300 2100 12600 +2 4 0 3 33 32 50 -1 20 0.000 1 0 7 0 0 5 + 6900 12600 6375 12600 6375 12300 6900 12300 6900 12600 +4 1 0 49 -1 4 14 1.5708 4 225 1335 450 825 application\001 +4 0 0 49 -1 4 12 1.5708 4 180 2595 2850 3225 mux->subscribe(SUB_RECV)\001 +4 1 0 46 -1 4 16 1.5708 4 210 645 3600 4200 rxbuf\001 +4 1 0 46 -1 4 16 1.5708 4 210 615 4575 9375 dbuf\001 +4 1 0 49 -1 4 16 0.0000 4 210 600 12300 7125 MUX\001 +4 1 0 44 -1 4 16 0.0000 4 210 945 4500 8175 DEMUX\001 +4 2 0 49 -1 4 12 0.0000 4 150 915 3600 8100 Stream ID\001 +4 0 0 49 -1 4 12 0.0000 4 150 915 12825 7125 Stream ID\001 +4 2 0 49 -1 4 12 0.0000 4 180 1635 3300 10125 tasklet_wakeup()\001 +4 2 0 49 -1 4 12 0.0000 4 180 1635 12150 10125 tasklet_wakeup()\001 +4 2 0 49 -1 4 12 0.0000 4 180 1470 11175 3150 mux->snd_buf()\001 +4 0 0 49 -1 4 12 0.0000 4 180 1425 3675 3225 mux->rcv_buf()\001 +4 0 0 49 -1 4 12 0.0000 4 180 1920 13425 10575 xprt->snd_buf(mbuf)\001 +4 0 0 49 -1 4 12 0.0000 4 180 1830 4725 10500 xprt->rcv_buf(dbuf)\001 +4 1 0 49 -1 4 12 0.0000 4 150 3105 8400 2100 HTX contents when mode==HTTP\001 +4 2 0 49 -1 4 12 0.0000 4 180 1635 7500 450 tasklet_wakeup()\001 +4 0 0 49 -1 4 12 0.0000 4 180 1635 9300 450 tasklet_wakeup()\001 +4 1 38 48 -1 4 12 0.0000 4 150 750 9534 4200 encode\001 +4 1 38 48 -1 4 12 0.0000 4 150 750 11325 4200 encode\001 +4 1 38 48 -1 4 12 0.0000 4 150 750 13134 4200 encode\001 +4 1 38 48 -1 4 12 0.0000 4 150 750 15009 4200 encode\001 +4 1 38 48 -1 4 12 0.0000 4 150 765 1725 5250 decode\001 +4 1 38 48 -1 4 12 0.0000 4 150 765 3525 5250 decode\001 +4 1 38 48 -1 4 12 0.0000 4 150 765 5325 5250 decode\001 +4 1 38 48 -1 4 12 0.0000 4 150 765 7200 5250 decode\001 +4 1 38 48 -1 4 12 0.0000 4 180 1035 12375 9300 mux_io_cb\001 +4 0 0 49 -1 4 12 1.5708 4 180 2580 12075 3225 mux->subscribe(SUB_SEND)\001 +4 1 0 49 -1 4 14 1.5708 4 180 1425 450 4500 mux streams\001 +4 1 0 49 -1 4 14 1.5708 4 135 1980 450 6750 mux=conn->mux\001 +4 1 0 49 -1 4 18 0.0000 4 285 1335 4500 11175 Transport\001 +4 1 0 46 -1 4 16 0.0000 4 210 690 14625 8175 mbuf\001 +4 1 38 48 -1 4 12 0.0000 4 180 1035 3159 9300 mux_io_cb\001 +4 0 0 49 -1 4 12 0.0000 4 195 2805 2250 12000 encoding/decoding function\001 +4 0 0 49 -1 4 12 0.0000 4 180 1365 2250 12525 transport layer\001 +4 0 0 49 -1 4 12 0.0000 4 180 2445 7050 12525 multiplexer (MUX/DEMUX)\001 +4 0 0 49 -1 4 12 0.0000 4 195 2655 7050 12000 general processing function\001 +4 0 0 49 -1 4 12 0.0000 4 180 2820 11775 12525 stream buffer (byte-level FIFO)\001 +4 2 0 49 -1 4 12 0.0000 4 180 2550 3675 10725 xprt->subscribe(SUB_RECV)\001 +4 2 0 49 -1 4 12 0.0000 4 180 2535 12225 10725 xprt->subscribe(SUB_SEND)\001 +4 1 0 49 -1 4 14 1.5708 4 180 780 450 2550 stconn\001 +4 1 0 49 -1 4 12 1.5708 4 195 2010 900 1125 (eg: checks, streams)\001 +4 1 0 49 -1 4 14 1.5708 4 180 3720 450 10125 connection = sc->sedesc->conn\001 +4 0 0 49 -1 4 12 0.0000 4 150 600 12225 225 Notes:\001 +4 0 0 49 -1 4 12 0.0000 4 180 2220 12975 675 snd_buf() will move the\001 +4 0 0 49 -1 4 12 0.0000 4 180 2310 12975 975 buffer (zero-copy) when\001 +4 0 0 49 -1 4 12 0.0000 4 180 2310 12975 1275 the destination is empty.\001 +4 0 0 49 -1 4 12 0.0000 4 180 2220 12825 1650 - the application is also\001 +4 0 0 49 -1 4 12 0.0000 4 180 2700 12975 2250 is sc->app with sc->app_ops\001 +4 0 0 49 -1 4 12 0.0000 4 180 2490 12825 2550 - transport layers (xprt) are\001 +4 0 0 49 -1 4 12 0.0000 4 180 2250 12975 2775 stackable. conn->xprt is\001 +4 0 0 49 -1 4 12 0.0000 4 180 1635 12975 3000 the topmost one.\001 +4 0 0 49 -1 4 12 0.0000 4 180 2400 12975 1950 called the app layer and\001 +4 0 0 49 -1 4 12 0.0000 4 180 1995 12825 375 - mux->rcv_buf() and\001 +4 1 38 48 -1 4 12 0.0000 4 180 1440 8409 657 sc_conn_io_cb\001 diff --git a/doc/internals/muxes.pdf b/doc/internals/muxes.pdf Binary files differnew file mode 100644 index 0000000..54f8cc7 --- /dev/null +++ b/doc/internals/muxes.pdf diff --git a/doc/internals/muxes.png b/doc/internals/muxes.png Binary files differnew file mode 100644 index 0000000..a58f42f --- /dev/null +++ b/doc/internals/muxes.png diff --git a/doc/internals/muxes.svg b/doc/internals/muxes.svg new file mode 100644 index 0000000..3feaa4d --- /dev/null +++ b/doc/internals/muxes.svg @@ -0,0 +1,911 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Creator: fig2dev Version 3.2.8b --> +<!-- CreationDate: 2022-05-27 11:37:43 --> +<!-- Magnification: 1 --> +<svg xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + width="942pt" height="755pt" + viewBox="254 60 15690 12573"> +<g fill="none"> +<!-- Line --> +<rect x="12000" y="10800" width="1800" height="600" rx="105" fill="#458dba" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<polygon points=" 900,3300 900,9900 8100,9900 8100,3300" fill="#bbf2e2" + stroke="#a7ceb3" stroke-width="45px"/> +<!-- Line --> +<polygon points=" 8700,3300 8700,9900 15900,9900 15900,3300" fill="#bbf2e2" + stroke="#a7ceb3" stroke-width="45px"/> +<!-- Line --> +<rect x="3600" y="10800" width="1800" height="600" rx="105" fill="#458dba" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<rect x="1575" y="12300" width="525" height="300" rx="105" fill="#458dba" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<rect x="6375" y="12300" width="525" height="300" rx="105" fill="#bbf2e2" + stroke="#a7ceb3" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<polygon points=" 4761,9751 4761,8751 4261,8751 4261,9751" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 1986,4576 1986,3576 1486,3576 1486,4576" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 3786,4576 3786,3576 3286,3576 3286,4576" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 5586,4576 5586,3576 5086,3576 5086,4576" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 7461,4576 7461,3576 6961,3576 6961,4576" fill="#dae8fc"/> +<!-- Text --> +<text xml:space="preserve" x="12900" y="11175" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="216" text-anchor="middle">Transport</text> +<!-- Line --> +<polygon points=" 6692,1761 9959,1761 9959,1261 6692,1261" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 12425,8661 12825,8661 12825,8161 12425,8161" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 11750,8661 12150,8661 12150,8161 11750,8161" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 11075,8661 11475,8661 11475,8161 11075,8161" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 10400,8661 10800,8661 10800,8161 10400,8161" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 13100,8661 13500,8661 13500,8161 13100,8161" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 13775,8661 14175,8661 14175,8161 13775,8161" fill="#dae8fc"/> +<!-- Ellipse --> +<ellipse cx="11400" cy="11925" rx="225" ry="150" fill="#ffc1e7" + stroke="#d10000" stroke-width="45px"/> +<!-- Text --> +<text xml:space="preserve" x="11850" y="12000" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">I/O tasklet</text> +<!-- Line --> +<polygon points=" 11157,12581 11614,12581 11614,12331 11157,12331" fill="#dae8fc"/> +<!-- Circle --> +<circle cx="10725" cy="5700" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="12750" cy="5700" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="13875" cy="5700" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="11700" cy="5700" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="2925" cy="6750" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="4950" cy="6750" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="6075" cy="6750" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Circle --> +<circle cx="3900" cy="6750" r="75" fill="#000000" + stroke="#000000" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse cx="9525" cy="4140" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="11341" cy="4140" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="13154" cy="4140" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="15033" cy="4140" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="7182" cy="5173" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="3507" cy="5173" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="1719" cy="5173" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="5325" cy="5175" rx="583" ry="250" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="12333" cy="7025" rx="417" ry="250" fill="#87cfff" + stroke="#0000d1" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="12392" cy="9240" rx="808" ry="210" fill="#ffc1e7" + stroke="#d10000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="3167" cy="9240" rx="808" ry="210" fill="#ffc1e7" + stroke="#d10000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="1800" cy="11925" rx="225" ry="150" fill="#ffe6cc" + stroke="#e9b000" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="8400" cy="600" rx="900" ry="210" fill="#ffc1e7" + stroke="#d10000" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 2550,3300 2550,6150" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<polyline points=" 4500,3300 4500,6150" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<polyline points=" 6300,3300 6300,6150" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<defs> +<clipPath id="cp0"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 645,12029 555,12029 582,12243 618,12243z"/> +</clipPath> +</defs> +<polyline points=" 600,8025 600,12225" clip-path="url(#cp0)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 600,12225 --> +<polygon points=" 555,12029 600,12209 645,12029 555,12029" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp1"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 555,1996 645,1996 618,1782 582,1782z"/> +</clipPath> +</defs> +<polyline points=" 600,3150 600,1800" clip-path="url(#cp1)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 600,1800 --> +<polygon points=" 645,1996 600,1816 555,1996 645,1996" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp2"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 555,346 645,346 618,132 582,132z"/> +</clipPath> +</defs> +<polyline points=" 600,1500 600,150" clip-path="url(#cp2)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 600,150 --> +<polygon points=" 645,346 600,166 555,346 645,346" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp3"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 7304,555 7304,645 7518,618 7518,582z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 3000,3300 3000,1425 3675,600 7500,600" clip-path="url(#cp3)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7500,600 --> +<polygon points=" 7304,645 7484,600 7304,555 7304,645" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Backward arrow to point 3000,3300 --> +<polygon points=" 2955,3104 3000,3284 3045,3104 2955,3104" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp4"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 6554,1455 6554,1545 6768,1518 6768,1482z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 3525,3525 3525,2625 4500,1500 6750,1500" clip-path="url(#cp4)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 6750,1500 --> +<polygon points=" 6554,1545 6734,1500 6554,1455 6554,1545" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp5"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11661,5428 11578,5465 11691,5649 11724,5634z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 11295,4425 11295,4725 11700,5625" clip-path="url(#cp5)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11700,5625 --> +<polygon points=" 11578,5465 11693,5610 11661,5428 11578,5465" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp6"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 10571,5541 10514,5611 10698,5725 10720,5697z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 9495,4425 9495,4725 10695,5700" clip-path="url(#cp6)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10695,5700 --> +<polygon points=" 10514,5611 10682,5690 10571,5541 10514,5611" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp7"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12905,5461 12822,5427 12764,5635 12798,5649z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 13163,4425 13163,4725 12788,5625" clip-path="url(#cp7)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 12788,5625 --> +<polygon points=" 12822,5427 12794,5610 12905,5461 12822,5427" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp8"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 14066,5607 14007,5539 13863,5700 13886,5727z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 15013,4427 15013,4725 13888,5702" clip-path="url(#cp8)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 13888,5702 --> +<polygon points=" 14007,5539 13900,5691 14066,5607 14007,5539" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp9"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 9555,3689 9495,3689 9507,3843 9543,3843z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 9525,3525 9525,3825" clip-path="url(#cp9)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 9525,3825 --> +<polygon points=" 9495,3689 9525,3809 9555,3689 9495,3689" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp10"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 13155,3689 13095,3689 13107,3843 13143,3843z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 13125,3525 13125,3825" clip-path="url(#cp10)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 13125,3825 --> +<polygon points=" 13095,3689 13125,3809 13155,3689 13095,3689" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp11"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 15030,3689 14970,3689 14982,3843 15018,3843z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 15000,3525 15000,3825" clip-path="url(#cp11)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 15000,3825 --> +<polygon points=" 14970,3689 15000,3809 15030,3689 14970,3689" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp12"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 10229,8355 10229,8445 10443,8418 10443,8382z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 12300,7275 12300,7725 9975,7725 9975,8400 10425,8400" clip-path="url(#cp12)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10425,8400 --> +<polygon points=" 10229,8445 10409,8400 10229,8355 10229,8445" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp13"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12345,6629 12255,6629 12282,6843 12318,6843z + M 3045,3104 2955,3104 2982,3318 3018,3318z"/> +</clipPath> +</defs> +<polyline points=" 11775,5850 12300,6450 12300,6825" clip-path="url(#cp13)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 12300,6825 --> +<polygon points=" 12255,6629 12300,6809 12345,6629 12255,6629" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 11475,6150 13200,6150 13200,6825" + stroke="#000000" stroke-width="30px" stroke-linejoin="round" stroke-dasharray="80 80"/> +<!-- Line --> +<defs> +<clipPath id="cp14"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12345,6629 12255,6629 12282,6843 12318,6843z + M 4051,7087 4124,7035 3979,6875 3950,6896z"/> +</clipPath> +</defs> +<polyline points=" 3975,6900 4500,7650 4500,7875" clip-path="url(#cp14)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 3975,6900 --> +<polygon points=" 4124,7035 3984,6913 4051,7087 4124,7035" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp15"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12345,6629 12255,6629 12282,6843 12318,6843z + M 3450,5671 3540,5671 3513,5457 3477,5457z"/> +</clipPath> +</defs> +<polyline points=" 3495,5475 3495,5775 3900,6675" clip-path="url(#cp15)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 3495,5475 --> +<polygon points=" 3540,5671 3495,5491 3450,5671 3540,5671" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp16"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12345,6629 12255,6629 12282,6843 12318,6843z + M 1650,5671 1740,5671 1713,5457 1677,5457z"/> +</clipPath> +</defs> +<polyline points=" 1695,5475 1695,5775 2895,6750" clip-path="url(#cp16)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 1695,5475 --> +<polygon points=" 1740,5671 1695,5491 1650,5671 1740,5671" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp17"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12345,6629 12255,6629 12282,6843 12318,6843z + M 7168,5673 7258,5673 7231,5459 7195,5459z"/> +</clipPath> +</defs> +<polyline points=" 7213,5477 7213,5775 6088,6752" clip-path="url(#cp17)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 7213,5477 --> +<polygon points=" 7258,5673 7213,5493 7168,5673 7258,5673" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp18"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 1695,4711 1755,4711 1743,4557 1707,4557z + M 7168,5673 7258,5673 7231,5459 7195,5459z"/> +</clipPath> +</defs> +<polyline points=" 1725,4875 1725,4575" clip-path="url(#cp18)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 1725,4575 --> +<polygon points=" 1755,4711 1725,4591 1695,4711 1755,4711" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp19"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 3495,4711 3555,4711 3543,4557 3507,4557z + M 7168,5673 7258,5673 7231,5459 7195,5459z"/> +</clipPath> +</defs> +<polyline points=" 3525,4875 3525,4575" clip-path="url(#cp19)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 3525,4575 --> +<polygon points=" 3555,4711 3525,4591 3495,4711 3555,4711" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp20"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 5295,4711 5355,4711 5343,4557 5307,4557z + M 7168,5673 7258,5673 7231,5459 7195,5459z"/> +</clipPath> +</defs> +<polyline points=" 5325,4875 5325,4575" clip-path="url(#cp20)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 5325,4575 --> +<polygon points=" 5355,4711 5325,4591 5295,4711 5355,4711" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp21"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 7170,4711 7230,4711 7218,4557 7182,4557z + M 7168,5673 7258,5673 7231,5459 7195,5459z"/> +</clipPath> +</defs> +<polyline points=" 7200,4875 7200,4575" clip-path="url(#cp21)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,4575 --> +<polygon points=" 7230,4711 7200,4591 7170,4711 7230,4711" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp22"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 7170,4711 7230,4711 7218,4557 7182,4557z + M 4455,8521 4545,8521 4518,8307 4482,8307z"/> +</clipPath> +</defs> +<polyline points=" 4500,8325 4500,8721" clip-path="url(#cp22)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 4500,8325 --> +<polygon points=" 4545,8521 4500,8341 4455,8521 4545,8521" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 3225,7875 3225,7350 4725,7350" + stroke="#000000" stroke-width="30px" stroke-linejoin="round" stroke-dasharray="80 80"/> +<!-- Line --> +<defs> +<clipPath id="cp23"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 3272,9646 3353,9605 3233,9426 3201,9442z + M 3853,10604 3772,10645 3892,10824 3924,10808z"/> +</clipPath> +</defs> +<polyline points=" 3900,10800 3225,9450" clip-path="url(#cp23)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 3225,9450 --> +<polygon points=" 3353,9605 3232,9464 3272,9646 3353,9605" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Backward arrow to point 3900,10800 --> +<polygon points=" 3772,10645 3893,10786 3853,10604 3772,10645" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp24"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 4455,9946 4545,9946 4518,9732 4482,9732z + M 3853,10604 3772,10645 3892,10824 3924,10808z"/> +</clipPath> +</defs> +<polyline points=" 4500,10800 4500,9750" clip-path="url(#cp24)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 4500,9750 --> +<polygon points=" 4545,9946 4500,9766 4455,9946 4545,9946" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp25"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12330,9646 12420,9646 12393,9432 12357,9432z + M 12420,10604 12330,10604 12357,10818 12393,10818z"/> +</clipPath> +</defs> +<polyline points=" 12375,10800 12375,9750 12375,9450" clip-path="url(#cp25)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 12375,9450 --> +<polygon points=" 12420,9646 12375,9466 12330,9646 12420,9646" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Backward arrow to point 12375,10800 --> +<polygon points=" 12330,10604 12375,10784 12420,10604 12330,10604" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 12225,3300 12225,5025" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<polyline points=" 10425,3300 10425,5025" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<polyline points=" 14025,3300 14025,5025" + stroke="#000000" stroke-width="8px" stroke-linejoin="round" stroke-dasharray="40 40"/> +<!-- Line --> +<defs> +<clipPath id="cp26"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11370,3629 11280,3629 11307,3843 11343,3843z + M 12420,10604 12330,10604 12357,10818 12393,10818z"/> +</clipPath> +</defs> +<polyline points=" 9975,1500 10800,1500 11325,2100 11325,3825" clip-path="url(#cp26)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11325,3825 --> +<polygon points=" 11280,3629 11325,3809 11370,3629 11280,3629" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp27"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11820,3104 11730,3104 11757,3318 11793,3318z + M 9496,645 9496,555 9282,582 9282,618z"/> +</clipPath> +</defs> +<polyline points=" 9300,600 11175,600 11775,1275 11775,3300" clip-path="url(#cp27)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11775,3300 --> +<polygon points=" 11730,3104 11775,3284 11820,3104 11730,3104" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Backward arrow to point 9300,600 --> +<polygon points=" 9496,555 9316,600 9496,645 9496,555" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp28"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11820,3104 11730,3104 11757,3318 11793,3318z + M 13245,10604 13155,10604 13182,10818 13218,10818z"/> +</clipPath> +</defs> +<polyline points=" 13200,10800 13200,10200 14625,9750 14625,8400 14175,8400" clip-path="url(#cp28)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 13200,10800 --> +<polygon points=" 13155,10604 13200,10784 13245,10604 13155,10604" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp29"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11820,3104 11730,3104 11757,3318 11793,3318z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 5325,5475 5325,5775 4950,6675" clip-path="url(#cp29)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Backward arrow to point 5325,5475 --> +<polygon points=" 5370,5671 5325,5491 5280,5671 5370,5671" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp30"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 555,3496 645,3496 618,3282 582,3282z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 600,5400 600,3300" clip-path="url(#cp30)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 600,3300 --> +<polygon points=" 645,3496 600,3316 555,3496 645,3496" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 600,7800 600,5700" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<defs> +<clipPath id="cp31"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12314,8370 12314,8430 12468,8418 12468,8382z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 12150,8400 12450,8400" clip-path="url(#cp31)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 12450,8400 --> +<polygon points=" 12314,8430 12434,8400 12314,8370 12314,8430" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp32"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 11639,8370 11639,8430 11793,8418 11793,8382z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 11475,8400 11775,8400" clip-path="url(#cp32)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11775,8400 --> +<polygon points=" 11639,8430 11759,8400 11639,8370 11639,8430" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp33"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 10964,8370 10964,8430 11118,8418 11118,8382z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 10800,8400 11100,8400" clip-path="url(#cp33)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11100,8400 --> +<polygon points=" 10964,8430 11084,8400 10964,8370 10964,8430" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp34"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 12989,8370 12989,8430 13143,8418 13143,8382z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 12825,8400 13125,8400" clip-path="url(#cp34)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 13125,8400 --> +<polygon points=" 12989,8430 13109,8400 12989,8370 12989,8430" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp35"> + <path clip-rule="evenodd" d="M 254,60 H 15944 V 12633 H 254 z + M 13664,8370 13664,8430 13818,8418 13818,8382z + M 5280,5671 5370,5671 5343,5457 5307,5457z"/> +</clipPath> +</defs> +<polyline points=" 13500,8400 13800,8400" clip-path="url(#cp35)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 13800,8400 --> +<polygon points=" 13664,8430 13784,8400 13664,8370 13664,8430" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Text --> +<g transform="translate(450,825) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="168" text-anchor="middle">application</text> +</g><!-- Text --> +<g transform="translate(2850,3225) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">mux->subscribe(SUB_RECV)</text> +</g><!-- Text --> +<text xml:space="preserve" x="12300" y="7125" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">MUX</text> +<!-- Text --> +<text xml:space="preserve" x="3600" y="8100" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">Stream ID</text> +<!-- Text --> +<text xml:space="preserve" x="12825" y="7125" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">Stream ID</text> +<!-- Text --> +<text xml:space="preserve" x="3300" y="10125" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">tasklet_wakeup()</text> +<!-- Text --> +<text xml:space="preserve" x="12150" y="10125" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">tasklet_wakeup()</text> +<!-- Text --> +<text xml:space="preserve" x="11175" y="3150" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">mux->snd_buf()</text> +<!-- Text --> +<text xml:space="preserve" x="3675" y="3225" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">mux->rcv_buf()</text> +<!-- Text --> +<text xml:space="preserve" x="13425" y="10575" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">xprt->snd_buf(mbuf)</text> +<!-- Text --> +<text xml:space="preserve" x="4725" y="10500" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">xprt->rcv_buf(dbuf)</text> +<!-- Text --> +<text xml:space="preserve" x="8400" y="2100" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">HTX contents when mode==HTTP</text> +<!-- Text --> +<text xml:space="preserve" x="7500" y="450" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">tasklet_wakeup()</text> +<!-- Text --> +<text xml:space="preserve" x="9300" y="450" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">tasklet_wakeup()</text> +<!-- Text --> +<g transform="translate(12075,3225) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">mux->subscribe(SUB_SEND)</text> +</g><!-- Text --> +<g transform="translate(450,4500) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="168" text-anchor="middle">mux streams</text> +</g><!-- Text --> +<g transform="translate(450,6750) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="168" text-anchor="middle">mux=conn->mux</text> +</g><!-- Text --> +<text xml:space="preserve" x="4500" y="11175" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="216" text-anchor="middle">Transport</text> +<!-- Text --> +<text xml:space="preserve" x="2250" y="12000" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">encoding/decoding function</text> +<!-- Text --> +<text xml:space="preserve" x="2250" y="12525" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">transport layer</text> +<!-- Text --> +<text xml:space="preserve" x="7050" y="12525" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">multiplexer (MUX/DEMUX)</text> +<!-- Text --> +<text xml:space="preserve" x="7050" y="12000" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">general processing function</text> +<!-- Text --> +<text xml:space="preserve" x="11775" y="12525" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">stream buffer (byte-level FIFO)</text> +<!-- Text --> +<text xml:space="preserve" x="3675" y="10725" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">xprt->subscribe(SUB_RECV)</text> +<!-- Text --> +<text xml:space="preserve" x="12225" y="10725" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="end">xprt->subscribe(SUB_SEND)</text> +<!-- Text --> +<g transform="translate(450,2550) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="168" text-anchor="middle">stconn</text> +</g><!-- Text --> +<g transform="translate(900,1125) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">(eg: checks, streams)</text> +</g><!-- Text --> +<g transform="translate(450,10125) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="168" text-anchor="middle">connection = sc->sedesc->conn</text> +</g><!-- Text --> +<text xml:space="preserve" x="12225" y="225" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">Notes:</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="675" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">snd_buf() will move the</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="975" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">buffer (zero-copy) when</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="1275" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">the destination is empty.</text> +<!-- Text --> +<text xml:space="preserve" x="12825" y="1650" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">- the application is also</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="2250" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">is sc->app with sc->app_ops</text> +<!-- Text --> +<text xml:space="preserve" x="12825" y="2550" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">- transport layers (xprt) are</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="2775" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">stackable. conn->xprt is</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="3000" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">the topmost one.</text> +<!-- Text --> +<text xml:space="preserve" x="12975" y="1950" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">called the app layer and</text> +<!-- Text --> +<text xml:space="preserve" x="12825" y="375" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="start">- mux->rcv_buf() and</text> +<!-- Line --> +<polyline points=" 4261,9751 4261,8751 4761,8751 4761,9751" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 1486,4576 1486,3576 1986,3576 1986,4576" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 3286,4576 3286,3576 3786,3576 3786,4576" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 5086,4576 5086,3576 5586,3576 5586,4576" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 6961,4576 6961,3576 7461,3576 7461,4576" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 6692,1261 9959,1261 9959,1761 6692,1761" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 12425,8161 12825,8161 12825,8661 12425,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 11750,8161 12150,8161 12150,8661 11750,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 11075,8161 11475,8161 11475,8661 11075,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 10400,8161 10800,8161 10800,8661 10400,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 13100,8161 13500,8161 13500,8661 13100,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 13775,8161 14175,8161 14175,8661 13775,8661" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 11157,12331 11614,12331 11614,12581 11157,12581" + stroke="#458dba" stroke-width="45px"/> +<!-- Text --> +<text xml:space="preserve" x="9534" y="4200" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">encode</text> +<!-- Text --> +<text xml:space="preserve" x="11325" y="4200" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">encode</text> +<!-- Text --> +<text xml:space="preserve" x="13134" y="4200" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">encode</text> +<!-- Text --> +<text xml:space="preserve" x="15009" y="4200" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">encode</text> +<!-- Text --> +<text xml:space="preserve" x="1725" y="5250" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">decode</text> +<!-- Text --> +<text xml:space="preserve" x="3525" y="5250" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">decode</text> +<!-- Text --> +<text xml:space="preserve" x="5325" y="5250" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">decode</text> +<!-- Text --> +<text xml:space="preserve" x="7200" y="5250" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">decode</text> +<!-- Text --> +<text xml:space="preserve" x="12375" y="9300" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">mux_io_cb</text> +<!-- Text --> +<text xml:space="preserve" x="3159" y="9300" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">mux_io_cb</text> +<!-- Text --> +<text xml:space="preserve" x="8409" y="657" fill="#1a1a1a" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="144" text-anchor="middle">sc_conn_io_cb</text> +<!-- Line --> +<polyline points=" 4261,8850 4761,8850" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 4261,8925 4761,8925" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 4261,9000 4761,9000" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 1486,3675 1986,3675" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 1486,3750 1986,3750" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 1486,3825 1986,3825" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 3286,3675 3786,3675" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 3286,3750 3786,3750" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 3286,3825 3786,3825" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5086,3675 5586,3675" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5086,3750 5586,3750" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5086,3825 5586,3825" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 6961,3675 7461,3675" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 6961,3750 7461,3750" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 6961,3825 7461,3825" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9750,1261 9750,1761" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9525,1261 9525,1761" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9300,1261 9300,1761" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 12600,8161 12600,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 12675,8161 12675,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 12750,8161 12750,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11925,8161 11925,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 12000,8161 12000,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 12075,8161 12075,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11250,8161 11250,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11325,8161 11325,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11400,8161 11400,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 10575,8161 10575,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 10650,8161 10650,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 10725,8161 10725,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 13275,8161 13275,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 13350,8161 13350,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 13425,8161 13425,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 13950,8161 13950,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 14025,8161 14025,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 14100,8161 14100,8661" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11357,12331 11357,12581" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11443,12331 11443,12581" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 11529,12331 11529,12581" + stroke="#458dba" stroke-width="15px"/> +<!-- Text --> +<text xml:space="preserve" x="8025" y="1575" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">channel buf</text> +<!-- Text --> +<g transform="translate(3600,4200) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">rxbuf</text> +</g><!-- Text --> +<g transform="translate(4575,9375) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">dbuf</text> +</g><!-- Text --> +<text xml:space="preserve" x="14625" y="8175" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">mbuf</text> +<!-- Ellipse --> +<ellipse cx="4488" cy="8082" rx="612" ry="250" fill="#87cfff" + stroke="#0000d1" stroke-width="45px"/> +<!-- Ellipse --> +<ellipse cx="6600" cy="11925" rx="225" ry="150" fill="#87cfff" + stroke="#0000d1" stroke-width="45px"/> +<!-- Text --> +<text xml:space="preserve" x="4500" y="8175" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">DEMUX</text> +</g> +</svg> diff --git a/doc/internals/naming.txt b/doc/internals/naming.txt new file mode 100644 index 0000000..0c3d5b5 --- /dev/null +++ b/doc/internals/naming.txt @@ -0,0 +1,54 @@ +Naming rules for manipulated objects and structures. + +Previously, there were ambiguities between sessions, transactions and requests, +as well as in the way responses are noted ("resp", "rep", "rsp"). + +Here is a proposal for a better naming scheme. + +The "session" is above the transport level, which means at ISO layer 5. +We can talk about "http sessions" when we consider the entity which lives +between the accept() and the close(), or the connect() and the close(). + +=> This demonstrates that it is not possible to have the same http session from + the client to the server. + +A session can carry one or multiple "transactions", which are each composed of +one "request" and zero or one "response". Both "request" and "response" are +described in RFC2616 as "HTTP messages". RFC2616 also seldom references the +word "transaction" without explicitly defining it. + +An "HTTP message" is composed of a "start line" which can be either a +"request line" or a "status line", followed by a number of "message headers" +which can be either "request headers" or "response headers", and an "entity", +itself composed of "entity headers" and an "entity body".Most probably, +"message headers" and "entity headers" will always be processed together as +"headers", while the "entity body" will design the payload. + +We must try to always use the same abbreviations when naming objects. Here are +a few common ones : + + - txn : transaction + - req : request + - rtr : response to request + - msg : message + - hdr : header + - ent : entity + - bdy : body + - sts : status + - stt : state + - idx : index + - cli : client + - srv : server + - svc : service + - ses : session + - tsk : task + +Short names for unions or cascaded structs : + - sl : start line + - sl.rq : request line + - sl.st : status line + - cl : client + - px : proxy + - sv : server + - st : state / status + diff --git a/doc/internals/notes-layers.txt b/doc/internals/notes-layers.txt new file mode 100644 index 0000000..541c125 --- /dev/null +++ b/doc/internals/notes-layers.txt @@ -0,0 +1,330 @@ +2018-02-21 - Layering in haproxy 1.9 +------------------------------------ + +2 main zones : + - application : reads from conn_streams, writes to conn_streams, often uses + streams + + - connection : receives data from the network, presented into buffers + available via conn_streams, sends data to the network + + +The connection zone contains multiple layers which behave independently in each +direction. The Rx direction is activated upon callbacks from the lower layers. +The Tx direction is activated recursively from the upper layers. Between every +two layers there may be a buffer, in each direction. When a buffer is full +either in Tx or Rx direction, this direction is paused from the network layer +and the location where the congestion is encountered. Upon end of congestion +(cs_recv() from the upper layer, of sendto() at the lower layers), a +tasklet_wakeup() is performed on the blocked layer so that suspended operations +can be resumed. In this case, the Rx side restarts propagating data upwards +from the lowest blocked level, while the Tx side restarts propagating data +downwards from the highest blocked level. Proceeding like this ensures that +information known to the producer may always be used to tailor the buffer sizes +or decide of a strategy to best aggregate data. Additionally, each time a layer +is crossed without transformation, it becomes possible to send without copying. + +The Rx side notifies the application of data readiness using a wakeup or a +callback. The Tx side notifies the application of room availability once data +have been moved resulting in the uppermost buffer having some free space. + +When crossing a mux downwards, it is possible that the sender is not allowed to +access the buffer because it is not yet its turn. It is not a problem, the data +remains in the conn_stream's buffer (or the stream one) and will be restarted +once the mux is ready to consume these data. + + + cs_recv() -------. cs_send() + ^ +--------> |||||| -------------+ ^ + | | -------' | | stream + --|----------|-------------------------------|-------|------------------- + | | V | connection + data .---. | | room + ready! |---| |---| available! + |---| |---| + |---| |---| + | | '---' + ^ +------------+-------+ | + | | ^ | / + / V | V / + / recvfrom() | sendto() | + -------------|----------------|--------------|--------------------------- + | | poll! V kernel + + +The cs_recv() function should act on pointers to buffer pointers, so that the +callee may decide to pass its own buffer directly by simply swapping pointers. +Similarly for cs_send() it is desirable to let the callee steal the buffer by +swapping the pointers. This way it remains possible to implement zero-copy +forwarding. + +Some operation flags will be needed on cs_recv() : + - RECV_ZERO_COPY : refuse to merge new data into the current buffer if it + will result in a data copy (ie the buffer is not empty), unless no more + than XXX bytes have to be copied (eg: copying 2 cache lines may be cheaper + than waiting and playing with pointers) + + - RECV_AT_ONCE : only perform the operation if it will result in the source + buffer to become empty at the end of the operation so that no two buffers + remain allocated at the end. It will most of the time result in either a + small read or a zero-copy operation. + + - RECV_PEEK : retrieve a copy of pending data without removing these data + from the source buffer. Maybe an alternate solution could consist in + finding the pointer to the source buffer and accessing these data directly, + except that it might be less interesting for the long term, thread-wise. + + - RECV_MIN : receive minimum X bytes (or less with a shutdown), or fail. + This should help various protocol parsers which need to receive a complete + frame before proceeding. + + - RECV_ENOUGH : no more data expected after this read if it's of the + requested size, thus no need to re-enable receiving on the lower layers. + + - RECV_ONE_SHOT : perform a single read without re-enabling reading on the + lower layers, like we currently do when receiving an HTTP/1 request. Like + RECV_ENOUGH where any size is enough. Probably that the two could be merged + (eg: by having a MIN argument like RECV_MIN). + + +Some operation flags will be needed on cs_send() : + - SEND_ZERO_COPY : refuse to merge the presented data with existing data and + prefer to wait for current data to leave and try again, unless the consumer + considers the amount of data acceptable for a copy. + + - SEND_AT_ONCE : only perform the operation if it will result in the source + buffer to become empty at the end of the operation so that no two buffers + remain allocated at the end. It will most of the time result in either a + small write or a zero-copy operation. + + +Both operations should return a composite status : + - number of bytes transferred + - status flags (shutr, shutw, reset, empty, full, ...) + + +2018-07-23 - Update after merging rxbuf +--------------------------------------- + +It becomes visible that the mux will not always be welcome to decode incoming +data because it will sometimes imply extra memory copies and/or usage for no +benefit. + +Ideally, when when a stream is instantiated based on incoming data, these +incoming data should be passed and the upper layers called, but it should then +be up these upper layers to peek more data in certain circumstances. Typically +if the pending connection data are larger than what is expected to be passed +above, it means some data may cause head-of-line blocking (HOL) to other +streams, and needs to be pushed up through the layers to let other streams +continue to work. Similarly very large H2 data frames after header frames +should probably not be passed as they may require copies that could be avoided +if passed later. However if the decoded frame fits into the conn_stream's +buffer, there is an opportunity to use a single buffer for the conn_stream +and the channel. The H2 demux could set a blocking flag indicating it's waiting +for the upper stream to take over demuxing. This flag would be purged once the +upper stream would start reading, or when extra data come and change the +conditions. + +Forcing structured headers and raw data to coexist within a single buffer is +quite challenging for many code parts. For example it's perfectly possible to +see a fragmented buffer containing series of headers, then a small data chunk +that was received at the same time, then a few other headers added by request +processing, then another data block received afterwards, then possibly yet +another header added by option http-send-name-header, and yet another data +block. This causes some pain for compression which still needs to know where +compressed and uncompressed data start/stop. It also makes it very difficult +to account the exact bytes to pass through the various layers. + +One solution consists in thinking about buffers using 3 representations : + + - a structured message, which is used for the internal HTTP representation. + This message may only be atomically processed. It has no clear byte count, + it's a message. + + - a raw stream, consisting in sequences of bytes. That's typically what + happens in data sequences or in tunnel. + + - a pipe, which contains data to be forwarded, and that haproxy cannot have + access to. + +The processing efficiency decreases with the higher complexity above, but the +capabilities increase. The structured message can contain anything including +serialized data blocks to be processed or forwarded. The raw stream contains +data blocks to be processed or forwarded. The pipe only contains data blocks +to be forwarded. The the latter ones are only an optimization of the former +ones. + +Thus ideally a channel should have access to all such 3 storage areas at once, +depending on the use case : + (1) a structured message, + (2) a raw stream, + (3) a pipe + +Right now a channel only has (2) and (3) but after the native HTTP rework, it +will only have (1) and (3). Placing a raw stream exclusively in (1) comes with +some performance drawbacks which are not easily recovered, and with some quite +difficult management still involving the reserve to ensure that a data block +doesn't prevent headers from being appended. But during header processing, the +payload may be necessary so we cannot decide to drop this option. + +A long-term approach would consist in ensuring that a single channel may have +access to all 3 representations at once, and to enumerate priority rules to +define how they interact together. That's exactly what is currently being done +with the pipe and the raw buffer right now. Doing so would also save the need +for storing payload in the structured message and void the requirement for the +reserve. But it would cost more memory to process POST data and server +responses. Thus an intermediary step consists in keeping this model in mind but +not implementing everything yet. + +Short term proposal : a channel has access to a buffer and a pipe. A non-empty +buffer is either in structured message format OR raw stream format. Only the +channel knows. However a structured buffer MAY contain raw data in a properly +formatted way (using the envelope defined by the structured message format). + +By default, when a demux writes to a CS rxbuf, it will try to use the lowest +possible level for what is being done (i.e. splice if possible, otherwise raw +stream, otherwise structured message). If the buffer already contains a +structured message, then this format is exclusive. From this point the MUX has +two options : either encode the incoming data to match the structured message +format, or refrain from receiving into the CS's rxbuf and wait until the upper +layer request those data. + +This opens a simplified option which could be suited even for the long term : + - cs_recv() will take one or two flags to indicate if a buffer already + contains a structured message or not ; the upper layer knows it. + + - cs_recv() will take two flags to indicate what the upper layer is willing + to take : + - structured message only + - raw stream only + - any of them + + From this point the mux can decide to either pass anything or refrain from + doing so. + + - the demux stores the knowledge it has from the contents into some CS flags + to indicate whether or not some structured message are still available, and + whether or not some raw data are still available. Thus the caller knows + whether or not extra data are available. + + - when the demux works on its own, it refrains from passing structured data + to a non-empty buffer, unless these data are causing trouble to other + streams (HOL). + + - when a demux has to encapsulate raw data into a structured message, it will + always have to respect a configured reserve so that extra header processing + can be done on the structured message inside the buffer, regardless of the + supposed available room. In addition, the upper layer may indicate using an + extra recv() flag whether it wants the demux to defragment serialized data + (for example by moving trailing headers apart) or if it's not necessary. + This flag will be set by the stream interface if compression is required or + if the http-buffer-request option is set for example. Probably that using + to_forward==0 is a stronger indication that the reserve must be respected. + + - cs_recv() and cs_send() when fed with a message, should not return byte + counts but message counts (i.e. 0 or 1). This implies that a single call to + either of these functions cannot mix raw data and structured messages at + the same time. + +At this point it looks like the conn_stream will have some encapsulation work +to do for the payload if it needs to be encapsulated into a message. This +further magnifies the importance of *not* decoding DATA frames into the CS's +rxbuf until really needed. + +The CS will probably need to hold indication of what is available at the mux +level, not only in the CS. Eg: we know that payload is still available. + +Using these elements, it should be possible to ensure that full header frames +may be received without enforcing any reserve, that too large frames that do +not fit will be detected because they return 0 message and indicate that such +a message is still pending, and that data availability is correctly detected +(later we may expect that the stream-interface allocates a larger or second +buffer to place the payload). + +Regarding the ability for the channel to forward data, it looks like having a +new function "cs_xfer(src_cs, dst_cs, count)" could be very productive in +optimizing the forwarding to make use of splicing when available. It is not yet +totally clear whether it will split into "cs_xfer_in(src_cs, pipe, count)" +followed by "cs_xfer_out(dst_cs, pipe, count)" or anything different, and it +still needs to be studied. The general idea seems to be that the receiver might +have to call the sender directly once they agree on how to transfer data (pipe +or buffer). If the transfer is incomplete, the cs_xfer() return value and/or +flags will indicate the current situation (src empty, dst full, etc) so that +the caller may register for notifications on the appropriate event and wait to +be called again to continue. + +Short term implementation : + 1) add new CS flags to qualify what the buffer contains and what we expect + to read into it; + + 2) set these flags to pretend we have a structured message when receiving + headers (after all, H1 is an atomic header as well) and see what it + implies for the code; for H1 it's unclear whether it makes sense to try + to set it without the H1 mux. + + 3) use these flags to refrain from sending DATA frames after HEADERS frames + in H2. + + 4) flush the flags at the stream interface layer when performing a cs_send(). + + 5) use the flags to enforce receipt of data only when necessary + +We should be able to end up with sequential receipt in H2 modelling what is +needed for other protocols without interfering with the native H1 devs. + + +2018-08-17 - Considerations after killing cs_recv() +--------------------------------------------------- + +With the ongoing reorganisation of the I/O layers, it's visible that cs_recv() +will have to transfer data between the cs' rxbuf and the channel's buffer while +not being aware of the data format. Moreover, in case there's no data there, it +needs to recursively call the mux's rcv_buf() to trigger a decoding, while this +function is sometimes replaced with cs_recv(). All this shows that cs_recv() is +in fact needed while data are pushed upstream from the lower layers, and is not +suitable for the "pull" mode. Thus it was decided to remove this function and +put its code back into h2_rcv_buf(). The H1 mux's rcv_buf() already couldn't be +replaced with cs_recv() since it is the only one knowing about the buffer's +format. + +This opportunity simplified something : if the cs's rxbuf is only read by the +mux's rcv_buf() method, then it doesn't need to be located into the CS and is +well placed into the mux's representation of the stream. This has an important +impact for H2 as it offers more freedom to the mux to allocate/free/reallocate +this buffer, and it ensures the mux always has access to it. + +Furthermore, the conn_stream's txbuf experienced the same fate. Indeed, the H1 +mux has already uncovered the difficulty related to the channel shutting down +on output, with data stuck into the CS's txbuf. Since the CS is tightly coupled +to the stream and the stream can close immediately once its buffers are empty, +it required a way to support orphaned CS with pending data in their txbuf. This +is something that the H2 mux already has to deal with, by carefully leaving the +data in the channel's buffer. But due to the snd_buf() call being top-down, it +is always possible to push the stream's data via the mux's snd_buf() call +without requiring a CS txbuf anymore. Thus the txbuf (when needed) is only +implemented in the mux and attached to the mux's representation of the stream, +and doing so allows to immediately release the channel once the data are safe +in the mux's buffer. + +This is an important change which clarifies the roles and responsibilities of +each layer in the chain : when receiving data from a mux, it's the mux's +responsibility to make sure it can correctly decode the incoming data and to +buffer the possible excess of data it cannot pass to the requester. This means +that decoding an H2 frame, which is not retryable since it has an impact on the +HPACK decompression context, and which cannot be reordered for the same reason, +simply needs to be performed to the H2 stream's rxbuf which will then be passed +to the stream when this one calls h2_rcv_buf(), even if it reads one byte at a +time. Similarly when calling h2_snd_buf(), it's the mux's responsibility to +read as much as it needs to be able to restart later, possibly by buffering +some data into a local buffer. And it's only once all the output data has been +consumed by snd_buf() that the stream is free to disappear. + +This model presents the nice benefit of being infinitely stackable and solving +the last identified showstoppers to move towards a structured message internal +representation, as it will give full power to the rcv_buf() and snd_buf() to +process what they need. + +For now the conn_stream's flags indicating whether a shutdown has been seen in +any direction or if an end of stream was seen will remain in the conn_stream, +though it's likely that some of them will move to the mux's representation of +the stream after structured messages are implemented. diff --git a/doc/internals/pattern.dia b/doc/internals/pattern.dia Binary files differnew file mode 100644 index 0000000..3d13215 --- /dev/null +++ b/doc/internals/pattern.dia diff --git a/doc/internals/pattern.pdf b/doc/internals/pattern.pdf Binary files differnew file mode 100644 index 0000000..a8d8bc9 --- /dev/null +++ b/doc/internals/pattern.pdf diff --git a/doc/internals/polling-states.fig b/doc/internals/polling-states.fig new file mode 100644 index 0000000..3b2c782 --- /dev/null +++ b/doc/internals/polling-states.fig @@ -0,0 +1,59 @@ +#FIG 3.2 Produced by xfig version 2.3 +Portrait +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1125 1350 1125 1800 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1125 2250 1125 2700 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1125 3150 1125 3600 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1575 1800 1575 1350 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1575 3600 1575 3150 +2 4 0 1 0 7 51 -1 20 0.000 0 0 7 0 0 5 + 1800 1350 900 1350 900 900 1800 900 1800 1350 +2 4 0 1 0 7 51 -1 20 0.000 0 0 7 0 0 5 + 1800 2250 900 2250 900 1800 1800 1800 1800 2250 +2 4 0 1 0 7 51 -1 20 0.000 0 0 7 0 0 5 + 1800 4050 900 4050 900 3600 1800 3600 1800 4050 +2 4 0 1 0 7 51 -1 20 0.000 0 0 7 0 0 5 + 1800 3150 900 3150 900 2700 1800 2700 1800 3150 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1350 450 1350 900 +2 1 0 1 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 1 1 1.00 90.00 180.00 + 1575 2700 1575 2250 +4 2 0 50 -1 16 8 0.0000 4 105 270 1080 1485 want\001 +4 2 0 50 -1 16 8 0.0000 4 120 255 1035 3285 stop\001 +4 0 0 50 -1 16 8 0.0000 4 105 270 1665 3510 want\001 +4 1 0 50 -1 16 10 0.0000 4 120 735 1350 1080 STOPPED\001 +4 1 0 50 -1 16 10 0.0000 4 120 795 1350 3780 DISABLED\001 +4 1 0 50 -1 16 10 0.0000 4 120 555 1350 2880 ACTIVE\001 +4 1 0 50 -1 16 10 0.0000 4 120 540 1350 1980 READY\001 +4 0 0 50 -1 16 8 0.0000 4 90 210 1665 2565 may\001 +4 2 0 50 -1 16 8 0.0000 4 105 240 1035 2430 cant\001 +4 1 0 50 -1 16 8 0.0000 4 120 240 1350 1260 R,!A\001 +4 1 0 50 -1 16 8 0.0000 4 120 210 1350 2160 R,A\001 +4 1 0 50 -1 16 8 0.0000 4 120 240 1350 3060 !R,A\001 +4 1 0 50 -1 16 8 0.0000 4 120 270 1350 3960 !R,!A\001 +4 0 0 50 -1 16 8 0.0000 4 120 255 1665 1710 stop\001 +4 0 0 50 -1 16 10 0.0000 4 150 855 2520 1125 R=ready flag\001 +4 0 0 50 -1 16 10 0.0000 4 150 885 2520 1290 A=active flag\001 +4 0 0 50 -1 16 10 0.0000 4 150 1365 2520 2475 fd_want sets A flag\001 +4 0 0 50 -1 16 10 0.0000 4 150 1440 2520 2640 fd_stop clears A flag\001 +4 0 0 50 -1 16 10 0.0000 4 150 1905 2520 3300 update() updates the poller.\001 +4 0 0 50 -1 16 10 0.0000 4 150 2190 2520 2970 fd_cant clears R flag (EAGAIN)\001 +4 0 0 50 -1 16 10 0.0000 4 150 2115 2520 3135 fd_rdy sets R flag (poll return)\001 diff --git a/doc/internals/repartition-be-fe-fi.txt b/doc/internals/repartition-be-fe-fi.txt new file mode 100644 index 0000000..261d073 --- /dev/null +++ b/doc/internals/repartition-be-fe-fi.txt @@ -0,0 +1,20 @@ +- session : ajouter ->fiprm et ->beprm comme raccourcis +- px->maxconn: ne s'applique qu'au FE. Pour le BE, on utilise fullconn, + initialisé par défaut ŕ la męme chose que maxconn. + + + \ from: proxy session server actuellement +field \ +rules px->fiprm sess->fiprm - +srv,cookies px->beprm sess->beprm srv->px +options(log) px-> sess->fe - +options(fe) px-> sess->fe - +options(be) px->beprm sess->beprm srv->px +captures px-> sess->fe - ->fiprm + + +logs px-> sess->fe srv->px +errorloc px-> sess->beprm|fe - +maxconn px-> sess->fe - ->be +fullconn px-> sess->beprm srv->px - + diff --git a/doc/internals/sched.fig b/doc/internals/sched.fig new file mode 100644 index 0000000..4134420 --- /dev/null +++ b/doc/internals/sched.fig @@ -0,0 +1,748 @@ +#FIG 3.2 Produced by xfig version 2.4 +Landscape +Center +Metric +A4 +150.00 +Single +-2 +1200 2 +0 32 #c5ebe1 +0 33 #86c8a2 +0 34 #ffebac +0 35 #cbb366 +0 36 #c7b696 +0 37 #effbff +0 38 #dfcba6 +0 39 #414141 +0 40 #aeaaae +0 41 #595559 +0 42 #414141 +0 43 #868286 +0 44 #bec3be +0 45 #868286 +0 46 #bec3be +0 47 #dfe3df +0 48 #8e8e8e +0 49 #8e8e8e +0 50 #414141 +0 51 #868286 +0 52 #bec3be +0 53 #dfe3df +0 54 #414141 +0 55 #868286 +0 56 #bec3be +0 57 #dfe3df +0 58 #868286 +0 59 #bec3be +0 60 #dfe3df +0 61 #c7b696 +0 62 #effbff +0 63 #dfcba6 +0 64 #c7b696 +0 65 #effbff +0 66 #dfcba6 +0 67 #aeaaae +0 68 #595559 +0 69 #8e8e8e +0 70 #414141 +0 71 #868286 +0 72 #bec3be +0 73 #dfe3df +0 74 #414141 +0 75 #868286 +0 76 #bec3be +0 77 #dfe3df +0 78 #868286 +0 79 #bec3be +0 80 #dfe3df +0 81 #414141 +0 82 #868286 +0 83 #bec3be +0 84 #414141 +0 85 #bec3be +0 86 #dfe3df +0 87 #414141 +0 88 #868286 +0 89 #bec3be +0 90 #8e8e8e +0 91 #414141 +0 92 #868286 +0 93 #bec3be +0 94 #dfe3df +0 95 #414141 +0 96 #868286 +0 97 #bec3be +0 98 #dfe3df +0 99 #bebebe +0 100 #515151 +0 101 #e7e3e7 +0 102 #000049 +0 103 #797979 +0 104 #303430 +0 105 #414541 +0 106 #414141 +0 107 #868286 +0 108 #bec3be +0 109 #dfe3df +0 110 #cfcfcf +0 111 #cfcfcf +0 112 #cfcfcf +0 113 #cfcfcf +0 114 #cfcfcf +0 115 #cfcfcf +0 116 #cfcfcf +0 117 #cfcfcf +0 118 #cfcfcf +0 119 #cfcfcf +0 120 #cfcfcf +0 121 #cfcfcf +0 122 #cfcfcf +0 123 #cfcfcf +0 124 #cfcfcf +0 125 #cfcfcf +0 126 #cfcfcf +0 127 #cfcfcf +0 128 #cfcfcf +0 129 #cfcfcf +0 130 #cfcfcf +0 131 #cfcfcf +0 132 #cfcfcf +0 133 #cfcfcf +0 134 #cfcfcf +0 135 #cfcfcf +0 136 #cfcfcf +0 137 #cfcfcf +0 138 #cfcfcf +0 139 #cfcfcf +0 140 #cfcfcf +0 141 #cfcfcf +0 142 #cfcfcf +0 143 #cfcfcf +0 144 #cfcfcf +0 145 #cfcfcf +0 146 #cfcfcf +0 147 #cfcfcf +0 148 #cfcfcf +0 149 #cfcfcf +0 150 #c7c3c7 +0 151 #868286 +0 152 #bec3be +0 153 #dfe3df +0 154 #8e8e8e +0 155 #8e8e8e +0 156 #494549 +0 157 #868686 +0 158 #c7c7c7 +0 159 #e7e7e7 +0 160 #f7f7f7 +0 161 #9e9e9e +0 162 #717571 +0 163 #aeaaae +0 164 #494549 +0 165 #aeaaae +0 166 #595559 +0 167 #bec3be +0 168 #dfe3df +0 169 #494549 +0 170 #616561 +0 171 #494549 +0 172 #868286 +0 173 #bec3be +0 174 #dfe3df +0 175 #bec3be +0 176 #dfe3df +0 177 #c7b696 +0 178 #effbff +0 179 #dfcba6 +0 180 #414141 +0 181 #868286 +0 182 #bec3be +0 183 #dfe3df +0 184 #8e8e8e +0 185 #aeaaae +0 186 #595559 +0 187 #414141 +0 188 #868286 +0 189 #bec3be +0 190 #868286 +0 191 #bec3be +0 192 #dfe3df +0 193 #8e8e8e +0 194 #8e8e8e +0 195 #414141 +0 196 #868286 +0 197 #bec3be +0 198 #dfe3df +0 199 #414141 +0 200 #868286 +0 201 #bec3be +0 202 #dfe3df +0 203 #868286 +0 204 #bec3be +0 205 #dfe3df +0 206 #c7b696 +0 207 #effbff +0 208 #dfcba6 +0 209 #c7b696 +0 210 #effbff +0 211 #dfcba6 +0 212 #aeaaae +0 213 #595559 +0 214 #8e8e8e +0 215 #414141 +0 216 #868286 +0 217 #bec3be +0 218 #dfe3df +0 219 #414141 +0 220 #868286 +0 221 #bec3be +0 222 #dfe3df +0 223 #868286 +0 224 #bec3be +0 225 #dfe3df +0 226 #414141 +0 227 #868286 +0 228 #bec3be +0 229 #414141 +0 230 #bec3be +0 231 #dfe3df +0 232 #414141 +0 233 #868286 +0 234 #bec3be +0 235 #8e8e8e +0 236 #414141 +0 237 #868286 +0 238 #bec3be +0 239 #dfe3df +0 240 #414141 +0 241 #868286 +0 242 #bec3be +0 243 #dfe3df +0 244 #414141 +0 245 #868286 +0 246 #bec3be +0 247 #dfe3df +0 248 #868286 +0 249 #bec3be +0 250 #dfe3df +0 251 #8e8e8e +0 252 #8e8e8e +0 253 #494549 +0 254 #aeaaae +0 255 #494549 +0 256 #aeaaae +0 257 #595559 +0 258 #bec3be +0 259 #dfe3df +0 260 #494549 +0 261 #616561 +0 262 #494549 +0 263 #868286 +0 264 #bec3be +0 265 #dfe3df +0 266 #bec3be +0 267 #dfe3df +0 268 #dfe3ef +0 269 #96969e +0 270 #d7dbd7 +0 271 #9ea2b6 +0 272 #9e0000 +0 273 #efefef +0 274 #86aeff +0 275 #7171ff +0 276 #bbf2e2 +0 277 #a7ceb3 +0 278 #dae8fc +0 279 #458dba +0 280 #ffe6cc +0 281 #e9b000 +0 282 #1a1a1a +0 283 #ffc1e7 +0 284 #009ed7 +0 285 #006d9e +0 286 #00719e +0 287 #9e9a9e +0 288 #000000 +0 289 #595959 +0 290 #006596 +0 291 #00a6d7 +0 292 #b6b6b6 +0 293 #8edbef +0 294 #00699e +0 295 #595d59 +0 296 #69d3e7 +0 297 #a6e3ef +0 298 #9ec7d7 +0 299 #aeb2ae +0 300 #00b6df +0 301 #00aed7 +0 302 #797d79 +0 303 #00a2d7 +0 304 #303030 +0 305 #006996 +0 306 #086d9e +0 307 #86b6cf +0 308 #f7fbf7 +0 309 #9ec3d7 +0 310 #ffff96 +0 311 #ff600a +5 1 0 2 0 7 50 -1 -1 0.000 0 0 1 0 11301.000 3060.000 11205 3825 10530 3060 11205 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 11289.000 3060.000 11385 3825 12060 3060 11385 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 11293.750 3060.000 10890 3105 11700 3060 10890 3015 + 2 1 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 0 1 0 7611.000 3060.000 7515 3825 6840 3060 7515 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 7599.000 3060.000 7695 3825 8370 3060 7695 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 7603.750 3060.000 7200 3105 8010 3060 7200 3015 + 2 1 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 0 1 0 4956.000 3060.000 4860 3825 4185 3060 4860 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 4944.000 3060.000 5040 3825 5715 3060 5040 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 4948.750 3060.000 4545 3105 5355 3060 4545 3015 + 2 1 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 0 1 0 1266.000 3060.000 1170 3825 495 3060 1170 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 1254.000 3060.000 1350 3825 2025 3060 1350 2295 + 0 0 1.00 60.00 120.00 +5 1 0 2 0 7 50 -1 -1 0.000 0 1 0 1 1258.750 3060.000 855 3105 1665 3060 855 3015 + 2 1 1.00 60.00 120.00 +6 10606 2371 11985 3749 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11768 3060 11970 3060 11967 3119 11959 3177 11946 3234 11929 3291 + 11907 3345 11879 3397 11704 3296 11723 3259 11738 3222 11751 3182 + 11760 3142 11765 3101 11768 3060 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11704 3296 11879 3397 11848 3447 11812 3494 11772 3537 11729 3577 + 11682 3613 11633 3644 11531 3469 11566 3447 11599 3422 11628 3393 + 11657 3364 11682 3331 11704 3296 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11531 3469 11633 3644 11580 3672 11526 3694 11469 3711 11412 3724 + 11354 3732 11295 3734 11295 3532 11336 3530 11377 3525 11417 3516 + 11457 3503 11494 3488 11531 3469 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11295 3532 11295 3734 11236 3732 11178 3724 11121 3711 11064 3694 + 11010 3672 10958 3644 11059 3469 11096 3488 11133 3503 11173 3516 + 11213 3525 11254 3530 11295 3532 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11059 3469 10958 3644 10908 3613 10861 3577 10818 3537 10778 3494 + 10742 3447 10711 3398 10886 3296 10908 3331 10933 3364 10962 3393 + 10991 3422 11024 3447 11059 3469 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 10886 3296 10711 3398 10683 3345 10661 3291 10644 3234 10631 3177 + 10623 3119 10621 3060 10823 3060 10825 3101 10830 3142 10839 3182 + 10852 3222 10867 3259 10886 3296 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 10823 3060 10621 3060 10623 3001 10631 2943 10644 2886 10661 2829 + 10683 2775 10711 2723 10886 2824 10867 2861 10852 2898 10839 2938 + 10830 2978 10825 3019 10823 3060 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 10886 2824 10711 2723 10742 2673 10778 2626 10818 2583 10861 2543 + 10908 2507 10958 2476 11059 2651 11024 2673 10991 2698 10962 2727 + 10933 2756 10908 2789 10886 2824 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11059 2651 10958 2476 11010 2448 11064 2426 11121 2409 11178 2396 + 11236 2388 11295 2386 11295 2588 11254 2590 11213 2595 11173 2604 + 11133 2617 11096 2632 11059 2651 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11295 2588 11295 2386 11354 2388 11412 2396 11469 2409 11526 2426 + 11580 2448 11632 2476 11531 2651 11494 2632 11457 2617 11417 2604 + 11377 2595 11336 2590 11295 2588 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11531 2651 11632 2476 11682 2507 11729 2543 11772 2583 11812 2626 + 11848 2673 11879 2723 11704 2824 11682 2789 11657 2756 11628 2727 + 11599 2698 11566 2673 11531 2651 +2 3 0 2 0 31 50 -1 20 0.000 0 0 -1 0 0 15 + 11704 2824 11879 2723 11907 2775 11929 2829 11946 2886 11959 2943 + 11967 3001 11969 3060 11767 3060 11765 3019 11760 2978 11751 2938 + 11738 2898 11723 2861 11704 2824 +-6 +6 4261 2371 5640 3749 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 5423 3060 5625 3060 5622 3119 5614 3177 5601 3234 5584 3291 + 5562 3345 5534 3397 5359 3296 5378 3259 5393 3222 5406 3182 + 5415 3142 5420 3101 5423 3060 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 5359 3296 5534 3397 5503 3447 5467 3494 5427 3537 5384 3577 + 5337 3613 5288 3644 5186 3469 5221 3447 5254 3422 5283 3393 + 5312 3364 5337 3331 5359 3296 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 5186 3469 5288 3644 5235 3672 5181 3694 5124 3711 5067 3724 + 5009 3732 4950 3734 4950 3532 4991 3530 5032 3525 5072 3516 + 5112 3503 5149 3488 5186 3469 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4950 3532 4950 3734 4891 3732 4833 3724 4776 3711 4719 3694 + 4665 3672 4613 3644 4714 3469 4751 3488 4788 3503 4828 3516 + 4868 3525 4909 3530 4950 3532 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4714 3469 4613 3644 4563 3613 4516 3577 4473 3537 4433 3494 + 4397 3447 4366 3398 4541 3296 4563 3331 4588 3364 4617 3393 + 4646 3422 4679 3447 4714 3469 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4541 3296 4366 3398 4338 3345 4316 3291 4299 3234 4286 3177 + 4278 3119 4276 3060 4478 3060 4480 3101 4485 3142 4494 3182 + 4507 3222 4522 3259 4541 3296 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4478 3060 4276 3060 4278 3001 4286 2943 4299 2886 4316 2829 + 4338 2775 4366 2723 4541 2824 4522 2861 4507 2898 4494 2938 + 4485 2978 4480 3019 4478 3060 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4541 2824 4366 2723 4397 2673 4433 2626 4473 2583 4516 2543 + 4563 2507 4613 2476 4714 2651 4679 2673 4646 2698 4617 2727 + 4588 2756 4563 2789 4541 2824 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4714 2651 4613 2476 4665 2448 4719 2426 4776 2409 4833 2396 + 4891 2388 4950 2386 4950 2588 4909 2590 4868 2595 4828 2604 + 4788 2617 4751 2632 4714 2651 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 4950 2588 4950 2386 5009 2388 5067 2396 5124 2409 5181 2426 + 5235 2448 5287 2476 5186 2651 5149 2632 5112 2617 5072 2604 + 5032 2595 4991 2590 4950 2588 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 5186 2651 5287 2476 5337 2507 5384 2543 5427 2583 5467 2626 + 5503 2673 5534 2723 5359 2824 5337 2789 5312 2756 5283 2727 + 5254 2698 5221 2673 5186 2651 +2 3 0 2 0 13 50 -1 20 0.000 0 0 -1 0 0 15 + 5359 2824 5534 2723 5562 2775 5584 2829 5601 2886 5614 2943 + 5622 3001 5624 3060 5422 3060 5420 3019 5415 2978 5406 2938 + 5393 2898 5378 2861 5359 2824 +-6 +6 2250 4815 3960 5265 +1 1 0 3 8 11 52 -1 20 0.000 1 0.0000 3105 5049 810 171 3105 5049 3915 5049 +4 1 0 50 -1 6 10 0.0000 4 150 1125 3105 5130 Most Urgent\001 +-6 +6 6916 2371 8295 3749 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 8078 3060 8280 3060 8277 3119 8269 3177 8256 3234 8239 3291 + 8217 3345 8189 3397 8014 3296 8033 3259 8048 3222 8061 3182 + 8070 3142 8075 3101 8078 3060 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 8014 3296 8189 3397 8158 3447 8122 3494 8082 3537 8039 3577 + 7992 3613 7943 3644 7841 3469 7876 3447 7909 3422 7938 3393 + 7967 3364 7992 3331 8014 3296 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7841 3469 7943 3644 7890 3672 7836 3694 7779 3711 7722 3724 + 7664 3732 7605 3734 7605 3532 7646 3530 7687 3525 7727 3516 + 7767 3503 7804 3488 7841 3469 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7605 3532 7605 3734 7546 3732 7488 3724 7431 3711 7374 3694 + 7320 3672 7268 3644 7369 3469 7406 3488 7443 3503 7483 3516 + 7523 3525 7564 3530 7605 3532 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7369 3469 7268 3644 7218 3613 7171 3577 7128 3537 7088 3494 + 7052 3447 7021 3398 7196 3296 7218 3331 7243 3364 7272 3393 + 7301 3422 7334 3447 7369 3469 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7196 3296 7021 3398 6993 3345 6971 3291 6954 3234 6941 3177 + 6933 3119 6931 3060 7133 3060 7135 3101 7140 3142 7149 3182 + 7162 3222 7177 3259 7196 3296 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7133 3060 6931 3060 6933 3001 6941 2943 6954 2886 6971 2829 + 6993 2775 7021 2723 7196 2824 7177 2861 7162 2898 7149 2938 + 7140 2978 7135 3019 7133 3060 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7196 2824 7021 2723 7052 2673 7088 2626 7128 2583 7171 2543 + 7218 2507 7268 2476 7369 2651 7334 2673 7301 2698 7272 2727 + 7243 2756 7218 2789 7196 2824 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7369 2651 7268 2476 7320 2448 7374 2426 7431 2409 7488 2396 + 7546 2388 7605 2386 7605 2588 7564 2590 7523 2595 7483 2604 + 7443 2617 7406 2632 7369 2651 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7605 2588 7605 2386 7664 2388 7722 2396 7779 2409 7836 2426 + 7890 2448 7942 2476 7841 2651 7804 2632 7767 2617 7727 2604 + 7687 2595 7646 2590 7605 2588 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 7841 2651 7942 2476 7992 2507 8039 2543 8082 2583 8122 2626 + 8158 2673 8189 2723 8014 2824 7992 2789 7967 2756 7938 2727 + 7909 2698 7876 2673 7841 2651 +2 3 0 2 0 31 50 -1 43 0.000 0 0 -1 0 0 15 + 8014 2824 8189 2723 8217 2775 8239 2829 8256 2886 8269 2943 + 8277 3001 8279 3060 8077 3060 8075 3019 8070 2978 8061 2938 + 8048 2898 8033 2861 8014 2824 +-6 +6 571 2371 1950 3749 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1733 3060 1935 3060 1932 3119 1924 3177 1911 3234 1894 3291 + 1872 3345 1844 3397 1669 3296 1688 3259 1703 3222 1716 3182 + 1725 3142 1730 3101 1733 3060 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1669 3296 1844 3397 1813 3447 1777 3494 1737 3537 1694 3577 + 1647 3613 1598 3644 1496 3469 1531 3447 1564 3422 1593 3393 + 1622 3364 1647 3331 1669 3296 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1496 3469 1598 3644 1545 3672 1491 3694 1434 3711 1377 3724 + 1319 3732 1260 3734 1260 3532 1301 3530 1342 3525 1382 3516 + 1422 3503 1459 3488 1496 3469 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1260 3532 1260 3734 1201 3732 1143 3724 1086 3711 1029 3694 + 975 3672 923 3644 1024 3469 1061 3488 1098 3503 1138 3516 + 1178 3525 1219 3530 1260 3532 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1024 3469 923 3644 873 3613 826 3577 783 3537 743 3494 + 707 3447 676 3398 851 3296 873 3331 898 3364 927 3393 + 956 3422 989 3447 1024 3469 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 851 3296 676 3398 648 3345 626 3291 609 3234 596 3177 + 588 3119 586 3060 788 3060 790 3101 795 3142 804 3182 + 817 3222 832 3259 851 3296 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 788 3060 586 3060 588 3001 596 2943 609 2886 626 2829 + 648 2775 676 2723 851 2824 832 2861 817 2898 804 2938 + 795 2978 790 3019 788 3060 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 851 2824 676 2723 707 2673 743 2626 783 2583 826 2543 + 873 2507 923 2476 1024 2651 989 2673 956 2698 927 2727 + 898 2756 873 2789 851 2824 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1024 2651 923 2476 975 2448 1029 2426 1086 2409 1143 2396 + 1201 2388 1260 2386 1260 2588 1219 2590 1178 2595 1138 2604 + 1098 2617 1061 2632 1024 2651 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1260 2588 1260 2386 1319 2388 1377 2396 1434 2409 1491 2426 + 1545 2448 1597 2476 1496 2651 1459 2632 1422 2617 1382 2604 + 1342 2595 1301 2590 1260 2588 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1496 2651 1597 2476 1647 2507 1694 2543 1737 2583 1777 2626 + 1813 2673 1844 2723 1669 2824 1647 2789 1622 2756 1593 2727 + 1564 2698 1531 2673 1496 2651 +2 3 0 2 0 13 50 -1 43 0.000 0 0 -1 0 0 15 + 1669 2824 1844 2723 1872 2775 1894 2829 1911 2886 1924 2943 + 1932 3001 1934 3060 1732 3060 1730 3019 1725 2978 1716 2938 + 1703 2898 1688 2861 1669 2824 +-6 +6 1800 1845 2520 2385 +4 1 0 50 -1 6 10 0.0000 4 120 570 2160 1980 Global\001 +4 1 0 50 -1 6 10 0.0000 4 120 495 2160 2160 tasks\001 +4 1 0 50 -1 6 10 0.0000 4 150 705 2160 2340 (locked)\001 +-6 +6 3960 1935 4500 2250 +4 1 0 50 -1 6 10 0.0000 4 120 495 4230 2250 tasks\001 +4 1 0 50 -1 6 10 0.0000 4 120 465 4230 2070 Local\001 +-6 +6 8190 1845 8910 2385 +4 1 0 50 -1 6 10 0.0000 4 150 705 8550 2340 (locked)\001 +4 1 0 50 -1 6 10 0.0000 4 120 585 8550 2160 timers\001 +4 1 0 50 -1 6 10 0.0000 4 120 570 8550 1980 Global\001 +-6 +6 10215 1935 10845 2250 +4 1 0 50 -1 6 10 0.0000 4 120 585 10530 2250 timers\001 +4 1 0 50 -1 6 10 0.0000 4 120 465 10530 2070 Local\001 +-6 +6 2430 945 3735 1530 +1 1 0 3 20 29 52 -1 20 0.000 1 0.0000 3083 1180 607 170 3083 1180 3690 1350 +4 1 0 50 -1 6 10 0.0000 4 120 615 3105 1260 Local ?\001 +4 0 0 50 -1 6 9 0.0000 4 105 315 3375 1530 Yes\001 +4 2 0 50 -1 6 9 0.0000 4 105 225 2790 1530 No\001 +-6 +6 8775 945 10080 1530 +1 1 0 3 20 29 52 -1 20 0.000 1 0.0000 9428 1180 607 170 9428 1180 10035 1350 +4 1 0 50 -1 6 10 0.0000 4 120 615 9450 1260 Local ?\001 +4 0 0 50 -1 6 9 0.0000 4 105 315 9720 1530 Yes\001 +4 2 0 50 -1 6 9 0.0000 4 105 225 9135 1530 No\001 +-6 +6 7200 6345 9810 6885 +2 1 0 4 279 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 7234 6398 9776 6398 9776 6838 7234 6838 +2 3 0 0 -1 278 49 -1 20 0.000 0 0 -1 0 0 5 + 7234 6838 9776 6838 9776 6398 7234 6398 7234 6838 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9613 6398 9613 6838 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9438 6398 9438 6838 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9264 6398 9264 6838 +4 1 0 46 -1 4 16 0.0000 4 210 1620 8460 6705 TL_URGENT\001 +-6 +6 4140 7830 4545 9045 +1 1 0 3 20 29 52 -1 20 0.000 1 1.5708 4330 8437 607 170 4330 8437 4500 7830 +4 1 0 50 -1 6 10 1.5708 4 120 585 4410 8415 Class?\001 +-6 +1 1 0 3 8 11 52 -1 20 0.000 1 0.0000 9450 5049 540 171 9450 5049 9990 5049 +1 1 0 3 20 29 52 -1 20 0.000 1 1.5708 2440 7672 607 170 2440 7672 2610 7065 +1 1 0 3 8 11 52 -1 20 0.000 1 1.5708 10755 7695 810 171 10755 7695 10755 6885 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 7605 3870 7605 4185 9270 4545 9270 4905 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 11301 3870 11301 4185 9636 4545 9636 4905 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 9630 1395 9626 1591 11291 1800 11295 2295 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 9270 1395 9270 1575 7605 1800 7605 2295 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 2 1 1.00 90.00 180.00 + 9450 360 9450 1035 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 1260 3870 1260 4185 2925 4545 2925 4905 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 4956 3870 4956 4185 3291 4545 3291 4905 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 2 1 1.00 90.00 180.00 + 3105 360 3105 1035 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 3285 1395 3285 1575 4950 1845 4950 2385 +2 1 0 3 22 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 9180 5535 9000 5805 +2 1 0 5 13 7 54 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 120.00 240.00 + 3105 5220 3105 5850 3105 7200 7200 7200 +2 1 0 5 22 7 54 -1 -1 0.000 1 0 -1 1 0 5 + 2 1 1.00 120.00 240.00 + 9450 5220 9450 5670 6300 5670 6300 1215 3690 1170 +2 1 0 3 13 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 3195 5535 3015 5805 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 2925 1395 2925 1575 1260 1845 1260 2385 +2 2 0 3 35 34 100 -1 20 0.000 1 0 -1 0 0 5 + 6570 720 12330 720 12330 5400 6570 5400 6570 720 +2 2 0 3 33 32 100 -1 20 0.000 1 0 -1 0 0 5 + 270 720 6030 720 6030 5400 270 5400 270 720 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 2 1 1.00 90.00 180.00 + 315 7650 2250 7650 +2 1 0 5 4 7 54 -1 -1 0.000 1 0 -1 1 0 2 + 2 1 1.00 120.00 240.00 + 10890 7695 12285 7695 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 4455 8775 4725 8910 7200 8910 +2 1 0 4 279 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 7234 7118 9776 7118 9776 7558 7234 7558 +2 3 0 0 -1 278 49 -1 20 0.000 0 0 -1 0 0 5 + 7234 7558 9776 7558 9776 7118 7234 7118 7234 7558 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9613 7118 9613 7558 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9438 7118 9438 7558 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9264 7118 9264 7558 +2 3 0 0 -1 278 49 -1 20 0.000 0 0 -1 0 0 5 + 7234 8278 9776 8278 9776 7838 7234 7838 7234 8278 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9613 7838 9613 8278 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9438 7838 9438 8278 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9264 7838 9264 8278 +2 1 0 4 279 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 7234 8558 9776 8558 9776 8998 7234 8998 +2 3 0 0 -1 278 49 -1 20 0.000 0 0 -1 0 0 5 + 7234 8998 9776 8998 9776 8558 7234 8558 7234 8998 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9613 8558 9613 8998 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9438 8558 9438 8998 +2 1 0 2 279 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 9264 8558 9264 8998 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 2 + 2 1 1.00 90.00 180.00 + 6075 6480 7200 6480 +2 1 0 3 0 7 50 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 2610 7830 3195 8415 4140 8415 +2 1 0 4 45 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 4166 6398 6094 6398 6094 6838 4166 6838 +2 1 0 2 45 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5923 6398 5923 6838 +2 1 0 2 45 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5748 6398 5748 6838 +2 1 0 2 45 -1 47 -1 -1 0.000 0 0 -1 0 0 2 + 5574 6398 5574 6838 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 2610 7515 2925 6660 3645 6660 4140 6660 +2 3 0 0 277 276 49 -1 43 0.000 0 0 -1 0 0 5 + 4166 6838 6094 6838 6094 6398 4166 6398 4166 6838 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 9765 8775 10350 8775 10665 8280 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 9765 8055 10305 8055 10620 7875 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 9806 6605 10350 6615 10665 7155 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 3 + 2 1 1.00 90.00 180.00 + 9720 7335 10350 7335 10620 7560 +2 1 1 5 4 7 57 -1 -1 12.000 1 0 -1 1 0 2 + 2 1 1.00 120.00 240.00 + 9900 6165 9900 9450 +2 1 0 2 0 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 10080 7245 9990 7425 +2 1 0 2 0 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 10080 7965 9990 8145 +2 1 0 2 0 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 10080 8685 9990 8865 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 4 + 2 1 1.00 90.00 180.00 + 4500 8550 6255 8550 6705 8190 7200 8190 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 5 + 2 1 1.00 90.00 180.00 + 4500 8280 4725 8100 6435 8100 6750 7470 7200 7470 +2 1 0 3 0 7 54 -1 -1 0.000 1 0 -1 1 0 5 + 2 1 1.00 90.00 180.00 + 4455 8055 4635 7740 6390 7740 6750 6750 7200 6750 +2 1 0 4 279 -1 48 -1 -1 0.000 0 0 -1 0 0 4 + 7234 7838 9776 7838 9776 8278 7234 8278 +2 1 0 2 0 7 54 -1 -1 0.000 1 0 -1 0 0 2 + 10080 6525 9990 6705 +2 2 0 3 43 47 100 -1 20 0.000 1 0 -1 0 0 5 + 1935 5985 11070 5985 11070 9585 1935 9585 1935 5985 +4 1 0 50 -1 4 9 1.5708 4 135 315 12240 3060 past\001 +4 1 0 50 -1 4 9 1.5708 4 120 465 10440 3060 future\001 +4 1 0 50 -1 4 9 1.5708 4 135 315 8550 3060 past\001 +4 1 0 50 -1 4 9 1.5708 4 120 465 6750 3060 future\001 +4 1 0 50 -1 6 10 0.0000 4 120 600 9450 5130 Oldest\001 +4 1 0 50 -1 4 9 1.5708 4 105 540 405 3060 newest\001 +4 1 0 50 -1 4 9 1.5708 4 120 450 2205 3060 oldest\001 +4 1 0 50 -1 4 9 1.5708 4 105 540 4095 3060 newest\001 +4 1 0 50 -1 4 9 1.5708 4 120 450 5895 3060 oldest\001 +4 0 0 50 -1 14 10 0.0000 4 135 1470 9135 5850 runqueue-depth\001 +4 0 0 50 -1 14 10 0.0000 4 135 1470 3195 5715 runqueue-depth\001 +4 1 0 50 -1 6 12 0.0000 4 165 1320 9450 3600 Time-based\001 +4 1 0 50 -1 6 12 0.0000 4 195 1395 9450 3780 Wait queues\001 +4 0 0 50 -1 6 12 0.0000 4 195 1050 9000 4005 - 1 global\001 +4 0 0 50 -1 6 12 0.0000 4 195 1605 9000 4185 - 1 per thread\001 +4 1 0 50 -1 6 12 0.0000 4 195 1650 3105 3600 Priority-based\001 +4 1 0 50 -1 6 12 0.0000 4 180 1365 3105 3780 Run queues\001 +4 0 0 50 -1 6 12 0.0000 4 195 1050 2655 4005 - 1 global\001 +4 0 0 50 -1 6 12 0.0000 4 195 1605 2655 4185 - 1 per thread\001 +4 0 0 50 -1 14 10 0.0000 4 135 1365 3240 585 task_wakeup()\001 +4 0 0 50 -1 14 10 0.0000 4 135 1575 9585 630 task_schedule()\001 +4 0 0 50 -1 14 10 0.0000 4 135 1260 9585 450 task_queue()\001 +4 0 0 50 -1 14 10 0.0000 4 135 1680 315 7560 tasklet_wakeup()\001 +4 2 0 50 -1 14 10 0.0000 4 135 1260 12285 7515 t->process()\001 +4 2 4 50 -1 6 12 0.0000 4 150 525 12285 7335 Run!\001 +4 1 0 46 -1 4 16 0.0000 4 210 1695 8460 7425 TL_NORMAL\001 +4 1 0 46 -1 4 16 0.0000 4 210 1200 8460 8145 TL_BULK\001 +4 1 0 46 -1 4 16 0.0000 4 210 1425 8460 8865 TL_HEAVY\001 +4 1 0 46 -1 4 16 0.0000 4 195 1095 4950 6705 SHARED\001 +4 0 0 50 -1 6 9 0.0000 4 105 345 10035 7515 37%\001 +4 0 0 50 -1 6 9 0.0000 4 105 210 10080 8955 =1\001 +4 1 0 50 -1 4 10 0.0000 4 150 2280 5085 6255 (accessed using atomic ops)\001 +4 0 0 50 -1 6 9 0.0000 4 105 345 10035 6795 50%\001 +4 0 0 50 -1 6 9 0.0000 4 105 345 10035 8235 13%\001 +4 2 0 50 -1 6 9 1.5708 4 105 315 2745 8100 Yes\001 +4 1 0 50 -1 6 10 1.5708 4 120 615 2520 7650 Local ?\001 +4 0 0 50 -1 6 9 1.5708 4 105 225 2700 7110 No\001 +4 0 0 50 -1 14 10 0.0000 4 135 1680 4725 8460 TASK_SELF_WAKING\001 +4 0 0 50 -1 14 10 0.0000 4 135 1050 4725 8820 TASK_HEAVY\001 +4 0 0 50 -1 4 10 0.0000 4 165 675 4725 8010 (default)\001 +4 0 0 50 -1 4 10 0.0000 4 150 1290 4725 7650 In I/O or signals\001 +4 1 0 50 -1 6 10 1.5708 4 150 1125 10815 7695 Most Urgent\001 +4 0 4 50 -1 6 10 0.0000 4 120 480 9990 6480 order\001 +4 0 4 50 -1 6 10 0.0000 4 120 420 9990 6300 Scan\001 +4 1 0 50 -1 6 12 0.0000 4 195 9075 6030 9450 5 class-based tasklet queues per thread (one accessible from remote threads)\001 diff --git a/doc/internals/sched.pdf b/doc/internals/sched.pdf Binary files differnew file mode 100644 index 0000000..d1ce3de --- /dev/null +++ b/doc/internals/sched.pdf diff --git a/doc/internals/sched.png b/doc/internals/sched.png Binary files differnew file mode 100644 index 0000000..65c97a1 --- /dev/null +++ b/doc/internals/sched.png diff --git a/doc/internals/sched.svg b/doc/internals/sched.svg new file mode 100644 index 0000000..0fa329a --- /dev/null +++ b/doc/internals/sched.svg @@ -0,0 +1,1204 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Creator: fig2dev Version 3.2.7b --> +<!-- CreationDate: 2021-02-26 17:49:00 --> +<!-- Magnification: 1.57 --> +<svg xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + width="1146pt" height="878pt" + viewBox="237 327 12126 9291"> +<g fill="none"> +<!-- Line --> +<rect x="6570" y="720" width="5760" height="4680" fill="#ffebac" + stroke="#cbb366" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<rect x="270" y="720" width="5760" height="4680" fill="#c5ebe1" + stroke="#86c8a2" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<rect x="1935" y="5985" width="9135" height="3600" fill="#dfe3df" + stroke="#868286" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<defs> +<clipPath id="cp0"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 9960,9130 9900,9190 9840,9130 9867,9483 9933,9483z"/> +</clipPath> +</defs> +<polyline points=" 9900,6165 9900,9450" clip-path="url(#cp0)" + stroke="#ff0000" stroke-width="60px" stroke-linejoin="round" stroke-dasharray="120 120"/> +<!-- Forward arrow to point 9900,9450 --> +<polygon points=" 9840,9130 9900,9430 9960,9130 9900,9190 9840,9130" + stroke="#ff0000" stroke-width="8px" stroke-miterlimit="8" fill="#ff0000"/> +<!-- Line --> +<polyline points=" 9180,5535 9000,5805" + stroke="#b000b0" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<defs> +<clipPath id="cp1"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6880,7140 6940,7200 6880,7260 7233,7233 7233,7167z"/> +</clipPath> +</defs> +<polyline points=" 3105,5220 3105,5850 3105,7200 7200,7200" clip-path="url(#cp1)" + stroke="#00b000" stroke-width="60px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,7200 --> +<polygon points=" 6880,7260 7180,7200 6880,7140 6940,7200 6880,7260" + stroke="#00b000" stroke-width="8px" stroke-miterlimit="8" fill="#00b000"/> +<!-- Line --> +<defs> +<clipPath id="cp2"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 4009,1236 3950,1174 4011,1116 3658,1136 3656,1202z"/> +</clipPath> +</defs> +<polyline points=" 9450,5220 9450,5670 6300,5670 6300,1215 3690,1170" clip-path="url(#cp2)" + stroke="#b000b0" stroke-width="60px" stroke-linejoin="round"/> +<!-- Forward arrow to point 3690,1170 --> +<polygon points=" 4011,1116 3710,1170 4009,1236 3950,1174 4011,1116" + stroke="#b000b0" stroke-width="8px" stroke-miterlimit="8" fill="#b000b0"/> +<!-- Line --> +<polyline points=" 3195,5535 3015,5805" + stroke="#00b000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Line --> +<defs> +<clipPath id="cp3"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 11965,7635 12025,7695 11965,7755 12318,7728 12318,7662z"/> +</clipPath> +</defs> +<polyline points=" 10890,7695 12285,7695" clip-path="url(#cp3)" + stroke="#ff0000" stroke-width="60px" stroke-linejoin="round"/> +<!-- Forward arrow to point 12285,7695 --> +<polygon points=" 11965,7755 12265,7695 11965,7635 12025,7695 11965,7755" + stroke="#ff0000" stroke-width="8px" stroke-miterlimit="8" fill="#ff0000"/> +<!-- Line --> +<defs> +<clipPath id="cp4"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6955,8865 7000,8910 6955,8955 7218,8928 7218,8892z"/> +</clipPath> +</defs> +<polyline points=" 4455,8775 4725,8910 7200,8910" clip-path="url(#cp4)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,8910 --> +<polygon points=" 6955,8955 7180,8910 6955,8865 7000,8910 6955,8955" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp5"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 3895,6615 3940,6660 3895,6705 4158,6678 4158,6642z"/> +</clipPath> +</defs> +<polyline points=" 2610,7515 2925,6660 3645,6660 4140,6660" clip-path="url(#cp5)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 4140,6660 --> +<polygon points=" 3895,6705 4120,6660 3895,6615 3940,6660 3895,6705" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp6"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 10496,8462 10558,8449 10572,8511 10690,8274 10659,8255z"/> +</clipPath> +</defs> +<polyline points=" 9765,8775 10350,8775 10665,8280" clip-path="url(#cp6)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10665,8280 --> +<polygon points=" 10572,8511 10654,8297 10496,8462 10558,8449 10572,8511" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp7"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 10385,7957 10446,7974 10430,8036 10645,7882 10627,7850z"/> +</clipPath> +</defs> +<polyline points=" 9765,8055 10305,8055 10620,7875" clip-path="url(#cp7)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10620,7875 --> +<polygon points=" 10430,8036 10603,7885 10385,7957 10446,7974 10430,8036" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp8"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 10580,6921 10564,6982 10503,6966 10659,7180 10690,7161z"/> +</clipPath> +</defs> +<polyline points=" 9806,6605 10350,6615 10665,7155" clip-path="url(#cp8)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10665,7155 --> +<polygon points=" 10503,6966 10655,7138 10580,6921 10564,6982 10503,6966" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp9"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 10461,7369 10466,7432 10403,7438 10622,7585 10645,7558z"/> +</clipPath> +</defs> +<polyline points=" 9720,7335 10350,7335 10620,7560" clip-path="url(#cp9)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 10620,7560 --> +<polygon points=" 10403,7438 10605,7547 10461,7369 10466,7432 10403,7438" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 10080,7245 9990,7425" + stroke="#000000" stroke-width="15px" stroke-linejoin="round"/> +<!-- Line --> +<polyline points=" 10080,7965 9990,8145" + stroke="#000000" stroke-width="15px" stroke-linejoin="round"/> +<!-- Line --> +<polyline points=" 10080,8685 9990,8865" + stroke="#000000" stroke-width="15px" stroke-linejoin="round"/> +<!-- Line --> +<defs> +<clipPath id="cp10"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6955,8145 7000,8190 6955,8235 7218,8208 7218,8172z"/> +</clipPath> +</defs> +<polyline points=" 4500,8550 6255,8550 6705,8190 7200,8190" clip-path="url(#cp10)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,8190 --> +<polygon points=" 6955,8235 7180,8190 6955,8145 7000,8190 6955,8235" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp11"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6955,7425 7000,7470 6955,7515 7218,7488 7218,7452z"/> +</clipPath> +</defs> +<polyline points=" 4500,8280 4725,8100 6435,8100 6750,7470 7200,7470" clip-path="url(#cp11)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,7470 --> +<polygon points=" 6955,7515 7180,7470 6955,7425 7000,7470 6955,7515" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp12"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6955,6705 7000,6750 6955,6795 7218,6768 7218,6732z"/> +</clipPath> +</defs> +<polyline points=" 4455,8055 4635,7740 6390,7740 6750,6750 7200,6750" clip-path="url(#cp12)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,6750 --> +<polygon points=" 6955,6795 7180,6750 6955,6705 7000,6750 6955,6795" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<polyline points=" 10080,6525 9990,6705" + stroke="#000000" stroke-width="15px" stroke-linejoin="round"/> +<!-- Ellipse --> +<ellipse cx="3105" cy="5049" rx="810" ry="171" fill="#87cfff" + stroke="#00008f" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse cx="3083" cy="1180" rx="607" ry="170" fill="#ffbfbf" + stroke="#d10000" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse cx="9428" cy="1180" rx="607" ry="170" fill="#ffbfbf" + stroke="#d10000" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse transform="translate(4330,8437) rotate(-90)" rx="607" ry="170" fill="#ffbfbf" + stroke="#d10000" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse cx="9450" cy="5049" rx="540" ry="171" fill="#87cfff" + stroke="#00008f" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse transform="translate(2440,7672) rotate(-90)" rx="607" ry="170" fill="#ffbfbf" + stroke="#d10000" stroke-width="30px"/> +<!-- Ellipse --> +<ellipse transform="translate(10755,7695) rotate(-90)" rx="810" ry="171" fill="#87cfff" + stroke="#00008f" stroke-width="30px"/> +<!-- Line --> +<polygon points=" 11768,3060 11970,3060 11967,3119 11959,3177 11946,3234 11929,3291 11907,3345 + 11879,3397 11704,3296 11723,3259 11738,3222 11751,3182 11760,3142 11765,3101 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11704,3296 11879,3397 11848,3447 11812,3494 11772,3537 11729,3577 11682,3613 + 11633,3644 11531,3469 11566,3447 11599,3422 11628,3393 11657,3364 11682,3331 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11531,3469 11633,3644 11580,3672 11526,3694 11469,3711 11412,3724 11354,3732 + 11295,3734 11295,3532 11336,3530 11377,3525 11417,3516 11457,3503 11494,3488 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11295,3532 11295,3734 11236,3732 11178,3724 11121,3711 11064,3694 11010,3672 + 10958,3644 11059,3469 11096,3488 11133,3503 11173,3516 11213,3525 11254,3530 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11059,3469 10958,3644 10908,3613 10861,3577 10818,3537 10778,3494 10742,3447 + 10711,3398 10886,3296 10908,3331 10933,3364 10962,3393 10991,3422 11024,3447 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 10886,3296 10711,3398 10683,3345 10661,3291 10644,3234 10631,3177 10623,3119 + 10621,3060 10823,3060 10825,3101 10830,3142 10839,3182 10852,3222 10867,3259 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 10823,3060 10621,3060 10623,3001 10631,2943 10644,2886 10661,2829 10683,2775 + 10711,2723 10886,2824 10867,2861 10852,2898 10839,2938 10830,2978 10825,3019 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 10886,2824 10711,2723 10742,2673 10778,2626 10818,2583 10861,2543 10908,2507 + 10958,2476 11059,2651 11024,2673 10991,2698 10962,2727 10933,2756 10908,2789 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11059,2651 10958,2476 11010,2448 11064,2426 11121,2409 11178,2396 11236,2388 + 11295,2386 11295,2588 11254,2590 11213,2595 11173,2604 11133,2617 11096,2632 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11295,2588 11295,2386 11354,2388 11412,2396 11469,2409 11526,2426 11580,2448 + 11632,2476 11531,2651 11494,2632 11457,2617 11417,2604 11377,2595 11336,2590 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11531,2651 11632,2476 11682,2507 11729,2543 11772,2583 11812,2626 11848,2673 + 11879,2723 11704,2824 11682,2789 11657,2756 11628,2727 11599,2698 11566,2673 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 11704,2824 11879,2723 11907,2775 11929,2829 11946,2886 11959,2943 11967,3001 + 11969,3060 11767,3060 11765,3019 11760,2978 11751,2938 11738,2898 11723,2861 +" fill="#ffd600" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 5423,3060 5625,3060 5622,3119 5614,3177 5601,3234 5584,3291 5562,3345 5534,3397 + 5359,3296 5378,3259 5393,3222 5406,3182 5415,3142 5420,3101" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 5359,3296 5534,3397 5503,3447 5467,3494 5427,3537 5384,3577 5337,3613 5288,3644 + 5186,3469 5221,3447 5254,3422 5283,3393 5312,3364 5337,3331" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 5186,3469 5288,3644 5235,3672 5181,3694 5124,3711 5067,3724 5009,3732 4950,3734 + 4950,3532 4991,3530 5032,3525 5072,3516 5112,3503 5149,3488" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4950,3532 4950,3734 4891,3732 4833,3724 4776,3711 4719,3694 4665,3672 4613,3644 + 4714,3469 4751,3488 4788,3503 4828,3516 4868,3525 4909,3530" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4714,3469 4613,3644 4563,3613 4516,3577 4473,3537 4433,3494 4397,3447 4366,3398 + 4541,3296 4563,3331 4588,3364 4617,3393 4646,3422 4679,3447" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4541,3296 4366,3398 4338,3345 4316,3291 4299,3234 4286,3177 4278,3119 4276,3060 + 4478,3060 4480,3101 4485,3142 4494,3182 4507,3222 4522,3259" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4478,3060 4276,3060 4278,3001 4286,2943 4299,2886 4316,2829 4338,2775 4366,2723 + 4541,2824 4522,2861 4507,2898 4494,2938 4485,2978 4480,3019" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4541,2824 4366,2723 4397,2673 4433,2626 4473,2583 4516,2543 4563,2507 4613,2476 + 4714,2651 4679,2673 4646,2698 4617,2727 4588,2756 4563,2789" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4714,2651 4613,2476 4665,2448 4719,2426 4776,2409 4833,2396 4891,2388 4950,2386 + 4950,2588 4909,2590 4868,2595 4828,2604 4788,2617 4751,2632" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 4950,2588 4950,2386 5009,2388 5067,2396 5124,2409 5181,2426 5235,2448 5287,2476 + 5186,2651 5149,2632 5112,2617 5072,2604 5032,2595 4991,2590" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 5186,2651 5287,2476 5337,2507 5384,2543 5427,2583 5467,2626 5503,2673 5534,2723 + 5359,2824 5337,2789 5312,2756 5283,2727 5254,2698 5221,2673" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<polygon points=" 5359,2824 5534,2723 5562,2775 5584,2829 5601,2886 5614,2943 5622,3001 5624,3060 + 5422,3060 5420,3019 5415,2978 5406,2938 5393,2898 5378,2861" fill="#00b000" + stroke="#000000" stroke-width="15px"/> +<!-- Text --> +<text xml:space="preserve" x="3105" y="5130" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Most Urgent</text> +<!-- Line --> +<defs> +<polygon points=" 8078,3060 8280,3060 8277,3119 8269,3177 8256,3234 8239,3291 8217,3345 8189,3397 + 8014,3296 8033,3259 8048,3222 8061,3182 8070,3142 8075,3101" id="p0"/> +<pattern id="tile0" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p0" fill="#ffd600"/> +<use xlink:href="#p0" fill="url(#tile0)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 8014,3296 8189,3397 8158,3447 8122,3494 8082,3537 8039,3577 7992,3613 7943,3644 + 7841,3469 7876,3447 7909,3422 7938,3393 7967,3364 7992,3331" id="p1"/> +<pattern id="tile1" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p1" fill="#ffd600"/> +<use xlink:href="#p1" fill="url(#tile1)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7841,3469 7943,3644 7890,3672 7836,3694 7779,3711 7722,3724 7664,3732 7605,3734 + 7605,3532 7646,3530 7687,3525 7727,3516 7767,3503 7804,3488" id="p2"/> +<pattern id="tile2" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p2" fill="#ffd600"/> +<use xlink:href="#p2" fill="url(#tile2)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7605,3532 7605,3734 7546,3732 7488,3724 7431,3711 7374,3694 7320,3672 7268,3644 + 7369,3469 7406,3488 7443,3503 7483,3516 7523,3525 7564,3530" id="p3"/> +<pattern id="tile3" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p3" fill="#ffd600"/> +<use xlink:href="#p3" fill="url(#tile3)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7369,3469 7268,3644 7218,3613 7171,3577 7128,3537 7088,3494 7052,3447 7021,3398 + 7196,3296 7218,3331 7243,3364 7272,3393 7301,3422 7334,3447" id="p4"/> +<pattern id="tile4" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p4" fill="#ffd600"/> +<use xlink:href="#p4" fill="url(#tile4)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7196,3296 7021,3398 6993,3345 6971,3291 6954,3234 6941,3177 6933,3119 6931,3060 + 7133,3060 7135,3101 7140,3142 7149,3182 7162,3222 7177,3259" id="p5"/> +<pattern id="tile5" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p5" fill="#ffd600"/> +<use xlink:href="#p5" fill="url(#tile5)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7133,3060 6931,3060 6933,3001 6941,2943 6954,2886 6971,2829 6993,2775 7021,2723 + 7196,2824 7177,2861 7162,2898 7149,2938 7140,2978 7135,3019" id="p6"/> +<pattern id="tile6" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p6" fill="#ffd600"/> +<use xlink:href="#p6" fill="url(#tile6)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7196,2824 7021,2723 7052,2673 7088,2626 7128,2583 7171,2543 7218,2507 7268,2476 + 7369,2651 7334,2673 7301,2698 7272,2727 7243,2756 7218,2789" id="p7"/> +<pattern id="tile7" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p7" fill="#ffd600"/> +<use xlink:href="#p7" fill="url(#tile7)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7369,2651 7268,2476 7320,2448 7374,2426 7431,2409 7488,2396 7546,2388 7605,2386 + 7605,2588 7564,2590 7523,2595 7483,2604 7443,2617 7406,2632" id="p8"/> +<pattern id="tile8" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p8" fill="#ffd600"/> +<use xlink:href="#p8" fill="url(#tile8)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7605,2588 7605,2386 7664,2388 7722,2396 7779,2409 7836,2426 7890,2448 7942,2476 + 7841,2651 7804,2632 7767,2617 7727,2604 7687,2595 7646,2590" id="p9"/> +<pattern id="tile9" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p9" fill="#ffd600"/> +<use xlink:href="#p9" fill="url(#tile9)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 7841,2651 7942,2476 7992,2507 8039,2543 8082,2583 8122,2626 8158,2673 8189,2723 + 8014,2824 7992,2789 7967,2756 7938,2727 7909,2698 7876,2673" id="p10"/> +<pattern id="tile10" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p10" fill="#ffd600"/> +<use xlink:href="#p10" fill="url(#tile10)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 8014,2824 8189,2723 8217,2775 8239,2829 8256,2886 8269,2943 8277,3001 8279,3060 + 8077,3060 8075,3019 8070,2978 8061,2938 8048,2898 8033,2861" id="p11"/> +<pattern id="tile11" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p11" fill="#ffd600"/> +<use xlink:href="#p11" fill="url(#tile11)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1733,3060 1935,3060 1932,3119 1924,3177 1911,3234 1894,3291 1872,3345 1844,3397 + 1669,3296 1688,3259 1703,3222 1716,3182 1725,3142 1730,3101" id="p12"/> +<pattern id="tile12" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p12" fill="#00b000"/> +<use xlink:href="#p12" fill="url(#tile12)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1669,3296 1844,3397 1813,3447 1777,3494 1737,3537 1694,3577 1647,3613 1598,3644 + 1496,3469 1531,3447 1564,3422 1593,3393 1622,3364 1647,3331" id="p13"/> +<pattern id="tile13" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p13" fill="#00b000"/> +<use xlink:href="#p13" fill="url(#tile13)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1496,3469 1598,3644 1545,3672 1491,3694 1434,3711 1377,3724 1319,3732 1260,3734 + 1260,3532 1301,3530 1342,3525 1382,3516 1422,3503 1459,3488" id="p14"/> +<pattern id="tile14" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p14" fill="#00b000"/> +<use xlink:href="#p14" fill="url(#tile14)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1260,3532 1260,3734 1201,3732 1143,3724 1086,3711 1029,3694 975,3672 923,3644 + 1024,3469 1061,3488 1098,3503 1138,3516 1178,3525 1219,3530" id="p15"/> +<pattern id="tile15" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p15" fill="#00b000"/> +<use xlink:href="#p15" fill="url(#tile15)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1024,3469 923,3644 873,3613 826,3577 783,3537 743,3494 707,3447 676,3398 851,3296 + 873,3331 898,3364 927,3393 956,3422 989,3447" id="p16"/> +<pattern id="tile16" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p16" fill="#00b000"/> +<use xlink:href="#p16" fill="url(#tile16)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 851,3296 676,3398 648,3345 626,3291 609,3234 596,3177 588,3119 586,3060 788,3060 + 790,3101 795,3142 804,3182 817,3222 832,3259" id="p17"/> +<pattern id="tile17" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p17" fill="#00b000"/> +<use xlink:href="#p17" fill="url(#tile17)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 788,3060 586,3060 588,3001 596,2943 609,2886 626,2829 648,2775 676,2723 851,2824 + 832,2861 817,2898 804,2938 795,2978 790,3019" id="p18"/> +<pattern id="tile18" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p18" fill="#00b000"/> +<use xlink:href="#p18" fill="url(#tile18)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 851,2824 676,2723 707,2673 743,2626 783,2583 826,2543 873,2507 923,2476 1024,2651 + 989,2673 956,2698 927,2727 898,2756 873,2789" id="p19"/> +<pattern id="tile19" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p19" fill="#00b000"/> +<use xlink:href="#p19" fill="url(#tile19)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1024,2651 923,2476 975,2448 1029,2426 1086,2409 1143,2396 1201,2388 1260,2386 + 1260,2588 1219,2590 1178,2595 1138,2604 1098,2617 1061,2632" id="p20"/> +<pattern id="tile20" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p20" fill="#00b000"/> +<use xlink:href="#p20" fill="url(#tile20)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1260,2588 1260,2386 1319,2388 1377,2396 1434,2409 1491,2426 1545,2448 1597,2476 + 1496,2651 1459,2632 1422,2617 1382,2604 1342,2595 1301,2590" id="p21"/> +<pattern id="tile21" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p21" fill="#00b000"/> +<use xlink:href="#p21" fill="url(#tile21)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1496,2651 1597,2476 1647,2507 1694,2543 1737,2583 1777,2626 1813,2673 1844,2723 + 1669,2824 1647,2789 1622,2756 1593,2727 1564,2698 1531,2673" id="p22"/> +<pattern id="tile22" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p22" fill="#00b000"/> +<use xlink:href="#p22" fill="url(#tile22)" + stroke="#000000" stroke-width="15px"/> +<!-- Line --> +<defs> +<polygon points=" 1669,2824 1844,2723 1872,2775 1894,2829 1911,2886 1924,2943 1932,3001 1934,3060 + 1732,3060 1730,3019 1725,2978 1716,2938 1703,2898 1688,2861" id="p23"/> +<pattern id="tile23" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#000000" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p23" fill="#00b000"/> +<use xlink:href="#p23" fill="url(#tile23)" + stroke="#000000" stroke-width="15px"/> +<!-- Text --> +<text xml:space="preserve" x="2160" y="1980" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Global</text> +<!-- Text --> +<text xml:space="preserve" x="2160" y="2160" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">tasks</text> +<!-- Text --> +<text xml:space="preserve" x="2160" y="2340" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">(locked)</text> +<!-- Text --> +<text xml:space="preserve" x="4230" y="2250" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">tasks</text> +<!-- Text --> +<text xml:space="preserve" x="4230" y="2070" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Local</text> +<!-- Text --> +<text xml:space="preserve" x="8550" y="2340" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">(locked)</text> +<!-- Text --> +<text xml:space="preserve" x="8550" y="2160" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">timers</text> +<!-- Text --> +<text xml:space="preserve" x="8550" y="1980" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Global</text> +<!-- Text --> +<text xml:space="preserve" x="10530" y="2250" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">timers</text> +<!-- Text --> +<text xml:space="preserve" x="10530" y="2070" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Local</text> +<!-- Text --> +<text xml:space="preserve" x="3105" y="1260" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Local ?</text> +<!-- Text --> +<text xml:space="preserve" x="3375" y="1530" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">Yes</text> +<!-- Text --> +<text xml:space="preserve" x="2790" y="1530" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="end">No</text> +<!-- Text --> +<text xml:space="preserve" x="9450" y="1260" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Local ?</text> +<!-- Text --> +<text xml:space="preserve" x="9720" y="1530" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">Yes</text> +<!-- Text --> +<text xml:space="preserve" x="9135" y="1530" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="end">No</text> +<!-- Text --> +<g transform="translate(4410,8415) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Class?</text> +</g><!-- Arc --> +<defs> +<clipPath id="cp13"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 11065,2294 11189,2298 11078,2353 11217,2303 11213,2283z"/> +</clipPath> +</defs> +<path d="M 11205,3825 A 771 771 0 0 1 11205 2295" clip-path="url(#cp13)" + stroke="#000000" stroke-width="15px"/> +<!-- Forward arrow to point 11205,2295 --> +<polyline points=" 11065,2294 11189,2298 11078,2353" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp14"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 11065,2294 11189,2298 11078,2353 11217,2303 11213,2283z + M 11525,3826 11401,3822 11512,3767 11373,3817 11377,3837z"/> +</clipPath> +</defs> +<path d="M 11385,3825 A 771 771 0 0 0 11385 2295" clip-path="url(#cp14)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 11385,3825 --> +<polyline points=" 11525,3826 11401,3822 11512,3767" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp15"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 11065,2294 11189,2298 11078,2353 11217,2303 11213,2283z + M 10908,3277 10928,3240 10965,3260 10897,3092 10877,3098z"/> +</clipPath> +</defs> +<path d="M 10890,3105 A 406 406 0 1 0 10890 3015" clip-path="url(#cp15)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 10890,3105 --> +<polygon points=" 10965,3260 10895,3124 10908,3277 10928,3240 10965,3260" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Arc --> +<defs> +<clipPath id="cp16"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 7375,2294 7499,2298 7388,2353 7527,2303 7523,2283z + M 10908,3277 10928,3240 10965,3260 10897,3092 10877,3098z"/> +</clipPath> +</defs> +<path d="M 7515,3825 A 771 771 0 0 1 7515 2295" clip-path="url(#cp16)" + stroke="#000000" stroke-width="15px"/> +<!-- Forward arrow to point 7515,2295 --> +<polyline points=" 7375,2294 7499,2298 7388,2353" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp17"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 7375,2294 7499,2298 7388,2353 7527,2303 7523,2283z + M 7835,3826 7711,3822 7822,3767 7683,3817 7687,3837z"/> +</clipPath> +</defs> +<path d="M 7695,3825 A 771 771 0 0 0 7695 2295" clip-path="url(#cp17)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 7695,3825 --> +<polyline points=" 7835,3826 7711,3822 7822,3767" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp18"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 7375,2294 7499,2298 7388,2353 7527,2303 7523,2283z + M 7218,3277 7238,3240 7275,3260 7207,3092 7187,3098z"/> +</clipPath> +</defs> +<path d="M 7200,3105 A 406 406 0 1 0 7200 3015" clip-path="url(#cp18)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 7200,3105 --> +<polygon points=" 7275,3260 7205,3124 7218,3277 7238,3240 7275,3260" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Arc --> +<defs> +<clipPath id="cp19"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 4720,2294 4844,2298 4733,2353 4872,2303 4868,2283z + M 7218,3277 7238,3240 7275,3260 7207,3092 7187,3098z"/> +</clipPath> +</defs> +<path d="M 4860,3825 A 771 771 0 0 1 4860 2295" clip-path="url(#cp19)" + stroke="#000000" stroke-width="15px"/> +<!-- Forward arrow to point 4860,2295 --> +<polyline points=" 4720,2294 4844,2298 4733,2353" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp20"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 4720,2294 4844,2298 4733,2353 4872,2303 4868,2283z + M 5180,3826 5056,3822 5167,3767 5028,3817 5032,3837z"/> +</clipPath> +</defs> +<path d="M 5040,3825 A 771 771 0 0 0 5040 2295" clip-path="url(#cp20)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 5040,3825 --> +<polyline points=" 5180,3826 5056,3822 5167,3767" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp21"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 4720,2294 4844,2298 4733,2353 4872,2303 4868,2283z + M 4563,3277 4583,3240 4620,3260 4552,3092 4532,3098z"/> +</clipPath> +</defs> +<path d="M 4545,3105 A 406 406 0 1 0 4545 3015" clip-path="url(#cp21)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 4545,3105 --> +<polygon points=" 4620,3260 4550,3124 4563,3277 4583,3240 4620,3260" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Arc --> +<defs> +<clipPath id="cp22"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 1030,2294 1154,2298 1043,2353 1182,2303 1178,2283z + M 4563,3277 4583,3240 4620,3260 4552,3092 4532,3098z"/> +</clipPath> +</defs> +<path d="M 1170,3825 A 771 771 0 0 1 1170 2295" clip-path="url(#cp22)" + stroke="#000000" stroke-width="15px"/> +<!-- Forward arrow to point 1170,2295 --> +<polyline points=" 1030,2294 1154,2298 1043,2353" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp23"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 1030,2294 1154,2298 1043,2353 1182,2303 1178,2283z + M 1490,3826 1366,3822 1477,3767 1338,3817 1342,3837z"/> +</clipPath> +</defs> +<path d="M 1350,3825 A 771 771 0 0 0 1350 2295" clip-path="url(#cp23)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 1350,3825 --> +<polyline points=" 1490,3826 1366,3822 1477,3767" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8"/> +<!-- Arc --> +<defs> +<clipPath id="cp24"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 1030,2294 1154,2298 1043,2353 1182,2303 1178,2283z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<path d="M 855,3105 A 406 406 0 1 0 855 3015" clip-path="url(#cp24)" + stroke="#000000" stroke-width="15px"/> +<!-- Backward arrow to point 855,3105 --> +<polygon points=" 930,3260 860,3124 873,3277 893,3240 930,3260" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp25"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 9315,4660 9270,4705 9225,4660 9252,4923 9288,4923z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 7605,3870 7605,4185 9270,4545 9270,4905" clip-path="url(#cp25)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 9270,4905 --> +<polygon points=" 9225,4660 9270,4885 9315,4660 9270,4705 9225,4660" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp26"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 9681,4660 9636,4705 9591,4660 9618,4923 9654,4923z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 11301,3870 11301,4185 9636,4545 9636,4905" clip-path="url(#cp26)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 9636,4905 --> +<polygon points=" 9591,4660 9636,4885 9681,4660 9636,4705 9591,4660" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp27"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 11338,2050 11293,2095 11248,2051 11277,2313 11313,2313z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 9630,1395 9626,1591 11291,1800 11295,2295" clip-path="url(#cp27)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 11295,2295 --> +<polygon points=" 11248,2051 11295,2275 11338,2050 11293,2095 11248,2051" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp28"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 7650,2050 7605,2095 7560,2050 7587,2313 7623,2313z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 9270,1395 9270,1575 7605,1800 7605,2295" clip-path="url(#cp28)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7605,2295 --> +<polygon points=" 7560,2050 7605,2275 7650,2050 7605,2095 7560,2050" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp29"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 9495,790 9450,835 9405,790 9432,1053 9468,1053z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 9450,360 9450,1035" clip-path="url(#cp29)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 9450,1035 --> +<polygon points=" 9405,790 9450,1015 9495,790 9450,835 9405,790" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp30"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 2970,4660 2925,4705 2880,4660 2907,4923 2943,4923z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 1260,3870 1260,4185 2925,4545 2925,4905" clip-path="url(#cp30)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 2925,4905 --> +<polygon points=" 2880,4660 2925,4885 2970,4660 2925,4705 2880,4660" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp31"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 3336,4660 3291,4705 3246,4660 3273,4923 3309,4923z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 4956,3870 4956,4185 3291,4545 3291,4905" clip-path="url(#cp31)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 3291,4905 --> +<polygon points=" 3246,4660 3291,4885 3336,4660 3291,4705 3246,4660" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp32"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 3150,790 3105,835 3060,790 3087,1053 3123,1053z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 3105,360 3105,1035" clip-path="url(#cp32)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 3105,1035 --> +<polygon points=" 3060,790 3105,1015 3150,790 3105,835 3060,790" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp33"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 4995,2140 4950,2185 4905,2140 4932,2403 4968,2403z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 3285,1395 3285,1575 4950,1845 4950,2385" clip-path="url(#cp33)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 4950,2385 --> +<polygon points=" 4905,2140 4950,2365 4995,2140 4950,2185 4905,2140" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp34"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 1305,2140 1260,2185 1215,2140 1242,2403 1278,2403z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 2925,1395 2925,1575 1260,1845 1260,2385" clip-path="url(#cp34)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 1260,2385 --> +<polygon points=" 1215,2140 1260,2365 1305,2140 1260,2185 1215,2140" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp35"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 2005,7605 2050,7650 2005,7695 2268,7668 2268,7632z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 315,7650 2250,7650" clip-path="url(#cp35)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 2250,7650 --> +<polygon points=" 2005,7695 2230,7650 2005,7605 2050,7650 2005,7695" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp36"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 6955,6435 7000,6480 6955,6525 7218,6498 7218,6462z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 6075,6480 7200,6480" clip-path="url(#cp36)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 7200,6480 --> +<polygon points=" 6955,6525 7180,6480 6955,6435 7000,6480 6955,6525" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Line --> +<defs> +<clipPath id="cp37"> + <path clip-rule="evenodd" d="M 237,327 H 12363 V 9618 H 237 z + M 3895,8370 3940,8415 3895,8460 4158,8433 4158,8397z + M 873,3277 893,3240 930,3260 862,3092 842,3098z"/> +</clipPath> +</defs> +<polyline points=" 2610,7830 3195,8415 4140,8415" clip-path="url(#cp37)" + stroke="#000000" stroke-width="30px" stroke-linejoin="round"/> +<!-- Forward arrow to point 4140,8415 --> +<polygon points=" 3895,8460 4120,8415 3895,8370 3940,8415 3895,8460" + stroke="#000000" stroke-width="8px" stroke-miterlimit="8" fill="#000000"/> +<!-- Text --> +<g transform="translate(12240,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">past</text> +</g><!-- Text --> +<g transform="translate(10440,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">future</text> +</g><!-- Text --> +<g transform="translate(8550,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">past</text> +</g><!-- Text --> +<g transform="translate(6750,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">future</text> +</g><!-- Text --> +<text xml:space="preserve" x="9450" y="5130" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Oldest</text> +<!-- Text --> +<g transform="translate(405,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">newest</text> +</g><!-- Text --> +<g transform="translate(2205,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">oldest</text> +</g><!-- Text --> +<g transform="translate(4095,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">newest</text> +</g><!-- Text --> +<g transform="translate(5895,3060) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="108" text-anchor="middle">oldest</text> +</g><!-- Text --> +<text xml:space="preserve" x="9135" y="5850" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">runqueue-depth</text> +<!-- Text --> +<text xml:space="preserve" x="3195" y="5715" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">runqueue-depth</text> +<!-- Text --> +<text xml:space="preserve" x="9450" y="3600" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="middle">Time-based</text> +<!-- Text --> +<text xml:space="preserve" x="9450" y="3780" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="middle">Wait queues</text> +<!-- Text --> +<text xml:space="preserve" x="9000" y="4005" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="start">- 1 global</text> +<!-- Text --> +<text xml:space="preserve" x="9000" y="4185" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="start">- 1 per thread</text> +<!-- Text --> +<text xml:space="preserve" x="3105" y="3600" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="middle">Priority-based</text> +<!-- Text --> +<text xml:space="preserve" x="3105" y="3780" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="middle">Run queues</text> +<!-- Text --> +<text xml:space="preserve" x="2655" y="4005" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="start">- 1 global</text> +<!-- Text --> +<text xml:space="preserve" x="2655" y="4185" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="start">- 1 per thread</text> +<!-- Text --> +<text xml:space="preserve" x="3240" y="585" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">task_wakeup()</text> +<!-- Text --> +<text xml:space="preserve" x="9585" y="630" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">task_schedule()</text> +<!-- Text --> +<text xml:space="preserve" x="9585" y="450" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">task_queue()</text> +<!-- Text --> +<text xml:space="preserve" x="315" y="7560" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">tasklet_wakeup()</text> +<!-- Text --> +<text xml:space="preserve" x="12285" y="7515" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="end">t->process()</text> +<!-- Text --> +<text xml:space="preserve" x="12285" y="7335" fill="#ff0000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="end">Run!</text> +<!-- Text --> +<text xml:space="preserve" x="10035" y="7515" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">37%</text> +<!-- Text --> +<text xml:space="preserve" x="10080" y="8955" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">=1</text> +<!-- Text --> +<text xml:space="preserve" x="5085" y="6255" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="120" text-anchor="middle">(accessed using atomic ops)</text> +<!-- Text --> +<text xml:space="preserve" x="10035" y="6795" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">50%</text> +<!-- Text --> +<text xml:space="preserve" x="10035" y="8235" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">13%</text> +<!-- Text --> +<g transform="translate(2745,8100) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="end">Yes</text> +</g><!-- Text --> +<g transform="translate(2520,7650) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Local ?</text> +</g><!-- Text --> +<g transform="translate(2700,7110) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="108" text-anchor="start">No</text> +</g><!-- Text --> +<text xml:space="preserve" x="4725" y="8460" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">TASK_SELF_WAKING</text> +<!-- Text --> +<text xml:space="preserve" x="4725" y="8820" fill="#000000" font-family="Courier" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">TASK_HEAVY</text> +<!-- Text --> +<text xml:space="preserve" x="4725" y="8010" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="120" text-anchor="start">(default)</text> +<!-- Text --> +<text xml:space="preserve" x="4725" y="7650" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="120" text-anchor="start">In I/O or signals</text> +<!-- Text --> +<g transform="translate(10815,7695) rotate(-90)" > +<text xml:space="preserve" x="0" y="0" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="middle">Most Urgent</text> +</g><!-- Text --> +<text xml:space="preserve" x="9990" y="6480" fill="#ff0000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">order</text> +<!-- Text --> +<text xml:space="preserve" x="9990" y="6300" fill="#ff0000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="120" text-anchor="start">Scan</text> +<!-- Text --> +<text xml:space="preserve" x="6030" y="9450" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="bold" font-size="144" text-anchor="middle">5 class-based tasklet queues per thread (one accessible from remote threads)</text> +<!-- Line --> +<polygon points=" 7234,6838 9776,6838 9776,6398 7234,6398" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 7234,7558 9776,7558 9776,7118 7234,7118" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 7234,8278 9776,8278 9776,7838 7234,7838" fill="#dae8fc"/> +<!-- Line --> +<polygon points=" 7234,8998 9776,8998 9776,8558 7234,8558" fill="#dae8fc"/> +<!-- Line --> +<defs> +<polygon points=" 4166,6838 6094,6838 6094,6398 4166,6398" id="p24"/> +<pattern id="tile24" patternUnits="userSpaceOnUse" + x="0" y="0" width="134" height="67"> +<g stroke-width="7.5" stroke="#a7ceb3" fill="none"> +<path d="M -7,30 73,70 M 61,-3 141,37 M -7,37 73,-3 M 61,70 141,30"/> +</g> +</pattern> +</defs> +<use xlink:href="#p24" fill="#bbf2e2"/> +<use xlink:href="#p24" fill="url(#tile24)"/> +<!-- Line --> +<polyline points=" 7234,6398 9776,6398 9776,6838 7234,6838" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 7234,7118 9776,7118 9776,7558 7234,7558" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 7234,8558 9776,8558 9776,8998 7234,8998" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 4166,6398 6094,6398 6094,6838 4166,6838" + stroke="#868286" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 7234,7838 9776,7838 9776,8278 7234,8278" + stroke="#458dba" stroke-width="45px"/> +<!-- Line --> +<polyline points=" 9613,6398 9613,6838" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9438,6398 9438,6838" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9264,6398 9264,6838" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9613,7118 9613,7558" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9438,7118 9438,7558" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9264,7118 9264,7558" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9613,7838 9613,8278" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9438,7838 9438,8278" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9264,7838 9264,8278" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9613,8558 9613,8998" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9438,8558 9438,8998" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 9264,8558 9264,8998" + stroke="#458dba" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5923,6398 5923,6838" + stroke="#868286" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5748,6398 5748,6838" + stroke="#868286" stroke-width="15px"/> +<!-- Line --> +<polyline points=" 5574,6398 5574,6838" + stroke="#868286" stroke-width="15px"/> +<!-- Text --> +<text xml:space="preserve" x="8460" y="6705" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">TL_URGENT</text> +<!-- Text --> +<text xml:space="preserve" x="8460" y="7425" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">TL_NORMAL</text> +<!-- Text --> +<text xml:space="preserve" x="8460" y="8145" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">TL_BULK</text> +<!-- Text --> +<text xml:space="preserve" x="8460" y="8865" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">TL_HEAVY</text> +<!-- Text --> +<text xml:space="preserve" x="4950" y="6705" fill="#000000" font-family="AvantGarde" font-style="normal" font-weight="normal" font-size="192" text-anchor="middle">SHARED</text> +</g> +</svg> diff --git a/doc/internals/ssl_cert.dia b/doc/internals/ssl_cert.dia Binary files differnew file mode 100644 index 0000000..52496a1 --- /dev/null +++ b/doc/internals/ssl_cert.dia diff --git a/doc/internals/stats-v2.txt b/doc/internals/stats-v2.txt new file mode 100644 index 0000000..7d2ae76 --- /dev/null +++ b/doc/internals/stats-v2.txt @@ -0,0 +1,8 @@ + + Qcur Qmax Scur Smax Slim Scum Fin Fout Bin Bout Ereq Econ Ersp Sts Wght Act Bck EChk Down +Frontend - - X maxX Y totX I O I O Q - - - - - - - - +Server X maxX X maxX Y totX I O I O - C R S W A B E D +Server X maxX X maxX Y totX I O I O - C R S W A B E D +Server X maxX X maxX Y totX I O I O - C R S W A B E D +Backend X maxX X maxX Y totX I O I O - C R S totW totA totB totE totD + diff --git a/doc/internals/stream-sock-states.fig b/doc/internals/stream-sock-states.fig new file mode 100644 index 0000000..79131e5 --- /dev/null +++ b/doc/internals/stream-sock-states.fig @@ -0,0 +1,535 @@ +#FIG 3.2 Produced by xfig version 2.0 +Portrait +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +0 32 #8e8e8e +6 2295 1260 2430 1395 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2363 1328 68 68 2430 1328 2295 1328 +4 1 0 50 -1 18 5 0.0000 4 60 60 2363 1361 1\001 +-6 +6 1845 2295 1980 2430 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1913 2363 68 68 1980 2363 1845 2363 +4 1 0 50 -1 18 5 0.0000 4 60 60 1913 2396 2\001 +-6 +6 2475 2340 2610 2475 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2543 2408 68 68 2610 2408 2475 2408 +4 1 0 50 -1 18 5 0.0000 4 60 60 2543 2441 9\001 +-6 +6 2835 2610 2970 2745 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2903 2678 68 68 2970 2678 2835 2678 +4 1 0 50 -1 18 5 0.0000 4 60 60 2903 2711 7\001 +-6 +6 3195 2025 3330 2160 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 3263 2093 68 68 3330 2093 3195 2093 +4 1 0 50 -1 18 5 0.0000 4 60 60 3263 2126 8\001 +-6 +6 2745 2160 2880 2295 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2813 2228 68 68 2880 2228 2745 2228 +4 1 0 50 -1 18 5 0.0000 4 60 60 2813 2261 6\001 +-6 +6 990 2700 1125 2835 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1058 2768 68 68 1125 2768 990 2768 +4 1 0 50 -1 18 5 0.0000 4 60 120 1058 2801 13\001 +-6 +6 1305 2970 1440 3105 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1373 3038 68 68 1440 3038 1305 3038 +4 1 0 50 -1 18 5 0.0000 4 60 120 1373 3071 12\001 +-6 +6 3105 1710 3240 1845 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 3173 1778 68 68 3240 1778 3105 1778 +4 1 0 50 -1 18 5 0.0000 4 60 120 3173 1811 15\001 +-6 +6 4275 1260 4410 1395 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 1328 68 68 4410 1328 4275 1328 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 1361 1\001 +-6 +6 4275 1440 4410 1575 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 1508 68 68 4410 1508 4275 1508 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 1541 2\001 +-6 +6 4275 1620 4410 1755 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 1688 68 68 4410 1688 4275 1688 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 1721 3\001 +-6 +6 4275 1800 4410 1935 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 1868 68 68 4410 1868 4275 1868 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 1901 4\001 +-6 +6 3240 2835 3375 2970 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 3308 2903 68 68 3375 2903 3240 2903 +4 1 0 50 -1 18 5 0.0000 4 60 120 3308 2936 16\001 +-6 +6 2835 3015 2970 3150 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2903 3083 68 68 2970 3083 2835 3083 +4 1 0 50 -1 18 5 0.0000 4 60 120 2903 3116 17\001 +-6 +6 2295 3195 2430 3330 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2363 3263 68 68 2430 3263 2295 3263 +4 1 0 50 -1 18 5 0.0000 4 60 60 2363 3296 3\001 +-6 +6 1440 4815 1620 4995 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1508 4883 68 68 1575 4883 1440 4883 +4 1 0 50 -1 18 5 0.0000 4 60 120 1508 4916 19\001 +-6 +6 1800 3960 1980 4140 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1868 4028 68 68 1935 4028 1800 4028 +4 1 0 50 -1 18 5 0.0000 4 60 120 1868 4061 18\001 +-6 +6 4275 1980 4410 2115 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 2048 68 68 4410 2048 4275 2048 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 2081 5\001 +-6 +6 4275 2340 4410 2475 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 2408 68 68 4410 2408 4275 2408 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 2441 6\001 +-6 +6 4275 2520 4410 2655 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 2588 68 68 4410 2588 4275 2588 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 2621 7\001 +-6 +6 4275 2700 4410 2835 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 2768 68 68 4410 2768 4275 2768 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 2801 8\001 +-6 +6 4275 2880 4410 3015 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 2948 68 68 4410 2948 4275 2948 +4 1 0 50 -1 18 5 0.0000 4 60 60 4343 2981 9\001 +-6 +6 4275 3060 4410 3195 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 3128 68 68 4410 3128 4275 3128 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 3161 10\001 +-6 +6 4275 3240 4410 3375 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 3308 68 68 4410 3308 4275 3308 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 3341 11\001 +-6 +6 4275 3420 4410 3555 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 3488 68 68 4410 3488 4275 3488 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 3521 12\001 +-6 +6 4275 3600 4410 3735 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 3668 68 68 4410 3668 4275 3668 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 3701 13\001 +-6 +6 4275 3960 4410 4095 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 4028 68 68 4410 4028 4275 4028 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 4061 15\001 +-6 +6 4275 4140 4410 4275 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 4208 68 68 4410 4208 4275 4208 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 4241 16\001 +-6 +6 4275 4320 4410 4455 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 4388 68 68 4410 4388 4275 4388 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 4421 17\001 +-6 +6 4275 3780 4455 3960 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 3848 68 68 4410 3848 4275 3848 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 3881 14\001 +-6 +6 4275 4590 4455 4770 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 4658 68 68 4410 4658 4275 4658 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 4691 18\001 +-6 +6 4275 4770 4455 4950 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 4838 68 68 4410 4838 4275 4838 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 4871 19\001 +-6 +6 4275 4950 4455 5130 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 5018 68 68 4410 5018 4275 5018 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 5051 20\001 +-6 +6 1170 3690 1350 3870 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1238 3758 68 68 1305 3758 1170 3758 +4 1 0 50 -1 18 5 0.0000 4 60 120 1238 3791 11\001 +-6 +6 1530 3555 1710 3735 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 1598 3623 68 68 1665 3623 1530 3623 +4 1 0 50 -1 18 5 0.0000 4 60 120 1598 3656 10\001 +-6 +6 720 4095 900 4275 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 788 4163 68 68 855 4163 720 4163 +4 1 0 50 -1 18 5 0.0000 4 60 120 788 4196 14\001 +-6 +6 855 3645 1035 3825 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 923 3713 68 68 990 3713 855 3713 +4 1 0 50 -1 18 5 0.0000 4 60 120 923 3746 21\001 +-6 +6 4275 5130 4455 5310 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 5198 68 68 4410 5198 4275 5198 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 5231 21\001 +-6 +6 2295 4140 2430 4275 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2363 4208 68 68 2430 4208 2295 4208 +4 1 0 50 -1 18 5 0.0000 4 60 60 2363 4241 4\001 +-6 +6 2475 3870 2655 4050 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2543 3938 68 68 2610 3938 2475 3938 +4 1 0 50 -1 18 5 0.0000 4 60 120 2543 3971 22\001 +-6 +6 4275 5310 4455 5490 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 4343 5378 68 68 4410 5378 4275 5378 +4 1 0 50 -1 18 5 0.0000 4 60 120 4343 5411 22\001 +-6 +6 2295 5625 2430 5760 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2363 5693 68 68 2430 5693 2295 5693 +4 1 0 50 -1 18 5 0.0000 4 60 60 2363 5726 5\001 +-6 +6 2295 6480 2475 6660 +1 4 0 1 0 7 50 -1 -1 0.000 1 0.0000 2363 6548 68 68 2430 6548 2295 6548 +4 1 0 50 -1 18 5 0.0000 4 60 120 2363 6581 20\001 +-6 +1 2 0 1 0 6 50 -1 20 0.000 1 0.0000 1350 4612 225 112 1125 4612 1575 4612 +1 2 0 1 0 6 50 -1 20 0.000 1 0.0000 2250 1912 225 112 2025 1912 2475 1912 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 1125 3487 225 112 900 3487 1350 3487 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 2250 3712 225 112 2025 3712 2475 3712 +1 2 0 1 0 6 50 -1 20 0.000 1 0.0000 2250 2812 225 112 2025 2812 2475 2812 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 3375 2362 225 112 3150 2362 3600 2362 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 2250 1012 225 112 2025 1012 2475 1012 +1 2 0 1 0 6 50 -1 20 0.000 1 0.0000 2250 6232 225 112 2025 6232 2475 6232 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 2250 5422 225 112 2025 5422 2475 5422 +1 2 0 1 0 7 50 -1 20 0.000 1 0.0000 2250 6997 225 112 2025 6997 2475 6997 +1 2 0 1 0 6 50 -1 20 0.000 1 0.0000 2250 4587 225 112 2025 4587 2475 4587 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 1125 2250 1800 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8910 5805 4500 5805 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 6885 5900 6930 5990 6975 5810 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 6885 6570 6930 6660 6975 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5310 5589 5310 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5670 5589 5670 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6030 5589 6030 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6390 5589 6390 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6750 5589 6750 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7110 5589 7110 6921 +2 1 0 2 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 4950 5589 4950 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8910 6705 4500 6705 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 5535 2250 6120 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 6345 2250 6885 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 4500 5580 8910 5580 8910 6930 4500 6930 4500 5580 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 4500 5580 8910 5580 8910 6930 4500 6930 4500 5580 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8910 6030 4500 6030 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8910 6255 4500 6255 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8910 6480 4500 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5310 5589 5310 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5670 5589 5670 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6030 5589 6030 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6390 5589 6390 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6750 5589 6750 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7110 5589 7110 6921 +2 1 0 2 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 4950 5589 4950 6921 +2 2 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 5 + 4500 5580 8910 5580 8910 6930 4500 6930 4500 5580 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 5805 4500 5805 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6030 4500 6030 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6255 4500 6255 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6480 4500 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5310 5589 5310 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5670 5589 5670 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6030 5589 6030 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6390 5589 6390 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6750 5589 6750 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7110 5589 7110 6921 +2 1 0 2 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 4950 5589 4950 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6705 4500 6705 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 5805 4500 5805 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6030 4500 6030 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6255 4500 6255 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6480 4500 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5310 5589 5310 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 5670 5589 5670 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6030 5589 6030 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6390 5589 6390 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 6750 5589 6750 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7110 5589 7110 6921 +2 1 0 2 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 4950 5589 4950 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8865 6705 4500 6705 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 7605 5890 7650 5980 7695 5800 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 7605 6570 7650 6660 7695 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7470 5589 7470 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 7965 5890 8010 5980 8055 5800 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 7965 6570 8010 6660 8055 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 7830 5589 7830 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 8325 6570 8370 6660 8415 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8190 5589 8190 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8550 5589 8550 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8190 5589 8190 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8550 5589 8550 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8190 5589 8190 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8550 5589 8550 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8190 5589 8190 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 8550 5589 8550 6921 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 1 + 4500 5805 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 1 + 4500 6030 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 1 + 4500 6255 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 1 + 4500 6480 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 1 + 4500 6705 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 2250 2700 2475 2475 2475 2250 2250 2025 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 3375 2250 2925 2025 2475 1935 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 3375 2475 3375 2700 2475 2835 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 3420 2475 3420 4320 3150 5850 2475 6165 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 1125 3375 1125 2925 2025 2790 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 1125 3375 1125 2250 2025 1935 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 6 + 1 1 1.00 60.00 120.00 + 2475 1890 3825 1800 3825 2520 3825 4500 3150 6075 2475 6210 + 0.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 2250 2025 2025 2250 2025 2475 2250 2700 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 3825 2250 4500 + 0.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 2475 1980 2880 2115 3150 2340 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 2925 2250 3600 + 0.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 2205 3825 2070 4140 1622 4221 1440 4500 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 1350 4725 1350 4950 1485 5760 2025 6165 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 7 + 1 1 1.00 60.00 120.00 + 1125 4590 720 4455 675 4050 675 3600 675 2250 1350 1800 + 2025 1935 + 0.000 1.000 1.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 3 + 1 1 1.00 60.00 120.00 + 1260 4500 1125 4320 1125 3600 + 0.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 1350 4500 1440 3645 1575 3330 2070 2880 + 0.000 1.000 1.000 0.000 +3 0 0 1 32 7 51 -1 -1 0.000 0 1 0 5 + 1 1 1.00 60.00 120.00 + 1035 3600 990 4365 990 5040 1395 5895 2025 6210 + 0.000 1.000 1.000 1.000 0.000 +3 0 0 1 32 7 51 -1 -1 0.000 0 1 0 5 + 1 1 1.00 60.00 120.00 + 2340 3825 2385 4005 2925 4275 2655 4815 2295 5310 + 0.000 1.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 4 + 1 1 1.00 60.00 120.00 + 2475 2835 3150 3375 3150 5625 2475 6120 + 0.000 1.000 1.000 0.000 +3 0 0 1 0 7 50 -1 -1 0.000 0 1 0 2 + 1 1 1.00 60.00 120.00 + 2250 4725 2250 5310 + 0.000 0.000 +4 0 0 50 -1 14 6 0.0000 4 105 2880 4500 1710 ASS-CON: ssui(): connect_server() == SN_ERR_NONE\001 +4 0 0 50 -1 14 6 0.0000 4 90 540 4500 1350 INI-REQ: \001 +4 0 0 50 -1 14 6 0.0000 4 120 3720 4500 1530 REQ-ASS: prepare_conn_request(): srv_redispatch_connect() == 0\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2475 2700 4\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 1620 4500 6\001 +4 0 0 50 -1 14 6 0.0000 4 120 3360 4500 1890 CON-EST: sess_update_st_con_tcp(): !timeout && !conn_err\001 +4 0 0 50 -1 14 6 0.0000 4 105 2460 4500 3510 TAR-ASS: ssui(): SI_FL_EXP && SN_ASSIGNED\001 +4 0 0 50 -1 14 6 0.0000 4 105 3420 4500 2970 ASS-REQ: connect_server: conn_retries == 0 && PR_O_REDISP\001 +4 0 0 50 -1 14 6 0.0000 4 120 2460 4500 2610 QUE-REQ: ssui(): !pend_pos && SN_ASSIGNED\001 +4 0 0 50 -1 14 6 0.0000 4 120 2520 4500 2790 QUE-REQ: ssui(): !pend_pos && !SN_ASSIGNED\001 +4 0 0 50 -1 14 6 0.0000 4 120 3300 4500 4230 QUE-CLO: ssui(): pend_pos && (SI_FL_EXP || req_aborted)\001 +4 0 0 50 -1 14 6 0.0000 4 105 2520 4500 3690 TAR-REQ: ssui(): SI_FL_EXP && !SN_ASSIGNED\001 +4 0 0 50 -1 14 6 0.0000 4 120 3960 4500 4545 ASS-CLO: PR_O_REDISP && SN_REDIRECTABLE && perform_http_redirect()\001 +4 0 0 50 -1 14 6 0.0000 4 120 4440 4500 2430 REQ-QUE: prepare_conn_request(): srv_redispatch_connect() != 0 (SI_ST_QUE)\001 +4 0 0 50 -1 14 6 0.0000 4 120 4200 4500 4050 REQ-CLO: prepare_conn_request(): srv_redispatch_connect() != 0 (error)\001 +4 0 0 50 -1 14 6 0.0000 4 105 4320 4500 4410 ASS-CLO: ssui(): connect_server() == SN_ERR_INTERNAL || conn_retries < 0\001 +4 0 0 50 -1 14 6 0.0000 4 120 3120 4500 4680 CON-CER: sess_update_st_con_tcp(): timeout/SI_FL_ERR\001 +4 0 0 50 -1 14 6 0.0000 4 120 3600 4500 4860 CER-CLO: sess_update_st_cer(): (ERR/EXP) && conn_retries < 0\001 +4 0 0 50 -1 14 6 0.0000 4 120 4200 4500 3870 CER-REQ: sess_update_st_cer(): timeout && !conn_retries && PR_O_REDISP\001 +4 0 0 50 -1 14 6 0.0000 4 120 3600 4500 3330 CER-TAR: sess_update_st_cer(): conn_err && conn_retries >= 0\001 +4 0 0 50 -1 14 6 0.0000 4 120 4620 4500 3150 CER-ASS: sess_update_st_cer(): timeout && (conn_retries >= 0 || !PR_O_REDISP)\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 1305 3375 3\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2430 3600 5\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 3555 2250 2\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2430 1800 1\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2430 900 0\001 +4 0 0 50 -1 14 6 0.0000 4 105 3000 4500 2070 EST-DIS: stream_sock_read/write/shutr/shutw: close\001 +4 0 0 50 -1 14 6 0.0000 4 120 1980 4500 2250 EST-DIS: process_session(): error\001 +4 0 0 50 -1 14 6 0.0000 4 120 2100 4500 5040 DIS-CLO: process_session(): cleanup\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 1350 4680 CER\001 +4 1 0 50 -1 14 10 0.0000 4 135 315 2250 1980 REQ\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 1125 3555 TAR\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 2880 ASS\001 +4 1 0 50 -1 14 10 0.0000 4 135 315 3375 2430 QUE\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 3780 CON\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 1080 INI\001 +4 0 0 50 -1 14 6 0.0000 4 120 2820 4500 5220 TAR-CLO: sess_update_stream_int(): client abort\001 +4 0 0 50 -1 14 6 0.0000 4 120 2820 4500 5400 CON-DIS: sess_update_st_con_tcp(): client abort\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5130 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5490 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5850 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 6210 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 6570 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 7290 5985 -\001 +4 1 0 50 -1 16 7 0.0000 4 105 120 4725 5985 fd\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 5130 5760 INI\001 +4 1 0 50 -1 16 7 0.0000 4 105 270 4725 5760 state\001 +4 1 0 50 -1 14 8 0.0000 4 120 225 5490 5760 REQ\001 +4 1 0 50 -1 14 8 0.0000 4 120 225 5850 5760 QUE\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 6210 5760 TAR\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 6570 5760 ASS\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 6930 5760 CON\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 7290 5760 CER\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5850 6210 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5130 6210 0\001 +4 1 0 50 -1 16 7 0.0000 4 90 270 4725 6210 ERR\001 +4 1 0 50 -1 16 7 0.0000 4 90 270 4725 6435 EXP\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5490 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6210 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6570 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6570 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5490 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5130 6435 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5850 6435 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6210 6435 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 7290 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6930 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 7290 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6930 6210 X\001 +4 1 0 50 -1 16 7 0.0000 4 75 240 4725 6660 sess\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5130 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5490 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 5850 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 6210 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 6570 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 7290 6660 -\001 +4 0 0 50 -1 16 6 0.0000 4 120 5970 675 7335 Note: states painted yellow above are transient ; process_session() will never leave a stream interface in any of those upon return.\001 +4 1 0 50 -1 16 7 0.0000 4 90 330 4725 6840 SHUT\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 7290 6840 -\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6930 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6570 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 6210 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5850 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5490 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 5130 6840 0\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 6300 DIS\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 5490 EST\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 7065 CLO\001 +4 1 0 50 -1 14 10 0.0000 4 105 315 2250 4635 RDY\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2430 4455 7\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2430 5310 8\001 +4 0 4 50 -1 14 10 0.0000 4 105 105 2385 6120 9\001 +4 0 4 50 -1 14 10 0.0000 4 105 210 2385 6840 10\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 7650 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 7650 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 7650 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 8010 5760 EST\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8010 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8010 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8010 6840 0\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 8370 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8370 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8370 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 8730 5760 CLO\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 8370 5760 DIS\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 8730 5985 -\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8730 6210 X\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8730 6435 X\001 +4 1 0 50 -1 14 8 0.0000 4 15 75 8730 6660 -\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8370 6840 1\001 +4 1 0 50 -1 14 8 0.0000 4 90 75 8730 6840 1\001 +4 1 0 50 -1 14 8 0.0000 4 90 225 7650 5760 RDY\001 diff --git a/doc/intro.txt b/doc/intro.txt new file mode 100644 index 0000000..adc59d5 --- /dev/null +++ b/doc/intro.txt @@ -0,0 +1,1700 @@ + ----------------------- + HAProxy Starter Guide + ----------------------- + version 2.6 + + +This document is an introduction to HAProxy for all those who don't know it, as +well as for those who want to re-discover it when they know older versions. Its +primary focus is to provide users with all the elements to decide if HAProxy is +the product they're looking for or not. Advanced users may find here some parts +of solutions to some ideas they had just because they were not aware of a given +new feature. Some sizing information is also provided, the product's lifecycle +is explained, and comparisons with partially overlapping products are provided. + +This document doesn't provide any configuration help or hints, but it explains +where to find the relevant documents. The summary below is meant to help you +search sections by name and navigate through the document. + +Note to documentation contributors : + This document is formatted with 80 columns per line, with even number of + spaces for indentation and without tabs. Please follow these rules strictly + so that it remains easily printable everywhere. If you add sections, please + update the summary below for easier searching. + + +Summary +------- + +1. Available documentation + +2. Quick introduction to load balancing and load balancers + +3. Introduction to HAProxy +3.1. What HAProxy is and is not +3.2. How HAProxy works +3.3. Basic features +3.3.1. Proxying +3.3.2. SSL +3.3.3. Monitoring +3.3.4. High availability +3.3.5. Load balancing +3.3.6. Stickiness +3.3.7. Logging +3.3.8. Statistics +3.4. Standard features +3.4.1. Sampling and converting information +3.4.2. Maps +3.4.3. ACLs and conditions +3.4.4. Content switching +3.4.5. Stick-tables +3.4.6. Formatted strings +3.4.7. HTTP rewriting and redirection +3.4.8. Server protection +3.5. Advanced features +3.5.1. Management +3.5.2. System-specific capabilities +3.5.3. Scripting +3.6. Sizing +3.7. How to get HAProxy + +4. Companion products and alternatives +4.1. Apache HTTP server +4.2. NGINX +4.3. Varnish +4.4. Alternatives + +5. Contacts + + +1. Available documentation +-------------------------- + +The complete HAProxy documentation is contained in the following documents. +Please ensure to consult the relevant documentation to save time and to get the +most accurate response to your needs. Also please refrain from sending questions +to the mailing list whose responses are present in these documents. + + - intro.txt (this document) : it presents the basics of load balancing, + HAProxy as a product, what it does, what it doesn't do, some known traps to + avoid, some OS-specific limitations, how to get it, how it evolves, how to + ensure you're running with all known fixes, how to update it, complements + and alternatives. + + - management.txt : it explains how to start haproxy, how to manage it at + runtime, how to manage it on multiple nodes, and how to proceed with + seamless upgrades. + + - configuration.txt : the reference manual details all configuration keywords + and their options. It is used when a configuration change is needed. + + - coding-style.txt : this is for developers who want to propose some code to + the project. It explains the style to adopt for the code. It is not very + strict and not all the code base completely respects it, but contributions + which diverge too much from it will be rejected. + + - proxy-protocol.txt : this is the de-facto specification of the PROXY + protocol which is implemented by HAProxy and a number of third party + products. + + - README : how to build HAProxy from sources + + +2. Quick introduction to load balancing and load balancers +---------------------------------------------------------- + +Load balancing consists in aggregating multiple components in order to achieve +a total processing capacity above each component's individual capacity, without +any intervention from the end user and in a scalable way. This results in more +operations being performed simultaneously by the time it takes a component to +perform only one. A single operation however will still be performed on a single +component at a time and will not get faster than without load balancing. It +always requires at least as many operations as available components and an +efficient load balancing mechanism to make use of all components and to fully +benefit from the load balancing. A good example of this is the number of lanes +on a highway which allows as many cars to pass during the same time frame +without increasing their individual speed. + +Examples of load balancing : + + - Process scheduling in multi-processor systems + - Link load balancing (e.g. EtherChannel, Bonding) + - IP address load balancing (e.g. ECMP, DNS round-robin) + - Server load balancing (via load balancers) + +The mechanism or component which performs the load balancing operation is +called a load balancer. In web environments these components are called a +"network load balancer", and more commonly a "load balancer" given that this +activity is by far the best known case of load balancing. + +A load balancer may act : + + - at the link level : this is called link load balancing, and it consists in + choosing what network link to send a packet to; + + - at the network level : this is called network load balancing, and it + consists in choosing what route a series of packets will follow; + + - at the server level : this is called server load balancing and it consists + in deciding what server will process a connection or request. + +Two distinct technologies exist and address different needs, though with some +overlapping. In each case it is important to keep in mind that load balancing +consists in diverting the traffic from its natural flow and that doing so always +requires a minimum of care to maintain the required level of consistency between +all routing decisions. + +The first one acts at the packet level and processes packets more or less +individually. There is a 1-to-1 relation between input and output packets, so +it is possible to follow the traffic on both sides of the load balancer using a +regular network sniffer. This technology can be very cheap and extremely fast. +It is usually implemented in hardware (ASICs) allowing to reach line rate, such +as switches doing ECMP. Usually stateless, it can also be stateful (consider +the session a packet belongs to and called layer4-LB or L4), may support DSR +(direct server return, without passing through the LB again) if the packets +were not modified, but provides almost no content awareness. This technology is +very well suited to network-level load balancing, though it is sometimes used +for very basic server load balancing at high speed. + +The second one acts on session contents. It requires that the input streams is +reassembled and processed as a whole. The contents may be modified, and the +output stream is segmented into new packets. For this reason it is generally +performed by proxies and they're often called layer 7 load balancers or L7. +This implies that there are two distinct connections on each side, and that +there is no relation between input and output packets sizes nor counts. Clients +and servers are not required to use the same protocol (for example IPv4 vs +IPv6, clear vs SSL). The operations are always stateful, and the return traffic +must pass through the load balancer. The extra processing comes with a cost so +it's not always possible to achieve line rate, especially with small packets. +On the other hand, it offers wide possibilities and is generally achieved by +pure software, even if embedded into hardware appliances. This technology is +very well suited for server load balancing. + +Packet-based load balancers are generally deployed in cut-through mode, so they +are installed on the normal path of the traffic and divert it according to the +configuration. The return traffic doesn't necessarily pass through the load +balancer. Some modifications may be applied to the network destination address +in order to direct the traffic to the proper destination. In this case, it is +mandatory that the return traffic passes through the load balancer. If the +routes doesn't make this possible, the load balancer may also replace the +packets' source address with its own in order to force the return traffic to +pass through it. + +Proxy-based load balancers are deployed as a server with their own IP addresses +and ports, without architecture changes. Sometimes this requires to perform some +adaptations to the applications so that clients are properly directed to the +load balancer's IP address and not directly to the server's. Some load balancers +may have to adjust some servers' responses to make this possible (e.g. the HTTP +Location header field used in HTTP redirects). Some proxy-based load balancers +may intercept traffic for an address they don't own, and spoof the client's +address when connecting to the server. This allows them to be deployed as if +they were a regular router or firewall, in a cut-through mode very similar to +the packet based load balancers. This is particularly appreciated for products +which combine both packet mode and proxy mode. In this case DSR is obviously +still not possible and the return traffic still has to be routed back to the +load balancer. + +A very scalable layered approach would consist in having a front router which +receives traffic from multiple load balanced links, and uses ECMP to distribute +this traffic to a first layer of multiple stateful packet-based load balancers +(L4). These L4 load balancers in turn pass the traffic to an even larger number +of proxy-based load balancers (L7), which have to parse the contents to decide +what server will ultimately receive the traffic. + +The number of components and possible paths for the traffic increases the risk +of failure; in very large environments, it is even normal to permanently have +a few faulty components being fixed or replaced. Load balancing done without +awareness of the whole stack's health significantly degrades availability. For +this reason, any sane load balancer will verify that the components it intends +to deliver the traffic to are still alive and reachable, and it will stop +delivering traffic to faulty ones. This can be achieved using various methods. + +The most common one consists in periodically sending probes to ensure the +component is still operational. These probes are called "health checks". They +must be representative of the type of failure to address. For example a ping- +based check will not detect that a web server has crashed and doesn't listen to +a port anymore, while a connection to the port will verify this, and a more +advanced request may even validate that the server still works and that the +database it relies on is still accessible. Health checks often involve a few +retries to cover for occasional measuring errors. The period between checks +must be small enough to ensure the faulty component is not used for too long +after an error occurs. + +Other methods consist in sampling the production traffic sent to a destination +to observe if it is processed correctly or not, and to evict the components +which return inappropriate responses. However this requires to sacrifice a part +of the production traffic and this is not always acceptable. A combination of +these two mechanisms provides the best of both worlds, with both of them being +used to detect a fault, and only health checks to detect the end of the fault. +A last method involves centralized reporting : a central monitoring agent +periodically updates all load balancers about all components' state. This gives +a global view of the infrastructure to all components, though sometimes with +less accuracy or responsiveness. It's best suited for environments with many +load balancers and many servers. + +Layer 7 load balancers also face another challenge known as stickiness or +persistence. The principle is that they generally have to direct multiple +subsequent requests or connections from a same origin (such as an end user) to +the same target. The best known example is the shopping cart on an online +store. If each click leads to a new connection, the user must always be sent +to the server which holds his shopping cart. Content-awareness makes it easier +to spot some elements in the request to identify the server to deliver it to, +but that's not always enough. For example if the source address is used as a +key to pick a server, it can be decided that a hash-based algorithm will be +used and that a given IP address will always be sent to the same server based +on a divide of the address by the number of available servers. But if one +server fails, the result changes and all users are suddenly sent to a different +server and lose their shopping cart. The solution against this issue consists +in memorizing the chosen target so that each time the same visitor is seen, +he's directed to the same server regardless of the number of available servers. +The information may be stored in the load balancer's memory, in which case it +may have to be replicated to other load balancers if it's not alone, or it may +be stored in the client's memory using various methods provided that the client +is able to present this information back with every request (cookie insertion, +redirection to a sub-domain, etc). This mechanism provides the extra benefit of +not having to rely on unstable or unevenly distributed information (such as the +source IP address). This is in fact the strongest reason to adopt a layer 7 +load balancer instead of a layer 4 one. + +In order to extract information such as a cookie, a host header field, a URL +or whatever, a load balancer may need to decrypt SSL/TLS traffic and even +possibly to re-encrypt it when passing it to the server. This expensive task +explains why in some high-traffic infrastructures, sometimes there may be a +lot of load balancers. + +Since a layer 7 load balancer may perform a number of complex operations on the +traffic (decrypt, parse, modify, match cookies, decide what server to send to, +etc), it can definitely cause some trouble and will very commonly be accused of +being responsible for a lot of trouble that it only revealed. Often it will be +discovered that servers are unstable and periodically go up and down, or for +web servers, that they deliver pages with some hard-coded links forcing the +clients to connect directly to one specific server without passing via the load +balancer, or that they take ages to respond under high load causing timeouts. +That's why logging is an extremely important aspect of layer 7 load balancing. +Once a trouble is reported, it is important to figure if the load balancer took +a wrong decision and if so why so that it doesn't happen anymore. + + +3. Introduction to HAProxy +-------------------------- + +HAProxy is written as "HAProxy" to designate the product, and as "haproxy" to +designate the executable program, software package or a process. However, both +are commonly used for both purposes, and are pronounced H-A-Proxy. Very early, +"haproxy" used to stand for "high availability proxy" and the name was written +in two separate words, though by now it means nothing else than "HAProxy". + + +3.1. What HAProxy is and isn't +------------------------------ + +HAProxy is : + + - a TCP proxy : it can accept a TCP connection from a listening socket, + connect to a server and attach these sockets together allowing traffic to + flow in both directions; IPv4, IPv6 and even UNIX sockets are supported on + either side, so this can provide an easy way to translate addresses between + different families. + + - an HTTP reverse-proxy (called a "gateway" in HTTP terminology) : it presents + itself as a server, receives HTTP requests over connections accepted on a + listening TCP socket, and passes the requests from these connections to + servers using different connections. It may use any combination of HTTP/1.x + or HTTP/2 on any side and will even automatically detect the protocol + spoken on each side when ALPN is used over TLS. + + - an SSL terminator / initiator / offloader : SSL/TLS may be used on the + connection coming from the client, on the connection going to the server, + or even on both connections. A lot of settings can be applied per name + (SNI), and may be updated at runtime without restarting. Such setups are + extremely scalable and deployments involving tens to hundreds of thousands + of certificates were reported. + + - a TCP normalizer : since connections are locally terminated by the operating + system, there is no relation between both sides, so abnormal traffic such as + invalid packets, flag combinations, window advertisements, sequence numbers, + incomplete connections (SYN floods), or so will not be passed to the other + side. This protects fragile TCP stacks from protocol attacks, and also + allows to optimize the connection parameters with the client without having + to modify the servers' TCP stack settings. + + - an HTTP normalizer : when configured to process HTTP traffic, only valid + complete requests are passed. This protects against a lot of protocol-based + attacks. Additionally, protocol deviations for which there is a tolerance + in the specification are fixed so that they don't cause problem on the + servers (e.g. multiple-line headers). + + - an HTTP fixing tool : it can modify / fix / add / remove / rewrite the URL + or any request or response header. This helps fixing interoperability issues + in complex environments. + + - a content-based switch : it can consider any element from the request to + decide what server to pass the request or connection to. Thus it is possible + to handle multiple protocols over a same port (e.g. HTTP, HTTPS, SSH). + + - a server load balancer : it can load balance TCP connections and HTTP + requests. In TCP mode, load balancing decisions are taken for the whole + connection. In HTTP mode, decisions are taken per request. + + - a traffic regulator : it can apply some rate limiting at various points, + protect the servers against overloading, adjust traffic priorities based on + the contents, and even pass such information to lower layers and outer + network components by marking packets. + + - a protection against DDoS and service abuse : it can maintain a wide number + of statistics per IP address, URL, cookie, etc and detect when an abuse is + happening, then take action (slow down the offenders, block them, send them + to outdated contents, etc). + + - an observation point for network troubleshooting : due to the precision of + the information reported in logs, it is often used to narrow down some + network-related issues. + + - an HTTP compression offloader : it can compress responses which were not + compressed by the server, thus reducing the page load time for clients with + poor connectivity or using high-latency, mobile networks. + + - a caching proxy : it may cache responses in RAM so that subsequent requests + for the same object avoid the cost of another network transfer from the + server as long as the object remains present and valid. It will however not + store objects to any persistent storage. Please note that this caching + feature is designed to be maintenance free and focuses solely on saving + haproxy's precious resources and not on save the server's resources. Caches + designed to optimize servers require much more tuning and flexibility. If + you instead need such an advanced cache, please use Varnish Cache, which + integrates perfectly with haproxy, especially when SSL/TLS is needed on any + side. + + - a FastCGI gateway : FastCGI can be seen as a different representation of + HTTP, and as such, HAProxy can directly load-balance a farm comprising any + combination of FastCGI application servers without requiring to insert + another level of gateway between them. This results in resource savings and + a reduction of maintenance costs. + +HAProxy is not : + + - an explicit HTTP proxy, i.e. the proxy that browsers use to reach the + internet. There are excellent open-source software dedicated for this task, + such as Squid. However HAProxy can be installed in front of such a proxy to + provide load balancing and high availability. + + - a data scrubber : it will not modify the body of requests nor responses. + + - a static web server : during startup, it isolates itself inside a chroot + jail and drops its privileges, so that it will not perform any single file- + system access once started. As such it cannot be turned into a static web + server (dynamic servers are supported through FastCGI however). There are + excellent open-source software for this such as Apache or Nginx, and + HAProxy can be easily installed in front of them to provide load balancing, + high availability and acceleration. + + - a packet-based load balancer : it will not see IP packets nor UDP datagrams, + will not perform NAT or even less DSR. These are tasks for lower layers. + Some kernel-based components such as IPVS (Linux Virtual Server) already do + this pretty well and complement perfectly with HAProxy. + + +3.2. How HAProxy works +---------------------- + +HAProxy is an event-driven, non-blocking engine combining a very fast I/O layer +with a priority-based, multi-threaded scheduler. As it is designed with a data +forwarding goal in mind, its architecture is optimized to move data as fast as +possible with the least possible operations. It focuses on optimizing the CPU +cache's efficiency by sticking connections to the same CPU as long as possible. +As such it implements a layered model offering bypass mechanisms at each level +ensuring data doesn't reach higher levels unless needed. Most of the processing +is performed in the kernel, and HAProxy does its best to help the kernel do the +work as fast as possible by giving some hints or by avoiding certain operation +when it guesses they could be grouped later. As a result, typical figures show +15% of the processing time spent in HAProxy versus 85% in the kernel in TCP or +HTTP close mode, and about 30% for HAProxy versus 70% for the kernel in HTTP +keep-alive mode. + +A single process can run many proxy instances; configurations as large as +300000 distinct proxies in a single process were reported to run fine. A single +core, single CPU setup is far more than enough for more than 99% users, and as +such, users of containers and virtual machines are encouraged to use the +absolute smallest images they can get to save on operational costs and simplify +troubleshooting. However the machine HAProxy runs on must never ever swap, and +its CPU must not be artificially throttled (sub-CPU allocation in hypervisors) +nor be shared with compute-intensive processes which would induce a very high +context-switch latency. + +Threading allows to exploit all available processing capacity by using one +thread per CPU core. This is mostly useful for SSL or when data forwarding +rates above 40 Gbps are needed. In such cases it is critically important to +avoid communications between multiple physical CPUs, which can cause strong +bottlenecks in the network stack and in HAProxy itself. While counter-intuitive +to some, the first thing to do when facing some performance issues is often to +reduce the number of CPUs HAProxy runs on. + +HAProxy only requires the haproxy executable and a configuration file to run. +For logging it is highly recommended to have a properly configured syslog daemon +and log rotations in place. Logs may also be sent to stdout/stderr, which can be +useful inside containers. The configuration files are parsed before starting, +then HAProxy tries to bind all listening sockets, and refuses to start if +anything fails. Past this point it cannot fail anymore. This means that there +are no runtime failures and that if it accepts to start, it will work until it +is stopped. + +Once HAProxy is started, it does exactly 3 things : + + - process incoming connections; + + - periodically check the servers' status (known as health checks); + + - exchange information with other haproxy nodes. + +Processing incoming connections is by far the most complex task as it depends +on a lot of configuration possibilities, but it can be summarized as the 9 steps +below : + + - accept incoming connections from listening sockets that belong to a + configuration entity known as a "frontend", which references one or multiple + listening addresses; + + - apply the frontend-specific processing rules to these connections that may + result in blocking them, modifying some headers, or intercepting them to + execute some internal applets such as the statistics page or the CLI; + + - pass these incoming connections to another configuration entity representing + a server farm known as a "backend", which contains the list of servers and + the load balancing strategy for this server farm; + + - apply the backend-specific processing rules to these connections; + + - decide which server to forward the connection to according to the load + balancing strategy; + + - apply the backend-specific processing rules to the response data; + + - apply the frontend-specific processing rules to the response data; + + - emit a log to report what happened in fine details; + + - in HTTP, loop back to the second step to wait for a new request, otherwise + close the connection. + +Frontends and backends are sometimes considered as half-proxies, since they only +look at one side of an end-to-end connection; the frontend only cares about the +clients while the backend only cares about the servers. HAProxy also supports +full proxies which are exactly the union of a frontend and a backend. When HTTP +processing is desired, the configuration will generally be split into frontends +and backends as they open a lot of possibilities since any frontend may pass a +connection to any backend. With TCP-only proxies, using frontends and backends +rarely provides a benefit and the configuration can be more readable with full +proxies. + + +3.3. Basic features +------------------- + +This section will enumerate a number of features that HAProxy implements, some +of which are generally expected from any modern load balancer, and some of +which are a direct benefit of HAProxy's architecture. More advanced features +will be detailed in the next section. + + +3.3.1. Basic features : Proxying +-------------------------------- + +Proxying is the action of transferring data between a client and a server over +two independent connections. The following basic features are supported by +HAProxy regarding proxying and connection management : + + - Provide the server with a clean connection to protect them against any + client-side defect or attack; + + - Listen to multiple IP addresses and/or ports, even port ranges; + + - Transparent accept : intercept traffic targeting any arbitrary IP address + that doesn't even belong to the local system; + + - Server port doesn't need to be related to listening port, and may even be + translated by a fixed offset (useful with ranges); + + - Transparent connect : spoof the client's (or any) IP address if needed + when connecting to the server; + + - Provide a reliable return IP address to the servers in multi-site LBs; + + - Offload the server thanks to buffers and possibly short-lived connections + to reduce their concurrent connection count and their memory footprint; + + - Optimize TCP stacks (e.g. SACK), congestion control, and reduce RTT impacts; + + - Support different protocol families on both sides (e.g. IPv4/IPv6/Unix); + + - Timeout enforcement : HAProxy supports multiple levels of timeouts depending + on the stage the connection is, so that a dead client or server, or an + attacker cannot be granted resources for too long; + + - Protocol validation: HTTP, SSL, or payload are inspected and invalid + protocol elements are rejected, unless instructed to accept them anyway; + + - Policy enforcement : ensure that only what is allowed may be forwarded; + + - Both incoming and outgoing connections may be limited to certain network + namespaces (Linux only), making it easy to build a cross-container, + multi-tenant load balancer; + + - PROXY protocol presents the client's IP address to the server even for + non-HTTP traffic. This is an HAProxy extension that was adopted by a number + of third-party products by now, at least these ones at the time of writing : + - client : haproxy, stud, stunnel, exaproxy, ELB, squid + - server : haproxy, stud, postfix, exim, nginx, squid, node.js, varnish + + +3.3.2. Basic features : SSL +--------------------------- + +HAProxy's SSL stack is recognized as one of the most featureful according to +Google's engineers (http://istlsfastyet.com/). The most commonly used features +making it quite complete are : + + - SNI-based multi-hosting with no limit on sites count and focus on + performance. At least one deployment is known for running 50000 domains + with their respective certificates; + + - support for wildcard certificates reduces the need for many certificates ; + + - certificate-based client authentication with configurable policies on + failure to present a valid certificate. This allows to present a different + server farm to regenerate the client certificate for example; + + - authentication of the backend server ensures the backend server is the real + one and not a man in the middle; + + - authentication with the backend server lets the backend server know it's + really the expected haproxy node that is connecting to it; + + - TLS NPN and ALPN extensions make it possible to reliably offload SPDY/HTTP2 + connections and pass them in clear text to backend servers; + + - OCSP stapling further reduces first page load time by delivering inline an + OCSP response when the client requests a Certificate Status Request; + + - Dynamic record sizing provides both high performance and low latency, and + significantly reduces page load time by letting the browser start to fetch + new objects while packets are still in flight; + + - permanent access to all relevant SSL/TLS layer information for logging, + access control, reporting etc. These elements can be embedded into HTTP + header or even as a PROXY protocol extension so that the offloaded server + gets all the information it would have had if it performed the SSL + termination itself. + + - Detect, log and block certain known attacks even on vulnerable SSL libs, + such as the Heartbleed attack affecting certain versions of OpenSSL. + + - support for stateless session resumption (RFC 5077 TLS Ticket extension). + TLS tickets can be updated from CLI which provides them means to implement + Perfect Forward Secrecy by frequently rotating the tickets. + + +3.3.3. Basic features : Monitoring +---------------------------------- + +HAProxy focuses a lot on availability. As such it cares about servers state, +and about reporting its own state to other network components : + + - Servers' state is continuously monitored using per-server parameters. This + ensures the path to the server is operational for regular traffic; + + - Health checks support two hysteresis for up and down transitions in order + to protect against state flapping; + + - Checks can be sent to a different address/port/protocol : this makes it + easy to check a single service that is considered representative of multiple + ones, for example the HTTPS port for an HTTP+HTTPS server. + + - Servers can track other servers and go down simultaneously : this ensures + that servers hosting multiple services can fail atomically and that no one + will be sent to a partially failed server; + + - Agents may be deployed on the server to monitor load and health : a server + may be interested in reporting its load, operational status, administrative + status independently from what health checks can see. By running a simple + agent on the server, it's possible to consider the server's view of its own + health in addition to the health checks validating the whole path; + + - Various check methods are available : TCP connect, HTTP request, SMTP hello, + SSL hello, LDAP, SQL, Redis, send/expect scripts, all with/without SSL; + + - State change is notified in the logs and stats page with the failure reason + (e.g. the HTTP response received at the moment the failure was detected). An + e-mail can also be sent to a configurable address upon such a change ; + + - Server state is also reported on the stats interface and can be used to take + routing decisions so that traffic may be sent to different farms depending + on their sizes and/or health (e.g. loss of an inter-DC link); + + - HAProxy can use health check requests to pass information to the servers, + such as their names, weight, the number of other servers in the farm etc. + so that servers can adjust their response and decisions based on this + knowledge (e.g. postpone backups to keep more CPU available); + + - Servers can use health checks to report more detailed state than just on/off + (e.g. I would like to stop, please stop sending new visitors); + + - HAProxy itself can report its state to external components such as routers + or other load balancers, allowing to build very complete multi-path and + multi-layer infrastructures. + + +3.3.4. Basic features : High availability +----------------------------------------- + +Just like any serious load balancer, HAProxy cares a lot about availability to +ensure the best global service continuity : + + - Only valid servers are used ; the other ones are automatically evicted from + load balancing farms ; under certain conditions it is still possible to + force to use them though; + + - Support for a graceful shutdown so that it is possible to take servers out + of a farm without affecting any connection; + + - Backup servers are automatically used when active servers are down and + replace them so that sessions are not lost when possible. This also allows + to build multiple paths to reach the same server (e.g. multiple interfaces); + + - Ability to return a global failed status for a farm when too many servers + are down. This, combined with the monitoring capabilities makes it possible + for an upstream component to choose a different LB node for a given service; + + - Stateless design makes it easy to build clusters : by design, HAProxy does + its best to ensure the highest service continuity without having to store + information that could be lost in the event of a failure. This ensures that + a takeover is the most seamless possible; + + - Integrates well with standard VRRP daemon keepalived : HAProxy easily tells + keepalived about its state and copes very well with floating virtual IP + addresses. Note: only use IP redundancy protocols (VRRP/CARP) over cluster- + based solutions (Heartbeat, ...) as they're the ones offering the fastest, + most seamless, and most reliable switchover. + + +3.3.5. Basic features : Load balancing +-------------------------------------- + +HAProxy offers a fairly complete set of load balancing features, most of which +are unfortunately not available in a number of other load balancing products : + + - no less than 10 load balancing algorithms are supported, some of which apply + to input data to offer an infinite list of possibilities. The most common + ones are round-robin (for short connections, pick each server in turn), + leastconn (for long connections, pick the least recently used of the servers + with the lowest connection count), source (for SSL farms or terminal server + farms, the server directly depends on the client's source address), URI (for + HTTP caches, the server directly depends on the HTTP URI), hdr (the server + directly depends on the contents of a specific HTTP header field), first + (for short-lived virtual machines, all connections are packed on the + smallest possible subset of servers so that unused ones can be powered + down); + + - all algorithms above support per-server weights so that it is possible to + accommodate from different server generations in a farm, or direct a small + fraction of the traffic to specific servers (debug mode, running the next + version of the software, etc); + + - dynamic weights are supported for round-robin, leastconn and consistent + hashing ; this allows server weights to be modified on the fly from the CLI + or even by an agent running on the server; + + - slow-start is supported whenever a dynamic weight is supported; this allows + a server to progressively take the traffic. This is an important feature + for fragile application servers which require to compile classes at runtime + as well as cold caches which need to fill up before being run at full + throttle; + + - hashing can apply to various elements such as client's source address, URL + components, query string element, header field values, POST parameter, RDP + cookie; + + - consistent hashing protects server farms against massive redistribution when + adding or removing servers in a farm. That's very important in large cache + farms and it allows slow-start to be used to refill cold caches; + + - a number of internal metrics such as the number of connections per server, + per backend, the amount of available connection slots in a backend etc makes + it possible to build very advanced load balancing strategies. + + +3.3.6. Basic features : Stickiness +---------------------------------- + +Application load balancing would be useless without stickiness. HAProxy provides +a fairly comprehensive set of possibilities to maintain a visitor on the same +server even across various events such as server addition/removal, down/up +cycles, and some methods are designed to be resistant to the distance between +multiple load balancing nodes in that they don't require any replication : + + - stickiness information can be individually matched and learned from + different places if desired. For example a JSESSIONID cookie may be matched + both in a cookie and in the URL. Up to 8 parallel sources can be learned at + the same time and each of them may point to a different stick-table; + + - stickiness information can come from anything that can be seen within a + request or response, including source address, TCP payload offset and + length, HTTP query string elements, header field values, cookies, and so + on. + + - stick-tables are replicated between all nodes in a multi-master fashion; + + - commonly used elements such as SSL-ID or RDP cookies (for TSE farms) are + directly accessible to ease manipulation; + + - all sticking rules may be dynamically conditioned by ACLs; + + - it is possible to decide not to stick to certain servers, such as backup + servers, so that when the nominal server comes back, it automatically takes + the load back. This is often used in multi-path environments; + + - in HTTP it is often preferred not to learn anything and instead manipulate + a cookie dedicated to stickiness. For this, it's possible to detect, + rewrite, insert or prefix such a cookie to let the client remember what + server was assigned; + + - the server may decide to change or clean the stickiness cookie on logout, + so that leaving visitors are automatically unbound from the server; + + - using ACL-based rules it is also possible to selectively ignore or enforce + stickiness regardless of the server's state; combined with advanced health + checks, that helps admins verify that the server they're installing is up + and running before presenting it to the whole world; + + - an innovative mechanism to set a maximum idle time and duration on cookies + ensures that stickiness can be smoothly stopped on devices which are never + closed (smartphones, TVs, home appliances) without having to store them on + persistent storage; + + - multiple server entries may share the same stickiness keys so that + stickiness is not lost in multi-path environments when one path goes down; + + - soft-stop ensures that only users with stickiness information will continue + to reach the server they've been assigned to but no new users will go there. + + +3.3.7. Basic features : Logging +------------------------------- + +Logging is an extremely important feature for a load balancer, first because a +load balancer is often wrongly accused of causing the problems it reveals, and +second because it is placed at a critical point in an infrastructure where all +normal and abnormal activity needs to be analyzed and correlated with other +components. + +HAProxy provides very detailed logs, with millisecond accuracy and the exact +connection accept time that can be searched in firewalls logs (e.g. for NAT +correlation). By default, TCP and HTTP logs are quite detailed and contain +everything needed for troubleshooting, such as source IP address and port, +frontend, backend, server, timers (request receipt duration, queue duration, +connection setup time, response headers time, data transfer time), global +process state, connection counts, queue status, retries count, detailed +stickiness actions and disconnect reasons, header captures with a safe output +encoding. It is then possible to extend or replace this format to include any +sampled data, variables, captures, resulting in very detailed information. For +example it is possible to log the number of cumulative requests or number of +different URLs visited by a client. + +The log level may be adjusted per request using standard ACLs, so it is possible +to automatically silent some logs considered as pollution and instead raise +warnings when some abnormal behavior happen for a small part of the traffic +(e.g. too many URLs or HTTP errors for a source address). Administrative logs +are also emitted with their own levels to inform about the loss or recovery of a +server for example. + +Each frontend and backend may use multiple independent log outputs, which eases +multi-tenancy. Logs are preferably sent over UDP, maybe JSON-encoded, and are +truncated after a configurable line length in order to guarantee delivery. But +it is also possible to send them to stdout/stderr or any file descriptor, as +well as to a ring buffer that a client can subscribe to in order to retrieve +them. + + +3.3.8. Basic features : Statistics +---------------------------------- + +HAProxy provides a web-based statistics reporting interface with authentication, +security levels and scopes. It is thus possible to provide each hosted customer +with his own page showing only his own instances. This page can be located in a +hidden URL part of the regular web site so that no new port needs to be opened. +This page may also report the availability of other HAProxy nodes so that it is +easy to spot if everything works as expected at a glance. The view is synthetic +with a lot of details accessible (such as error causes, last access and last +change duration, etc), which are also accessible as a CSV table that other tools +may import to draw graphs. The page may self-refresh to be used as a monitoring +page on a large display. In administration mode, the page also allows to change +server state to ease maintenance operations. + +A Prometheus exporter is also provided so that the statistics can be consumed +in a different format depending on the deployment. + + +3.4. Standard features +---------------------- + +In this section, some features that are very commonly used in HAProxy but are +not necessarily present on other load balancers are enumerated. + + +3.4.1. Standard features : Sampling and converting information +-------------------------------------------------------------- + +HAProxy supports information sampling using a wide set of "sample fetch +functions". The principle is to extract pieces of information known as samples, +for immediate use. This is used for stickiness, to build conditions, to produce +information in logs or to enrich HTTP headers. + +Samples can be fetched from various sources : + + - constants : integers, strings, IP addresses, binary blocks; + + - the process : date, environment variables, server/frontend/backend/process + state, byte/connection counts/rates, queue length, random generator, ... + + - variables : per-session, per-request, per-response variables; + + - the client connection : source and destination addresses and ports, and all + related statistics counters; + + - the SSL client session : protocol, version, algorithm, cipher, key size, + session ID, all client and server certificate fields, certificate serial, + SNI, ALPN, NPN, client support for certain extensions; + + - request and response buffers contents : arbitrary payload at offset/length, + data length, RDP cookie, decoding of SSL hello type, decoding of TLS SNI; + + - HTTP (request and response) : method, URI, path, query string arguments, + status code, headers values, positional header value, cookies, captures, + authentication, body elements; + +A sample may then pass through a number of operators known as "converters" to +experience some transformation. A converter consumes a sample and produces a +new one, possibly of a completely different type. For example, a converter may +be used to return only the integer length of the input string, or could turn a +string to upper case. Any arbitrary number of converters may be applied in +series to a sample before final use. Among all available sample converters, the +following ones are the most commonly used : + + - arithmetic and logic operators : they make it possible to perform advanced + computation on input data, such as computing ratios, percentages or simply + converting from one unit to another one; + + - IP address masks are useful when some addresses need to be grouped by larger + networks; + + - data representation : URL-decode, base64, hex, JSON strings, hashing; + + - string conversion : extract substrings at fixed positions, fixed length, + extract specific fields around certain delimiters, extract certain words, + change case, apply regex-based substitution; + + - date conversion : convert to HTTP date format, convert local to UTC and + conversely, add or remove offset; + + - lookup an entry in a stick table to find statistics or assigned server; + + - map-based key-to-value conversion from a file (mostly used for geolocation). + + +3.4.2. Standard features : Maps +------------------------------- + +Maps are a powerful type of converter consisting in loading a two-columns file +into memory at boot time, then looking up each input sample from the first +column and either returning the corresponding pattern on the second column if +the entry was found, or returning a default value. The output information also +being a sample, it can in turn experience other transformations including other +map lookups. Maps are most commonly used to translate the client's IP address +to an AS number or country code since they support a longest match for network +addresses but they can be used for various other purposes. + +Part of their strength comes from being updatable on the fly either from the CLI +or from certain actions using other samples, making them capable of storing and +retrieving information between subsequent accesses. Another strength comes from +the binary tree based indexation which makes them extremely fast even when they +contain hundreds of thousands of entries, making geolocation very cheap and easy +to set up. + + +3.4.3. Standard features : ACLs and conditions +---------------------------------------------- + +Most operations in HAProxy can be made conditional. Conditions are built by +combining multiple ACLs using logic operators (AND, OR, NOT). Each ACL is a +series of tests based on the following elements : + + - a sample fetch method to retrieve the element to test ; + + - an optional series of converters to transform the element ; + + - a list of patterns to match against ; + + - a matching method to indicate how to compare the patterns with the sample + +For example, the sample may be taken from the HTTP "Host" header, it could then +be converted to lower case, then matched against a number of regex patterns +using the regex matching method. + +Technically, ACLs are built on the same core as the maps, they share the exact +same internal structure, pattern matching methods and performance. The only real +difference is that instead of returning a sample, they only return "found" or +or "not found". In terms of usage, ACL patterns may be declared inline in the +configuration file and do not require their own file. ACLs may be named for ease +of use or to make configurations understandable. A named ACL may be declared +multiple times and it will evaluate all definitions in turn until one matches. + +About 13 different pattern matching methods are provided, among which IP address +mask, integer ranges, substrings, regex. They work like functions, and just like +with any programming language, only what is needed is evaluated, so when a +condition involving an OR is already true, next ones are not evaluated, and +similarly when a condition involving an AND is already false, the rest of the +condition is not evaluated. + +There is no practical limit to the number of declared ACLs, and a handful of +commonly used ones are provided. However experience has shown that setups using +a lot of named ACLs are quite hard to troubleshoot and that sometimes using +anonymous ACLs inline is easier as it requires less references out of the scope +being analyzed. + + +3.4.4. Standard features : Content switching +-------------------------------------------- + +HAProxy implements a mechanism known as content-based switching. The principle +is that a connection or request arrives on a frontend, then the information +carried with this request or connection are processed, and at this point it is +possible to write ACLs-based conditions making use of these information to +decide what backend will process the request. Thus the traffic is directed to +one backend or another based on the request's contents. The most common example +consists in using the Host header and/or elements from the path (sub-directories +or file-name extensions) to decide whether an HTTP request targets a static +object or the application, and to route static objects traffic to a backend made +of fast and light servers, and all the remaining traffic to a more complex +application server, thus constituting a fine-grained virtual hosting solution. +This is quite convenient to make multiple technologies coexist as a more global +solution. + +Another use case of content-switching consists in using different load balancing +algorithms depending on various criteria. A cache may use a URI hash while an +application would use round-robin. + +Last but not least, it allows multiple customers to use a small share of a +common resource by enforcing per-backend (thus per-customer connection limits). + +Content switching rules scale very well, though their performance may depend on +the number and complexity of the ACLs in use. But it is also possible to write +dynamic content switching rules where a sample value directly turns into a +backend name and without making use of ACLs at all. Such configurations have +been reported to work fine at least with 300000 backends in production. + + +3.4.5. Standard features : Stick-tables +--------------------------------------- + +Stick-tables are commonly used to store stickiness information, that is, to keep +a reference to the server a certain visitor was directed to. The key is then the +identifier associated with the visitor (its source address, the SSL ID of the +connection, an HTTP or RDP cookie, the customer number extracted from the URL or +from the payload, ...) and the stored value is then the server's identifier. + +Stick tables may use 3 different types of samples for their keys : integers, +strings and addresses. Only one stick-table may be referenced in a proxy, and it +is designated everywhere with the proxy name. Up to 8 keys may be tracked in +parallel. The server identifier is committed during request or response +processing once both the key and the server are known. + +Stick-table contents may be replicated in active-active mode with other HAProxy +nodes known as "peers" as well as with the new process during a reload operation +so that all load balancing nodes share the same information and take the same +routing decision if client's requests are spread over multiple nodes. + +Since stick-tables are indexed on what allows to recognize a client, they are +often also used to store extra information such as per-client statistics. The +extra statistics take some extra space and need to be explicitly declared. The +type of statistics that may be stored includes the input and output bandwidth, +the number of concurrent connections, the connection rate and count over a +period, the amount and frequency of errors, some specific tags and counters, +etc. In order to support keeping such information without being forced to +stick to a given server, a special "tracking" feature is implemented and allows +to track up to 3 simultaneous keys from different tables at the same time +regardless of stickiness rules. Each stored statistics may be searched, dumped +and cleared from the CLI and adds to the live troubleshooting capabilities. + +While this mechanism can be used to surclass a returning visitor or to adjust +the delivered quality of service depending on good or bad behavior, it is +mostly used to fight against service abuse and more generally DDoS as it allows +to build complex models to detect certain bad behaviors at a high processing +speed. + + +3.4.6. Standard features : Formatted strings +-------------------------------------------- + +There are many places where HAProxy needs to manipulate character strings, such +as logs, redirects, header additions, and so on. In order to provide the +greatest flexibility, the notion of Formatted strings was introduced, initially +for logging purposes, which explains why it's still called "log-format". These +strings contain escape characters allowing to introduce various dynamic data +including variables and sample fetch expressions into strings, and even to +adjust the encoding while the result is being turned into a string (for example, +adding quotes). This provides a powerful way to build header contents, to build +response data or even response templates, or to customize log lines. +Additionally, in order to remain simple to build most common strings, about 50 +special tags are provided as shortcuts for information commonly used in logs. + + +3.4.7. Standard features : HTTP rewriting and redirection +--------------------------------------------------------- + +Installing a load balancer in front of an application that was never designed +for this can be a challenging task without the proper tools. One of the most +commonly requested operation in this case is to adjust requests and response +headers to make the load balancer appear as the origin server and to fix hard +coded information. This comes with changing the path in requests (which is +strongly advised against), modifying Host header field, modifying the Location +response header field for redirects, modifying the path and domain attribute +for cookies, and so on. It also happens that a number of servers are somewhat +verbose and tend to leak too much information in the response, making them more +vulnerable to targeted attacks. While it's theoretically not the role of a load +balancer to clean this up, in practice it's located at the best place in the +infrastructure to guarantee that everything is cleaned up. + +Similarly, sometimes the load balancer will have to intercept some requests and +respond with a redirect to a new target URL. While some people tend to confuse +redirects and rewriting, these are two completely different concepts, since the +rewriting makes the client and the server see different things (and disagree on +the location of the page being visited) while redirects ask the client to visit +the new URL so that it sees the same location as the server. + +In order to do this, HAProxy supports various possibilities for rewriting and +redirects, among which : + + - regex-based URL and header rewriting in requests and responses. Regex are + the most commonly used tool to modify header values since they're easy to + manipulate and well understood; + + - headers may also be appended, deleted or replaced based on formatted strings + so that it is possible to pass information there (e.g. client side TLS + algorithm and cipher); + + - HTTP redirects can use any 3xx code to a relative, absolute, or completely + dynamic (formatted string) URI; + + - HTTP redirects also support some extra options such as setting or clearing + a specific cookie, dropping the query string, appending a slash if missing, + and so on; + + - a powerful "return" directive allows to customize every part of a response + like status, headers, body using dynamic contents or even template files. + + - all operations support ACL-based conditions; + + +3.4.8. Standard features : Server protection +-------------------------------------------- + +HAProxy does a lot to maximize service availability, and for this it takes +large efforts to protect servers against overloading and attacks. The first +and most important point is that only complete and valid requests are forwarded +to the servers. The initial reason is that HAProxy needs to find the protocol +elements it needs to stay synchronized with the byte stream, and the second +reason is that until the request is complete, there is no way to know if some +elements will change its semantics. The direct benefit from this is that servers +are not exposed to invalid or incomplete requests. This is a very effective +protection against slowloris attacks, which have almost no impact on HAProxy. + +Another important point is that HAProxy contains buffers to store requests and +responses, and that by only sending a request to a server when it's complete and +by reading the whole response very quickly from the local network, the server +side connection is used for a very short time and this preserves server +resources as much as possible. + +A direct extension to this is that HAProxy can artificially limit the number of +concurrent connections or outstanding requests to a server, which guarantees +that the server will never be overloaded even if it continuously runs at 100% of +its capacity during traffic spikes. All excess requests will simply be queued to +be processed when one slot is released. In the end, this huge resource savings +most often ensures so much better server response times that it ends up actually +being faster than by overloading the server. Queued requests may be redispatched +to other servers, or even aborted in queue when the client aborts, which also +protects the servers against the "reload effect", where each click on "reload" +by a visitor on a slow-loading page usually induces a new request and maintains +the server in an overloaded state. + +The slow-start mechanism also protects restarting servers against high traffic +levels while they're still finalizing their startup or compiling some classes. + +Regarding the protocol-level protection, it is possible to relax the HTTP parser +to accept non standard-compliant but harmless requests or responses and even to +fix them. This allows bogus applications to be accessible while a fix is being +developed. In parallel, offending messages are completely captured with a +detailed report that help developers spot the issue in the application. The most +dangerous protocol violations are properly detected and dealt with and fixed. +For example malformed requests or responses with two Content-length headers are +either fixed if the values are exactly the same, or rejected if they differ, +since it becomes a security problem. Protocol inspection is not limited to HTTP, +it is also available for other protocols like TLS or RDP. + +When a protocol violation or attack is detected, there are various options to +respond to the user, such as returning the common "HTTP 400 bad request", +closing the connection with a TCP reset, or faking an error after a long delay +("tarpit") to confuse the attacker. All of these contribute to protecting the +servers by discouraging the offending client from pursuing an attack that +becomes very expensive to maintain. + +HAProxy also proposes some more advanced options to protect against accidental +data leaks and session crossing. Not only it can log suspicious server responses +but it will also log and optionally block a response which might affect a given +visitors' confidentiality. One such example is a cacheable cookie appearing in a +cacheable response and which may result in an intermediary cache to deliver it +to another visitor, causing an accidental session sharing. + + +3.5. Advanced features +---------------------- + +3.5.1. Advanced features : Management +------------------------------------- + +HAProxy is designed to remain extremely stable and safe to manage in a regular +production environment. It is provided as a single executable file which doesn't +require any installation process. Multiple versions can easily coexist, meaning +that it's possible (and recommended) to upgrade instances progressively by +order of importance instead of migrating all of them at once. Configuration +files are easily versioned. Configuration checking is done off-line so it +doesn't require to restart a service that will possibly fail. During +configuration checks, a number of advanced mistakes may be detected (e.g. a rule +hiding another one, or stickiness that will not work) and detailed warnings and +configuration hints are proposed to fix them. Backwards configuration file +compatibility goes very far away in time, with version 1.5 still fully +supporting configurations for versions 1.1 written 13 years before, and 1.6 +only dropping support for almost unused, obsolete keywords that can be done +differently. The configuration and software upgrade mechanism is smooth and non +disruptive in that it allows old and new processes to coexist on the system, +each handling its own connections. System status, build options, and library +compatibility are reported on startup. + +Some advanced features allow an application administrator to smoothly stop a +server, detect when there's no activity on it anymore, then take it off-line, +stop it, upgrade it and ensure it doesn't take any traffic while being upgraded, +then test it again through the normal path without opening it to the public, and +all of this without touching HAProxy at all. This ensures that even complicated +production operations may be done during opening hours with all technical +resources available. + +The process tries to save resources as much as possible, uses memory pools to +save on allocation time and limit memory fragmentation, releases payload buffers +as soon as their contents are sent, and supports enforcing strong memory limits +above which connections have to wait for a buffer to become available instead of +allocating more memory. This system helps guarantee memory usage in certain +strict environments. + +A command line interface (CLI) is available as a UNIX or TCP socket, to perform +a number of operations and to retrieve troubleshooting information. Everything +done on this socket doesn't require a configuration change, so it is mostly used +for temporary changes. Using this interface it is possible to change a server's +address, weight and status, to consult statistics and clear counters, dump and +clear stickiness tables, possibly selectively by key criteria, dump and kill +client-side and server-side connections, dump captured errors with a detailed +analysis of the exact cause and location of the error, dump, add and remove +entries from ACLs and maps, update TLS shared secrets, apply connection limits +and rate limits on the fly to arbitrary frontends (useful in shared hosting +environments), and disable a specific frontend to release a listening port +(useful when daytime operations are forbidden and a fix is needed nonetheless). +Updating certificates and their configuration on the fly is permitted, as well +as enabling and consulting traces of every processing step of the traffic. + +For environments where SNMP is mandatory, at least two agents exist, one is +provided with the HAProxy sources and relies on the Net-SNMP Perl module. +Another one is provided with the commercial packages and doesn't require Perl. +Both are roughly equivalent in terms of coverage. + +It is often recommended to install 4 utilities on the machine where HAProxy is +deployed : + + - socat (in order to connect to the CLI, though certain forks of netcat can + also do it to some extents); + + - halog from the latest HAProxy version : this is the log analysis tool, it + parses native TCP and HTTP logs extremely fast (1 to 2 GB per second) and + extracts useful information and statistics such as requests per URL, per + source address, URLs sorted by response time or error rate, termination + codes etc. It was designed to be deployed on the production servers to + help troubleshoot live issues so it has to be there ready to be used; + + - tcpdump : this is highly recommended to take the network traces needed to + troubleshoot an issue that was made visible in the logs. There is a moment + where application and haproxy's analysis will diverge and the network traces + are the only way to say who's right and who's wrong. It's also fairly common + to detect bugs in network stacks and hypervisors thanks to tcpdump; + + - strace : it is tcpdump's companion. It will report what HAProxy really sees + and will help sort out the issues the operating system is responsible for + from the ones HAProxy is responsible for. Strace is often requested when a + bug in HAProxy is suspected; + + +3.5.2. Advanced features : System-specific capabilities +------------------------------------------------------- + +Depending on the operating system HAProxy is deployed on, certain extra features +may be available or needed. While it is supported on a number of platforms, +HAProxy is primarily developed on Linux, which explains why some features are +only available on this platform. + +The transparent bind and connect features, the support for binding connections +to a specific network interface, as well as the ability to bind multiple +processes to the same IP address and ports are only available on Linux and BSD +systems, though only Linux performs a kernel-side load balancing of the incoming +requests between the available processes. + +On Linux, there are also a number of extra features and optimizations including +support for network namespaces (also known as "containers") allowing HAProxy to +be a gateway between all containers, the ability to set the MSS, Netfilter marks +and IP TOS field on the client side connection, support for TCP FastOpen on the +listening side, TCP user timeouts to let the kernel quickly kill connections +when it detects the client has disappeared before the configured timeouts, TCP +splicing to let the kernel forward data between the two sides of a connections +thus avoiding multiple memory copies, the ability to enable the "defer-accept" +bind option to only get notified of an incoming connection once data become +available in the kernel buffers, and the ability to send the request with the +ACK confirming a connect (sometimes called "piggy-back") which is enabled with +the "tcp-smart-connect" option. On Linux, HAProxy also takes great care of +manipulating the TCP delayed ACKs to save as many packets as possible on the +network. + +Some systems have an unreliable clock which jumps back and forth in the past +and in the future. This used to happen with some NUMA systems where multiple +processors didn't see the exact same time of day, and recently it became more +common in virtualized environments where the virtual clock has no relation with +the real clock, resulting in huge time jumps (sometimes up to 30 seconds have +been observed). This causes a lot of trouble with respect to timeout enforcement +in general. Due to this flaw of these systems, HAProxy maintains its own +monotonic clock which is based on the system's clock but where drift is measured +and compensated for. This ensures that even with a very bad system clock, timers +remain reasonably accurate and timeouts continue to work. Note that this problem +affects all the software running on such systems and is not specific to HAProxy. +The common effects are spurious timeouts or application freezes. Thus if this +behavior is detected on a system, it must be fixed, regardless of the fact that +HAProxy protects itself against it. + +On Linux, a new starting process may communicate with the previous one to reuse +its listening file descriptors so that the listening sockets are never +interrupted during the process's replacement. + + +3.5.3. Advanced features : Scripting +------------------------------------ + +HAProxy can be built with support for the Lua embedded language, which opens a +wide area of new possibilities related to complex manipulation of requests or +responses, routing decisions, statistics processing and so on. Using Lua it is +even possible to establish parallel connections to other servers to exchange +information. This way it becomes possible (though complex) to develop an +authentication system for example. Please refer to the documentation in the file +"doc/lua-api/index.rst" for more information on how to use Lua. + + +3.5.4. Advanced features: Tracing +--------------------------------- + +At any moment an administrator may connect over the CLI and enable tracing in +various internal subsystems. Various levels of details are provided by default +so that in practice anything between one line per request to 500 lines per +request can be retrieved. Filters as well as an automatic capture on/off/pause +mechanism are available so that it really is possible to wait for a certain +event and watch it in detail. This is extremely convenient to diagnose protocol +violations from faulty servers and clients, or denial of service attacks. + + +3.6. Sizing +----------- + +Typical CPU usage figures show 15% of the processing time spent in HAProxy +versus 85% in the kernel in TCP or HTTP close mode, and about 30% for HAProxy +versus 70% for the kernel in HTTP keep-alive mode. This means that the operating +system and its tuning have a strong impact on the global performance. + +Usages vary a lot between users, some focus on bandwidth, other ones on request +rate, others on connection concurrency, others on SSL performance. This section +aims at providing a few elements to help with this task. + +It is important to keep in mind that every operation comes with a cost, so each +individual operation adds its overhead on top of the other ones, which may be +negligible in certain circumstances, and which may dominate in other cases. + +When processing the requests from a connection, we can say that : + + - forwarding data costs less than parsing request or response headers; + + - parsing request or response headers cost less than establishing then closing + a connection to a server; + + - establishing an closing a connection costs less than a TLS resume operation; + + - a TLS resume operation costs less than a full TLS handshake with a key + computation; + + - an idle connection costs less CPU than a connection whose buffers hold data; + + - a TLS context costs even more memory than a connection with data; + +So in practice, it is cheaper to process payload bytes than header bytes, thus +it is easier to achieve high network bandwidth with large objects (few requests +per volume unit) than with small objects (many requests per volume unit). This +explains why maximum bandwidth is always measured with large objects, while +request rate or connection rates are measured with small objects. + +Some operations scale well on multiple processes spread over multiple CPUs, +and others don't scale as well. Network bandwidth doesn't scale very far because +the CPU is rarely the bottleneck for large objects, it's mostly the network +bandwidth and data buses to reach the network interfaces. The connection rate +doesn't scale well over multiple processors due to a few locks in the system +when dealing with the local ports table. The request rate over persistent +connections scales very well as it doesn't involve much memory nor network +bandwidth and doesn't require to access locked structures. TLS key computation +scales very well as it's totally CPU-bound. TLS resume scales moderately well, +but reaches its limits around 4 processes where the overhead of accessing the +shared table offsets the small gains expected from more power. + +The performance numbers one can expect from a very well tuned system are in the +following range. It is important to take them as orders of magnitude and to +expect significant variations in any direction based on the processor, IRQ +setting, memory type, network interface type, operating system tuning and so on. + +The following numbers were found on a Core i7 running at 3.7 GHz equipped with +a dual-port 10 Gbps NICs running Linux kernel 3.10, HAProxy 1.6 and OpenSSL +1.0.2. HAProxy was running as a single process on a single dedicated CPU core, +and two extra cores were dedicated to network interrupts : + + - 20 Gbps of maximum network bandwidth in clear text for objects 256 kB or + higher, 10 Gbps for 41kB or higher; + + - 4.6 Gbps of TLS traffic using AES256-GCM cipher with large objects; + + - 83000 TCP connections per second from client to server; + + - 82000 HTTP connections per second from client to server; + + - 97000 HTTP requests per second in server-close mode (keep-alive with the + client, close with the server); + + - 243000 HTTP requests per second in end-to-end keep-alive mode; + + - 300000 filtered TCP connections per second (anti-DDoS) + + - 160000 HTTPS requests per second in keep-alive mode over persistent TLS + connections; + + - 13100 HTTPS requests per second using TLS resumed connections; + + - 1300 HTTPS connections per second using TLS connections renegotiated with + RSA2048; + + - 20000 concurrent saturated connections per GB of RAM, including the memory + required for system buffers; it is possible to do better with careful tuning + but this result it easy to achieve. + + - about 8000 concurrent TLS connections (client-side only) per GB of RAM, + including the memory required for system buffers; + + - about 5000 concurrent end-to-end TLS connections (both sides) per GB of + RAM including the memory required for system buffers; + +A more recent benchmark featuring the multi-thread enabled HAProxy 2.4 on a +64-core ARM Graviton2 processor in AWS reached 2 million HTTPS requests per +second at sub-millisecond response time, and 100 Gbps of traffic: + + https://www.haproxy.com/blog/haproxy-forwards-over-2-million-http-requests-per-second-on-a-single-aws-arm-instance/ + +Thus a good rule of thumb to keep in mind is that the request rate is divided +by 10 between TLS keep-alive and TLS resume, and between TLS resume and TLS +renegotiation, while it's only divided by 3 between HTTP keep-alive and HTTP +close. Another good rule of thumb is to remember that a high frequency core +with AES instructions can do around 20 Gbps of AES-GCM per core. + +Another good rule of thumb is to consider that on the same server, HAProxy will +be able to saturate : + + - about 5-10 static file servers or caching proxies; + + - about 100 anti-virus proxies; + + - and about 100-1000 application servers depending on the technology in use. + + +3.7. How to get HAProxy +----------------------- + +HAProxy is an open source project covered by the GPLv2 license, meaning that +everyone is allowed to redistribute it provided that access to the sources is +also provided upon request, especially if any modifications were made. + +HAProxy evolves as a main development branch called "master" or "mainline", from +which new branches are derived once the code is considered stable. A lot of web +sites run some development branches in production on a voluntarily basis, either +to participate to the project or because they need a bleeding edge feature, and +their feedback is highly valuable to fix bugs and judge the overall quality and +stability of the version being developed. + +The new branches that are created when the code is stable enough constitute a +stable version and are generally maintained for several years, so that there is +no emergency to migrate to a newer branch even when you're not on the latest. +Once a stable branch is issued, it may only receive bug fixes, and very rarely +minor feature updates when that makes users' life easier. All fixes that go into +a stable branch necessarily come from the master branch. This guarantees that no +fix will be lost after an upgrade. For this reason, if you fix a bug, please +make the patch against the master branch, not the stable branch. You may even +discover it was already fixed. This process also ensures that regressions in a +stable branch are extremely rare, so there is never any excuse for not upgrading +to the latest version in your current branch. + +Branches are numbered with two digits delimited with a dot, such as "1.6". +Since 1.9, branches with an odd second digit are mostly focused on sensitive +technical updates and more aimed at advanced users because they are likely to +trigger more bugs than the other ones. They are maintained for about a year +only and must not be deployed where they cannot be rolled back in emergency. A +complete version includes one or two sub-version numbers indicating the level of +fix. For example, version 1.5.14 is the 14th fix release in branch 1.5 after +version 1.5.0 was issued. It contains 126 fixes for individual bugs, 24 updates +on the documentation, and 75 other backported patches, most of which were needed +to fix the aforementioned 126 bugs. An existing feature may never be modified +nor removed in a stable branch, in order to guarantee that upgrades within the +same branch will always be harmless. + +HAProxy is available from multiple sources, at different release rhythms : + + - The official community web site : http://www.haproxy.org/ : this site + provides the sources of the latest development release, all stable releases, + as well as nightly snapshots for each branch. The release cycle is not fast, + several months between stable releases, or between development snapshots. + Very old versions are still supported there. Everything is provided as + sources only, so whatever comes from there needs to be rebuilt and/or + repackaged; + + - GitHub : https://github.com/haproxy/haproxy/ : this is the mirror for the + development branch only, which provides integration with the issue tracker, + continuous integration and code coverage tools. This is exclusively for + contributors; + + - A number of operating systems such as Linux distributions and BSD ports. + These systems generally provide long-term maintained versions which do not + always contain all the fixes from the official ones, but which at least + contain the critical fixes. It often is a good option for most users who do + not seek advanced configurations and just want to keep updates easy; + + - Commercial versions from http://www.haproxy.com/ : these are supported + professional packages built for various operating systems or provided as + appliances, based on the latest stable versions and including a number of + features backported from the next release for which there is a strong + demand. It is the best option for users seeking the latest features with + the reliability of a stable branch, the fastest response time to fix bugs, + or simply support contracts on top of an open source product; + + +In order to ensure that the version you're using is the latest one in your +branch, you need to proceed this way : + + - verify which HAProxy executable you're running : some systems ship it by + default and administrators install their versions somewhere else on the + system, so it is important to verify in the startup scripts which one is + used; + + - determine which source your HAProxy version comes from. For this, it's + generally sufficient to type "haproxy -v". A development version will + appear like this, with the "dev" word after the branch number : + + HAProxy version 2.4-dev18-a5357c-137 2021/05/09 - https://haproxy.org/ + + A stable version will appear like this, as well as unmodified stable + versions provided by operating system vendors : + + HAProxy version 1.5.14 2015/07/02 + + And a nightly snapshot of a stable version will appear like this with an + hexadecimal sequence after the version, and with the date of the snapshot + instead of the date of the release : + + HAProxy version 1.5.14-e4766ba 2015/07/29 + + Any other format may indicate a system-specific package with its own + patch set. For example HAProxy Enterprise versions will appear with the + following format (<branch>-<latest commit>-<revision>) : + + HAProxy version 1.5.0-994126-357 2015/07/02 + + Please note that historically versions prior to 2.4 used to report the + process name with a hyphen between "HA" and "Proxy", including those above + which were adjusted to show the correct format only, so better ignore this + word or use a relaxed match in scripts. Additionally, modern versions add + a URL linking to the project's home. + + Finally, versions 2.1 and above will include a "Status" line indicating + whether the version is safe for production or not, and if so, till when, as + well as a link to the list of known bugs affecting this version. + + - for system-specific packages, you have to check with your vendor's package + repository or update system to ensure that your system is still supported, + and that fixes are still provided for your branch. For community versions + coming from haproxy.org, just visit the site, verify the status of your + branch and compare the latest version with yours to see if you're on the + latest one. If not you can upgrade. If your branch is not maintained + anymore, you're definitely very late and will have to consider an upgrade + to a more recent branch (carefully read the README when doing so). + +HAProxy will have to be updated according to the source it came from. Usually it +follows the system vendor's way of upgrading a package. If it was taken from +sources, please read the README file in the sources directory after extracting +the sources and follow the instructions for your operating system. + + +4. Companion products and alternatives +-------------------------------------- + +HAProxy integrates fairly well with certain products listed below, which is why +they are mentioned here even if not directly related to HAProxy. + + +4.1. Apache HTTP server +----------------------- + +Apache is the de-facto standard HTTP server. It's a very complete and modular +project supporting both file serving and dynamic contents. It can serve as a +frontend for some application servers. It can even proxy requests and cache +responses. In all of these use cases, a front load balancer is commonly needed. +Apache can work in various modes, some being heavier than others. Certain +modules still require the heavier pre-forked model and will prevent Apache from +scaling well with a high number of connections. In this case HAProxy can provide +a tremendous help by enforcing the per-server connection limits to a safe value +and will significantly speed up the server and preserve its resources that will +be better used by the application. + +Apache can extract the client's address from the X-Forwarded-For header by using +the "mod_rpaf" extension. HAProxy will automatically feed this header when +"option forwardfor" is specified in its configuration. HAProxy may also offer a +nice protection to Apache when exposed to the internet, where it will better +resist a wide number of types of DoS attacks. + + +4.2. NGINX +---------- + +NGINX is the second de-facto standard HTTP server. Just like Apache, it covers a +wide range of features. NGINX is built on a similar model as HAProxy so it has +no problem dealing with tens of thousands of concurrent connections. When used +as a gateway to some applications (e.g. using the included PHP FPM) it can often +be beneficial to set up some frontend connection limiting to reduce the load +on the PHP application. HAProxy will clearly be useful there both as a regular +load balancer and as the traffic regulator to speed up PHP by decongesting +it. Also since both products use very little CPU thanks to their event-driven +architecture, it's often easy to install both of them on the same system. NGINX +implements HAProxy's PROXY protocol, thus it is easy for HAProxy to pass the +client's connection information to NGINX so that the application gets all the +relevant information. Some benchmarks have also shown that for large static +file serving, implementing consistent hash on HAProxy in front of NGINX can be +beneficial by optimizing the OS' cache hit ratio, which is basically multiplied +by the number of server nodes. + + +4.3. Varnish +------------ + +Varnish is a smart caching reverse-proxy, probably best described as a web +application accelerator. Varnish doesn't implement SSL/TLS and wants to dedicate +all of its CPU cycles to what it does best. Varnish also implements HAProxy's +PROXY protocol so that HAProxy can very easily be deployed in front of Varnish +as an SSL offloader as well as a load balancer and pass it all relevant client +information. Also, Varnish naturally supports decompression from the cache when +a server has provided a compressed object, but doesn't compress however. HAProxy +can then be used to compress outgoing data when backend servers do not implement +compression, though it's rarely a good idea to compress on the load balancer +unless the traffic is low. + +When building large caching farms across multiple nodes, HAProxy can make use of +consistent URL hashing to intelligently distribute the load to the caching nodes +and avoid cache duplication, resulting in a total cache size which is the sum of +all caching nodes. In addition, caching of very small dumb objects for a short +duration on HAProxy can sometimes save network round trips and reduce the CPU +load on both the HAProxy and the Varnish nodes. This is only possible is no +processing is done on these objects on Varnish (this is often referred to as +the notion of "favicon cache", by which a sizeable percentage of useless +downstream requests can sometimes be avoided). However do not enable HAProxy +caching for a long time (more than a few seconds) in front of any other cache, +that would significantly complicate troubleshooting without providing really +significant savings. + + +4.4. Alternatives +----------------- + +Linux Virtual Server (LVS or IPVS) is the layer 4 load balancer included within +the Linux kernel. It works at the packet level and handles TCP and UDP. In most +cases it's more a complement than an alternative since it doesn't have layer 7 +knowledge at all. + +Pound is another well-known load balancer. It's much simpler and has much less +features than HAProxy but for many very basic setups both can be used. Its +author has always focused on code auditability first and wants to maintain the +set of features low. Its thread-based architecture scales less well with high +connection counts, but it's a good product. + +Pen is a quite light load balancer. It supports SSL, maintains persistence using +a fixed-size table of its clients' IP addresses. It supports a packet-oriented +mode allowing it to support direct server return and UDP to some extents. It is +meant for small loads (the persistence table only has 2048 entries). + +NGINX can do some load balancing to some extents, though it's clearly not its +primary function. Production traffic is used to detect server failures, the +load balancing algorithms are more limited, and the stickiness is very limited. +But it can make sense in some simple deployment scenarios where it is already +present. The good thing is that since it integrates very well with HAProxy, +there's nothing wrong with adding HAProxy later when its limits have been +reached. + +Varnish also does some load balancing of its backend servers and does support +real health checks. It doesn't implement stickiness however, so just like with +NGINX, as long as stickiness is not needed that can be enough to start with. +And similarly, since HAProxy and Varnish integrate so well together, it's easy +to add it later into the mix to complement the feature set. + + +5. Contacts +----------- + +If you want to contact the developers or any community member about anything, +the best way to do it usually is via the mailing list by sending your message +to haproxy@formilux.org. Please note that this list is public and its archives +are public as well so you should avoid disclosing sensitive information. A +thousand of users of various experience levels are present there and even the +most complex questions usually find an optimal response relatively quickly. +Suggestions are welcome too. For users having difficulties with e-mail, a +Discourse platform is available at http://discourse.haproxy.org/ . However +please keep in mind that there are less people reading questions there and that +most are handled by a really tiny team. In any case, please be patient and +respectful with those who devote their spare time helping others. + +I you believe you've found a bug but are not sure, it's best reported on the +mailing list. If you're quite convinced you've found a bug, that your version +is up-to-date in its branch, and you already have a GitHub account, feel free +to go directly to https://github.com/haproxy/haproxy/ and file an issue with +all possibly available details. Again, this is public so be careful not to post +information you might later regret. Since the issue tracker presents itself as +a very long thread, please avoid pasting very long dumps (a few hundreds lines +or more) and attach them instead. + +If you've found what you're absolutely certain can be considered a critical +security issue that would put many users in serious trouble if discussed in a +public place, then you can send it with the reproducer to security@haproxy.org. +A small team of trusted developers will receive it and will be able to propose +a fix. We usually don't use embargoes and once a fix is available it gets +merged. In some rare circumstances it can happen that a release is coordinated +with software vendors. Please note that this process usually messes up with +eveyone's work, and that rushed up releases can sometimes introduce new bugs, +so it's best avoided unless strictly necessary; as such, there is often little +consideration for reports that needlessly cause such extra burden, and the best +way to see your work credited usually is to provide a working fix, which will +appear in changelogs. diff --git a/doc/lgpl.txt b/doc/lgpl.txt new file mode 100644 index 0000000..5ab7695 --- /dev/null +++ b/doc/lgpl.txt @@ -0,0 +1,504 @@ + GNU LESSER GENERAL PUBLIC LICENSE + Version 2.1, February 1999 + + Copyright (C) 1991, 1999 Free Software Foundation, Inc. + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + +[This is the first released version of the Lesser GPL. It also counts + as the successor of the GNU Library Public License, version 2, hence + the version number 2.1.] + + Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software--to make sure the software is free for all its users. + + This license, the Lesser General Public License, applies to some +specially designated software packages--typically libraries--of the +Free Software Foundation and other authors who decide to use it. You +can use it too, but we suggest you first think carefully about whether +this license or the ordinary General Public License is the better +strategy to use in any particular case, based on the explanations below. + + When we speak of free software, we are referring to freedom of use, +not price. Our General Public Licenses are designed to make sure that +you have the freedom to distribute copies of free software (and charge +for this service if you wish); that you receive source code or can get +it if you want it; that you can change the software and use pieces of +it in new free programs; and that you are informed that you can do +these things. + + To protect your rights, we need to make restrictions that forbid +distributors to deny you these rights or to ask you to surrender these +rights. These restrictions translate to certain responsibilities for +you if you distribute copies of the library or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link other code with the library, you must provide +complete object files to the recipients, so that they can relink them +with the library after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + We protect your rights with a two-step method: (1) we copyright the +library, and (2) we offer you this license, which gives you legal +permission to copy, distribute and/or modify the library. + + To protect each distributor, we want to make it very clear that +there is no warranty for the free library. Also, if the library is +modified by someone else and passed on, the recipients should know +that what they have is not the original version, so that the original +author's reputation will not be affected by problems that might be +introduced by others. + + Finally, software patents pose a constant threat to the existence of +any free program. We wish to make sure that a company cannot +effectively restrict the users of a free program by obtaining a +restrictive license from a patent holder. Therefore, we insist that +any patent license obtained for a version of the library must be +consistent with the full freedom of use specified in this license. + + Most GNU software, including some libraries, is covered by the +ordinary GNU General Public License. This license, the GNU Lesser +General Public License, applies to certain designated libraries, and +is quite different from the ordinary General Public License. We use +this license for certain libraries in order to permit linking those +libraries into non-free programs. + + When a program is linked with a library, whether statically or using +a shared library, the combination of the two is legally speaking a +combined work, a derivative of the original library. The ordinary +General Public License therefore permits such linking only if the +entire combination fits its criteria of freedom. The Lesser General +Public License permits more lax criteria for linking other code with +the library. + + We call this license the "Lesser" General Public License because it +does Less to protect the user's freedom than the ordinary General +Public License. It also provides other free software developers Less +of an advantage over competing non-free programs. These disadvantages +are the reason we use the ordinary General Public License for many +libraries. However, the Lesser license provides advantages in certain +special circumstances. + + For example, on rare occasions, there may be a special need to +encourage the widest possible use of a certain library, so that it becomes +a de-facto standard. To achieve this, non-free programs must be +allowed to use the library. A more frequent case is that a free +library does the same job as widely used non-free libraries. In this +case, there is little to gain by limiting the free library to free +software only, so we use the Lesser General Public License. + + In other cases, permission to use a particular library in non-free +programs enables a greater number of people to use a large body of +free software. For example, permission to use the GNU C Library in +non-free programs enables many more people to use the whole GNU +operating system, as well as its variant, the GNU/Linux operating +system. + + Although the Lesser General Public License is Less protective of the +users' freedom, it does ensure that the user of a program that is +linked with the Library has the freedom and the wherewithal to run +that program using a modified version of the Library. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +"work based on the library" and a "work that uses the library". The +former contains code derived from the library, whereas the latter must +be combined with the library in order to run. + + GNU LESSER GENERAL PUBLIC LICENSE + TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + + 0. This License Agreement applies to any software library or other +program which contains a notice placed by the copyright holder or +other authorized party saying it may be distributed under the terms of +this Lesser General Public License (also called "this License"). +Each licensee is addressed as "you". + + A "library" means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The "Library", below, refers to any such software library or work +which has been distributed under these terms. A "work based on the +Library" means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term "modification".) + + "Source code" for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + + 1. You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + + 2. You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + + a) The modified work must itself be a software library. + + b) You must cause the files modified to carry prominent notices + stating that you changed the files and the date of any change. + + c) You must cause the whole of the work to be licensed at no + charge to all third parties under the terms of this License. + + d) If a facility in the modified Library refers to a function or a + table of data to be supplied by an application program that uses + the facility, other than as an argument passed when the facility + is invoked, then you must make a good faith effort to ensure that, + in the event an application does not supply such function or + table, the facility still operates, and performs whatever part of + its purpose remains meaningful. + + (For example, a function in a library to compute square roots has + a purpose that is entirely well-defined independent of the + application. Therefore, Subsection 2d requires that any + application-supplied function or table used by this function must + be optional: if the application does not supply it, the square + root function must still compute square roots.) + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + + 3. You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + + 4. You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + + 5. A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a "work that uses the Library". Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a "work that uses the Library" with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a "work that uses the +library". The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a "work that uses the Library" uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + + 6. As an exception to the Sections above, you may also combine or +link a "work that uses the Library" with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + + a) Accompany the work with the complete corresponding + machine-readable source code for the Library including whatever + changes were used in the work (which must be distributed under + Sections 1 and 2 above); and, if the work is an executable linked + with the Library, with the complete machine-readable "work that + uses the Library", as object code and/or source code, so that the + user can modify the Library and then relink to produce a modified + executable containing the modified Library. (It is understood + that the user who changes the contents of definitions files in the + Library will not necessarily be able to recompile the application + to use the modified definitions.) + + b) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (1) uses at run time a + copy of the library already present on the user's computer system, + rather than copying library functions into the executable, and (2) + will operate properly with a modified version of the library, if + the user installs one, as long as the modified version is + interface-compatible with the version that the work was made with. + + c) Accompany the work with a written offer, valid for at + least three years, to give the same user the materials + specified in Subsection 6a, above, for a charge no more + than the cost of performing this distribution. + + d) If distribution of the work is made by offering access to copy + from a designated place, offer equivalent access to copy the above + specified materials from the same place. + + e) Verify that the user has already received a copy of these + materials or that you have already sent this user a copy. + + For an executable, the required form of the "work that uses the +Library" must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the materials to be distributed need not include anything that is +normally distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + + 7. You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + + a) Accompany the combined library with a copy of the same work + based on the Library, uncombined with any other library + facilities. This must be distributed under the terms of the + Sections above. + + b) Give prominent notice with the combined library of the fact + that part of it is a work based on the Library, and explaining + where to find the accompanying uncombined form of the same work. + + 8. You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + + 9. You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + + 10. Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties with +this License. + + 11. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + + 12. If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + + 13. The Free Software Foundation may publish revised and/or new +versions of the Lesser General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +"any later version", you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + + 14. If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + + NO WARRANTY + + 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + + 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. + + END OF TERMS AND CONDITIONS + + How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +"copyright" line and a pointer to where the full notice is found. + + <one line to give the library's name and a brief idea of what it does.> + Copyright (C) <year> <name of author> + + This library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + This library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with this library; if not, write to the Free Software + Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the library, if +necessary. Here is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the + library `Frob' (a library for tweaking knobs) written by James Random Hacker. + + <signature of Ty Coon>, 1 April 1990 + Ty Coon, President of Vice + +That's all there is to it! + + diff --git a/doc/linux-syn-cookies.txt b/doc/linux-syn-cookies.txt new file mode 100644 index 0000000..ca13066 --- /dev/null +++ b/doc/linux-syn-cookies.txt @@ -0,0 +1,106 @@ +SYN cookie analysis on 3.10 + +include/net/request_sock.h: + +static inline int reqsk_queue_is_full(const struct request_sock_queue *queue) +{ + return queue->listen_opt->qlen >> queue->listen_opt->max_qlen_log; +} + +include/net/inet_connection_sock.h: + +static inline int inet_csk_reqsk_queue_is_full(const struct sock *sk) +{ + return reqsk_queue_is_full(&inet_csk(sk)->icsk_accept_queue); +} + +max_qlen_log is computed to equal log2(min(min(listen_backlog,somaxconn), sysctl_max_syn_backlog), +and this is done this way following this path : + + socket.c:listen(fd, backlog) : + + backlog = min(backlog, somaxconn) + => af_inet.c:inet_listen(sock, backlog) + + => inet_connection_sock.c:inet_csk_listen_start(sk, backlog) + + sk_max_ack_backlog = backlog + => request_sock.c:reqsk_queue_alloc(sk, backlog (=nr_table_entries)) + + nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog); + nr_table_entries = max_t(u32, nr_table_entries, 8); + nr_table_entries = roundup_pow_of_two(nr_table_entries + 1); + for (lopt->max_qlen_log = 3; + (1 << lopt->max_qlen_log) < nr_table_entries; + lopt->max_qlen_log++); + + +tcp_ipv4.c:tcp_v4_conn_request() + - inet_csk_reqsk_queue_is_full() returns true when the listening socket's + qlen is larger than 1 << max_qlen_log, so basically qlen >= min(backlog,max_backlog) + + - tcp_syn_flood_action() returns true when sysctl_tcp_syncookies is set. It + also emits a warning once per listening socket when activating the feature. + + if (inet_csk_reqsk_queue_is_full(sk) && !isn) { + want_cookie = tcp_syn_flood_action(sk, skb, "TCP"); + if (!want_cookie) + goto drop; + } + + => when the socket's current backlog is >= min(backlog,max_backlog), + either tcp_syn_cookies is set so we set want_cookie to 1, or we drop. + + + /* Accept backlog is full. If we have already queued enough + * of warm entries in syn queue, drop request. It is better than + * clogging syn queue with openreqs with exponentially increasing + * timeout. + */ + +sock.h:sk_acceptq_is_full() = sk_ack_backlog > sk_max_ack_backlog + = sk_ack_backlog > min(somaxconn, listen_backlog) + + if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) { + NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS); + goto drop; + } + +====> the following algorithm is applied in the reverse order but with these + priorities : + + 1) IF socket's accept queue >= min(somaxconn, listen_backlog) THEN drop + + 2) IF socket's SYN backlog < min(somaxconn, listen_backlog, tcp_max_syn_backlog) THEN accept + + 3) IF tcp_syn_cookies THEN send_syn_cookie + + 4) otherwise drop + +====> the problem is the accept queue being filled, but it's supposed to be + filled only with validated client requests (step 1). + + + + req = inet_reqsk_alloc(&tcp_request_sock_ops); + if (!req) + goto drop; + + ... + if (!sysctl_tcp_syncookies && + (sysctl_max_syn_backlog - inet_csk_reqsk_queue_len(sk) < + (sysctl_max_syn_backlog >> 2)) && + !tcp_peer_is_proven(req, dst, false)) { + /* Without syncookies last quarter of + * backlog is filled with destinations, + * proven to be alive. + * It means that we continue to communicate + * to destinations, already remembered + * to the moment of synflood. + */ + LIMIT_NETDEBUG(KERN_DEBUG pr_fmt("drop open request from %pI4/%u\n"), + &saddr, ntohs(tcp_hdr(skb)->source)); + goto drop_and_release; + } + + diff --git a/doc/lua-api/Makefile b/doc/lua-api/Makefile new file mode 100644 index 0000000..b21857d --- /dev/null +++ b/doc/lua-api/Makefile @@ -0,0 +1,153 @@ +# Makefile for Sphinx documentation +# + +# You can set these variables from the command line. +SPHINXOPTS = +SPHINXBUILD = sphinx-build +PAPER = +BUILDDIR = _build + +# Internal variables. +PAPEROPT_a4 = -D latex_paper_size=a4 +PAPEROPT_letter = -D latex_paper_size=letter +ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . +# the i18n builder cannot share the environment and doctrees with the others +I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . + +.PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext + +help: + @echo "Please use \`make <target>' where <target> is one of" + @echo " html to make standalone HTML files" + @echo " dirhtml to make HTML files named index.html in directories" + @echo " singlehtml to make a single large HTML file" + @echo " pickle to make pickle files" + @echo " json to make JSON files" + @echo " htmlhelp to make HTML files and a HTML help project" + @echo " qthelp to make HTML files and a qthelp project" + @echo " devhelp to make HTML files and a Devhelp project" + @echo " epub to make an epub" + @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" + @echo " latexpdf to make LaTeX files and run them through pdflatex" + @echo " text to make text files" + @echo " man to make manual pages" + @echo " texinfo to make Texinfo files" + @echo " info to make Texinfo files and run them through makeinfo" + @echo " gettext to make PO message catalogs" + @echo " changes to make an overview of all changed/added/deprecated items" + @echo " linkcheck to check all external links for integrity" + @echo " doctest to run all doctests embedded in the documentation (if enabled)" + +clean: + -rm -rf $(BUILDDIR)/* + +html: + $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." + +dirhtml: + $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml + @echo + @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." + +singlehtml: + $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml + @echo + @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." + +pickle: + $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle + @echo + @echo "Build finished; now you can process the pickle files." + +json: + $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json + @echo + @echo "Build finished; now you can process the JSON files." + +htmlhelp: + $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp + @echo + @echo "Build finished; now you can run HTML Help Workshop with the" \ + ".hhp project file in $(BUILDDIR)/htmlhelp." + +qthelp: + $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp + @echo + @echo "Build finished; now you can run "qcollectiongenerator" with the" \ + ".qhcp project file in $(BUILDDIR)/qthelp, like this:" + @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/haproxy-lua.qhcp" + @echo "To view the help file:" + @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/haproxy-lua.qhc" + +devhelp: + $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp + @echo + @echo "Build finished." + @echo "To view the help file:" + @echo "# mkdir -p $$HOME/.local/share/devhelp/haproxy-lua" + @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/haproxy-lua" + @echo "# devhelp" + +epub: + $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub + @echo + @echo "Build finished. The epub file is in $(BUILDDIR)/epub." + +latex: + $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex + @echo + @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." + @echo "Run \`make' in that directory to run these through (pdf)latex" \ + "(use \`make latexpdf' here to do that automatically)." + +latexpdf: + $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex + @echo "Running LaTeX files through pdflatex..." + $(MAKE) -C $(BUILDDIR)/latex all-pdf + @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." + +text: + $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text + @echo + @echo "Build finished. The text files are in $(BUILDDIR)/text." + +man: + $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man + @echo + @echo "Build finished. The manual pages are in $(BUILDDIR)/man." + +texinfo: + $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo + @echo + @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." + @echo "Run \`make' in that directory to run these through makeinfo" \ + "(use \`make info' here to do that automatically)." + +info: + $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo + @echo "Running Texinfo files through makeinfo..." + make -C $(BUILDDIR)/texinfo info + @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." + +gettext: + $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale + @echo + @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." + +changes: + $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes + @echo + @echo "The overview file is in $(BUILDDIR)/changes." + +linkcheck: + $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck + @echo + @echo "Link check complete; look for any errors in the above output " \ + "or in $(BUILDDIR)/linkcheck/output.txt." + +doctest: + $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest + @echo "Testing of doctests in the sources finished, look at the " \ + "results in $(BUILDDIR)/doctest/output.txt." diff --git a/doc/lua-api/_static/channel.fig b/doc/lua-api/_static/channel.fig new file mode 100644 index 0000000..8a6c0a1 --- /dev/null +++ b/doc/lua-api/_static/channel.fig @@ -0,0 +1,55 @@ +#FIG 3.2 Produced by xfig version 3.2.5b +Landscape +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +1 1 0 1 0 7 50 -1 -1 0.000 1 0.0000 4500 1620 1260 585 4500 1620 5760 2205 +2 3 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8 + 1170 1350 1170 1890 2790 1890 2790 2070 3240 1620 2790 1170 + 2790 1350 1170 1350 +2 3 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 8 + 5760 1350 5760 1890 7380 1890 7380 2070 7830 1620 7380 1170 + 7380 1350 5760 1350 +2 1 1 1 0 7 50 -1 -1 1.000 0 0 -1 1 0 2 + 5 1 1.00 60.00 120.00 + 6210 540 6210 1440 +2 1 1 1 0 7 50 -1 -1 1.000 0 0 -1 1 0 2 + 5 1 1.00 60.00 120.00 + 6210 2340 6210 1800 +2 1 1 1 0 7 50 -1 -1 1.000 0 0 -1 1 0 2 + 5 1 1.00 60.00 120.00 + 1350 2520 1350 1800 +2 1 1 1 0 7 50 -1 -1 1.000 0 0 -1 1 0 2 + 5 1 1.00 60.00 120.00 + 1350 360 1350 1440 +3 0 1 1 0 7 50 -1 -1 1.000 0 0 1 5 + 5 1 1.00 60.00 120.00 + 2970 1665 3105 1125 3330 900 3600 765 3915 720 + 0.000 1.000 1.000 1.000 0.000 +3 0 1 1 0 7 50 -1 -1 1.000 0 0 1 5 + 5 1 1.00 60.00 120.00 + 6030 1665 5895 1125 5670 900 5400 765 5040 720 + 0.000 1.000 1.000 1.000 0.000 +4 2 0 50 -1 16 12 0.0000 4 195 750 1080 1665 producer\001 +4 1 0 50 -1 16 12 0.0000 4 195 1785 4500 1575 HAProxy processing\001 +4 1 0 50 -1 16 12 0.0000 4 195 1260 4500 1815 (including Lua)\001 +4 0 0 50 -1 16 12 0.0000 4 105 855 7920 1665 consumer\001 +4 0 0 50 -1 12 12 0.0000 4 150 600 1440 2205 set()\001 +4 0 0 50 -1 12 12 0.0000 4 165 960 1440 2400 append()\001 +4 0 0 50 -1 16 12 0.0000 4 150 1260 1260 2700 write functions\001 +4 0 0 50 -1 16 12 0.0000 4 150 1230 1260 315 read functions\001 +4 0 0 50 -1 12 12 0.0000 4 165 600 1440 540 dup()\001 +4 0 0 50 -1 12 12 0.0000 4 165 600 1440 735 get()\001 +4 0 0 50 -1 12 12 0.0000 4 165 1200 1440 930 get_line()\001 +4 0 0 50 -1 12 12 0.0000 4 165 1440 1440 1125 get_in_len()\001 +4 1 0 50 -1 12 12 0.0000 4 150 1080 4500 765 forward()\001 +4 0 0 50 -1 16 12 0.0000 4 150 1260 6120 495 write functions\001 +4 0 0 50 -1 12 12 0.0000 4 150 720 6300 1110 send()\001 +4 0 0 50 -1 12 12 0.0000 4 165 1560 6255 2205 get_out_len()\001 +4 0 0 50 -1 16 12 0.0000 4 150 1230 6120 2520 read functions\001 +4 1 0 50 -1 16 12 0.0000 4 150 1650 4500 315 both side functions\001 +4 1 0 50 -1 12 12 0.0000 4 150 1080 4500 540 is_full()\001 diff --git a/doc/lua-api/_static/channel.png b/doc/lua-api/_static/channel.png Binary files differnew file mode 100644 index 0000000..e12a26e --- /dev/null +++ b/doc/lua-api/_static/channel.png diff --git a/doc/lua-api/conf.py b/doc/lua-api/conf.py new file mode 100644 index 0000000..fd7e0ee --- /dev/null +++ b/doc/lua-api/conf.py @@ -0,0 +1,242 @@ +# -*- coding: utf-8 -*- +# +# haproxy-lua documentation build configuration file, created by +# sphinx-quickstart on Tue Mar 10 11:15:09 2015. +# +# This file is execfile()d with the current directory set to its containing dir. +# +# Note that not all possible configuration values are present in this +# autogenerated file. +# +# All configuration values have a default; values that are commented out +# serve to show the default. + +import sys, os + +# If extensions (or modules to document with autodoc) are in another directory, +# add these directories to sys.path here. If the directory is relative to the +# documentation root, use os.path.abspath to make it absolute, like shown here. +#sys.path.insert(0, os.path.abspath('.')) + +# -- General configuration ----------------------------------------------------- + +# If your documentation needs a minimal Sphinx version, state it here. +#needs_sphinx = '1.0' + +# Add any Sphinx extension module names here, as strings. They can be extensions +# coming with Sphinx (named 'sphinx.ext.*') or your custom ones. +extensions = [] + +# Add any paths that contain templates here, relative to this directory. +templates_path = ['_templates'] + +# The suffix of source filenames. +source_suffix = '.rst' + +# The encoding of source files. +#source_encoding = 'utf-8-sig' + +# The master toctree document. +master_doc = 'index' + +# General information about the project. +project = u'haproxy-lua' +copyright = u'2015, Thierry FOURNIER' + +# The version info for the project you're documenting, acts as replacement for +# |version| and |release|, also used in various other places throughout the +# built documents. +# +# The short X.Y version. +version = '1.0' +# The full version, including alpha/beta/rc tags. +release = '1.0' + +# The language for content autogenerated by Sphinx. Refer to documentation +# for a list of supported languages. +#language = None + +# There are two options for replacing |today|: either, you set today to some +# non-false value, then it is used: +#today = '' +# Else, today_fmt is used as the format for a strftime call. +#today_fmt = '%B %d, %Y' + +# List of patterns, relative to source directory, that match files and +# directories to ignore when looking for source files. +exclude_patterns = ['_build'] + +# The reST default role (used for this markup: `text`) to use for all documents. +#default_role = None + +# If true, '()' will be appended to :func: etc. cross-reference text. +#add_function_parentheses = True + +# If true, the current module name will be prepended to all description +# unit titles (such as .. function::). +#add_module_names = True + +# If true, sectionauthor and moduleauthor directives will be shown in the +# output. They are ignored by default. +#show_authors = False + +# The name of the Pygments (syntax highlighting) style to use. +pygments_style = 'sphinx' + +# A list of ignored prefixes for module index sorting. +#modindex_common_prefix = [] + + +# -- Options for HTML output --------------------------------------------------- + +# The theme to use for HTML and HTML Help pages. See the documentation for +# a list of builtin themes. +html_theme = 'default' + +# Theme options are theme-specific and customize the look and feel of a theme +# further. For a list of options available for each theme, see the +# documentation. +#html_theme_options = {} + +# Add any paths that contain custom themes here, relative to this directory. +#html_theme_path = [] + +# The name for this set of Sphinx documents. If None, it defaults to +# "<project> v<release> documentation". +#html_title = None + +# A shorter title for the navigation bar. Default is the same as html_title. +#html_short_title = None + +# The name of an image file (relative to this directory) to place at the top +# of the sidebar. +#html_logo = None + +# The name of an image file (within the static path) to use as favicon of the +# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 +# pixels large. +#html_favicon = None + +# Add any paths that contain custom static files (such as style sheets) here, +# relative to this directory. They are copied after the builtin static files, +# so a file named "default.css" will overwrite the builtin "default.css". +html_static_path = ['_static'] + +# If not '', a 'Last updated on:' timestamp is inserted at every page bottom, +# using the given strftime format. +#html_last_updated_fmt = '%b %d, %Y' + +# If true, SmartyPants will be used to convert quotes and dashes to +# typographically correct entities. +#html_use_smartypants = True + +# Custom sidebar templates, maps document names to template names. +#html_sidebars = {} + +# Additional templates that should be rendered to pages, maps page names to +# template names. +#html_additional_pages = {} + +# If false, no module index is generated. +#html_domain_indices = True + +# If false, no index is generated. +#html_use_index = True + +# If true, the index is split into individual pages for each letter. +#html_split_index = False + +# If true, links to the reST sources are added to the pages. +#html_show_sourcelink = True + +# If true, "Created using Sphinx" is shown in the HTML footer. Default is True. +#html_show_sphinx = True + +# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. +#html_show_copyright = True + +# If true, an OpenSearch description file will be output, and all pages will +# contain a <link> tag referring to it. The value of this option must be the +# base URL from which the finished HTML is served. +#html_use_opensearch = '' + +# This is the file name suffix for HTML files (e.g. ".xhtml"). +#html_file_suffix = None + +# Output file base name for HTML help builder. +htmlhelp_basename = 'haproxy-luadoc' + + +# -- Options for LaTeX output -------------------------------------------------- + +latex_elements = { +# The paper size ('letterpaper' or 'a4paper'). +#'papersize': 'letterpaper', + +# The font size ('10pt', '11pt' or '12pt'). +#'pointsize': '10pt', + +# Additional stuff for the LaTeX preamble. +#'preamble': '', +} + +# Grouping the document tree into LaTeX files. List of tuples +# (source start file, target name, title, author, documentclass [howto/manual]). +latex_documents = [ + ('index', 'haproxy-lua.tex', u'haproxy-lua Documentation', + u'Thierry FOURNIER', 'manual'), +] + +# The name of an image file (relative to this directory) to place at the top of +# the title page. +#latex_logo = None + +# For "manual" documents, if this is true, then toplevel headings are parts, +# not chapters. +#latex_use_parts = False + +# If true, show page references after internal links. +#latex_show_pagerefs = False + +# If true, show URL addresses after external links. +#latex_show_urls = False + +# Documents to append as an appendix to all manuals. +#latex_appendices = [] + +# If false, no module index is generated. +#latex_domain_indices = True + + +# -- Options for manual page output -------------------------------------------- + +# One entry per manual page. List of tuples +# (source start file, name, description, authors, manual section). +man_pages = [ + ('index', 'haproxy-lua', u'haproxy-lua Documentation', + [u'Thierry FOURNIER'], 1) +] + +# If true, show URL addresses after external links. +#man_show_urls = False + + +# -- Options for Texinfo output ------------------------------------------------ + +# Grouping the document tree into Texinfo files. List of tuples +# (source start file, target name, title, author, +# dir menu entry, description, category) +texinfo_documents = [ + ('index', 'haproxy-lua', u'haproxy-lua Documentation', + u'Thierry FOURNIER', 'haproxy-lua', 'One line description of project.', + 'Miscellaneous'), +] + +# Documents to append as an appendix to all manuals. +#texinfo_appendices = [] + +# If false, no module index is generated. +#texinfo_domain_indices = True + +# How to display URL addresses: 'footnote', 'no', or 'inline'. +#texinfo_show_urls = 'footnote' diff --git a/doc/lua-api/index.rst b/doc/lua-api/index.rst new file mode 100644 index 0000000..15bdd7d --- /dev/null +++ b/doc/lua-api/index.rst @@ -0,0 +1,3797 @@ +.. toctree:: + :maxdepth: 2 + + +How Lua runs in HAProxy +======================= + +HAProxy Lua running contexts +---------------------------- + +The Lua code executed in HAProxy can be processed in 2 main modes. The first one +is the **initialisation mode**, and the second is the **runtime mode**. + +* In the **initialisation mode**, we can perform DNS solves, but we cannot + perform socket I/O. In this initialisation mode, HAProxy still blocked during + the execution of the Lua program. + +* In the **runtime mode**, we cannot perform DNS solves, but we can use sockets. + The execution of the Lua code is multiplexed with the requests processing, so + the Lua code seems to be run in blocking, but it is not the case. + +The Lua code is loaded in one or more files. These files contains main code and +functions. Lua have 7 execution context. + +1. The Lua file **body context**. It is executed during the load of the Lua file + in the HAProxy `[global]` section with the directive `lua-load`. It is + executed in initialisation mode. This section is use for configuring Lua + bindings in HAProxy. + +2. The Lua **init context**. It is a Lua function executed just after the + HAProxy configuration parsing. The execution is in initialisation mode. In + this context the HAProxy environment are already initialized. It is useful to + check configuration, or initializing socket connections or tasks. These + functions are declared in the body context with the Lua function + `core.register_init()`. The prototype of the function is a simple function + without return value and without parameters, like this: `function fcn()`. + +3. The Lua **task context**. It is a Lua function executed after the start + of the HAProxy scheduler, and just after the declaration of the task with the + Lua function `core.register_task()`. This context can be concurrent with the + traffic processing. It is executed in runtime mode. The prototype of the + function is a simple function without return value and without parameters, + like this: `function fcn()`. + +4. The **action context**. It is a Lua function conditionally executed. These + actions are registered by the Lua directives "`core.register_action()`". The + prototype of the Lua called function is a function with doesn't returns + anything and that take an object of class TXN as entry. `function fcn(txn)`. + +5. The **sample-fetch context**. This function takes a TXN object as entry + argument and returns a string. These types of function cannot execute any + blocking function. They are useful to aggregate some of original HAProxy + sample-fetches and return the result. The prototype of the function is + `function string fcn(txn)`. These functions can be registered with the Lua + function `core.register_fetches()`. Each declared sample-fetch is prefixed by + the string "lua.". + + .. note:: + It is possible that this function cannot found the required data in the + original HAProxy sample-fetches, in this case, it cannot return the + result. This case is not yet supported + +6. The **converter context**. It is a Lua function that takes a string as input + and returns another string as output. These types of function are stateless, + it cannot access to any context. They don't execute any blocking function. + The call prototype is `function string fcn(string)`. This function can be + registered with the Lua function `core.register_converters()`. Each declared + converter is prefixed by the string "lua.". + +7. The **filter context**: It is a Lua object based on a class defining filter + callback functions. Lua filters are registered using + `core.register_filter()`. Each declared filter is prefixed by the string + "lua.". + + .. warning:: + The Lua filter support is highly experimental. The API is still unstable + and may change without notice. No backward compatibility should be + expected for now. Use it with an extreme caution and report any issue or + comment about it. The feature was unveiled to improve it and to adapt it + to real usages. + + +HAProxy Lua Hello world +----------------------- + +HAProxy configuration file (`hello_world.conf`): + +:: + + global + lua-load hello_world.lua + + listen proxy + bind 127.0.0.1:10001 + tcp-request inspect-delay 1s + tcp-request content use-service lua.hello_world + +HAProxy Lua file (`hello_world.lua`): + +.. code-block:: lua + + core.register_service("hello_world", "tcp", function(applet) + applet:send("hello world\n") + end) + +How to start HAProxy for testing this configuration: + +:: + + ./haproxy -f hello_world.conf + +On other terminal, you can test with telnet: + +:: + + #:~ telnet 127.0.0.1 10001 + hello world + +Core class +========== + +.. js:class:: core + + The "core" class contains all the HAProxy core functions. These function are + useful for the controlling the execution flow, registering hooks, manipulating + global maps or ACL, ... + + "core" class is basically provided with HAProxy. No `require` line is + required to uses these function. + + The "core" class is static, it is not possible to create a new object of this + type. + +.. js:attribute:: core.emerg + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "emergency" (0). + +.. js:attribute:: core.alert + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "alert" (1). + +.. js:attribute:: core.crit + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "critical" (2). + +.. js:attribute:: core.err + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "error" (3). + +.. js:attribute:: core.warning + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "warning" (4). + +.. js:attribute:: core.notice + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "notice" (5). + +.. js:attribute:: core.info + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "info" (6). + +.. js:attribute:: core.debug + + :returns: integer + + This attribute is an integer, it contains the value of the loglevel "debug" (7). + +.. js:attribute:: core.proxies + + **context**: task, action, sample-fetch, converter + + This attribute is a table of declared proxies (frontend and backends). Each + proxy give an access to his list of listeners and servers. The table is + indexed by proxy name, and each entry is of type :ref:`proxy_class`. + + .. Warning:: + if you are declared frontend and backend with the same name, only one of + these are listed. + + :see: :js:attr:`core.backends` + :see: :js:attr:`core.frontends` + +.. js:attribute:: core.backends + + **context**: task, action, sample-fetch, converter + + This attribute is a table of declared proxies with backend capability. Each + proxy give an access to his list of listeners and servers. The table is + indexed by the backend name, and each entry is of type :ref:`proxy_class`. + + :see: :js:attr:`core.proxies` + :see: :js:attr:`core.frontends` + +.. js:attribute:: core.frontends + + **context**: task, action, sample-fetch, converter + + This attribute is a table of declared proxies with frontend capability. Each + proxy give an access to his list of listeners and servers. The table is + indexed by the frontend name, and each entry is of type :ref:`proxy_class`. + + :see: :js:attr:`core.proxies` + :see: :js:attr:`core.backends` + +.. js:attribute:: core.thread + + **context**: task, action, sample-fetch, converter, applet + + This variable contains the executing thread number starting at 1. 0 is a + special case for the common lua context. So, if thread is 0, Lua scope is + shared by all threads, otherwise the scope is dedicated to a single thread. + A program which needs to execute some parts exactly once regardless of the + number of threads can check that core.thread is 0 or 1. + +.. js:function:: core.log(loglevel, msg) + + **context**: body, init, task, action, sample-fetch, converter + + This function sends a log. The log is sent, according with the HAProxy + configuration file, on the default syslog server if it is configured and on + the stderr if it is allowed. + + :param integer loglevel: Is the log level associated with the message. It is a + number between 0 and 7. + :param string msg: The log content. + :see: :js:attr:`core.emerg`, :js:attr:`core.alert`, :js:attr:`core.crit`, + :js:attr:`core.err`, :js:attr:`core.warning`, :js:attr:`core.notice`, + :js:attr:`core.info`, :js:attr:`core.debug` (log level definitions) + :see: :js:func:`core.Debug` + :see: :js:func:`core.Info` + :see: :js:func:`core.Warning` + :see: :js:func:`core.Alert` + +.. js:function:: core.Debug(msg) + + **context**: body, init, task, action, sample-fetch, converter + + :param string msg: The log content. + :see: :js:func:`core.log` + + Does the same job than: + +.. code-block:: lua + + function Debug(msg) + core.log(core.debug, msg) + end +.. + +.. js:function:: core.Info(msg) + + **context**: body, init, task, action, sample-fetch, converter + + :param string msg: The log content. + :see: :js:func:`core.log` + +.. code-block:: lua + + function Info(msg) + core.log(core.info, msg) + end +.. + +.. js:function:: core.Warning(msg) + + **context**: body, init, task, action, sample-fetch, converter + + :param string msg: The log content. + :see: :js:func:`core.log` + +.. code-block:: lua + + function Warning(msg) + core.log(core.warning, msg) + end +.. + +.. js:function:: core.Alert(msg) + + **context**: body, init, task, action, sample-fetch, converter + + :param string msg: The log content. + :see: :js:func:`core.log` + +.. code-block:: lua + + function Alert(msg) + core.log(core.alert, msg) + end +.. + +.. js:function:: core.add_acl(filename, key) + + **context**: init, task, action, sample-fetch, converter + + Add the ACL *key* in the ACLs list referenced by the file *filename*. + + :param string filename: the filename that reference the ACL entries. + :param string key: the key which will be added. + +.. js:function:: core.del_acl(filename, key) + + **context**: init, task, action, sample-fetch, converter + + Delete the ACL entry referenced by the key *key* in the list of ACLs + referenced by *filename*. + + :param string filename: the filename that reference the ACL entries. + :param string key: the key which will be deleted. + +.. js:function:: core.del_map(filename, key) + + **context**: init, task, action, sample-fetch, converter + + Delete the map entry indexed with the specified key in the list of maps + referenced by his filename. + + :param string filename: the filename that reference the map entries. + :param string key: the key which will be deleted. + +.. js:function:: core.get_info() + + **context**: body, init, task, action, sample-fetch, converter + + Returns HAProxy core information. We can found information like the uptime, + the pid, memory pool usage, tasks number, ... + + These information are also returned by the management socket via the command + "show info". See the management socket documentation for more information + about the content of these variables. + + :returns: an array of values. + +.. js:function:: core.now() + + **context**: body, init, task, action + + This function returns the current time. The time returned is fixed by the + HAProxy core and assures than the hour will be monotonic and that the system + call 'gettimeofday' will not be called too. The time is refreshed between each + Lua execution or resume, so two consecutive call to the function "now" will + probably returns the same result. + + :returns: a table which contains two entries "sec" and "usec". "sec" + contains the current at the epoch format, and "usec" contains the + current microseconds. + +.. js:function:: core.http_date(date) + + **context**: body, init, task, action + + This function take a string representing http date, and returns an integer + containing the corresponding date with a epoch format. A valid http date + me respect the format IMF, RFC850 or ASCTIME. + + :param string date: a date http-date formatted + :returns: integer containing epoch date + :see: :js:func:`core.imf_date`. + :see: :js:func:`core.rfc850_date`. + :see: :js:func:`core.asctime_date`. + :see: https://tools.ietf.org/html/rfc7231#section-7.1.1.1 + +.. js:function:: core.imf_date(date) + + **context**: body, init, task, action + + This function take a string representing IMF date, and returns an integer + containing the corresponding date with a epoch format. + + :param string date: a date IMF formatted + :returns: integer containing epoch date + :see: https://tools.ietf.org/html/rfc7231#section-7.1.1.1 + + The IMF format is like this: + +.. code-block:: text + + Sun, 06 Nov 1994 08:49:37 GMT +.. + +.. js:function:: core.rfc850_date(date) + + **context**: body, init, task, action + + This function take a string representing RFC850 date, and returns an integer + containing the corresponding date with a epoch format. + + :param string date: a date RFC859 formatted + :returns: integer containing epoch date + :see: https://tools.ietf.org/html/rfc7231#section-7.1.1.1 + + The RFC850 format is like this: + +.. code-block:: text + + Sunday, 06-Nov-94 08:49:37 GMT +.. + +.. js:function:: core.asctime_date(date) + + **context**: body, init, task, action + + This function take a string representing ASCTIME date, and returns an integer + containing the corresponding date with a epoch format. + + :param string date: a date ASCTIME formatted + :returns: integer containing epoch date + :see: https://tools.ietf.org/html/rfc7231#section-7.1.1.1 + + The ASCTIME format is like this: + +.. code-block:: text + + Sun Nov 6 08:49:37 1994 +.. + +.. js:function:: core.rfc850_date(date) + + **context**: body, init, task, action + + This function take a string representing http date, and returns an integer + containing the corresponding date with a epoch format. + + :param string date: a date http-date formatted + +.. js:function:: core.asctime_date(date) + + **context**: body, init, task, action + + This function take a string representing http date, and returns an integer + containing the corresponding date with a epoch format. + + :param string date: a date http-date formatted + +.. js:function:: core.msleep(milliseconds) + + **context**: body, init, task, action + + The `core.msleep()` stops the Lua execution between specified milliseconds. + + :param integer milliseconds: the required milliseconds. + +.. js:attribute:: core.proxies + + **context**: body, init, task, action, sample-fetch, converter + + Proxies is a table containing the list of all proxies declared in the + configuration file. The table is indexed by the proxy name, and each entry + of the proxies table is an object of type :ref:`proxy_class`. + + .. warning:: + if you have declared a frontend and backend with the same name, only one of + these are listed. + +.. js:function:: core.register_action(name, actions, func [, nb_args]) + + **context**: body + + Register a Lua function executed as action. All the registered action can be + used in HAProxy with the prefix "lua.". An action gets a TXN object class as + input. + + :param string name: is the name of the converter. + :param table actions: is a table of string describing the HAProxy actions who + want to register to. The expected actions are 'tcp-req', + 'tcp-res', 'http-req' or 'http-res'. + :param integer nb_args: is the expected number of argument for the action. + By default the value is 0. + :param function func: is the Lua function called to work as converter. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function(txn [, arg1 [, arg2]]) +.. + + * **txn** (:ref:`txn_class`): this is a TXN object used for manipulating the + current request or TCP stream. + + * **argX**: this is argument provided through the HAProxy configuration file. + + Here, an example of action registration. The action just send an 'Hello world' + in the logs. + +.. code-block:: lua + + core.register_action("hello-world", { "tcp-req", "http-req" }, function(txn) + txn:Info("Hello world") + end) +.. + + This example code is used in HAProxy configuration like this: + +:: + + frontend tcp_frt + mode tcp + tcp-request content lua.hello-world + + frontend http_frt + mode http + http-request lua.hello-world +.. + + A second example using arguments + +.. code-block:: lua + + function hello_world(txn, arg) + txn:Info("Hello world for " .. arg) + end + core.register_action("hello-world", { "tcp-req", "http-req" }, hello_world, 2) +.. + + This example code is used in HAProxy configuration like this: + +:: + + frontend tcp_frt + mode tcp + tcp-request content lua.hello-world everybody +.. +.. js:function:: core.register_converters(name, func) + + **context**: body + + Register a Lua function executed as converter. All the registered converters + can be used in HAProxy with the prefix "lua.". An converter get a string as + input and return a string as output. The registered function can take up to 9 + values as parameter. All the value are strings. + + :param string name: is the name of the converter. + :param function func: is the Lua function called to work as converter. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function(str, [p1 [, p2 [, ... [, p5]]]]) +.. + + * **str** (*string*): this is the input value automatically converted in + string. + * **p1** .. **p5** (*string*): this is a list of string arguments declared in + the HAProxy configuration file. The number of arguments doesn't exceed 5. + The order and the nature of these is conventionally choose by the + developer. + +.. js:function:: core.register_fetches(name, func) + + **context**: body + + Register a Lua function executed as sample fetch. All the registered sample + fetch can be used in HAProxy with the prefix "lua.". A Lua sample fetch + return a string as output. The registered function can take up to 9 values as + parameter. All the value are strings. + + :param string name: is the name of the converter. + :param function func: is the Lua function called to work as sample fetch. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + string function(txn, [p1 [, p2 [, ... [, p5]]]]) +.. + + * **txn** (:ref:`txn_class`): this is the txn object associated with the current + request. + * **p1** .. **p5** (*string*): this is a list of string arguments declared in + the HAProxy configuration file. The number of arguments doesn't exceed 5. + The order and the nature of these is conventionally choose by the + developer. + * **Returns**: A string containing some data, or nil if the value cannot be + returned now. + + lua example code: + +.. code-block:: lua + + core.register_fetches("hello", function(txn) + return "hello" + end) +.. + + HAProxy example configuration: + +:: + + frontend example + http-request redirect location /%[lua.hello] + +.. js:function:: core.register_filter(name, Flt, func) + + **context**: body + + Register a Lua function used to declare a filter. All the registered filters + can by used in HAProxy with the prefix "lua.". + + :param string name: is the name of the filter. + :param table Flt: is a Lua class containing the filter definition (id, flags, + callbacks). + :param function func: is the Lua function called to create the Lua filter. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function(flt, args) +.. + + * **flt** : Is a filter object based on the class provided in + :js:func:`core.register_filter()` function. + + * **args**: Is a table of strings containing all arguments provided through + the HAProxy configuration file, on the filter line. + + It must return the filter to use or nil to ignore it. Here, an example of + filter registration. + +.. code-block:: lua + + core.register_filter("my-filter", MyFilter, function(flt, args) + flt.args = args -- Save arguments + return flt + end) +.. + + This example code is used in HAProxy configuration like this: + +:: + + frontend http + mode http + filter lua.my-filter arg1 arg2 arg3 +.. + + :see: :js:class:`Filter` + +.. js:function:: core.register_service(name, mode, func) + + **context**: body + + Register a Lua function executed as a service. All the registered service can + be used in HAProxy with the prefix "lua.". A service gets an object class as + input according with the required mode. + + :param string name: is the name of the converter. + :param string mode: is string describing the required mode. Only 'tcp' or + 'http' are allowed. + :param function func: is the Lua function called to work as converter. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function(applet) +.. + + * **applet** *applet* will be a :ref:`applettcp_class` or a + :ref:`applethttp_class`. It depends the type of registered applet. An applet + registered with the 'http' value for the *mode* parameter will gets a + :ref:`applethttp_class`. If the *mode* value is 'tcp', the applet will gets + a :ref:`applettcp_class`. + + .. warning:: + Applets of type 'http' cannot be called from 'tcp-*' rulesets. Only the + 'http-*' rulesets are authorized, this means that is not possible to call + an HTTP applet from a proxy in tcp mode. Applets of type 'tcp' can be + called from anywhere. + + Here, an example of service registration. The service just send an 'Hello world' + as an http response. + +.. code-block:: lua + + core.register_service("hello-world", "http", function(applet) + local response = "Hello World !" + applet:set_status(200) + applet:add_header("content-length", string.len(response)) + applet:add_header("content-type", "text/plain") + applet:start_response() + applet:send(response) + end) +.. + + This example code is used in HAProxy configuration like this: + +:: + + frontend example + http-request use-service lua.hello-world + +.. js:function:: core.register_init(func) + + **context**: body + + Register a function executed after the configuration parsing. This is useful + to check any parameters. + + :param function func: is the Lua function called to work as initializer. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function() +.. + + It takes no input, and no output is expected. + +.. js:function:: core.register_task(func) + + **context**: body, init, task, action, sample-fetch, converter + + Register and start independent task. The task is started when the HAProxy + main scheduler starts. For example this type of tasks can be executed to + perform complex health checks. + + :param function func: is the Lua function called to work as initializer. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function() +.. + + It takes no input, and no output is expected. + +.. js:function:: core.register_cli([path], usage, func) + + **context**: body + + Register and start independent task. The task is started when the HAProxy + main scheduler starts. For example this type of tasks can be executed to + perform complex health checks. + + :param array path: is the sequence of word for which the cli execute the Lua + binding. + :param string usage: is the usage message displayed in the help. + :param function func: is the Lua function called to handle the CLI commands. + + The prototype of the Lua function used as argument is: + +.. code-block:: lua + + function(AppletTCP, [arg1, [arg2, [...]]]) +.. + + I/O are managed with the :ref:`applettcp_class` object. Args are given as + parameter. The args embed the registered path. If the path is declared like + this: + +.. code-block:: lua + + core.register_cli({"show", "ssl", "stats"}, "Display SSL stats..", function(applet, arg1, arg2, arg3, arg4, arg5) + end) +.. + + And we execute this in the prompt: + +.. code-block:: text + + > prompt + > show ssl stats all +.. + + Then, arg1, arg2 and arg3 will contains respectively "show", "ssl" and "stats". + arg4 will contain "all". arg5 contains nil. + +.. js:function:: core.set_nice(nice) + + **context**: task, action, sample-fetch, converter + + Change the nice of the current task or current session. + + :param integer nice: the nice value, it must be between -1024 and 1024. + +.. js:function:: core.set_map(filename, key, value) + + **context**: init, task, action, sample-fetch, converter + + Set the value *value* associated to the key *key* in the map referenced by + *filename*. + + :param string filename: the Map reference + :param string key: the key to set or replace + :param string value: the associated value + +.. js:function:: core.sleep(int seconds) + + **context**: body, init, task, action + + The `core.sleep()` functions stop the Lua execution between specified seconds. + + :param integer seconds: the required seconds. + +.. js:function:: core.tcp() + + **context**: init, task, action + + This function returns a new object of a *socket* class. + + :returns: A :ref:`socket_class` object. + +.. js:function:: core.httpclient() + + **context**: init, task, action + + This function returns a new object of a *httpclient* class. + + :returns: A :ref:`httpclient_class` object. + +.. js:function:: core.concat() + + **context**: body, init, task, action, sample-fetch, converter + + This function returns a new concat object. + + :returns: A :ref:`concat_class` object. + +.. js:function:: core.done(data) + + **context**: body, init, task, action, sample-fetch, converter + + :param any data: Return some data for the caller. It is useful with + sample-fetches and sample-converters. + + Immediately stops the current Lua execution and returns to the caller which + may be a sample fetch, a converter or an action and returns the specified + value (ignored for actions and init). It is used when the LUA process finishes + its work and wants to give back the control to HAProxy without executing the + remaining code. It can be seen as a multi-level "return". + +.. js:function:: core.yield() + + **context**: task, action, sample-fetch, converter + + Give back the hand at the HAProxy scheduler. It is used when the LUA + processing consumes a lot of processing time. + +.. js:function:: core.parse_addr(address) + + **context**: body, init, task, action, sample-fetch, converter + + :param network: is a string describing an ipv4 or ipv6 address and optionally + its network length, like this: "127.0.0.1/8" or "aaaa::1234/32". + :returns: a userdata containing network or nil if an error occurs. + + Parse ipv4 or ipv6 addresses and its facultative associated network. + +.. js:function:: core.match_addr(addr1, addr2) + + **context**: body, init, task, action, sample-fetch, converter + + :param addr1: is an address created with "core.parse_addr". + :param addr2: is an address created with "core.parse_addr". + :returns: boolean, true if the network of the addresses match, else returns + false. + + Match two networks. For example "127.0.0.1/32" matches "127.0.0.0/8". The order + of network is not important. + +.. js:function:: core.tokenize(str, separators [, noblank]) + + **context**: body, init, task, action, sample-fetch, converter + + This function is useful for tokenizing an entry, or splitting some messages. + :param string str: The string which will be split. + :param string separators: A string containing a list of separators. + :param boolean noblank: Ignore empty entries. + :returns: an array of string. + + For example: + +.. code-block:: lua + + local array = core.tokenize("This function is useful, for tokenizing an entry.", "., ", true) + print_r(array) +.. + + Returns this array: + +.. code-block:: text + + (table) table: 0x21c01e0 [ + 1: (string) "This" + 2: (string) "function" + 3: (string) "is" + 4: (string) "useful" + 5: (string) "for" + 6: (string) "tokenizing" + 7: (string) "an" + 8: (string) "entry" + ] +.. + +.. _proxy_class: + +Proxy class +============ + +.. js:class:: Proxy + + This class provides a way for manipulating proxy and retrieving information + like statistics. + +.. js:attribute:: Proxy.name + + Contain the name of the proxy. + +.. js:attribute:: Proxy.uuid + + Contain the unique identifier of the proxy. + +.. js:attribute:: Proxy.servers + + Contain a table with the attached servers. The table is indexed by server + name, and each server entry is an object of type :ref:`server_class`. + +.. js:attribute:: Proxy.stktable + + Contains a stick table object attached to the proxy. + +.. js:attribute:: Proxy.listeners + + Contain a table with the attached listeners. The table is indexed by listener + name, and each each listeners entry is an object of type + :ref:`listener_class`. + +.. js:function:: Proxy.pause(px) + + Pause the proxy. See the management socket documentation for more information. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + +.. js:function:: Proxy.resume(px) + + Resume the proxy. See the management socket documentation for more + information. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + +.. js:function:: Proxy.stop(px) + + Stop the proxy. See the management socket documentation for more information. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + +.. js:function:: Proxy.shut_bcksess(px) + + Kill the session attached to a backup server. See the management socket + documentation for more information. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + +.. js:function:: Proxy.get_cap(px) + + Returns a string describing the capabilities of the proxy. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + :returns: a string "frontend", "backend", "proxy" or "ruleset". + +.. js:function:: Proxy.get_mode(px) + + Returns a string describing the mode of the current proxy. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + :returns: a string "tcp", "http", "health" or "unknown" + +.. js:function:: Proxy.get_stats(px) + + Returns a table containing the proxy statistics. The statistics returned are + not the same if the proxy is frontend or a backend. + + :param class_proxy px: A :ref:`proxy_class` which indicates the manipulated + proxy. + :returns: a key/value table containing stats + +.. _server_class: + +Server class +============ + +.. js:class:: Server + + This class provides a way for manipulating servers and retrieving information. + +.. js:attribute:: Server.name + + Contain the name of the server. + +.. js:attribute:: Server.puid + + Contain the proxy unique identifier of the server. + +.. js:function:: Server.is_draining(sv) + + Return true if the server is currently draining sticky connections. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :returns: a boolean + +.. js:function:: Server.set_maxconn(sv, weight) + + Dynamically change the maximum connections of the server. See the management + socket documentation for more information about the format of the string. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :param string maxconn: A string describing the server maximum connections. + +.. js:function:: Server.get_maxconn(sv, weight) + + This function returns an integer representing the server maximum connections. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :returns: an integer. + +.. js:function:: Server.set_weight(sv, weight) + + Dynamically change the weight of the server. See the management socket + documentation for more information about the format of the string. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :param string weight: A string describing the server weight. + +.. js:function:: Server.get_weight(sv) + + This function returns an integer representing the server weight. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :returns: an integer. + +.. js:function:: Server.set_addr(sv, addr[, port]) + + Dynamically change the address of the server. See the management socket + documentation for more information about the format of the string. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :param string addr: A string describing the server address. + +.. js:function:: Server.get_addr(sv) + + Returns a string describing the address of the server. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :returns: A string + +.. js:function:: Server.get_stats(sv) + + Returns server statistics. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + :returns: a key/value table containing stats + +.. js:function:: Server.shut_sess(sv) + + Shutdown all the sessions attached to the server. See the management socket + documentation for more information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.set_drain(sv) + + Drain sticky sessions. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.set_maint(sv) + + Set maintenance mode. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.set_ready(sv) + + Set normal mode. See the management socket documentation for more information + about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.check_enable(sv) + + Enable health checks. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.check_disable(sv) + + Disable health checks. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.check_force_up(sv) + + Force health-check up. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.check_force_nolb(sv) + + Force health-check nolb mode. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.check_force_down(sv) + + Force health-check down. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.agent_enable(sv) + + Enable agent check. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.agent_disable(sv) + + Disable agent check. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.agent_force_up(sv) + + Force agent check up. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. js:function:: Server.agent_force_down(sv) + + Force agent check down. See the management socket documentation for more + information about this function. + + :param class_server sv: A :ref:`server_class` which indicates the manipulated + server. + +.. _listener_class: + +Listener class +============== + +.. js:function:: Listener.get_stats(ls) + + Returns server statistics. + + :param class_listener ls: A :ref:`listener_class` which indicates the + manipulated listener. + :returns: a key/value table containing stats + +.. _concat_class: + +Concat class +============ + +.. js:class:: Concat + + This class provides a fast way for string concatenation. The way using native + Lua concatenation like the code below is slow for some reasons. + +.. code-block:: lua + + str = "string1" + str = str .. ", string2" + str = str .. ", string3" +.. + + For each concatenation, Lua: + * allocate memory for the result, + * catenate the two string copying the strings in the new memory block, + * free the old memory block containing the string which is no longer used. + This process does many memory move, allocation and free. In addition, the + memory is not really freed, it is just mark mark as unused and wait for the + garbage collector. + + The Concat class provide an alternative way to concatenate strings. It uses + the internal Lua mechanism (it does not allocate memory), but it doesn't copy + the data more than once. + + On my computer, the following loops spends 0.2s for the Concat method and + 18.5s for the pure Lua implementation. So, the Concat class is about 1000x + faster than the embedded solution. + +.. code-block:: lua + + for j = 1, 100 do + c = core.concat() + for i = 1, 20000 do + c:add("#####") + end + end +.. + +.. code-block:: lua + + for j = 1, 100 do + c = "" + for i = 1, 20000 do + c = c .. "#####" + end + end +.. + +.. js:function:: Concat.add(concat, string) + + This function adds a string to the current concatenated string. + + :param class_concat concat: A :ref:`concat_class` which contains the currently + built string. + :param string string: A new string to concatenate to the current built + string. + +.. js:function:: Concat.dump(concat) + + This function returns the concatenated string. + + :param class_concat concat: A :ref:`concat_class` which contains the currently + built string. + :returns: the concatenated string + +.. _fetches_class: + +Fetches class +============= + +.. js:class:: Fetches + + This class contains a lot of internal HAProxy sample fetches. See the + HAProxy "configuration.txt" documentation for more information about her + usage. They are the chapters 7.3.2 to 7.3.6. + + .. warning:: + some sample fetches are not available in some context. These limitations + are specified in this documentation when they're useful. + + :see: :js:attr:`TXN.f` + :see: :js:attr:`TXN.sf` + + Fetches are useful for: + + * get system time, + * get environment variable, + * get random numbers, + * known backend status like the number of users in queue or the number of + connections established, + * client information like ip source or destination, + * deal with stick tables, + * Established SSL information, + * HTTP information like headers or method. + +.. code-block:: lua + + function action(txn) + -- Get source IP + local clientip = txn.f:src() + end +.. + +.. _converters_class: + +Converters class +================ + +.. js:class:: Converters + + This class contains a lot of internal HAProxy sample converters. See the + HAProxy documentation "configuration.txt" for more information about her + usage. Its the chapter 7.3.1. + + :see: :js:attr:`TXN.c` + :see: :js:attr:`TXN.sc` + + Converters provides statefull transformation. They are useful for: + + * converting input to base64, + * applying hash on input string (djb2, crc32, sdbm, wt6), + * format date, + * json escape, + * extracting preferred language comparing two lists, + * turn to lower or upper chars, + * deal with stick tables. + +.. _channel_class: + +Channel class +============= + +.. js:class:: Channel + + **context**: action, sample-fetch, convert, filter + + HAProxy uses two buffers for the processing of the requests. The first one is + used with the request data (from the client to the server) and the second is + used for the response data (from the server to the client). + + Each buffer contains two types of data. The first type is the incoming data + waiting for a processing. The second part is the outgoing data already + processed. Usually, the incoming data is processed, after it is tagged as + outgoing data, and finally it is sent. The following functions provides tools + for manipulating these data in a buffer. + + The following diagram shows where the channel class function are applied. + + .. image:: _static/channel.png + + .. warning:: + It is not possible to read from the response in request action, and it is + not possible to read from the request channel in response action. + + .. warning:: + It is forbidden to alter the Channels buffer from HTTP contexts. So only + :js:func:`Channel.input`, :js:func:`Channel.output`, + :js:func:`Channel.may_recv`, :js:func:`Channel.is_full` and + :js:func:`Channel.is_resp` can be called from an HTTP conetext. + + All the functions provided by this class are available in the + **sample-fetches**, **actions** and **filters** contexts. For **filters**, + incoming data (offset and length) are relative to the filter. Some functions + may yield, but only for **actions**. Yield is not possible for + **sample-fetches**, **converters** and **filters**. + +.. js:function:: Channel.append(channel, string) + + This function copies the string **string** at the end of incoming data of the + channel buffer. The function returns the copied length on success or -1 if + data cannot be copied. + + Same that :js:func:`Channel.insert(channel, string, channel:input())`. + + :param class_channel channel: The manipulated Channel. + :param string string: The data to copied into incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: Channel.data(channel [, offset [, length]]) + + This function returns **length** bytes of incoming data from the channel + buffer, starting at the offset **offset**. The data are not removed from the + buffer. + + By default, if no length is provided, all incoming data found, starting at the + given offset, are returned. If **length** is set to -1, the function tries to + retrieve a maximum of data and, if called by an action, it yields if + necessary. It also waits for more data if the requested length exceeds the + available amount of incoming data. Do not providing an offset is the same that + setting it to 0. A positive offset is relative to the beginning of incoming + data of the channel buffer while negative offset is relative to their end. + + If there is no incoming data and the channel can't receive more data, a 'nil' + value is returned. + + :param class_channel channel: The manipulated Channel. + :param integer offset: *optional* The offset in incoming data to start to get + data. 0 by default. May be negative to be relative to + the end of incoming data. + :param integer length: *optional* The expected length of data to retrieve. All + incoming data by default. May be set to -1 to get a + maximum of data. + :returns: a string containing the data found or nil. + +.. js:function:: Channel.forward(channel, length) + + This function forwards **length** bytes of data from the channel buffer. If + the requested length exceeds the available amount of incoming data, and if + called by an action, the function yields, waiting for more data to forward. It + returns the amount of data forwarded. + + :param class_channel channel: The manipulated Channel. + :param integer int: The amount of data to forward. + +.. js:function:: Channel.input(channel) + + This function returns the length of incoming data in the channel buffer. When + called by a filter, this value is relative to the filter. + + :param class_channel channel: The manipulated Channel. + :returns: an integer containing the amount of available bytes. + +.. js:function:: Channel.insert(channel, string [, offset]) + + This function copies the string **string** at the offset **offset** in + incoming data of the channel buffer. The function returns the copied length on + success or -1 if data cannot be copied. + + By default, if no offset is provided, the string is copied in front of + incoming data. A positive offset is relative to the beginning of incoming data + of the channel buffer while negative offset is relative to their end. + + :param class_channel channel: The manipulated Channel. + :param string string: The data to copied into incoming data. + :param integer offset: *optional* The offset in incomding data where to copied + data. 0 by default. May be negative to be relative to + the end of incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: Channel.is_full(channel) + + This function returns true if the channel buffer is full. + + :param class_channel channel: The manipulated Channel. + :returns: a boolean + +.. js:function:: Channel.is_resp(channel) + + This function returns true if the channel is the response one. + + :param class_channel channel: The manipulated Channel. + :returns: a boolean + +.. js:function:: Channel.line(channel [, offset [, length]]) + + This function parses **length** bytes of incoming data of the channel buffer, + starting at offset **offset**, and returns the first line found, including the + '\\n'. The data are not removed from the buffer. If no line is found, all data + are returned. + + By default, if no length is provided, all incoming data, starting at the given + offset, are evaluated. If **length** is set to -1, the function tries to + retrieve a maximum of data and, if called by an action, yields if + necessary. It also waits for more data if the requested length exceeds the + available amount of incoming data. Do not providing an offset is the same that + setting it to 0. A positive offset is relative to the beginning of incoming + data of the channel buffer while negative offset is relative to their end. + + If there is no incoming data and the channel can't receive more data, a 'nil' + value is returned. + + :param class_channel channel: The manipulated Channel. + :param integer offset: *optional* The offset in incomding data to start to + parse data. 0 by default. May be negative to be + relative to the end of incoming data. + :param integer length: *optional* The length of data to parse. All incoming + data by default. May be set to -1 to get a maximum of + data. + :returns: a string containing the line found or nil. + +.. js:function:: Channel.may_recv(channel) + + This function returns true if the channel may still receive data. + + :param class_channel channel: The manipulated Channel. + :returns: a boolean + +.. js:function:: Channel.output(channel) + + This function returns the length of outgoing data of the channel buffer. When + called by a filter, this value is relative to the filter. + + :param class_channel channel: The manipulated Channel. + :returns: an integer containing the amount of available bytes. + +.. js:function:: Channel.prepend(channel, string) + + This function copies the string **string** in front of incoming data of the + channel buffer. The function returns the copied length on success or -1 if + data cannot be copied. + + Same that :js:func:`Channel.insert(channel, string, 0)`. + + :param class_channel channel: The manipulated Channel. + :param string string: The data to copied into incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: Channel.remove(channel [, offset [, length]]) + + This function removes **length** bytes of incoming data of the channel buffer, + starting at offset **offset**. This function returns number of bytes removed + on success. + + By default, if no length is provided, all incoming data, starting at the given + offset, are removed. Do not providing an offset is the same that setting it + to 0. A positive offset is relative to the beginning of incoming data of the + channel buffer while negative offset is relative to their end. + + :param class_channel channel: The manipulated Channel. + :param integer offset: *optional* The offset in incomding data where to start + to remove data. 0 by default. May be negative to + be relative to the end of incoming data. + :param integer length: *optional* The length of data to remove. All incoming + data by default. + :returns: an integer containing the amount of bytes removed. + +.. js:function:: Channel.send(channel, string) + + This function requires immediate send of the string **string**. It means the + string is copied at the beginning of incoming data of the channel buffer and + immediately forwarded. Unless if the connection is close, and if called by an + action, this function yields to copied and forward all the string. + + :param class_channel channel: The manipulated Channel. + :param string string: The data to send. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: Channel.set(channel, string [, offset [, length]]) + + This function replace **length** bytes of incoming data of the channel buffer, + starting at offset **offset**, by the string **string**. The function returns + the copied length on success or -1 if data cannot be copied. + + By default, if no length is provided, all incoming data, starting at the given + offset, are replaced. Do not providing an offset is the same that setting it + to 0. A positive offset is relative to the beginning of incoming data of the + channel buffer while negative offset is relative to their end. + + :param class_channel channel: The manipulated Channel. + :param string string: The data to copied into incoming data. + :param integer offset: *optional* The offset in incomding data where to start + the data replacement. 0 by default. May be negative to + be relative to the end of incoming data. + :param integer length: *optional* The length of data to replace. All incoming + data by default. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: Channel.dup(channel) + + **DEPRECATED** + + This function returns all incoming data found in the channel buffer. The data + are not removed from the buffer and can be reprocessed later. + + If there is no incoming data and the channel can't receive more data, a 'nil' + value is returned. + + :param class_channel channel: The manipulated Channel. + :returns: a string containing all data found or nil. + + .. warning:: + This function is deprecated. :js:func:`Channel.data()` must be used + instead. + +.. js:function:: Channel.get(channel) + + **DEPRECATED** + + This function returns all incoming data found in the channel buffer and remove + them from the buffer. + + If there is no incoming data and the channel can't receive more data, a 'nil' + value is returned. + + :param class_channel channel: The manipulated Channel. + :returns: a string containing all the data found or nil. + + .. warning:: + This function is deprecated. :js:func:`Channel.data()` must be used to + retrieve data followed by a call to :js:func:`Channel:remove()` to remove + data. + + .. code-block:: lua + + local data = chn:data() + chn:remove(0, data:len()) + + .. + +.. js:function:: Channel.getline(channel) + + **DEPRECATED** + + This function returns the first line found in incoming data of the channel + buffer, including the '\\n'. The returned data are removed from the buffer. If + no line is found, and if called by an action, this function yields to wait for + more data, except if the channel can't receive more data. In this case all + data are returned. + + If there is no incoming data and the channel can't receive more data, a 'nil' + value is returned. + + :param class_channel channel: The manipulated Channel. + :returns: a string containing the line found or nil. + + .. warning:: + This function is deprecated. :js:func:`Channel.line()` must be used to + retrieve a line followed by a call to :js:func:`Channel:remove()` to remove + data. + + .. code-block:: lua + + local line = chn:line(0, -1) + chn:remove(0, line:len()) + + .. + +.. js:function:: Channel.get_in_len(channel) + + **DEPRECATED** + + This function returns the length of the input part of the buffer. When called + by a filter, this value is relative to the filter. + + :param class_channel channel: The manipulated Channel. + :returns: an integer containing the amount of available bytes. + + .. warning:: + This function is deprecated. :js:func:`Channel.input()` must be used + instead. + +.. js:function:: Channel.get_out_len(channel) + + **DEPRECATED** + + This function returns the length of the output part of the buffer. When called + by a filter, this value is relative to the filter. + + :param class_channel channel: The manipulated Channel. + :returns: an integer containing the amount of available bytes. + + .. warning:: + This function is deprecated. :js:func:`Channel.output()` must be used + instead. + +.. _http_class: + +HTTP class +========== + +.. js:class:: HTTP + + This class contain all the HTTP manipulation functions. + +.. js:function:: HTTP.req_get_headers(http) + + Returns a table containing all the request headers. + + :param class_http http: The related http object. + :returns: table of headers. + :see: :js:func:`HTTP.res_get_headers` + + This is the form of the returned table: + +.. code-block:: lua + + HTTP:req_get_headers()['<header-name>'][<header-index>] = "<header-value>" + + local hdr = HTTP:req_get_headers() + hdr["host"][0] = "www.test.com" + hdr["accept"][0] = "audio/basic q=1" + hdr["accept"][1] = "audio/*, q=0.2" + hdr["accept"][2] = "*/*, q=0.1" +.. + +.. js:function:: HTTP.res_get_headers(http) + + Returns a table containing all the response headers. + + :param class_http http: The related http object. + :returns: table of headers. + :see: :js:func:`HTTP.req_get_headers` + + This is the form of the returned table: + +.. code-block:: lua + + HTTP:res_get_headers()['<header-name>'][<header-index>] = "<header-value>" + + local hdr = HTTP:req_get_headers() + hdr["host"][0] = "www.test.com" + hdr["accept"][0] = "audio/basic q=1" + hdr["accept"][1] = "audio/*, q=0.2" + hdr["accept"][2] = "*.*, q=0.1" +.. + +.. js:function:: HTTP.req_add_header(http, name, value) + + Appends an HTTP header field in the request whose name is + specified in "name" and whose value is defined in "value". + + :param class_http http: The related http object. + :param string name: The header name. + :param string value: The header value. + :see: :js:func:`HTTP.res_add_header` + +.. js:function:: HTTP.res_add_header(http, name, value) + + Appends an HTTP header field in the response whose name is + specified in "name" and whose value is defined in "value". + + :param class_http http: The related http object. + :param string name: The header name. + :param string value: The header value. + :see: :js:func:`HTTP.req_add_header` + +.. js:function:: HTTP.req_del_header(http, name) + + Removes all HTTP header fields in the request whose name is + specified in "name". + + :param class_http http: The related http object. + :param string name: The header name. + :see: :js:func:`HTTP.res_del_header` + +.. js:function:: HTTP.res_del_header(http, name) + + Removes all HTTP header fields in the response whose name is + specified in "name". + + :param class_http http: The related http object. + :param string name: The header name. + :see: :js:func:`HTTP.req_del_header` + +.. js:function:: HTTP.req_set_header(http, name, value) + + This variable replace all occurrence of all header "name", by only + one containing the "value". + + :param class_http http: The related http object. + :param string name: The header name. + :param string value: The header value. + :see: :js:func:`HTTP.res_set_header` + + This function does the same work as the following code: + +.. code-block:: lua + + function fcn(txn) + TXN.http:req_del_header("header") + TXN.http:req_add_header("header", "value") + end +.. + +.. js:function:: HTTP.res_set_header(http, name, value) + + This variable replace all occurrence of all header "name", by only + one containing the "value". + + :param class_http http: The related http object. + :param string name: The header name. + :param string value: The header value. + :see: :js:func:`HTTP.req_rep_header()` + +.. js:function:: HTTP.req_rep_header(http, name, regex, replace) + + Matches the regular expression in all occurrences of header field "name" + according to "regex", and replaces them with the "replace" argument. The + replacement value can contain back references like \1, \2, ... This + function works with the request. + + :param class_http http: The related http object. + :param string name: The header name. + :param string regex: The match regular expression. + :param string replace: The replacement value. + :see: :js:func:`HTTP.res_rep_header()` + +.. js:function:: HTTP.res_rep_header(http, name, regex, string) + + Matches the regular expression in all occurrences of header field "name" + according to "regex", and replaces them with the "replace" argument. The + replacement value can contain back references like \1, \2, ... This + function works with the request. + + :param class_http http: The related http object. + :param string name: The header name. + :param string regex: The match regular expression. + :param string replace: The replacement value. + :see: :js:func:`HTTP.req_rep_header()` + +.. js:function:: HTTP.req_set_method(http, method) + + Rewrites the request method with the parameter "method". + + :param class_http http: The related http object. + :param string method: The new method. + +.. js:function:: HTTP.req_set_path(http, path) + + Rewrites the request path with the "path" parameter. + + :param class_http http: The related http object. + :param string path: The new path. + +.. js:function:: HTTP.req_set_query(http, query) + + Rewrites the request's query string which appears after the first question + mark ("?") with the parameter "query". + + :param class_http http: The related http object. + :param string query: The new query. + +.. js:function:: HTTP.req_set_uri(http, uri) + + Rewrites the request URI with the parameter "uri". + + :param class_http http: The related http object. + :param string uri: The new uri. + +.. js:function:: HTTP.res_set_status(http, status [, reason]) + + Rewrites the response status code with the parameter "code". + + If no custom reason is provided, it will be generated from the status. + + :param class_http http: The related http object. + :param integer status: The new response status code. + :param string reason: The new response reason (optional). + +.. _httpclient_class: + +HTTPClient class +================ + +.. js:class:: HTTPClient + + The httpclient class allows issue of outbound HTTP requests through a simple + API without the knowledge of HAProxy internals. + +.. js:function:: HTTPClient.get(httpclient, request) +.. js:function:: HTTPClient.head(httpclient, request) +.. js:function:: HTTPClient.put(httpclient, request) +.. js:function:: HTTPClient.post(httpclient, request) +.. js:function:: HTTPClient.delete(httpclient, request) + + Send an HTTP request and wait for a response. GET, HEAD PUT, POST and DELETE methods can be used. + The HTTPClient will send asynchronously the data and is able to send and receive more than an HAProxy bufsize. + + The HTTPClient interface is not able to decompress responses, it is not + recommended to send an Accept-Encoding in the request so the response is + received uncompressed. + + :param class httpclient: Is the manipulated HTTPClient. + :param table request: Is a table containing the parameters of the request that will be send. + :param string request.url: Is a mandatory parameter for the request that contains the URL. + :param string request.body: Is an optional parameter for the request that contains the body to send. + :param table request.headers: Is an optional parameter for the request that contains the headers to send. + :param string request.dst: Is an optional parameter for the destination in haproxy address format. + :param integer request.timeout: Optional timeout parameter, set a "timeout server" on the connections. + :returns: Lua table containing the response + + +.. code-block:: lua + + local httpclient = core.httpclient() + local response = httpclient:post{url="http://127.0.0.1", body=body, dst="unix@/var/run/http.sock"} + +.. + +.. code-block:: lua + + response = { + status = 400, + reason = "Bad request", + headers = { + ["content-type"] = { "text/html" }, + ["cache-control"] = { "no-cache", "no-store" }, + }, + body = "<html><body><h1>invalid request<h1></body></html>", + } +.. + + +.. _txn_class: + +TXN class +========= + +.. js:class:: TXN + + The txn class contain all the functions relative to the http or tcp + transaction (Note than a tcp stream is the same than a tcp transaction, but + an HTTP transaction is not the same than a tcp stream). + + The usage of this class permits to retrieve data from the requests, alter it + and forward it. + + All the functions provided by this class are available in the context + **sample-fetches**, **actions** and **filters**. + +.. js:attribute:: TXN.c + + :returns: An :ref:`converters_class`. + + This attribute contains a Converters class object. + +.. js:attribute:: TXN.sc + + :returns: An :ref:`converters_class`. + + This attribute contains a Converters class object. The functions of + this object returns always a string. + +.. js:attribute:: TXN.f + + :returns: An :ref:`fetches_class`. + + This attribute contains a Fetches class object. + +.. js:attribute:: TXN.sf + + :returns: An :ref:`fetches_class`. + + This attribute contains a Fetches class object. The functions of + this object returns always a string. + +.. js:attribute:: TXN.req + + :returns: An :ref:`channel_class`. + + This attribute contains a channel class object for the request buffer. + +.. js:attribute:: TXN.res + + :returns: An :ref:`channel_class`. + + This attribute contains a channel class object for the response buffer. + +.. js:attribute:: TXN.http + + :returns: An :ref:`http_class`. + + This attribute contains an HTTP class object. It is available only if the + proxy has the "mode http" enabled. + +.. js:attribute:: TXN.http_req + + :returns: An :ref:`httpmessage_class`. + + This attribute contains the request HTTPMessage class object. It is available + only if the proxy has the "mode http" enabled and only in the **filters** + context. + +.. js:attribute:: TXN.http_res + + :returns: An :ref:`httpmessage_class`. + + This attribute contains the response HTTPMessage class object. It is available + only if the proxy has the "mode http" enabled and only in the **filters** + context. + +.. js:function:: TXN.log(TXN, loglevel, msg) + + This function sends a log. The log is sent, according with the HAProxy + configuration file, on the default syslog server if it is configured and on + the stderr if it is allowed. + + :param class_txn txn: The class txn object containing the data. + :param integer loglevel: Is the log level associated with the message. It is a + number between 0 and 7. + :param string msg: The log content. + :see: :js:attr:`core.emerg`, :js:attr:`core.alert`, :js:attr:`core.crit`, + :js:attr:`core.err`, :js:attr:`core.warning`, :js:attr:`core.notice`, + :js:attr:`core.info`, :js:attr:`core.debug` (log level definitions) + :see: :js:func:`TXN.deflog` + :see: :js:func:`TXN.Debug` + :see: :js:func:`TXN.Info` + :see: :js:func:`TXN.Warning` + :see: :js:func:`TXN.Alert` + +.. js:function:: TXN.deflog(TXN, msg) + + Sends a log line with the default loglevel for the proxy associated with the + transaction. + + :param class_txn txn: The class txn object containing the data. + :param string msg: The log content. + :see: :js:func:`TXN.log + +.. js:function:: TXN.Debug(txn, msg) + + :param class_txn txn: The class txn object containing the data. + :param string msg: The log content. + :see: :js:func:`TXN.log` + + Does the same job than: + +.. code-block:: lua + + function Debug(txn, msg) + TXN.log(txn, core.debug, msg) + end +.. + +.. js:function:: TXN.Info(txn, msg) + + :param class_txn txn: The class txn object containing the data. + :param string msg: The log content. + :see: :js:func:`TXN.log` + +.. code-block:: lua + + function Debug(txn, msg) + TXN.log(txn, core.info, msg) + end +.. + +.. js:function:: TXN.Warning(txn, msg) + + :param class_txn txn: The class txn object containing the data. + :param string msg: The log content. + :see: :js:func:`TXN.log` + +.. code-block:: lua + + function Debug(txn, msg) + TXN.log(txn, core.warning, msg) + end +.. + +.. js:function:: TXN.Alert(txn, msg) + + :param class_txn txn: The class txn object containing the data. + :param string msg: The log content. + :see: :js:func:`TXN.log` + +.. code-block:: lua + + function Debug(txn, msg) + TXN.log(txn, core.alert, msg) + end +.. + +.. js:function:: TXN.get_priv(txn) + + Return Lua data stored in the current transaction (with the `TXN.set_priv()`) + function. If no data are stored, it returns a nil value. + + :param class_txn txn: The class txn object containing the data. + :returns: the opaque data previously stored, or nil if nothing is + available. + +.. js:function:: TXN.set_priv(txn, data) + + Store any data in the current HAProxy transaction. This action replace the + old stored data. + + :param class_txn txn: The class txn object containing the data. + :param opaque data: The data which is stored in the transaction. + +.. js:function:: TXN.set_var(TXN, var, value[, ifexist]) + + Converts a Lua type in a HAProxy type and store it in a variable <var>. + + :param class_txn txn: The class txn object containing the data. + :param string var: The variable name according with the HAProxy variable syntax. + :param type value: The value associated to the variable. The type can be string or + integer. + :param boolean ifexist: If this parameter is set to a truthy value the variable + will only be set if it was defined elsewhere (i.e. used + within the configuration). For global variables (using the + "proc" scope), they will only be updated and never created. + It is highly recommended to always set this to true. + +.. js:function:: TXN.unset_var(TXN, var) + + Unset the variable <var>. + + :param class_txn txn: The class txn object containing the data. + :param string var: The variable name according with the HAProxy variable syntax. + +.. js:function:: TXN.get_var(TXN, var) + + Returns data stored in the variable <var> converter in Lua type. + + :param class_txn txn: The class txn object containing the data. + :param string var: The variable name according with the HAProxy variable syntax. + +.. js:function:: TXN.reply([reply]) + + Return a new reply object + + :param table reply: A table containing info to initialize the reply fields. + :returns: A :ref:`reply_class` object. + + The table used to initialized the reply object may contain following entries : + + * status : The reply status code. the code 200 is used by default. + * reason : The reply reason. The reason corresponding to the status code is + used by default. + * headers : An list of headers, indexed by header name. Empty by default. For + a given name, multiple values are possible, stored in an ordered list. + * body : The reply body, empty by default. + +.. code-block:: lua + + local reply = txn:reply{ + status = 400, + reason = "Bad request", + headers = { + ["content-type"] = { "text/html" }, + ["cache-control"] = {"no-cache", "no-store" } + }, + body = "<html><body><h1>invalid request<h1></body></html>" + } +.. + :see: :js:class:`Reply` + +.. js:function:: TXN.done(txn[, reply]) + + This function terminates processing of the transaction and the associated + session and optionally reply to the client for HTTP sessions. + + :param class_txn txn: The class txn object containing the data. + :param class_reply reply: The class reply object to return to the client. + + This functions can be used when a critical error is detected or to terminate + processing after some data have been returned to the client (eg: a redirect). + To do so, a reply may be provided. This object is optional and may contain a + status code, a reason, a header list and a body. All these fields are + optional. When not provided, the default values are used. By default, with an + empty reply object, an empty HTTP 200 response is returned to the client. If + no reply object is provided, the transaction is terminated without any + reply. If a reply object is provided, it must not exceed the buffer size once + converted into the internal HTTP representing. Because for now there is no + easy way to be sure it fits, it is probably better to keep it reasonably + small. + + The reply object may be fully created in lua or the class Reply may be used to + create it. + +.. code-block:: lua + + local reply = txn:reply() + reply:set_status(400, "Bad request") + reply:add_header("content-type", "text/html") + reply:add_header("cache-control", "no-cache") + reply:add_header("cache-control", "no-store") + reply:set_body("<html><body><h1>invalid request<h1></body></html>") + txn:done(reply) +.. + +.. code-block:: lua + + txn:done{ + status = 400, + reason = "Bad request", + headers = { + ["content-type"] = { "text/html" }, + ["cache-control"] = { "no-cache", "no-store" }, + }, + body = "<html><body><h1>invalid request<h1></body></html>" + } +.. + + .. warning:: + It not make sense to call this function from sample-fetches. In this case + the behaviour of this one is the same than core.done(): it quit the Lua + execution. The transaction is really aborted only from an action registered + function. + + :see: :js:func:`TXN.reply`, :js:class:`Reply` + +.. js:function:: TXN.set_loglevel(txn, loglevel) + + Is used to change the log level of the current request. The "loglevel" must + be an integer between 0 and 7. + + :param class_txn txn: The class txn object containing the data. + :param integer loglevel: The required log level. This variable can be one of + :see: :js:attr:`core.emerg`, :js:attr:`core.alert`, :js:attr:`core.crit`, + :js:attr:`core.err`, :js:attr:`core.warning`, :js:attr:`core.notice`, + :js:attr:`core.info`, :js:attr:`core.debug` (log level definitions) + +.. js:function:: TXN.set_tos(txn, tos) + + Is used to set the TOS or DSCP field value of packets sent to the client to + the value passed in "tos" on platforms which support this. + + :param class_txn txn: The class txn object containing the data. + :param integer tos: The new TOS os DSCP. + +.. js:function:: TXN.set_mark(txn, mark) + + Is used to set the Netfilter MARK on all packets sent to the client to the + value passed in "mark" on platforms which support it. + + :param class_txn txn: The class txn object containing the data. + :param integer mark: The mark value. + +.. js:function:: TXN.set_priority_class(txn, prio) + + This function adjusts the priority class of the transaction. The value should + be within the range -2047..2047. Values outside this range will be + truncated. + + See the HAProxy configuration.txt file keyword "http-request" action + "set-priority-class" for details. + +.. js:function:: TXN.set_priority_offset(txn, prio) + + This function adjusts the priority offset of the transaction. The value + should be within the range -524287..524287. Values outside this range will be + truncated. + + See the HAProxy configuration.txt file keyword "http-request" action + "set-priority-offset" for details. + +.. _reply_class: + +Reply class +============ + +.. js:class:: Reply + + **context**: action + + This class represents an HTTP response message. It provides some methods to + enrich it. Once converted into the internal HTTP representation, the response + message must not exceed the buffer size. Because for now there is no + easy way to be sure it fits, it is probably better to keep it reasonably + small. + + See tune.bufsize in the configuration manual for dettails. + +.. code-block:: lua + + local reply = txn:reply({status = 400}) -- default HTTP 400 reason-phase used + reply:add_header("content-type", "text/html") + reply:add_header("cache-control", "no-cache") + reply:add_header("cache-control", "no-store") + reply:set_body("<html><body><h1>invalid request<h1></body></html>") +.. + + :see: :js:func:`TXN.reply` + +.. js:attribute:: Reply.status + + The reply status code. By default, the status code is set to 200. + + :returns: integer + +.. js:attribute:: Reply.reason + + The reason string describing the status code. + + :returns: string + +.. js:attribute:: Reply.headers + + A table indexing all reply headers by name. To each name is associated an + ordered list of values. + + :returns: Lua table + +.. code-block:: lua + + { + ["content-type"] = { "text/html" }, + ["cache-control"] = {"no-cache", "no-store" }, + x_header_name = { "value1", "value2", ... } + ... + } +.. + +.. js:attribute:: Reply.body + + The reply payload. + + :returns: string + +.. js:function:: Reply.set_status(REPLY, status[, reason]) + + Set the reply status code and optionally the reason-phrase. If the reason is + not provided, the default reason corresponding to the status code is used. + + :param class_reply reply: The related Reply object. + :param integer status: The reply status code. + :param string reason: The reply status reason (optional). + +.. js:function:: Reply.add_header(REPLY, name, value) + + Add a header to the reply object. If the header does not already exist, a new + entry is created with its name as index and a one-element list containing its + value as value. Otherwise, the header value is appended to the ordered list of + values associated to the header name. + + :param class_reply reply: The related Reply object. + :param string name: The header field name. + :param string value: The header field value. + +.. js:function:: Reply.del_header(REPLY, name) + + Remove all occurrences of a header name from the reply object. + + :param class_reply reply: The related Reply object. + :param string name: The header field name. + +.. js:function:: Reply.set_body(REPLY, body) + + Set the reply payload. + + :param class_reply reply: The related Reply object. + :param string body: The reply payload. + +.. _socket_class: + +Socket class +============ + +.. js:class:: Socket + + This class must be compatible with the Lua Socket class. Only the 'client' + functions are available. See the Lua Socket documentation: + + `http://w3.impa.br/~diego/software/luasocket/tcp.html + <http://w3.impa.br/~diego/software/luasocket/tcp.html>`_ + +.. js:function:: Socket.close(socket) + + Closes a TCP object. The internal socket used by the object is closed and the + local address to which the object was bound is made available to other + applications. No further operations (except for further calls to the close + method) are allowed on a closed Socket. + + :param class_socket socket: Is the manipulated Socket. + + Note: It is important to close all used sockets once they are not needed, + since, in many systems, each socket uses a file descriptor, which are limited + system resources. Garbage-collected objects are automatically closed before + destruction, though. + +.. js:function:: Socket.connect(socket, address[, port]) + + Attempts to connect a socket object to a remote host. + + + In case of error, the method returns nil followed by a string describing the + error. In case of success, the method returns 1. + + :param class_socket socket: Is the manipulated Socket. + :param string address: can be an IP address or a host name. See below for more + information. + :param integer port: must be an integer number in the range [1..64K]. + :returns: 1 or nil. + + An address field extension permits to use the connect() function to connect to + other stream than TCP. The syntax containing a simpleipv4 or ipv6 address is + the basically expected format. This format requires the port. + + Other format accepted are a socket path like "/socket/path", it permits to + connect to a socket. Abstract namespaces are supported with the prefix + "abns@", and finally a file descriptor can be passed with the prefix "fd@". + The prefix "ipv4@", "ipv6@" and "unix@" are also supported. The port can be + passed int the string. The syntax "127.0.0.1:1234" is valid. In this case, the + parameter *port* must not be set. + +.. js:function:: Socket.connect_ssl(socket, address, port) + + Same behavior than the function socket:connect, but uses SSL. + + :param class_socket socket: Is the manipulated Socket. + :returns: 1 or nil. + +.. js:function:: Socket.getpeername(socket) + + Returns information about the remote side of a connected client object. + + Returns a string with the IP address of the peer, followed by the port number + that peer is using for the connection. In case of error, the method returns + nil. + + :param class_socket socket: Is the manipulated Socket. + :returns: a string containing the server information. + +.. js:function:: Socket.getsockname(socket) + + Returns the local address information associated to the object. + + The method returns a string with local IP address and a number with the port. + In case of error, the method returns nil. + + :param class_socket socket: Is the manipulated Socket. + :returns: a string containing the client information. + +.. js:function:: Socket.receive(socket, [pattern [, prefix]]) + + Reads data from a client object, according to the specified read pattern. + Patterns follow the Lua file I/O format, and the difference in performance + between all patterns is negligible. + + :param class_socket socket: Is the manipulated Socket. + :param string|integer pattern: Describe what is required (see below). + :param string prefix: A string which will be prefix the returned data. + :returns: a string containing the required data or nil. + + Pattern can be any of the following: + + * **`*a`**: reads from the socket until the connection is closed. No + end-of-line translation is performed; + + * **`*l`**: reads a line of text from the Socket. The line is terminated by a + LF character (ASCII 10), optionally preceded by a CR character + (ASCII 13). The CR and LF characters are not included in the + returned line. In fact, all CR characters are ignored by the + pattern. This is the default pattern. + + * **number**: causes the method to read a specified number of bytes from the + Socket. Prefix is an optional string to be concatenated to the + beginning of any received data before return. + + * **empty**: If the pattern is left empty, the default option is `*l`. + + If successful, the method returns the received pattern. In case of error, the + method returns nil followed by an error message which can be the string + 'closed' in case the connection was closed before the transmission was + completed or the string 'timeout' in case there was a timeout during the + operation. Also, after the error message, the function returns the partial + result of the transmission. + + Important note: This function was changed severely. It used to support + multiple patterns (but I have never seen this feature used) and now it + doesn't anymore. Partial results used to be returned in the same way as + successful results. This last feature violated the idea that all functions + should return nil on error. Thus it was changed too. + +.. js:function:: Socket.send(socket, data [, start [, end ]]) + + Sends data through client object. + + :param class_socket socket: Is the manipulated Socket. + :param string data: The data that will be sent. + :param integer start: The start position in the buffer of the data which will + be sent. + :param integer end: The end position in the buffer of the data which will + be sent. + :returns: see below. + + Data is the string to be sent. The optional arguments i and j work exactly + like the standard string.sub Lua function to allow the selection of a + substring to be sent. + + If successful, the method returns the index of the last byte within [start, + end] that has been sent. Notice that, if start is 1 or absent, this is + effectively the total number of bytes sent. In case of error, the method + returns nil, followed by an error message, followed by the index of the last + byte within [start, end] that has been sent. You might want to try again from + the byte following that. The error message can be 'closed' in case the + connection was closed before the transmission was completed or the string + 'timeout' in case there was a timeout during the operation. + + Note: Output is not buffered. For small strings, it is always better to + concatenate them in Lua (with the '..' operator) and send the result in one + call instead of calling the method several times. + +.. js:function:: Socket.setoption(socket, option [, value]) + + Just implemented for compatibility, this cal does nothing. + +.. js:function:: Socket.settimeout(socket, value [, mode]) + + Changes the timeout values for the object. All I/O operations are blocking. + That is, any call to the methods send, receive, and accept will block + indefinitely, until the operation completes. The settimeout method defines a + limit on the amount of time the I/O methods can block. When a timeout time + has elapsed, the affected methods give up and fail with an error code. + + The amount of time to wait is specified as the value parameter, in seconds. + + The timeout modes are not implemented, the only settable timeout is the + inactivity time waiting for complete the internal buffer send or waiting for + receive data. + + :param class_socket socket: Is the manipulated Socket. + :param float value: The timeout value. Use floating point to specify + milliseconds. + +.. _regex_class: + +Regex class +=========== + +.. js:class:: Regex + + This class allows the usage of HAProxy regexes because classic lua doesn't + provides regexes. This class inherits the HAProxy compilation options, so the + regexes can be libc regex, pcre regex or pcre JIT regex. + + The expression matching number is limited to 20 per regex. The only available + option is case sensitive. + + Because regexes compilation is a heavy process, it is better to define all + your regexes in the **body context** and use it during the runtime. + +.. code-block:: lua + + -- Create the regex + st, regex = Regex.new("needle (..) (...)", true); + + -- Check compilation errors + if st == false then + print "error: " .. regex + end + + -- Match the regexes + print(regex:exec("Looking for a needle in the haystack")) -- true + print(regex:exec("Lokking for a cat in the haystack")) -- false + + -- Extract words + st, list = regex:match("Looking for a needle in the haystack") + print(st) -- true + print(list[1]) -- needle in the + print(list[2]) -- in + print(list[3]) -- the + +.. js:function:: Regex.new(regex, case_sensitive) + + Create and compile a regex. + + :param string regex: The regular expression according with the libc or pcre + standard + :param boolean case_sensitive: Match is case sensitive or not. + :returns: boolean status and :ref:`regex_class` or string containing fail reason. + +.. js:function:: Regex.exec(regex, str) + + Execute the regex. + + :param class_regex regex: A :ref:`regex_class` object. + :param string str: The input string will be compared with the compiled regex. + :returns: a boolean status according with the match result. + +.. js:function:: Regex.match(regex, str) + + Execute the regex and return matched expressions. + + :param class_map map: A :ref:`regex_class` object. + :param string str: The input string will be compared with the compiled regex. + :returns: a boolean status according with the match result, and + a table containing all the string matched in order of declaration. + +.. _map_class: + +Map class +========= + +.. js:class:: Map + + This class permits to do some lookup in HAProxy maps. The declared maps can + be modified during the runtime through the HAProxy management socket. + +.. code-block:: lua + + default = "usa" + + -- Create and load map + geo = Map.new("geo.map", Map._ip); + + -- Create new fetch that returns the user country + core.register_fetches("country", function(txn) + local src; + local loc; + + src = txn.f:fhdr("x-forwarded-for"); + if (src == nil) then + src = txn.f:src() + if (src == nil) then + return default; + end + end + + -- Perform lookup + loc = geo:lookup(src); + + if (loc == nil) then + return default; + end + + return loc; + end); + +.. js:attribute:: Map._int + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.int` is also available for compatibility. + +.. js:attribute:: Map._ip + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.ip` is also available for compatibility. + +.. js:attribute:: Map._str + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.str` is also available for compatibility. + +.. js:attribute:: Map._beg + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.beg` is also available for compatibility. + +.. js:attribute:: Map._sub + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.sub` is also available for compatibility. + +.. js:attribute:: Map._dir + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.dir` is also available for compatibility. + +.. js:attribute:: Map._dom + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.dom` is also available for compatibility. + +.. js:attribute:: Map._end + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + +.. js:attribute:: Map._reg + + See the HAProxy configuration.txt file, chapter "Using ACLs and fetching + samples" and subchapter "ACL basics" to understand this pattern matching + method. + + Note that :js:attr:`Map.reg` is also available for compatibility. + + +.. js:function:: Map.new(file, method) + + Creates and load a map. + + :param string file: Is the file containing the map. + :param integer method: Is the map pattern matching method. See the attributes + of the Map class. + :returns: a class Map object. + :see: The Map attributes: :js:attr:`Map._int`, :js:attr:`Map._ip`, + :js:attr:`Map._str`, :js:attr:`Map._beg`, :js:attr:`Map._sub`, + :js:attr:`Map._dir`, :js:attr:`Map._dom`, :js:attr:`Map._end` and + :js:attr:`Map._reg`. + +.. js:function:: Map.lookup(map, str) + + Perform a lookup in a map. + + :param class_map map: Is the class Map object. + :param string str: Is the string used as key. + :returns: a string containing the result or nil if no match. + +.. js:function:: Map.slookup(map, str) + + Perform a lookup in a map. + + :param class_map map: Is the class Map object. + :param string str: Is the string used as key. + :returns: a string containing the result or empty string if no match. + +.. _applethttp_class: + +AppletHTTP class +================ + +.. js:class:: AppletHTTP + + This class is used with applets that requires the 'http' mode. The http applet + can be registered with the *core.register_service()* function. They are used + for processing an http request like a server in back of HAProxy. + + This is an hello world sample code: + +.. code-block:: lua + + core.register_service("hello-world", "http", function(applet) + local response = "Hello World !" + applet:set_status(200) + applet:add_header("content-length", string.len(response)) + applet:add_header("content-type", "text/plain") + applet:start_response() + applet:send(response) + end) + +.. js:attribute:: AppletHTTP.c + + :returns: A :ref:`converters_class` + + This attribute contains a Converters class object. + +.. js:attribute:: AppletHTTP.sc + + :returns: A :ref:`converters_class` + + This attribute contains a Converters class object. The + functions of this object returns always a string. + +.. js:attribute:: AppletHTTP.f + + :returns: A :ref:`fetches_class` + + This attribute contains a Fetches class object. Note that the + applet execution place cannot access to a valid HAProxy core HTTP + transaction, so some sample fetches related to the HTTP dependent + values (hdr, path, ...) are not available. + +.. js:attribute:: AppletHTTP.sf + + :returns: A :ref:`fetches_class` + + This attribute contains a Fetches class object. The functions of + this object returns always a string. Note that the applet + execution place cannot access to a valid HAProxy core HTTP + transaction, so some sample fetches related to the HTTP dependent + values (hdr, path, ...) are not available. + +.. js:attribute:: AppletHTTP.method + + :returns: string + + The attribute method returns a string containing the HTTP + method. + +.. js:attribute:: AppletHTTP.version + + :returns: string + + The attribute version, returns a string containing the HTTP + request version. + +.. js:attribute:: AppletHTTP.path + + :returns: string + + The attribute path returns a string containing the HTTP + request path. + +.. js:attribute:: AppletHTTP.qs + + :returns: string + + The attribute qs returns a string containing the HTTP + request query string. + +.. js:attribute:: AppletHTTP.length + + :returns: integer + + The attribute length returns an integer containing the HTTP + body length. + +.. js:attribute:: AppletHTTP.headers + + :returns: table + + The attribute headers returns a table containing the HTTP + headers. The header names are always in lower case. As the header name can be + encountered more than once in each request, the value is indexed with 0 as + first index value. The table have this form: + +.. code-block:: lua + + AppletHTTP.headers['<header-name>'][<header-index>] = "<header-value>" + + AppletHTTP.headers["host"][0] = "www.test.com" + AppletHTTP.headers["accept"][0] = "audio/basic q=1" + AppletHTTP.headers["accept"][1] = "audio/*, q=0.2" + AppletHTTP.headers["accept"][2] = "*/*, q=0.1" +.. + +.. js:function:: AppletHTTP.set_status(applet, code [, reason]) + + This function sets the HTTP status code for the response. The allowed code are + from 100 to 599. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param integer code: the status code returned to the client. + :param string reason: the status reason returned to the client (optional). + +.. js:function:: AppletHTTP.add_header(applet, name, value) + + This function add an header in the response. Duplicated headers are not + collapsed. The special header *content-length* is used to determinate the + response length. If it not exists, a *transfer-encoding: chunked* is set, and + all the write from the function *AppletHTTP:send()* become a chunk. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param string name: the header name + :param string value: the header value + +.. js:function:: AppletHTTP.start_response(applet) + + This function indicates to the HTTP engine that it can process and send the + response headers. After this called we cannot add headers to the response; We + cannot use the *AppletHTTP:send()* function if the + *AppletHTTP:start_response()* is not called. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + +.. js:function:: AppletHTTP.getline(applet) + + This function returns a string containing one line from the http body. If the + data returned doesn't contains a final '\\n' its assumed than its the last + available data before the end of stream. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :returns: a string. The string can be empty if we reach the end of the stream. + +.. js:function:: AppletHTTP.receive(applet, [size]) + + Reads data from the HTTP body, according to the specified read *size*. If the + *size* is missing, the function tries to read all the content of the stream + until the end. If the *size* is bigger than the http body, it returns the + amount of data available. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param integer size: the required read size. + :returns: always return a string,the string can be empty is the connection is + closed. + +.. js:function:: AppletHTTP.send(applet, msg) + + Send the message *msg* on the http request body. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param string msg: the message to send. + +.. js:function:: AppletHTTP.get_priv(applet) + + Return Lua data stored in the current transaction. If no data are stored, + it returns a nil value. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :returns: the opaque data previously stored, or nil if nothing is + available. + :see: :js:func:`AppletHTTP.set_priv` + +.. js:function:: AppletHTTP.set_priv(applet, data) + + Store any data in the current HAProxy transaction. This action replace the + old stored data. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param opaque data: The data which is stored in the transaction. + :see: :js:func:`AppletHTTP.get_priv` + +.. js:function:: AppletHTTP.set_var(applet, var, value[, ifexist]) + + Converts a Lua type in a HAProxy type and store it in a variable <var>. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :param type value: The value associated to the variable. The type ca be string or + integer. + :param boolean ifexist: If this parameter is set to a truthy value the variable + will only be set if it was defined elsewhere (i.e. used + within the configuration). For global variables (using the + "proc" scope), they will only be updated and never created. + It is highly recommended to + always set this to true. + :see: :js:func:`AppletHTTP.unset_var` + :see: :js:func:`AppletHTTP.get_var` + +.. js:function:: AppletHTTP.unset_var(applet, var) + + Unset the variable <var>. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :see: :js:func:`AppletHTTP.set_var` + :see: :js:func:`AppletHTTP.get_var` + +.. js:function:: AppletHTTP.get_var(applet, var) + + Returns data stored in the variable <var> converter in Lua type. + + :param class_AppletHTTP applet: An :ref:`applethttp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :see: :js:func:`AppletHTTP.set_var` + :see: :js:func:`AppletHTTP.unset_var` + +.. _applettcp_class: + +AppletTCP class +=============== + +.. js:class:: AppletTCP + + This class is used with applets that requires the 'tcp' mode. The tcp applet + can be registered with the *core.register_service()* function. They are used + for processing a tcp stream like a server in back of HAProxy. + +.. js:attribute:: AppletTCP.c + + :returns: A :ref:`converters_class` + + This attribute contains a Converters class object. + +.. js:attribute:: AppletTCP.sc + + :returns: A :ref:`converters_class` + + This attribute contains a Converters class object. The + functions of this object returns always a string. + +.. js:attribute:: AppletTCP.f + + :returns: A :ref:`fetches_class` + + This attribute contains a Fetches class object. + +.. js:attribute:: AppletTCP.sf + + :returns: A :ref:`fetches_class` + + This attribute contains a Fetches class object. + +.. js:function:: AppletTCP.getline(applet) + + This function returns a string containing one line from the stream. If the + data returned doesn't contains a final '\\n' its assumed than its the last + available data before the end of stream. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :returns: a string. The string can be empty if we reach the end of the stream. + +.. js:function:: AppletTCP.receive(applet, [size]) + + Reads data from the TCP stream, according to the specified read *size*. If the + *size* is missing, the function tries to read all the content of the stream + until the end. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param integer size: the required read size. + :returns: always return a string,the string can be empty is the connection is + closed. + +.. js:function:: AppletTCP.send(appletmsg) + + Send the message on the stream. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param string msg: the message to send. + +.. js:function:: AppletTCP.get_priv(applet) + + Return Lua data stored in the current transaction. If no data are stored, + it returns a nil value. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :returns: the opaque data previously stored, or nil if nothing is + available. + :see: :js:func:`AppletTCP.set_priv` + +.. js:function:: AppletTCP.set_priv(applet, data) + + Store any data in the current HAProxy transaction. This action replace the + old stored data. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param opaque data: The data which is stored in the transaction. + :see: :js:func:`AppletTCP.get_priv` + +.. js:function:: AppletTCP.set_var(applet, var, value[, ifexist]) + + Converts a Lua type in a HAProxy type and stores it in a variable <var>. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :param type value: The value associated to the variable. The type can be string or + integer. + :param boolean ifexist: If this parameter is set to a truthy value the variable + will only be set if it was defined elsewhere (i.e. used + within the configuration). For global variables (using the + "proc" scope), they will only be updated and never created. + It is highly recommended to + always set this to true. + :see: :js:func:`AppletTCP.unset_var` + :see: :js:func:`AppletTCP.get_var` + +.. js:function:: AppletTCP.unset_var(applet, var) + + Unsets the variable <var>. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :see: :js:func:`AppletTCP.unset_var` + :see: :js:func:`AppletTCP.set_var` + +.. js:function:: AppletTCP.get_var(applet, var) + + Returns data stored in the variable <var> converter in Lua type. + + :param class_AppletTCP applet: An :ref:`applettcp_class` + :param string var: The variable name according with the HAProxy variable syntax. + :see: :js:func:`AppletTCP.unset_var` + :see: :js:func:`AppletTCP.set_var` + +StickTable class +================ + +.. js:class:: StickTable + + **context**: task, action, sample-fetch + + This class can be used to access the HAProxy stick tables from Lua. + +.. js:function:: StickTable.info() + + Returns stick table attributes as a Lua table. See HAProxy documentation for + "stick-table" for canonical info, or check out example below. + + :returns: Lua table + + Assume our table has IPv4 key and gpc0 and conn_rate "columns": + +.. code-block:: lua + + { + expire=<int>, # Value in ms + size=<int>, # Maximum table size + used=<int>, # Actual number of entries in table + data={ # Data columns, with types as key, and periods as values + (-1 if type is not rate counter) + conn_rate=<int>, + gpc0=-1 + }, + length=<int>, # max string length for string table keys, key length + # otherwise + nopurge=<boolean>, # purge oldest entries when table is full + type="ip" # can be "ip", "ipv6", "integer", "string", "binary" + } + +.. js:function:: StickTable.lookup(key) + + Returns stick table entry for given <key> + + :param string key: Stick table key (IP addresses and strings are supported) + :returns: Lua table + +.. js:function:: StickTable.dump([filter]) + + Returns all entries in stick table. An optional filter can be used + to extract entries with specific data values. Filter is a table with valid + comparison operators as keys followed by data type name and value pairs. + Check out the HAProxy docs for "show table" for more details. For the + reference, the supported operators are: + "eq", "ne", "le", "lt", "ge", "gt" + + For large tables, execution of this function can take a long time (for + HAProxy standards). That's also true when filter is used, so take care and + measure the impact. + + :param table filter: Stick table filter + :returns: Stick table entries (table) + + See below for example filter, which contains 4 entries (or comparisons). + (Maximum number of filter entries is 4, defined in the source code) + +.. code-block:: lua + + local filter = { + {"gpc0", "gt", 30}, {"gpc1", "gt", 20}}, {"conn_rate", "le", 10} + } + +.. _action_class: + +Action class +============= + +.. js:class:: Act + + **context**: action + + This class contains all return codes an action may return. It is the lua + equivalent to HAProxy "ACT_RET_*" code. + +.. code-block:: lua + + core.register_action("deny", { "http-req" }, function (txn) + return act.DENY + end) +.. +.. js:attribute:: act.CONTINUE + + This attribute is an integer (0). It instructs HAProxy to continue the current + ruleset processing on the message. It is the default return code for a lua + action. + + :returns: integer + +.. js:attribute:: act.STOP + + This attribute is an integer (1). It instructs HAProxy to stop the current + ruleset processing on the message. + +.. js:attribute:: act.YIELD + + This attribute is an integer (2). It instructs HAProxy to temporarily pause + the message processing. It will be resumed later on the same rule. The + corresponding lua script is re-executed for the start. + +.. js:attribute:: act.ERROR + + This attribute is an integer (3). It triggers an internal errors The message + processing is stopped and the transaction is terminated. For HTTP streams, an + HTTP 500 error is returned to the client. + + :returns: integer + +.. js:attribute:: act.DONE + + This attribute is an integer (4). It instructs HAProxy to stop the message + processing. + + :returns: integer + +.. js:attribute:: act.DENY + + This attribute is an integer (5). It denies the current message. The message + processing is stopped and the transaction is terminated. For HTTP streams, an + HTTP 403 error is returned to the client if the deny is returned during the + request analysis. During the response analysis, an HTTP 502 error is returned + and the server response is discarded. + + :returns: integer + +.. js:attribute:: act.ABORT + + This attribute is an integer (6). It aborts the current message. The message + processing is stopped and the transaction is terminated. For HTTP streams, + HAProxy assumes a response was already sent to the client. From the Lua + actions point of view, when this code is used, the transaction is terminated + with no reply. + + :returns: integer + +.. js:attribute:: act.INVALID + + This attribute is an integer (7). It triggers an internal errors. The message + processing is stopped and the transaction is terminated. For HTTP streams, an + HTTP 400 error is returned to the client if the error is returned during the + request analysis. During the response analysis, an HTTP 502 error is returned + and the server response is discarded. + + :returns: integer + +.. js:function:: act:wake_time(milliseconds) + + **context**: action + + Set the script pause timeout to the specified time, defined in + milliseconds. + + :param integer milliseconds: the required milliseconds. + + This function may be used when a lua action returns `act.YIELD`, to force its + wake-up at most after the specified number of milliseconds. + +.. _filter_class: + +Filter class +============= + +.. js:class:: filter + + **context**: filter + + This class contains return codes some filter callback functions may return. It + also contains configuration flags and some helper functions. To understand how + the filter API works, see `doc/internal/filters.txt` documentation. + +.. js:attribute:: filter.CONTINUE + + This attribute is an integer (1). It may be returned by some filter callback + functions to instruct this filtering step is finished for this filter. + +.. js:attribute:: filter.WAIT + + This attribute is an integer (0). It may be returned by some filter callback + functions to instruct the filtering must be paused, waiting for more data or + for an external event depending on this filter. + +.. js:attribute:: filter.ERROR + + This attribute is an integer (-1). It may be returned by some filter callback + functions to trigger an error. + +.. js:attribute:: filter.FLT_CFG_FL_HTX + + This attribute is a flag corresponding to the filter flag FLT_CFG_FL_HTX. When + it is set for a filter, it means the filter is able to filter HTTP streams. + +.. js:function:: filter.register_data_filter(chn) + + **context**: filter + + Enable the data filtering on the channel **chn** for the current filter. It + may be called at any time from any callback functions proceeding the data + analysis. + + :param class_Channel chn: A :ref:`channel_class`. + +.. js:function:: filter.unregister_data_filter(chn) + + **context**: filter + + Disable the data filtering on the channel **chn** for the current filter. It + may be called at any time from any callback functions. + + :param class_Channel chn: A :ref:`channel_class`. + +.. js:function:: filter.wake_time(milliseconds) + + **context**: filter + + Set the script pause timeout to the specified time, defined in + milliseconds. + + :param integer milliseconds: the required milliseconds. + + This function may be used from any lua filter callback function to force its + wake-up at most after the specified number of milliseconds. Especially, when + `filter.CONTINUE` is returned. + + +A filters is declared using :js:func:`core.register_filter()` function. The +provided class will be used to instantiate filters. It may define following +attributes: + +* id: The filter identifier. It is a string that identifies the filter and is + optional. + +* flags: The filter flags. Only :js:attr:`filter.FLT_CFG_FL_HTX` may be set for now. + +Such filter class must also define all required callback functions in the +following list. Note that :js:func:`Filter.new()` must be defined otherwise the +filter is ignored. Others are optional. + +* .. js:function:: FILTER.new() + + Called to instantiate a new filter. This function must be defined. + + :returns: a Lua object that will be used as filter instance for the current + stream. + +* .. js:function:: FILTER.start_analyze(flt, txn, chn) + + Called when the analysis starts on the channel **chn**. + +* .. js:function:: FILTER.end_analyze(flt, txn, chn) + + Called when the analysis ends on the channel **chn**. + +* .. js:function:: FILTER.http_headers(flt, txn, http_msg) + + Called just before the HTTP payload analysis and after any processing on the + HTTP message **http_msg**. This callback functions is only called for HTTP + streams. + +* .. js:function:: FILTER.http_payload(flt, txn, http_msg) + + Called during the HTTP payload analysis on the HTTP message **http_msg**. This + callback functions is only called for HTTP streams. + +* .. js:function:: FILTER.http_end(flt, txn, http_msg) + + Called after the HTTP payload analysis on the HTTP message **http_msg**. This + callback functions is only called for HTTP streams. + +* .. js:function:: FILTER.tcp_payload(flt, txn, chn) + + Called during the TCP payload analysis on the channel **chn**. + +Here is an full example: + +.. code-block:: lua + + Trace = {} + Trace.id = "Lua trace filter" + Trace.flags = filter.FLT_CFG_FL_HTX; + Trace.__index = Trace + + function Trace:new() + local trace = {} + setmetatable(trace, Trace) + trace.req_len = 0 + trace.res_len = 0 + return trace + end + + function Trace:start_analyze(txn, chn) + if chn:is_resp() then + print("Start response analysis") + else + print("Start request analysis") + end + filter.register_data_filter(self, chn) + end + + function Trace:end_analyze(txn, chn) + if chn:is_resp() then + print("End response analysis: "..self.res_len.." bytes filtered") + else + print("End request analysis: "..self.req_len.." bytes filtered") + end + end + + function Trace:http_headers(txn, http_msg) + stline = http_msg:get_stline() + if http_msg.channel:is_resp() then + print("response:") + print(stline.version.." "..stline.code.." "..stline.reason) + else + print("request:") + print(stline.method.." "..stline.uri.." "..stline.version) + end + + for n, hdrs in pairs(http_msg:get_headers()) do + for i,v in pairs(hdrs) do + print(n..": "..v) + end + end + return filter.CONTINUE + end + + function Trace:http_payload(txn, http_msg) + body = http_msg:body(-20000) + if http_msg.channel:is_resp() then + self.res_len = self.res_len + body:len() + else + self.req_len = self.req_len + body:len() + end + end + + core.register_filter("trace", Trace, function(trace, args) + return trace + end) + +.. + +.. _httpmessage_class: + +HTTPMessage class +=================== + +.. js:class:: HTTPMessage + + **context**: filter + + This class contains all functions to manipulate an HTTP message. For now, this + class is only available from a filter context. + +.. js:function:: HTTPMessage.add_header(http_msg, name, value) + + Appends an HTTP header field in the HTTP message **http_msg** whose name is + specified in **name** and whose value is defined in **value**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string name: The header name. + :param string value: The header value. + +.. js:function:: HTTPMessage.append(http_msg, string) + + This function copies the string **string** at the end of incoming data of the + HTTP message **http_msg**. The function returns the copied length on success + or -1 if data cannot be copied. + + Same that :js:func:`HTTPMessage.insert(http_msg, string, http_msg:input())`. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string string: The data to copied into incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: HTTPMessage.body(http_msgl[, offset[, length]]) + + This function returns **length** bytes of incoming data from the HTTP message + **http_msg**, starting at the offset **offset**. The data are not removed from + the buffer. + + By default, if no length is provided, all incoming data found, starting at the + given offset, are returned. If **length** is set to -1, the function tries to + retrieve a maximum of data. Because it is called in the filter context, it + never yield. Do not providing an offset is the same that setting it to 0. A + positive offset is relative to the beginning of incoming data of the + http_message buffer while negative offset is relative to their end. + + If there is no incoming data and the HTTP message can't receive more data, a 'nil' + value is returned. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param integer offset: *optional* The offset in incoming data to start to get + data. 0 by default. May be negative to be relative to + the end of incoming data. + :param integer length: *optional* The expected length of data to retrieve. All + incoming data by default. May be set to -1 to get a + maximum of data. + :returns: a string containing the data found or nil. + +.. js:function:: HTTPMessage.eom(http_msg) + + This function returns true if the end of message is reached for the HTTP + message **http_msg**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: an integer containing the amount of available bytes. + +.. js:function:: HTTPMessage.del_header(http_msg, name) + + Removes all HTTP header fields in the HTTP message **http_msg** whose name is + specified in **name**. + + :param class_httpmessage http_msg: The manipulated http message. + :param string name: The header name. + +.. js:function:: HTTPMessage.get_headers(http_msg) + + Returns a table containing all the headers of the HTTP message **http_msg**. + + :param class_httpmessage http_msg: The manipulated http message. + :returns: table of headers. + + This is the form of the returned table: + +.. code-block:: lua + + http_msg:get_headers()['<header-name>'][<header-index>] = "<header-value>" + + local hdr = http_msg:get_headers() + hdr["host"][0] = "www.test.com" + hdr["accept"][0] = "audio/basic q=1" + hdr["accept"][1] = "audio/*, q=0.2" + hdr["accept"][2] = "*.*, q=0.1" +.. + +.. js:function:: HTTPMessage.get_stline(http_msg) + + Returns a table containing the start-line of the HTTP message **http_msg**. + + :param class_httpmessage http_msg: The manipulated http message. + :returns: the start-line. + + This is the form of the returned table: + +.. code-block:: lua + + -- for the request : + {"method" = string, "uri" = string, "version" = string} + + -- for the response: + {"version" = string, "code" = string, "reason" = string} +.. + +.. js:function:: HTTPMessage.forward(http_msg, length) + + This function forwards **length** bytes of data from the HTTP message + **http_msg**. Because it is called in the filter context, it never yield. Only + available incoming data may be forwarded, event if the requested length + exceeds the available amount of incoming data. It returns the amount of data + forwarded. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param integer int: The amount of data to forward. + +.. js:function:: HTTPMessage.input(http_msg) + + This function returns the length of incoming data in the HTTP message + **http_msg** from the calling filter point of view. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: an integer containing the amount of available bytes. + +.. js:function:: HTTPMessage.insert(http_msg, string[, offset]) + + This function copies the string **string** at the offset **offset** in + incoming data of the HTTP message **http_msg**. The function returns the + copied length on success or -1 if data cannot be copied. + + By default, if no offset is provided, the string is copied in front of + incoming data. A positive offset is relative to the beginning of incoming data + of the HTTP message while negative offset is relative to their end. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string string: The data to copied into incoming data. + :param integer offset: *optional* The offset in incomding data where to copied + data. 0 by default. May be negative to be relative to + the end of incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: HTTPMessage.is_full(http_msg) + + This function returns true if the HTTP message **http_msg** is full. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: a boolean + +.. js:function:: HTTPMessage.is_resp(http_msg) + + This function returns true if the HTTP message **http_msg** is the response + one. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: a boolean + +.. js:function:: HTTPMessage.may_recv(http_msg) + + This function returns true if the HTTP message **http_msg** may still receive + data. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: a boolean + +.. js:function:: HTTPMessage.output(http_msg) + + This function returns the length of outgoing data of the HTTP message + **http_msg**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :returns: an integer containing the amount of available bytes. + +.. js:function:: HTTPMessage.prepend(http_msg, string) + + This function copies the string **string** in front of incoming data of the + HTTP message **http_msg**. The function returns the copied length on success + or -1 if data cannot be copied. + + Same that :js:func:`HTTPMessage.insert(http_msg, string, 0)`. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string string: The data to copied into incoming data. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: HTTPMessage.remove(http_msg[, offset[, length]]) + + This function removes **length** bytes of incoming data of the HTTP message + **http_msg**, starting at offset **offset**. This function returns number of + bytes removed on success. + + By default, if no length is provided, all incoming data, starting at the given + offset, are removed. Do not providing an offset is the same that setting it + to 0. A positive offset is relative to the beginning of incoming data of the + HTTP message while negative offset is relative to their end. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param integer offset: *optional* The offset in incomding data where to start + to remove data. 0 by default. May be negative to + be relative to the end of incoming data. + :param integer length: *optional* The length of data to remove. All incoming + data by default. + :returns: an integer containing the amount of bytes removed. + +.. js:function:: HTTPMessage.rep_header(http_msg, name, regex, replace) + + Matches the regular expression in all occurrences of header field **name** + according to regex **regex**, and replaces them with the string **replace**. + The replacement value can contain back references like \1, \2, ... This + function acts on whole header lines, regardless of the number of values they + may contain. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string name: The header name. + :param string regex: The match regular expression. + :param string replace: The replacement value. + +.. js:function:: HTTPMessage.rep_value(http_msg, name, regex, replace) + + Matches the regular expression on every comma-delimited value of header field + **name** according to regex **regex**, and replaces them with the string + **replace**. The replacement value can contain back references like \1, \2, + ... + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string name: The header name. + :param string regex: The match regular expression. + :param string replace: The replacement value. + +.. js:function:: HTTPMessage.send(http_msg, string) + + This function required immediate send of the string **string**. It means the + string is copied at the beginning of incoming data of the HTTP message + **http_msg** and immediately forwarded. Because it is called in the filter + context, it never yield. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string string: The data to send. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: HTTPMessage.set(http_msg, string[, offset[, length]]) + + This function replace **length** bytes of incoming data of the HTTP message + **http_msg**, starting at offset **offset**, by the string **string**. The + function returns the copied length on success or -1 if data cannot be copied. + + By default, if no length is provided, all incoming data, starting at the given + offset, are replaced. Do not providing an offset is the same that setting it + to 0. A positive offset is relative to the beginning of incoming data of the + HTTP message while negative offset is relative to their end. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string string: The data to copied into incoming data. + :param integer offset: *optional* The offset in incomding data where to start + the data replacement. 0 by default. May be negative to + be relative to the end of incoming data. + :param integer length: *optional* The length of data to replace. All incoming + data by default. + :returns: an integer containing the amount of bytes copied or -1. + +.. js:function:: HTTPMessage.set_eom(http_msg) + + This function set the end of message for the HTTP message **http_msg**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + +.. js:function:: HTTPMessage.set_header(http_msg, name, value) + + This variable replace all occurrence of all header matching the name **name**, + by only one containing the value **value**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string name: The header name. + :param string value: The header value. + + This function does the same work as the following code: + +.. code-block:: lua + + http_msg:del_header("header") + http_msg:add_header("header", "value") +.. + +.. js:function:: HTTPMessage.set_method(http_msg, method) + + Rewrites the request method with the string **method**. The HTTP message + **http_msg** must be the request. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string method: The new method. + +.. js:function:: HTTPMessage.set_path(http_msg, path) + + Rewrites the request path with the string **path**. The HTTP message + **http_msg** must be the request. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string method: The new method. + +.. js:function:: HTTPMessage.set_query(http_msg, query) + + Rewrites the request's query string which appears after the first question + mark ("?") with the string **query**. The HTTP message **http_msg** must be + the request. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string query: The new query. + +.. js:function:: HTTPMessage.set_status(http_msg, status[, reason]) + + Rewrites the response status code with the integer **code** and optional the + reason **reason**. If no custom reason is provided, it will be generated from + the status. The HTTP message **http_msg** must be the response. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param integer status: The new response status code. + :param string reason: The new response reason (optional). + +.. js:function:: HTTPMessage.set_uri(http_msg, uri) + + Rewrites the request URI with the string **uri**. The HTTP message + **http_msg** must be the request. + + :param class_httpmessage http_msg: The manipulated HTTP message. + :param string uri: The new uri. + +.. js:function:: HTTPMessage.unset_eom(http_msg) + + This function remove the end of message for the HTTP message **http_msg**. + + :param class_httpmessage http_msg: The manipulated HTTP message. + +.. _CertCache_class: + +CertCache class +================ + +.. js:class:: CertCache + + This class allows to update an SSL certificate file in the memory of the + current HAProxy process. It will do the same as "set ssl cert" + "commit ssl + cert" over the HAProxy CLI. + +.. js:function:: CertCache.set(certificate) + + This function updates a certificate in memory. + + :param table certificate: A table containing the fields to update. + :param string certificate.filename: The mandatory filename of the certificate + to update, it must already exist in memory. + :param string certificate.crt: A certificate in the PEM format. It can also + contain a private key. + :param string certificate.key: A private key in the PEM format. + :param string certificate.ocsp: An OCSP response in base64. (cf management.txt) + :param string certificate.issuer: The certificate of the OCSP issuer. + :param string certificate.sctl: An SCTL file. + +.. code-block:: lua + + CertCache.set{filename="certs/localhost9994.pem.rsa", crt=crt} + + +External Lua libraries +====================== + +A lot of useful lua libraries can be found here: + +* `https://lua-toolbox.com/ <https://lua-toolbox.com/>`_ + +Redis client library: + +* `https://github.com/nrk/redis-lua <https://github.com/nrk/redis-lua>`_ + +This is an example about the usage of the Redis library with HAProxy. Note that +each call of any function of this library can throw an error if the socket +connection fails. + +.. code-block:: lua + + -- load the redis library + local redis = require("redis"); + + function do_something(txn) + + -- create and connect new tcp socket + local tcp = core.tcp(); + tcp:settimeout(1); + tcp:connect("127.0.0.1", 6379); + + -- use the redis library with this new socket + local client = redis.connect({socket=tcp}); + client:ping(); + + end + +OpenSSL: + +* `http://mkottman.github.io/luacrypto/index.html + <http://mkottman.github.io/luacrypto/index.html>`_ + +* `https://github.com/brunoos/luasec/wiki + <https://github.com/brunoos/luasec/wiki>`_ diff --git a/doc/lua.txt b/doc/lua.txt new file mode 100644 index 0000000..3ef0d31 --- /dev/null +++ b/doc/lua.txt @@ -0,0 +1,966 @@ + Lua: Architecture and first steps + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + version 2.6 + + author: Thierry FOURNIER + contact: tfournier at arpalert dot org + + + +HAProxy is a powerful load balancer. It embeds many options and many +configuration styles in order to give a solution to many load balancing +problems. However, HAProxy is not universal and some special or specific +problems do not have solution with the native software. + +This text is not a full explanation of the Lua syntax. + +This text is not a replacement of the HAProxy Lua API documentation. The API +documentation can be found at the project root, in the documentation directory. +The goal of this text is to discover how Lua is implemented in HAProxy and using +it efficiently. + +However, this can be read by Lua beginners. Some examples are detailed. + +Why a scripting language in HAProxy +=================================== + +HAProxy 1.5 makes at possible to do many things using samples, but some people +want to more combining results of samples fetches, programming conditions and +loops which is not possible. Sometimes people implement these functionalities +in patches which have no meaning outside their network. These people must +maintain these patches, or worse we must integrate them in the HAProxy +mainstream. + +Their need is to have an embedded programming language in order to no longer +modify the HAProxy source code, but to write their own control code. Lua is +encountered very often in the software industry, and in some open source +projects. It is easy to understand, efficient, light without external +dependencies, and leaves the resource control to the implementation. Its design +is close to the HAProxy philosophy which uses components for what they do +perfectly. + +The HAProxy control block allows one to take a decision based on the comparison +between samples and patterns. The samples are extracted using fetch functions +easily extensible, and are used by actions which are also extensible. It seems +natural to allow Lua to give samples, modify them, and to be an action target. +So, Lua uses the same entities as the configuration language. This is the most +natural and reliable way for the Lua integration. So, the Lua engine allows one +to add new sample fetch functions, new converter functions and new actions. +These new entities can access the existing samples fetches and converters +allowing to extend them without rewriting them. + +The writing of the first Lua functions shows that implementing complex concepts +like protocol analysers is easy and can be extended to full services. It appears +that these services are not easy to implement with the HAProxy configuration +model which is based on four steps: fetch, convert, compare and action. HAProxy +is extended with a notion of services which are a formalisation of the existing +services like stats, cli and peers. The service is an autonomous entity with a +behaviour pattern close to that of an external client or server. The Lua engine +inherits from this new service and offers new possibilities for writing +services. + +This scripting language is useful for testing new features as proof of concept. +Later, if there is general interest, the proof of concept could be integrated +with C language in the HAProxy core. + +The HAProxy Lua integration also provides a simple way for distributing Lua +packages. The final user needs only to install the Lua file, load it in HAProxy +and follow the attached documentation. + +Design and technical things +=========================== + +Lua is integrated into the HAProxy event driven core. We want to preserve the +fast processing of HAProxy. To ensure this, we implement some technical concepts +between HAProxy and the Lua library. + +The following paragraph also describes the interactions between Lua and HAProxy +from a technical point of view. + +Prerequisite +----------- + +Reading the following documentation links is required to understand the +current paragraph: + + HAProxy doc: http://docs.haproxy.org/ + Lua API: http://www.lua.org/manual/5.3/ + HAProxy API: http://www.arpalert.org/src/haproxy-lua-api/2.6/index.html + Lua guide: http://www.lua.org/pil/ + +more about Lua choice +--------------------- + +Lua language is very simple to extend. It is easy to add new functions written +in C in the core language. It is not required to embed very intrusive libraries, +and we do not change compilation processes. + +The amount of memory consumed can be controlled, and the issues due to lack of +memory are perfectly caught. The maximum amount of memory allowed for the Lua +processes is configurable. If some memory is missing, the current Lua action +fails, and the HAProxy processing flow continues. + +Lua provides a way for implementing event driven design. When the Lua code +wants to do a blocking action, the action is started, it executes non blocking +operations, and returns control to the HAProxy scheduler when it needs to wait +for some external event. + +The Lua process can be interrupted after a number of instructions executed. The +Lua execution will resume later. This is a useful way for controlling the +execution time. This system also keeps HAProxy responsive. When the Lua +execution is interrupted, HAProxy accepts some connections or transfers pending +data. The Lua execution does not block the main HAProxy processing, except in +some cases which we will see later. + +Lua function integration +------------------------ + +The Lua actions, sample fetches, converters and services are integrated in +HAProxy with "register_*" functions. The register system is a choice for +providing HAProxy Lua packages easily. The register system adds new sample +fetches, converters, actions or services usable in the HAProxy configuration +file. + +The register system is defined in the "core" functions collection. This +collection is provided by HAProxy and is always available. Below, the list of +these functions: + + - core.register_action() + - core.register_converters() + - core.register_fetches() + - core.register_init() + - core.register_service() + - core.register_task() + +These functions are the execution entry points. + +HTTP action must be used for manipulating HTTP request headers. This action +can not manipulates HTTP content. It is dangerous to use the channel +manipulation object with an HTTP request in an HTTP action. The channel +manipulation can transform a valid request in an invalid request. In this case, +the action will never resume and the processing will be frozen. HAProxy +discards the request after the reception timeout. + +Non blocking design +------------------- + +HAProxy is an event driven software, so blocking system calls are absolutely +forbidden. However, the Lua allows to do blocking actions. When an action +blocks, HAProxy is waiting and do nothing, so the basic functionalities like +accepting connections or forwarding data are blocked while the end of the system +call. In this case HAProxy will be less responsive. + +This is very insidious because when the developer tries to execute its Lua code +with only one stream, HAProxy seems to run fine. When the code is used with +production stream, HAProxy encounters some slow processing, and it cannot +hold the load. + +However, during the initialisation state, you can obviously using blocking +functions. There are typically used for loading files. + +The list of prohibited standard Lua functions during the runtime contains all +that do filesystem access: + + - os.remove() + - os.rename() + - os.tmpname() + - package.*() + - io.*() + - file.*() + +Some other functions are prohibited: + + - os.execute(), waits for the end of the required execution blocking HAProxy. + + - os.exit(), is not really dangerous for the process, but it's not the good way + for exiting the HAProxy process. + + - print(), writes data on stdout. In some cases these writes are blocking, the + best practice is reserving this call for debugging. We must prefer + to use core.log() or TXN.log() for sending messages. + +Some HAProxy functions have a blocking behaviour pattern in the Lua code, but +there are compatible with the non blocking design. These functions are: + + - All the socket class + - core.sleep() + +Responsive design +----------------- + +HAProxy must process connections accept, forwarding data and processing timeouts +as soon as possible. The first thing is to believe that a Lua script with a long +execution time should impact the expected responsive behaviour. + +It is not the case, the Lua script execution are regularly interrupted, and +HAProxy can process other things. These interruptions are exprimed in number of +Lua instructions. The number of interruptions between two interrupts is +configured with the following "tune" option: + + tune.lua.forced-yield <nb> + +The default value is 10 000. For determining it, I ran benchmark on my laptop. +I executed a Lua loop between 10 seconds with different values for the +"tune.lua.forced-yield" option, and I noted the results: + + configured | Number of + instructions | loops executed + between two | in millions + forced yields | + ---------------+--------------- + 10 | 160 + 500 | 670 + 1000 | 680 + 5000 | 700 + 7000 | 700 + 8000 | 700 + 9000 | 710 <- ceil + 10000 | 710 + 100000 | 710 + 1000000 | 710 + +The result showed that from 9000 instructions between two interrupt, we reached +a ceil, so the default parameter is 10 000. + +When HAProxy interrupts the Lua processing, we have two states possible: + + - Lua is resumable, and it returns control to the HAProxy scheduler, + - Lua is not resumable, and we just check the execution timeout. + +The second case occurs if it is required by the HAProxy core. This state is +forced if the Lua is processed in a non resumable HAProxy part, like sample +fetches or converters. + +It occurs also if the Lua is non resumable. For example, if some code is +executed through the Lua pcall() function, the execution is not resumable. This +is explained later. + +So, the Lua code must be fast and simple when is executed as sample fetches and +converters, it could be slow and complex when is executed as actions and +services. + +Execution time +-------------- + +The Lua execution time is measured and limited. Each group of functions has its +own timeout configured. The time measured is the real Lua execution time, and +not the difference between the end time and the start time. The groups are: + + - main code and init are not submitted to the timeout, + - fetches, converters and action have a default timeout of 4s, + - task, by default does not have timeout, + - service have a default timeout of 4s. + +The corresponding tune options are: + + - tune.lua.session-timeout (fetches, converters and action) + - tune.lua.task-timeout (task) + - tune.lua.service-timeout (services) + +The task does not have a timeout because it runs in background along the +HAProxy process life. + +For example, if an Lua script is executed during 1.1s and the script executes a +sleep of 1 second, the effective measured running time is 0.1s. + +This timeout is useful for preventing infinite loops. During the runtime, it +should be never triggered. + +The stack and the coprocess +--------------------------- + +The Lua execution is organized around a stack. Each Lua action, even out of the +effective execution, affects the stack. HAProxy integration uses one main stack, +which is common for all the process, and a secondary one used as coprocess. +After the initialization, the main stack is no longer used by HAProxy, except +for global storage. The second type of stack is used by all the Lua functions +called from different Lua actions declared in HAProxy. The main stack permits +to store coroutines pointers, and some global variables. + +Do you want to see an example of how seems Lua C development around a stack ? +Some examples follows. This first one, is a simple addition: + + lua_pushnumber(L, 1) + lua_pushnumber(L, 2) + lua_arith(L, LUA_OPADD) + +It's easy, we push 1 on the stack, after, we push 2, and finally, we perform an +addition. The two top entries of the stack are added, popped, and the result is +pushed. It is a classic way with a stack. + +Now an example for constructing array and objects. It's a little bit more +complicated. The difficult consist to keep in mind the state of the stack while +we write the code. The goal is to create the entity described below. Note that +the notation "*1" is a metatable reference. The metatable will be explained +later. + + name*1 = { + [0] = <userdata>, + } + + *1 = { + "__index" = { + "method1" = <function>, + "method2" = <function> + } + "__gc" = <function> + } + +Let's go: + + lua_newtable() // The "name" table + lua_newtable() // The metatable *1 + lua_pushstring("__index") + lua_newtable() // The "__index" table + lua_pushstring("method1") + lua_pushfunction(function) + lua_settable(-3) // -3 is an index in the stack. insert method1 + lua_pushstring("method2") + lua_pushfunction(function) + lua_settable(-3) // insert method2 + lua_settable(-3) // insert "__index" + lua_pushstring("__gc") + lua_pushfunction(function) + lua_settable() // insert "__gc" + lua_setmetatable(-1) // attach metatable to "name" + lua_pushnumber(0) + lua_pushuserdata(userdata) + lua_settable(-3) + lua_setglobal("name") + +So, coding for Lua in C, is not complex, but it needs some mental gymnastic. + +The object concept and the HAProxy format +----------------------------------------- + +The object seems to be not a native concept. An Lua object is a table. We can +note that the table notation accept three forms: + + 1. mytable["entry"](mytable, "param") + 2. mytable.entry(mytable, "param") + 3. mytable:entry("param") + +These three notation have the same behaviour pattern: a function is executed +with the table itself as first parameter and string "param" as second parameter +The notation with [] is commonly used for storing data in a hash table, and the +dotted notation is used for objects. The notation with ":" indicates that the +first parameter is the element at the left of the symbol ":". + +So, an object is a table and each entry of the table is a variable. A variable +can be a function. These are the first concepts of the object notation in the +Lua, but it is not the end. + +With the objects, we usually expect classes and inheritance. This is the role of +the metable. A metable is a table with predefined entries. These entries modify +the default behaviour of the table. The simplest example is the "__index" entry. +If this entry exists, it is called when a value is requested in the table. The +behaviour is the following: + + 1 - looks in the table if the entry exists, and if it the case, return it + + 2 - looks if a metatable exists, and if the "__index" entry exists + + 3 - if "__index" is a function, execute it with the key as parameter, and + returns the result of the function. + + 4 - if "__index" is a table, looks if the requested entry exists, and if + exists, return it. + + 5 - if not exists, return to step 2 + +The behaviour of the point 5 represents the inheritance. + +In HAProxy all the provided objects are tables, the entry "[0]" contains private +data, there are often userdata or lightuserdata. The metatable is registered in +the global part of the main Lua stack, and it is called with the case sensitive +class name. A great part of these class must not be used directly because it +requires an initialisation using the HAProxy internal structs. + +The HAProxy objects use unified conventions. An Lua object is always a table. +In most cases, an HAProxy Lua object needs some private data. These are always +set in the index [0] of the array. The metatable entry "__tostring" returns the +object name. + +The Lua developer can add entries to the HAProxy object. They just work carefully +and prevent to modify the index [0]. + +Common HAProxy objects are: + + - TXN : manipulates the transaction between the client and the server + - Channel : manipulates proxified data between the client and the server + - HTTP : manipulates HTTP between the client and the server + - Map : manipulates HAProxy maps. + - Fetches : access to all HAProxy sample fetches + - Converters : access to all HAProxy sample converters + - AppletTCP : process client request like a TCP server + - AppletHTTP : process client request like an HTTP server + - Socket : establish tcp connection to a server (ipv4/ipv6/socket/ssl/...) + +The garbage collector and the memory allocation +----------------------------------------------- + +Lua doesn't really have a global memory limit, but HAProxy implements it. This +permits to control the amount of memory dedicated to the Lua processes. It is +specially useful with embedded environments. + +When the memory limit is reached, HAProxy refuses to give more memory to the Lua +scripts. The current Lua execution is terminated with an error and HAProxy +continues its processing. + +The max amount of memory is configured with the option: + + tune.lua.maxmem + +As many other script languages, Lua uses a garbage collector for reusing its +memory. The Lua developer can work without memory preoccupation. Usually, the +garbage collector is controlled by the Lua core, but sometimes it will be useful +to run when the user/developer requires. So the garbage collector can be called +from C part or Lua part. + +Sometimes, objects using lightuserdata or userdata requires to free some memory +block or close filedescriptor not controlled by the Lua. A dedicated garbage +collection function is provided through the metatable. It is referenced with the +special entry "__gc". + +Generally, in HAProxy, the garbage collector does this job without any +intervention. However some objects use a great amount of memory, and we want to +release as quickly as possible. The problem is that only the GC knows if the +object is in use or not. The reason is simple variable containing objects can be +shared between coroutines and the main thread, so an object can be used +everywhere in HAProxy. + +The only one example is the HAProxy sockets. These are explained later, just for +understanding the GC issues, a quick overview of the socket follows. The HAProxy +socket uses an internal session and stream, the session uses resources like +memory and file descriptor and in some cases keeps a socket open while it is no +longer used by Lua. + +If the HAProxy socket is used, we forcing a garbage collector cycle after the +end of each function using HAProxy socket. The reason is simple: if the socket +is no longer used, we want to close the connection quickly. + +A special flag is used in HAProxy indicating that a HAProxy socket is created. +If this flag is set, a full GC cycle is started after each Lua action. This is +not free, we loose about 10% of performances, but it is the only way for closing +sockets quickly. + +The yield concept / longjmp issues +---------------------------------- + +The "yield" is an action which does some Lua processing in pause and give back +the hand to the HAProxy core. This action is do when the Lua needs to wait about +data or other things. The most basically example is the sleep() function. In an +event driven software the code must not process blocking systems call, so the +sleep blocks the software between a lot of time. In HAProxy, an Lua sleep does a +yield, and ask to the scheduler to be woken up in a required sleep time. +Meanwhile, the HAProxy scheduler does other things, like accepting new +connection or forwarding data. + +A yield is also executed regularly, after a lot of Lua instructions processed. +This yield permits to control the effective execution time, and also give back +the hand to the HAProxy core. When HAProxy finishes to process the pending jobs, +the Lua execution continues. + +This special "yield" uses the Lua "debug" functions. Lua provides a debug method +called "lua_sethook()" which permits to interrupt the execution after some +configured condition and call a function. This condition used in HAProxy is +a number of instructions processed and when a function returns. The function +called controls the effective execution time, and if it is possible to send a +"yield". + +The yield system is based on a couple setjmp/longjmp. In brief, the setjmp() +stores a stack state, and the longjmp restores the stack in its state which had +before the last Lua execution. + +Lua can immediately stop its execution if an error occurs. This system uses also +the longjmp system. In HAProxy, we try to use this system only for unrecoverable +errors. Maybe some trivial errors target an exception, but we try to remove it. + +It seems that Lua uses the longjmp system for having a behaviour like the java +try / catch. We can use the function pcall() to execute some code. The function +pcall() run a setjmp(). So, if any error occurs while the Lua code execution, +the flow immediately returns from the pcall() with an error. + +The big issue of this behaviour is that we cannot do a yield. So if some Lua code +executes a library using pcall for catching errors, HAProxy must be wait for the +end of execution without processing any accept or any stream. The cause is the +yield must be jump to the root of execution. The intermediate setjmp() avoids +this behaviour. + + + HAProxy start Lua execution + + Lua puts a setjmp() + + Lua executes code + + Some code is executed in a pcall() + + pcall() puts a setjmp() + + Lua executes code + + A yield is require for a sleep function + it cannot be jumps to the Lua root execution. + + +Another issue with the processing of strong errors is the manipulation of the +Lua stack outside of an Lua processing. If one of the functions called occurs a +strong error, the default behaviour is an abort(). It is not acceptable when +HAProxy is in runtime mode. The Lua documentation propose to use another +setjmp/longjmp to avoid the abort(). The goal is to put a setjmp between +manipulating the Lua stack and using an alternative "panic" function which jumps +to the setjmp() in error case. + +All of these behaviours are very dangerous for the stability, and the internal +HAProxy code must be modified with many precautions. + +For preserving a good behaviour of HAProxy, the yield is mandatory. +Unfortunately, some HAProxy parts are not adapted for resuming an execution +after a yield. These parts are the sample fetches and the sample converters. So, +the Lua code written in these parts of HAProxy must be quickly executed, and can +not do actions which require yield like TCP connection or simple sleep. + +HAProxy socket object +--------------------- + +The HAProxy design is optimized for the data transfers between a client and a +server, and processing the many errors which can occurs during these exchanges. +HAProxy is not designed for having a third connection established to a third +party server. + +The solution consist to put the main stream in pause waiting for the end of the +exchanges with the third connection. This is completed by a signal between +internal tasks. The following graph shows the HAProxy Lua socket: + + + +--------------------+ + | Lua processing | + ------------------\ | creates socket | ------------------\ + incoming request > | and puts the | Outgoing request > + ------------------/ | current processing | ------------------/ + Â Â | in pause waiting | + | for TCP applet | + +-----------------+--+ + ^ | + | | + | signal | read / write + | | data + | | + +-------------+---------+ v + | HAProxy internal +----------------+ + | applet send signals | | + | when data is received | | -------------------\ + | or some room is | Attached I/O | Client TCP stream > + | available | Buffers | -------------------/ + +--------------------+--+ | + | | + +-------------------+ + + +A more detailed graph is available in the "doc/internals" directory. + +The HAProxy Lua socket uses a full HAProxy session / stream for establishing the +connection. This mechanism provides all the facilities and HAProxy features, +like the SSL stack, many socket type, and support for namespaces. +Technically it supports the proxy protocol, but there are no way to enable it. + +How compiling HAProxy with Lua +============================== + +HAProxy 1.6 requires Lua 5.3. Lua 5.3 offers some features which make easy the +integration. Lua 5.3 is young, and some distros do not distribute it. Luckily, +Lua is a great product because it does not require exotic dependencies, and its +build process is really easy. + +The compilation process for linux is easy: + + - download the source tarball + wget http://www.lua.org/ftp/lua-5.3.1.tar.gz + + - untar it + tar xf lua-5.3.1.tar.gz + + - enter the directory + cd lua-5.3.1 + + - build the library for linux + make linux + + - install it: + sudo make INSTALL_TOP=/opt/lua-5.3.1 install + +HAProxy builds with your favourite options, plus the following options for +embedding the Lua script language: + + - download the source tarball + wget http://www.haproxy.org/download/1.6/src/haproxy-1.6.2.tar.gz + + - untar it + tar xf haproxy-1.6.2.tar.gz + + - enter the directory + cd haproxy-1.6.2 + + - build HAProxy: + make TARGET=linux-glibc \ + USE_LUA=1 \ + LUA_LIB=/opt/lua-5.3.1/lib \ + LUA_INC=/opt/lua-5.3.1/include + + - install it: + sudo make PREFIX=/opt/haproxy-1.6.2 install + +First steps with Lua +==================== + +Now, it's time to use Lua in HAProxy. + +Start point +----------- + +The HAProxy global directive "lua-load <file>" allows to load an Lua file. This +is the entry point. This load become during the configuration parsing, and the +Lua file is immediately executed. + +All the register_*() functions must be called at this time because they are used +just after the processing of the global section, in the frontend/backend/listen +sections. + +The most simple "Hello world !" is the following line a loaded Lua file: + + core.Alert("Hello World !"); + +It displays a log during the HAProxy startup: + + [alert] 285/083533 (14465) : Hello World ! + +Default path and libraries +-------------------------- + +Lua can embed some libraries. These libraries can be included from different +paths. It seems that Lua doesn't like subdirectories. In the following example, +I try to load a compiled library, so the first line is Lua code, the second line +is an 'strace' extract proving that the library was opened. The next lines are +the associated error. + + require("luac/concat") + + open("./luac/concat.so", O_RDONLY|O_CLOEXEC) = 4 + + [ALERT] (22806) : parsing [commonstats.conf:15] : lua runtime + error: error loading module 'luac/concat' from file './luac/concat.so': + ./luac/concat.so: undefined symbol: luaopen_luac/concat + +Lua tries to load the C symbol 'luaopen_luac/concat'. When Lua tries to open a +library, it tries to execute the function associated to the symbol +"luaopen_<libname>". + +The variable "<libname>" is defined using the content of the variable +"package.cpath" and/or "package.path". The default definition of the +"package.cpath" (on my computer is ) variable is: + + /usr/local/lib/lua/5.3/?.so;/usr/local/lib/lua/5.3/loadall.so;./?.so + +The "<libname>" is the content which replaces the symbol "<?>". In the previous +example, its "luac/concat", and obviously the Lua core try to load the function +associated with the symbol "luaopen_luac/concat". + +My conclusion is that Lua doesn't support subdirectories. So, for loading +libraries in subdirectory, it must fill the variable with the name of this +subdirectory. The extension .so must disappear, otherwise Lua try to execute the +function associated with the symbol "luaopen_concat.so". The following syntax is +correct: + + package.cpath = package.cpath .. ";./luac/?.so" + require("concat") + +First useful example +-------------------- + + core.register_fetches("my-hash", function(txn, salt) + return txn.sc:sdbm(salt .. txn.sf:req_fhdr("host") .. txn.sf:path() .. txn.sf:src(), 1) + end) + +You will see that these 3 lines can generate a lot of explanations :) + +Core.register_fetches() is executed during the processing of the global section +by the HAProxy configuration parser. A new sample fetch is declared with name +"my-hash", this name is always prefixed by "lua.". So this new declared +sample fetch will be used calling "lua.my-hash" in the HAProxy configuration +file. + +The second parameter is an inline declared anonymous function. Note the closed +parenthesis after the keyword "end" which ends the function. The first parameter +of this anonymous function is "txn". It is an object of class TXN. It provides +access functions. The second parameter is an arbitrary value provided by the +HAProxy configuration file. This parameter is optional, the developer must +check if it is present. + +The anonymous function registration is executed when the HAProxy backend or +frontend configuration references the sample fetch "lua.my-hash". + +This example can be written with another style, like below: + + function my_hash(txn, salt) + return txn.sc:sdbm(salt .. txn.sf:req_fhdr("host") .. txn.sf:path() .. txn.sf:src(), 1) + end + + core.register_fetches("my-hash", my_hash) + +This second form is clearer, but the first one is compact. + +The operator ".." is a string concatenation. If one of the two operands is not a +string, an error occurs and the execution is immediately stopped. This is +important to keep in mind for the following things. + +Now I write the example on more than one line. Its an easiest way for commenting +the code: + + 1. function my_hash(txn, salt) + 2. local str = "" + 3. str = str .. salt + 4. str = str .. txn.sf:req_fhdr("host") + 5. str = str .. txn.sf:path() + 6. str = str .. txn.sf:src() + 7. local result = txn.sc:sdbm(str, 1) + 8. return result + 9. end + 10. + 11. core.register_fetches("my-hash", my_hash) + +local +~~~~~ + +The first keyword is "local". This is a really important keyword. You must +understand that the function "my_hash" will be called for each HAProxy request +using the declared sample fetch. So, this function can be executed many times in +parallel. + +By default, Lua uses global variables. So in this example, if the variable "str" +is declared without the keyword "local", it will be shared by all the parallel +executions of the function and obviously, the content of the requests will be +shared. + +This warning is very important. I tried to write useful Lua code like a rewrite +of the statistics page, and it is very hard thing to declare each variable as +"local". + +I guess that this behaviour will be the cause of many troubles on the mailing +list. + +str = str .. +~~~~~~~~~~~~ + +Now a parenthesis about the form "str = str ..". This form allows to do string +concatenations. Remember that Lua uses a garbage collector, so what happens when +we do "str = str .. 'another string'" ? + + str = str .. "another string" + ^ ^ ^ ^ + 1 2 3 4 + +Lua executes first the concatenation operator (3), it allocates memory for the +resulting string and fill this memory with the concatenation of the operands 2 +and 4. Next, it frees the variable 1, now the old content of 1 can be garbage +collected. And finally, the new content of 1 is the concatenation. + +what the matter ? when we do this operation many times, we consume a lot of +memory, and the string data is duplicated and move many times. So, this practice +is expensive in execution time and memory consumption. + +There are easy ways to prevent this behaviour. I guess that a C binding for +concatenation with chunks will be available ASAP (it is already written). I do +some benchmarks. I compare the execution time of 1 000 times, 1 000 +concatenation of 10 bytes written in pure Lua and with a C library. The result is +10 times faster in C (1s in Lua, and 0.1s in C). + +txn +~~~ + +txn is an HAProxy object of class TXN. The documentation is available in the +HAProxy Lua API reference. This class allow the access to the native HAProxy +sample fetches and converters. The object txn contains 2 members dedicated to +the sample fetches and 2 members dedicated to the converters. + +The sample fetches members are "f" (as sample-Fetch) and "sf" (as String +sample-Fetch). These two members contain exactly the same functions. All the +HAProxy native sample fetches are available, obviously, the Lua registered sample +fetches are not available. Unfortunately, HAProxy sample fetches names are not +compatible with the Lua function names, and they are renamed. The rename +convention is simple, we replace all the '.', '+' and '-' by '_'. The '.' is the +object member separator, and the "-" and "+" is math operator. + +Now, that I'm writing this article, I know the Lua better than I wrote the +sample-fetches wrapper. The original HAProxy sample-fetches name should be used +using alternative manner to call an object member, so the sample-fetch +"req.fhdr" (actually renamed req_fhdr") should be used like this: + + txn.f["req.fhdr"](txn.f, ...) + +However, I think that this form is not elegant. + +The "s" collection return a data with a type near to the original returned type. +A string returns an Lua string, an integer returns an Lua integer and an IP +address returns an Lua string. Sometime the data is not or not yet available, in +this case it returns the Lua nil value. + +The "sf" collection guarantees that a string will be always returned. If the data +is not available, an empty string is returned. The main usage of these collection +is to concatenate the returned sample-fetches without testing each function. + +The parameters of the sample-fetches are according with the HAProxy +documentation. + +The converters run exactly with the same manner as the sample fetches. The +only one difference is that the first parameter is the converter entry element. +The "c" collection returns a precise result, and the "sc" collection returns +always a string. + +The sample-fetches used in the example function are "txn.sf:req_fhdr()", +"txn.sf:path()" and "txn.sf:src()". The converter is "txn.sc:sdbm()". The same +function with the "s" collection of sample-fetches and the "c" collection of +converter should be written like this: + + 1. function my_hash(txn, salt) + 2. local str = "" + 3. str = str .. salt + 4. str = str .. tostring(txn.f:req_fhdr("host")) + 5. str = str .. tostring(txn.f:path()) + 6. str = str .. tostring(txn.f:src()) + 7. local result = tostring(txn.c:sdbm(str, 1)) + 8. return result + 9. end + 10. + 11. core.register_fetches("my-hash", my_hash) + +tostring +~~~~~~~~ + +The function tostring ensures that its parameter is returned as a string. If the +parameter is a table or a thread or anything that will not have any sense as a +string, a form like the typename followed by a pointer is returned. For example: + + t = {} + print(tostring(t)) + +returns: + + table: 0x15facc0 + +For objects, if the special function __tostring() is registered in the attached +metatable, it will be called with the table itself as first argument. The +HAProxy object returns its own type. + +About the converters entry point +-------------------------------- + +In HAProxy, a converter is a stateless function that takes a data as entry and +returns a transformation of this data as output. In Lua it is exactly the same +behaviour. + +So, the registered Lua function doesn't have any special parameters, just a +variable as input which contains the value to convert, and it must return data. + +The data required as input by the Lua converter is a string. So HAProxy will +always provide a string as input. If the native sample fetch is not a string it +will be converted in best effort. + +The returned value will have anything type, it will be converted as sample of +the near HAProxy type. The conversion rules from Lua variables to HAProxy +samples are: + + Lua | HAProxy sample types + -----------+--------------------- + "number" | "sint" + "boolean" | "bool" + "string" | "str" + "userdata" | "bool" (false) + "nil" | "bool" (false) + "table" | "bool" (false) + "function" | "bool" (false) + "thread" | "bool" (false) + +The function used for registering a converter is: + + core.register_converters() + +The task entry point +-------------------- + +The function "core.register_task(fcn)" executes once the function "fcn" when the +scheduler starts. This way is used for executing background task. For example, +you can use this functionality for periodically checking the health of another +service, and giving the result to each proxy needing it. + +The task is started once, if you want periodic actions, you can use the +"core.sleep()" or "core.msleep()" for waiting the next runtime. + +Storing Lua variable between function in the same session +--------------------------------------------------------- + +All the functions registered as action or sample fetch can share an Lua context. +This context is a memory zone in the stack. sample fetch and action use the +same stack, so both can access to the context. + +The context is accessible via the function get_priv and set_priv provided by an +object of class TXN. The value given to set_priv replaces the current stored +value. This value can be a table, it is useful if a lot of data can be shared. + +If the value stored is a table, you can add or remove entries from the table +without storing again the new table. Maybe an example will be clearer: + + local t = {} + txn:set_priv(t) + + t["entry1"] = "foo" + t["entry2"] = "bar" + + -- this will display "foo" + print(txn:get_priv()["entry1"]) + +HTTP actions +============ + + ... coming soon ... + +Lua is fast, but my service require more execution speed +======================================================== + +We can write C modules for Lua. These modules must run with HAProxy while they +are compliant with the HAProxy Lua version. A simple example is the "concat" +module. + +It is very easy to write and compile a C Lua library, however, I don't see +documentation about this process. So the current chapter is a quick howto. + +The entry point +--------------- + +The entry point is called "luaopen_<name>", where <name> is the name of the ".so" +file. An hello world is like this: + + #include <stdio.h> + #include <lua.h> + #include <lauxlib.h> + + int luaopen_mymod(lua_State *L) + { + printf("Hello world\n"); + return 0; + } + +The build +--------- + +The compilation of the source file requires the Lua "include" directory. The +compilation and the link of the object file requires the -fPIC option. That's +all. + + cc -I/opt/lua/include -fPIC -shared -o mymod.so mymod.c + +Usage +----- + +You can load this module with the following Lua syntax: + + require("mymod") + +When you start HAProxy, this module just print "Hello world" when it is loaded. +Please, remember that HAProxy doesn't allow blocking method, so if you write a +function doing filesystem access or synchronous network access, all the HAProxy +process will fail. diff --git a/doc/management.txt b/doc/management.txt new file mode 100644 index 0000000..dfbcca2 --- /dev/null +++ b/doc/management.txt @@ -0,0 +1,4203 @@ + ------------------------ + HAProxy Management Guide + ------------------------ + version 2.6 + + +This document describes how to start, stop, manage, and troubleshoot HAProxy, +as well as some known limitations and traps to avoid. It does not describe how +to configure it (for this please read configuration.txt). + +Note to documentation contributors : + This document is formatted with 80 columns per line, with even number of + spaces for indentation and without tabs. Please follow these rules strictly + so that it remains easily printable everywhere. If you add sections, please + update the summary below for easier searching. + + +Summary +------- + +1. Prerequisites +2. Quick reminder about HAProxy's architecture +3. Starting HAProxy +4. Stopping and restarting HAProxy +5. File-descriptor limitations +6. Memory management +7. CPU usage +8. Logging +9. Statistics and monitoring +9.1. CSV format +9.2. Typed output format +9.3. Unix Socket commands +9.4. Master CLI +9.4.1. Master CLI commands +10. Tricks for easier configuration management +11. Well-known traps to avoid +12. Debugging and performance issues +13. Security considerations + + +1. Prerequisites +---------------- + +In this document it is assumed that the reader has sufficient administration +skills on a UNIX-like operating system, uses the shell on a daily basis and is +familiar with troubleshooting utilities such as strace and tcpdump. + + +2. Quick reminder about HAProxy's architecture +---------------------------------------------- + +HAProxy is a multi-threaded, event-driven, non-blocking daemon. This means is +uses event multiplexing to schedule all of its activities instead of relying on +the system to schedule between multiple activities. Most of the time it runs as +a single process, so the output of "ps aux" on a system will report only one +"haproxy" process, unless a soft reload is in progress and an older process is +finishing its job in parallel to the new one. It is thus always easy to trace +its activity using the strace utility. In order to scale with the number of +available processors, by default haproxy will start one worker thread per +processor it is allowed to run on. Unless explicitly configured differently, +the incoming traffic is spread over all these threads, all running the same +event loop. A great care is taken to limit inter-thread dependencies to the +strict minimum, so as to try to achieve near-linear scalability. This has some +impacts such as the fact that a given connection is served by a single thread. +Thus in order to use all available processing capacity, it is needed to have at +least as many connections as there are threads, which is almost always granted. + +HAProxy is designed to isolate itself into a chroot jail during startup, where +it cannot perform any file-system access at all. This is also true for the +libraries it depends on (eg: libc, libssl, etc). The immediate effect is that +a running process will not be able to reload a configuration file to apply +changes, instead a new process will be started using the updated configuration +file. Some other less obvious effects are that some timezone files or resolver +files the libc might attempt to access at run time will not be found, though +this should generally not happen as they're not needed after startup. A nice +consequence of this principle is that the HAProxy process is totally stateless, +and no cleanup is needed after it's killed, so any killing method that works +will do the right thing. + +HAProxy doesn't write log files, but it relies on the standard syslog protocol +to send logs to a remote server (which is often located on the same system). + +HAProxy uses its internal clock to enforce timeouts, that is derived from the +system's time but where unexpected drift is corrected. This is done by limiting +the time spent waiting in poll() for an event, and measuring the time it really +took. In practice it never waits more than one second. This explains why, when +running strace over a completely idle process, periodic calls to poll() (or any +of its variants) surrounded by two gettimeofday() calls are noticed. They are +normal, completely harmless and so cheap that the load they imply is totally +undetectable at the system scale, so there's nothing abnormal there. Example : + + 16:35:40.002320 gettimeofday({1442759740, 2605}, NULL) = 0 + 16:35:40.002942 epoll_wait(0, {}, 200, 1000) = 0 + 16:35:41.007542 gettimeofday({1442759741, 7641}, NULL) = 0 + 16:35:41.007998 gettimeofday({1442759741, 8114}, NULL) = 0 + 16:35:41.008391 epoll_wait(0, {}, 200, 1000) = 0 + 16:35:42.011313 gettimeofday({1442759742, 11411}, NULL) = 0 + +HAProxy is a TCP proxy, not a router. It deals with established connections that +have been validated by the kernel, and not with packets of any form nor with +sockets in other states (eg: no SYN_RECV nor TIME_WAIT), though their existence +may prevent it from binding a port. It relies on the system to accept incoming +connections and to initiate outgoing connections. An immediate effect of this is +that there is no relation between packets observed on the two sides of a +forwarded connection, which can be of different size, numbers and even family. +Since a connection may only be accepted from a socket in LISTEN state, all the +sockets it is listening to are necessarily visible using the "netstat" utility +to show listening sockets. Example : + + # netstat -ltnp + Active Internet connections (only servers) + Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name + tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1629/sshd + tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 2847/haproxy + tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 2847/haproxy + + +3. Starting HAProxy +------------------- + +HAProxy is started by invoking the "haproxy" program with a number of arguments +passed on the command line. The actual syntax is : + + $ haproxy [<options>]* + +where [<options>]* is any number of options. An option always starts with '-' +followed by one of more letters, and possibly followed by one or multiple extra +arguments. Without any option, HAProxy displays the help page with a reminder +about supported options. Available options may vary slightly based on the +operating system. A fair number of these options overlap with an equivalent one +if the "global" section. In this case, the command line always has precedence +over the configuration file, so that the command line can be used to quickly +enforce some settings without touching the configuration files. The current +list of options is : + + -- <cfgfile>* : all the arguments following "--" are paths to configuration + file/directory to be loaded and processed in the declaration order. It is + mostly useful when relying on the shell to load many files that are + numerically ordered. See also "-f". The difference between "--" and "-f" is + that one "-f" must be placed before each file name, while a single "--" is + needed before all file names. Both options can be used together, the + command line ordering still applies. When more than one file is specified, + each file must start on a section boundary, so the first keyword of each + file must be one of "global", "defaults", "peers", "listen", "frontend", + "backend", and so on. A file cannot contain just a server list for example. + + -f <cfgfile|cfgdir> : adds <cfgfile> to the list of configuration files to be + loaded. If <cfgdir> is a directory, all the files (and only files) it + contains are added in lexical order (using LC_COLLATE=C) to the list of + configuration files to be loaded ; only files with ".cfg" extension are + added, only non hidden files (not prefixed with ".") are added. + Configuration files are loaded and processed in their declaration order. + This option may be specified multiple times to load multiple files. See + also "--". The difference between "--" and "-f" is that one "-f" must be + placed before each file name, while a single "--" is needed before all file + names. Both options can be used together, the command line ordering still + applies. When more than one file is specified, each file must start on a + section boundary, so the first keyword of each file must be one of + "global", "defaults", "peers", "listen", "frontend", "backend", and so on. + A file cannot contain just a server list for example. + + -C <dir> : changes to directory <dir> before loading configuration + files. This is useful when using relative paths. Warning when using + wildcards after "--" which are in fact replaced by the shell before + starting haproxy. + + -D : start as a daemon. The process detaches from the current terminal after + forking, and errors are not reported anymore in the terminal. It is + equivalent to the "daemon" keyword in the "global" section of the + configuration. It is recommended to always force it in any init script so + that a faulty configuration doesn't prevent the system from booting. + + -L <name> : change the local peer name to <name>, which defaults to the local + hostname. This is used only with peers replication. You can use the + variable $HAPROXY_LOCALPEER in the configuration file to reference the + peer name. + + -N <limit> : sets the default per-proxy maxconn to <limit> instead of the + builtin default value (usually 2000). Only useful for debugging. + + -V : enable verbose mode (disables quiet mode). Reverts the effect of "-q" or + "quiet". + + -W : master-worker mode. It is equivalent to the "master-worker" keyword in + the "global" section of the configuration. This mode will launch a "master" + which will monitor the "workers". Using this mode, you can reload HAProxy + directly by sending a SIGUSR2 signal to the master. The master-worker mode + is compatible either with the foreground or daemon mode. It is + recommended to use this mode with multiprocess and systemd. + + -Ws : master-worker mode with support of `notify` type of systemd service. + This option is only available when HAProxy was built with `USE_SYSTEMD` + build option enabled. + + -c : only performs a check of the configuration files and exits before trying + to bind. The exit status is zero if everything is OK, or non-zero if an + error is encountered. Presence of warnings will be reported if any. + + -cc : evaluates a condition as used within a conditional block of the + configuration. The exit status is zero if the condition is true, 1 if the + condition is false or 2 if an error is encountered. + + -d : enable debug mode. This disables daemon mode, forces the process to stay + in foreground and to show incoming and outgoing events. It must never be + used in an init script. + + -dD : enable diagnostic mode. This mode will output extra warnings about + suspicious configuration statements. This will never prevent startup even in + "zero-warning" mode nor change the exit status code. + + -dG : disable use of getaddrinfo() to resolve host names into addresses. It + can be used when suspecting that getaddrinfo() doesn't work as expected. + This option was made available because many bogus implementations of + getaddrinfo() exist on various systems and cause anomalies that are + difficult to troubleshoot. + + -dK<class[,class]*> : dumps the list of registered keywords in each class. + The list of classes is available with "-dKhelp". All classes may be dumped + using "-dKall", otherwise a selection of those shown in the help can be + specified as a comma-delimited list. The output format will vary depending + on what class of keywords is being dumped (e.g. "cfg" will show the known + configuration keywords in a format resembling the config file format while + "smp" will show sample fetch functions prefixed with a compatibility matrix + with each rule set). These may rarely be used as-is by humans but can be of + great help for external tools that try to detect the appearance of new + keywords at certain places to automatically update some documentation, + syntax highlighting files, configuration parsers, API etc. The output + format may evolve a bit over time so it is really recommended to use this + output mostly to detect differences with previous archives. Note that not + all keywords are listed because many keywords have existed long before the + different keyword registration subsystems were created, and they do not + appear there. However since new keywords are only added via the modern + mechanisms, it's reasonably safe to assume that this output may be used to + detect language additions with a good accuracy. The keywords are only + dumped after the configuration is fully parsed, so that even dynamically + created keywords can be dumped. A good way to dump and exit is to run a + silent config check on an existing configuration: + + ./haproxy -dKall -q -c -f foo.cfg + + If no configuration file is available, using "-f /dev/null" will work as + well to dump all default keywords, but then the return status will not be + zero since there will be no listener, and will have to be ignored. + + -dL : dumps the list of dynamic shared libraries that are loaded at the end + of the config processing. This will generally also include deep dependencies + such as anything loaded from Lua code for example, as well as the executable + itself. The list is printed in a format that ought to be easy enough to + sanitize to directly produce a tarball of all dependencies. Since it doesn't + stop the program's startup, it is recommended to only use it in combination + with "-c" and "-q" where only the list of loaded objects will be displayed + (or nothing in case of error). In addition, keep in mind that when providing + such a package to help with a core file analysis, most libraries are in fact + symbolic links that need to be dereferenced when creating the archive: + + ./haproxy -W -q -c -dL -f foo.cfg | tar -T - -hzcf archive.tgz + + -dM[<byte>[,]][help|options,...] : forces memory poisoning, and/or changes + memory other debugging options. Memory poisonning means that each and every + memory region allocated with malloc() or pool_alloc() will be filled with + <byte> before being passed to the caller. When <byte> is not specified, it + defaults to 0x50 ('P'). While this slightly slows down operations, it is + useful to reliably trigger issues resulting from missing initializations in + the code that cause random crashes. Note that -dM0 has the effect of + turning any malloc() into a calloc(). In any case if a bug appears or + disappears when using this option it means there is a bug in haproxy, so + please report it. A number of other options are available either alone or + after a comma following the byte. The special option "help" will list the + currently supported options and their current value. Each debugging option + may be forced on or off. The most optimal options are usually chosen at + build time based on the operating system and do not need to be adjusted, + unless suggested by a developer. Supported debugging options include + (set/clear): + - fail / no-fail: + This enables randomly failing memory allocations, in conjunction with + the global "tune.fail-alloc" setting. This is used to detect missing + error checks in the code. + + - no-merge / merge: + By default, pools of very similar sizes are merged, resulting in more + efficiency, but this complicates the analysis of certain memory dumps. + This option allows to disable this mechanism, and may slightly increase + the memory usage. + + - cold-first / hot-first: + In order to optimize the CPU cache hit ratio, by default the most + recently released objects ("hot") are recycled for new allocations. + But doing so also complicates analysis of memory dumps and may hide + use-after-free bugs. This option allows to instead pick the coldest + objects first, which may result in a slight increase of CPU usage. + + - integrity / no-integrity: + When this option is enabled, memory integrity checks are enabled on + the allocated area to verify that it hasn't been modified since it was + last released. This works best with "no-merge", "cold-first" and "tag". + Enabling this option will slightly increase the CPU usage. + + - no-global / global: + Depending on the operating system, a process-wide global memory cache + may be enabled if it is estimated that the standard allocator is too + slow or inefficient with threads. This option allows to forcefully + disable it or enable it. Disabling it may result in a CPU usage + increase with inefficient allocators. Enabling it may result in a + higher memory usage with efficient allocators. + + - no-cache / cache: + Each thread uses a very fast local object cache for allocations, which + is always enabled by default. This option allows to disable it. Since + the global cache also passes via the local caches, this will + effectively result in disabling all caches and allocating directly from + the default allocator. This may result in a significant increase of CPU + usage, but may also result in small memory savings on tiny systems. + + - caller / no-caller: + Enabling this option reserves some extra space in each allocated object + to store the address of the last caller that allocated or released it. + This helps developers go back in time when analysing memory dumps and + to guess how something unexpected happened. + + - tag / no-tag: + Enabling this option reserves some extra space in each allocated object + to store a tag that allows to detect bugs such as double-free, freeing + an invalid object, and buffer overflows. It offers much stronger + reliability guarantees at the expense of 4 or 8 extra bytes per + allocation. It usually is the first step to detect memory corruption. + + - poison / no-poison: + Enabling this option will fill allocated objects with a fixed pattern + that will make sure that some accidental values such as 0 will not be + present if a newly added field was mistakenly forgotten in an + initialization routine. Such bugs tend to rarely reproduce, especially + when pools are not merged. This is normally enabled by directly passing + the byte's value to -dM but using this option allows to disable/enable + use of a previously set value. + + -dS : disable use of the splice() system call. It is equivalent to the + "global" section's "nosplice" keyword. This may be used when splice() is + suspected to behave improperly or to cause performance issues, or when + using strace to see the forwarded data (which do not appear when using + splice()). + + -dV : disable SSL verify on the server side. It is equivalent to having + "ssl-server-verify none" in the "global" section. This is useful when + trying to reproduce production issues out of the production + environment. Never use this in an init script as it degrades SSL security + to the servers. + + -dW : if set, haproxy will refuse to start if any warning was emitted while + processing the configuration. This helps detect subtle mistakes and keep the + configuration clean and portable across versions. It is recommended to set + this option in service scripts when configurations are managed by humans, + but it is recommended not to use it with generated configurations, which + tend to emit more warnings. It may be combined with "-c" to cause warnings + in checked configurations to fail. This is equivalent to global option + "zero-warning". + + -db : disable background mode and multi-process mode. The process remains in + foreground. It is mainly used during development or during small tests, as + Ctrl-C is enough to stop the process. Never use it in an init script. + + -de : disable the use of the "epoll" poller. It is equivalent to the "global" + section's keyword "noepoll". It is mostly useful when suspecting a bug + related to this poller. On systems supporting epoll, the fallback will + generally be the "poll" poller. + + -dk : disable the use of the "kqueue" poller. It is equivalent to the + "global" section's keyword "nokqueue". It is mostly useful when suspecting + a bug related to this poller. On systems supporting kqueue, the fallback + will generally be the "poll" poller. + + -dp : disable the use of the "poll" poller. It is equivalent to the "global" + section's keyword "nopoll". It is mostly useful when suspecting a bug + related to this poller. On systems supporting poll, the fallback will + generally be the "select" poller, which cannot be disabled and is limited + to 1024 file descriptors. + + -dr : ignore server address resolution failures. It is very common when + validating a configuration out of production not to have access to the same + resolvers and to fail on server address resolution, making it difficult to + test a configuration. This option simply appends the "none" method to the + list of address resolution methods for all servers, ensuring that even if + the libc fails to resolve an address, the startup sequence is not + interrupted. + + -m <limit> : limit the total allocatable memory to <limit> megabytes across + all processes. This may cause some connection refusals or some slowdowns + depending on the amount of memory needed for normal operations. This is + mostly used to force the processes to work in a constrained resource usage + scenario. It is important to note that the memory is not shared between + processes, so in a multi-process scenario, this value is first divided by + global.nbproc before forking. + + -n <limit> : limits the per-process connection limit to <limit>. This is + equivalent to the global section's keyword "maxconn". It has precedence + over this keyword. This may be used to quickly force lower limits to avoid + a service outage on systems where resource limits are too low. + + -p <file> : write all processes' pids into <file> during startup. This is + equivalent to the "global" section's keyword "pidfile". The file is opened + before entering the chroot jail, and after doing the chdir() implied by + "-C". Each pid appears on its own line. + + -q : set "quiet" mode. This disables some messages during the configuration + parsing and during startup. It can be used in combination with "-c" to + just check if a configuration file is valid or not. + + -S <bind>[,bind_options...]: in master-worker mode, bind a master CLI, which + allows the access to every processes, running or leaving ones. + For security reasons, it is recommended to bind the master CLI to a local + UNIX socket. The bind options are the same as the keyword "bind" in + the configuration file with words separated by commas instead of spaces. + + Note that this socket can't be used to retrieve the listening sockets from + an old process during a seamless reload. + + -sf <pid>* : send the "finish" signal (SIGUSR1) to older processes after boot + completion to ask them to finish what they are doing and to leave. <pid> + is a list of pids to signal (one per argument). The list ends on any + option starting with a "-". It is not a problem if the list of pids is + empty, so that it can be built on the fly based on the result of a command + like "pidof" or "pgrep". QUIC connections will be aborted. + + -st <pid>* : send the "terminate" signal (SIGTERM) to older processes after + boot completion to terminate them immediately without finishing what they + were doing. <pid> is a list of pids to signal (one per argument). The list + is ends on any option starting with a "-". It is not a problem if the list + of pids is empty, so that it can be built on the fly based on the result of + a command like "pidof" or "pgrep". + + -v : report the version and build date. + + -vv : display the version, build options, libraries versions and usable + pollers. This output is systematically requested when filing a bug report. + + -x <unix_socket> : connect to the specified socket and try to retrieve any + listening sockets from the old process, and use them instead of trying to + bind new ones. This is useful to avoid missing any new connection when + reloading the configuration on Linux. The capability must be enable on the + stats socket using "expose-fd listeners" in your configuration. + In master-worker mode, the master will use this option upon a reload with + the "sockpair@" syntax, which allows the master to connect directly to a + worker without using stats socket declared in the configuration. + +A safe way to start HAProxy from an init file consists in forcing the daemon +mode, storing existing pids to a pid file and using this pid file to notify +older processes to finish before leaving : + + haproxy -f /etc/haproxy.cfg \ + -D -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid) + +When the configuration is split into a few specific files (eg: tcp vs http), +it is recommended to use the "-f" option : + + haproxy -f /etc/haproxy/global.cfg -f /etc/haproxy/stats.cfg \ + -f /etc/haproxy/default-tcp.cfg -f /etc/haproxy/tcp.cfg \ + -f /etc/haproxy/default-http.cfg -f /etc/haproxy/http.cfg \ + -D -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid) + +When an unknown number of files is expected, such as customer-specific files, +it is recommended to assign them a name starting with a fixed-size sequence +number and to use "--" to load them, possibly after loading some defaults : + + haproxy -f /etc/haproxy/global.cfg -f /etc/haproxy/stats.cfg \ + -f /etc/haproxy/default-tcp.cfg -f /etc/haproxy/tcp.cfg \ + -f /etc/haproxy/default-http.cfg -f /etc/haproxy/http.cfg \ + -D -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid) \ + -f /etc/haproxy/default-customers.cfg -- /etc/haproxy/customers/* + +Sometimes a failure to start may happen for whatever reason. Then it is +important to verify if the version of HAProxy you are invoking is the expected +version and if it supports the features you are expecting (eg: SSL, PCRE, +compression, Lua, etc). This can be verified using "haproxy -vv". Some +important information such as certain build options, the target system and +the versions of the libraries being used are reported there. It is also what +you will systematically be asked for when posting a bug report : + + $ haproxy -vv + HAProxy version 1.6-dev7-a088d3-4 2015/10/08 + Copyright 2000-2015 Willy Tarreau <willy@haproxy.org> + + Build options : + TARGET = linux2628 + CPU = generic + CC = gcc + CFLAGS = -pg -O0 -g -fno-strict-aliasing -Wdeclaration-after-statement \ + -DBUFSIZE=8030 -DMAXREWRITE=1030 -DSO_MARK=36 -DTCP_REPAIR=19 + OPTIONS = USE_ZLIB=1 USE_DLMALLOC=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 + + Default settings : + maxconn = 2000, bufsize = 8030, maxrewrite = 1030, maxpollevents = 200 + + Encrypted password support via crypt(3): yes + Built with zlib version : 1.2.6 + Compression algorithms supported : identity("identity"), deflate("deflate"), \ + raw-deflate("deflate"), gzip("gzip") + Built with OpenSSL version : OpenSSL 1.0.1o 12 Jun 2015 + Running on OpenSSL version : OpenSSL 1.0.1o 12 Jun 2015 + OpenSSL library supports TLS extensions : yes + OpenSSL library supports SNI : yes + OpenSSL library supports prefer-server-ciphers : yes + Built with PCRE version : 8.12 2011-01-15 + PCRE library supports JIT : no (USE_PCRE_JIT not set) + Built with Lua version : Lua 5.3.1 + Built with transparent proxy support using: IP_TRANSPARENT IP_FREEBIND + + Available polling systems : + epoll : pref=300, test result OK + poll : pref=200, test result OK + select : pref=150, test result OK + Total: 3 (3 usable), will use epoll. + +The relevant information that many non-developer users can verify here are : + - the version : 1.6-dev7-a088d3-4 above means the code is currently at commit + ID "a088d3" which is the 4th one after after official version "1.6-dev7". + Version 1.6-dev7 would show as "1.6-dev7-8c1ad7". What matters here is in + fact "1.6-dev7". This is the 7th development version of what will become + version 1.6 in the future. A development version not suitable for use in + production (unless you know exactly what you are doing). A stable version + will show as a 3-numbers version, such as "1.5.14-16f863", indicating the + 14th level of fix on top of version 1.5. This is a production-ready version. + + - the release date : 2015/10/08. It is represented in the universal + year/month/day format. Here this means August 8th, 2015. Given that stable + releases are issued every few months (1-2 months at the beginning, sometimes + 6 months once the product becomes very stable), if you're seeing an old date + here, it means you're probably affected by a number of bugs or security + issues that have since been fixed and that it might be worth checking on the + official site. + + - build options : they are relevant to people who build their packages + themselves, they can explain why things are not behaving as expected. For + example the development version above was built for Linux 2.6.28 or later, + targeting a generic CPU (no CPU-specific optimizations), and lacks any + code optimization (-O0) so it will perform poorly in terms of performance. + + - libraries versions : zlib version is reported as found in the library + itself. In general zlib is considered a very stable product and upgrades + are almost never needed. OpenSSL reports two versions, the version used at + build time and the one being used, as found on the system. These ones may + differ by the last letter but never by the numbers. The build date is also + reported because most OpenSSL bugs are security issues and need to be taken + seriously, so this library absolutely needs to be kept up to date. Seeing a + 4-months old version here is highly suspicious and indeed an update was + missed. PCRE provides very fast regular expressions and is highly + recommended. Certain of its extensions such as JIT are not present in all + versions and still young so some people prefer not to build with them, + which is why the build status is reported as well. Regarding the Lua + scripting language, HAProxy expects version 5.3 which is very young since + it was released a little time before HAProxy 1.6. It is important to check + on the Lua web site if some fixes are proposed for this branch. + + - Available polling systems will affect the process's scalability when + dealing with more than about one thousand of concurrent connections. These + ones are only available when the correct system was indicated in the TARGET + variable during the build. The "epoll" mechanism is highly recommended on + Linux, and the kqueue mechanism is highly recommended on BSD. Lacking them + will result in poll() or even select() being used, causing a high CPU usage + when dealing with a lot of connections. + + +4. Stopping and restarting HAProxy +---------------------------------- + +HAProxy supports a graceful and a hard stop. The hard stop is simple, when the +SIGTERM signal is sent to the haproxy process, it immediately quits and all +established connections are closed. The graceful stop is triggered when the +SIGUSR1 signal is sent to the haproxy process. It consists in only unbinding +from listening ports, but continue to process existing connections until they +close. Once the last connection is closed, the process leaves. + +The hard stop method is used for the "stop" or "restart" actions of the service +management script. The graceful stop is used for the "reload" action which +tries to seamlessly reload a new configuration in a new process. + +Both of these signals may be sent by the new haproxy process itself during a +reload or restart, so that they are sent at the latest possible moment and only +if absolutely required. This is what is performed by the "-st" (hard) and "-sf" +(graceful) options respectively. + +In master-worker mode, it is not needed to start a new haproxy process in +order to reload the configuration. The master process reacts to the SIGUSR2 +signal by reexecuting itself with the -sf parameter followed by the PIDs of +the workers. The master will then parse the configuration file and fork new +workers. + +To understand better how these signals are used, it is important to understand +the whole restart mechanism. + +First, an existing haproxy process is running. The administrator uses a system +specific command such as "/etc/init.d/haproxy reload" to indicate they want to +take the new configuration file into effect. What happens then is the following. +First, the service script (/etc/init.d/haproxy or equivalent) will verify that +the configuration file parses correctly using "haproxy -c". After that it will +try to start haproxy with this configuration file, using "-st" or "-sf". + +Then HAProxy tries to bind to all listening ports. If some fatal errors happen +(eg: address not present on the system, permission denied), the process quits +with an error. If a socket binding fails because a port is already in use, then +the process will first send a SIGTTOU signal to all the pids specified in the +"-st" or "-sf" pid list. This is what is called the "pause" signal. It instructs +all existing haproxy processes to temporarily stop listening to their ports so +that the new process can try to bind again. During this time, the old process +continues to process existing connections. If the binding still fails (because +for example a port is shared with another daemon), then the new process sends a +SIGTTIN signal to the old processes to instruct them to resume operations just +as if nothing happened. The old processes will then restart listening to the +ports and continue to accept connections. Note that this mechanism is system +dependent and some operating systems may not support it in multi-process mode. + +If the new process manages to bind correctly to all ports, then it sends either +the SIGTERM (hard stop in case of "-st") or the SIGUSR1 (graceful stop in case +of "-sf") to all processes to notify them that it is now in charge of operations +and that the old processes will have to leave, either immediately or once they +have finished their job. + +It is important to note that during this timeframe, there are two small windows +of a few milliseconds each where it is possible that a few connection failures +will be noticed during high loads. Typically observed failure rates are around +1 failure during a reload operation every 10000 new connections per second, +which means that a heavily loaded site running at 30000 new connections per +second may see about 3 failed connection upon every reload. The two situations +where this happens are : + + - if the new process fails to bind due to the presence of the old process, + it will first have to go through the SIGTTOU+SIGTTIN sequence, which + typically lasts about one millisecond for a few tens of frontends, and + during which some ports will not be bound to the old process and not yet + bound to the new one. HAProxy works around this on systems that support the + SO_REUSEPORT socket options, as it allows the new process to bind without + first asking the old one to unbind. Most BSD systems have been supporting + this almost forever. Linux has been supporting this in version 2.0 and + dropped it around 2.2, but some patches were floating around by then. It + was reintroduced in kernel 3.9, so if you are observing a connection + failure rate above the one mentioned above, please ensure that your kernel + is 3.9 or newer, or that relevant patches were backported to your kernel + (less likely). + + - when the old processes close the listening ports, the kernel may not always + redistribute any pending connection that was remaining in the socket's + backlog. Under high loads, a SYN packet may happen just before the socket + is closed, and will lead to an RST packet being sent to the client. In some + critical environments where even one drop is not acceptable, these ones are + sometimes dealt with using firewall rules to block SYN packets during the + reload, forcing the client to retransmit. This is totally system-dependent, + as some systems might be able to visit other listening queues and avoid + this RST. A second case concerns the ACK from the client on a local socket + that was in SYN_RECV state just before the close. This ACK will lead to an + RST packet while the haproxy process is still not aware of it. This one is + harder to get rid of, though the firewall filtering rules mentioned above + will work well if applied one second or so before restarting the process. + +For the vast majority of users, such drops will never ever happen since they +don't have enough load to trigger the race conditions. And for most high traffic +users, the failure rate is still fairly within the noise margin provided that at +least SO_REUSEPORT is properly supported on their systems. + +QUIC limitations: soft-stop is not supported. In case of reload, QUIC connections +will not be preserved. + +5. File-descriptor limitations +------------------------------ + +In order to ensure that all incoming connections will successfully be served, +HAProxy computes at load time the total number of file descriptors that will be +needed during the process's life. A regular Unix process is generally granted +1024 file descriptors by default, and a privileged process can raise this limit +itself. This is one reason for starting HAProxy as root and letting it adjust +the limit. The default limit of 1024 file descriptors roughly allow about 500 +concurrent connections to be processed. The computation is based on the global +maxconn parameter which limits the total number of connections per process, the +number of listeners, the number of servers which have a health check enabled, +the agent checks, the peers, the loggers and possibly a few other technical +requirements. A simple rough estimate of this number consists in simply +doubling the maxconn value and adding a few tens to get the approximate number +of file descriptors needed. + +Originally HAProxy did not know how to compute this value, and it was necessary +to pass the value using the "ulimit-n" setting in the global section. This +explains why even today a lot of configurations are seen with this setting +present. Unfortunately it was often miscalculated resulting in connection +failures when approaching maxconn instead of throttling incoming connection +while waiting for the needed resources. For this reason it is important to +remove any vestigial "ulimit-n" setting that can remain from very old versions. + +Raising the number of file descriptors to accept even moderate loads is +mandatory but comes with some OS-specific adjustments. First, the select() +polling system is limited to 1024 file descriptors. In fact on Linux it used +to be capable of handling more but since certain OS ship with excessively +restrictive SELinux policies forbidding the use of select() with more than +1024 file descriptors, HAProxy now refuses to start in this case in order to +avoid any issue at run time. On all supported operating systems, poll() is +available and will not suffer from this limitation. It is automatically picked +so there is nothing to do to get a working configuration. But poll's becomes +very slow when the number of file descriptors increases. While HAProxy does its +best to limit this performance impact (eg: via the use of the internal file +descriptor cache and batched processing), a good rule of thumb is that using +poll() with more than a thousand concurrent connections will use a lot of CPU. + +For Linux systems base on kernels 2.6 and above, the epoll() system call will +be used. It's a much more scalable mechanism relying on callbacks in the kernel +that guarantee a constant wake up time regardless of the number of registered +monitored file descriptors. It is automatically used where detected, provided +that HAProxy had been built for one of the Linux flavors. Its presence and +support can be verified using "haproxy -vv". + +For BSD systems which support it, kqueue() is available as an alternative. It +is much faster than poll() and even slightly faster than epoll() thanks to its +batched handling of changes. At least FreeBSD and OpenBSD support it. Just like +with Linux's epoll(), its support and availability are reported in the output +of "haproxy -vv". + +Having a good poller is one thing, but it is mandatory that the process can +reach the limits. When HAProxy starts, it immediately sets the new process's +file descriptor limits and verifies if it succeeds. In case of failure, it +reports it before forking so that the administrator can see the problem. As +long as the process is started by as root, there should be no reason for this +setting to fail. However, it can fail if the process is started by an +unprivileged user. If there is a compelling reason for *not* starting haproxy +as root (eg: started by end users, or by a per-application account), then the +file descriptor limit can be raised by the system administrator for this +specific user. The effectiveness of the setting can be verified by issuing +"ulimit -n" from the user's command line. It should reflect the new limit. + +Warning: when an unprivileged user's limits are changed in this user's account, +it is fairly common that these values are only considered when the user logs in +and not at all in some scripts run at system boot time nor in crontabs. This is +totally dependent on the operating system, keep in mind to check "ulimit -n" +before starting haproxy when running this way. The general advice is never to +start haproxy as an unprivileged user for production purposes. Another good +reason is that it prevents haproxy from enabling some security protections. + +Once it is certain that the system will allow the haproxy process to use the +requested number of file descriptors, two new system-specific limits may be +encountered. The first one is the system-wide file descriptor limit, which is +the total number of file descriptors opened on the system, covering all +processes. When this limit is reached, accept() or socket() will typically +return ENFILE. The second one is the per-process hard limit on the number of +file descriptors, it prevents setrlimit() from being set higher. Both are very +dependent on the operating system. On Linux, the system limit is set at boot +based on the amount of memory. It can be changed with the "fs.file-max" sysctl. +And the per-process hard limit is set to 1048576 by default, but it can be +changed using the "fs.nr_open" sysctl. + +File descriptor limitations may be observed on a running process when they are +set too low. The strace utility will report that accept() and socket() return +"-1 EMFILE" when the process's limits have been reached. In this case, simply +raising the "ulimit-n" value (or removing it) will solve the problem. If these +system calls return "-1 ENFILE" then it means that the kernel's limits have +been reached and that something must be done on a system-wide parameter. These +trouble must absolutely be addressed, as they result in high CPU usage (when +accept() fails) and failed connections that are generally visible to the user. +One solution also consists in lowering the global maxconn value to enforce +serialization, and possibly to disable HTTP keep-alive to force connections +to be released and reused faster. + + +6. Memory management +-------------------- + +HAProxy uses a simple and fast pool-based memory management. Since it relies on +a small number of different object types, it's much more efficient to pick new +objects from a pool which already contains objects of the appropriate size than +to call malloc() for each different size. The pools are organized as a stack or +LIFO, so that newly allocated objects are taken from recently released objects +still hot in the CPU caches. Pools of similar sizes are merged together, in +order to limit memory fragmentation. + +By default, since the focus is set on performance, each released object is put +back into the pool it came from, and allocated objects are never freed since +they are expected to be reused very soon. + +On the CLI, it is possible to check how memory is being used in pools thanks to +the "show pools" command : + + > show pools + Dumping pools usage. Use SIGQUIT to flush them. + - Pool cache_st (16 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9ccc40=03 [SHARED] + - Pool pipe (32 bytes) : 5 allocated (160 bytes), 5 used, 0 failures, 2 users, @0x9ccac0=00 [SHARED] + - Pool comp_state (48 bytes) : 3 allocated (144 bytes), 3 used, 0 failures, 5 users, @0x9cccc0=04 [SHARED] + - Pool filter (64 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 3 users, @0x9ccbc0=02 [SHARED] + - Pool vars (80 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 2 users, @0x9ccb40=01 [SHARED] + - Pool uniqueid (128 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 2 users, @0x9cd240=15 [SHARED] + - Pool task (144 bytes) : 55 allocated (7920 bytes), 55 used, 0 failures, 1 users, @0x9cd040=11 [SHARED] + - Pool session (160 bytes) : 1 allocated (160 bytes), 1 used, 0 failures, 1 users, @0x9cd140=13 [SHARED] + - Pool h2s (208 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 2 users, @0x9ccec0=08 [SHARED] + - Pool h2c (288 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9cce40=07 [SHARED] + - Pool spoe_ctx (304 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 2 users, @0x9ccf40=09 [SHARED] + - Pool connection (400 bytes) : 2 allocated (800 bytes), 2 used, 0 failures, 1 users, @0x9cd1c0=14 [SHARED] + - Pool hdr_idx (416 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9cd340=17 [SHARED] + - Pool dns_resolut (480 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9ccdc0=06 [SHARED] + - Pool dns_answer_ (576 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9ccd40=05 [SHARED] + - Pool stream (960 bytes) : 1 allocated (960 bytes), 1 used, 0 failures, 1 users, @0x9cd0c0=12 [SHARED] + - Pool requri (1024 bytes) : 0 allocated (0 bytes), 0 used, 0 failures, 1 users, @0x9cd2c0=16 [SHARED] + - Pool buffer (8030 bytes) : 3 allocated (24090 bytes), 2 used, 0 failures, 1 users, @0x9cd3c0=18 [SHARED] + - Pool trash (8062 bytes) : 1 allocated (8062 bytes), 1 used, 0 failures, 1 users, @0x9cd440=19 + Total: 19 pools, 42296 bytes allocated, 34266 used. + +The pool name is only indicative, it's the name of the first object type using +this pool. The size in parenthesis is the object size for objects in this pool. +Object sizes are always rounded up to the closest multiple of 16 bytes. The +number of objects currently allocated and the equivalent number of bytes is +reported so that it is easy to know which pool is responsible for the highest +memory usage. The number of objects currently in use is reported as well in the +"used" field. The difference between "allocated" and "used" corresponds to the +objects that have been freed and are available for immediate use. The address +at the end of the line is the pool's address, and the following number is the +pool index when it exists, or is reported as -1 if no index was assigned. + +It is possible to limit the amount of memory allocated per process using the +"-m" command line option, followed by a number of megabytes. It covers all of +the process's addressable space, so that includes memory used by some libraries +as well as the stack, but it is a reliable limit when building a resource +constrained system. It works the same way as "ulimit -v" on systems which have +it, or "ulimit -d" for the other ones. + +If a memory allocation fails due to the memory limit being reached or because +the system doesn't have any enough memory, then haproxy will first start to +free all available objects from all pools before attempting to allocate memory +again. This mechanism of releasing unused memory can be triggered by sending +the signal SIGQUIT to the haproxy process. When doing so, the pools state prior +to the flush will also be reported to stderr when the process runs in +foreground. + +During a reload operation, the process switched to the graceful stop state also +automatically performs some flushes after releasing any connection so that all +possible memory is released to save it for the new process. + + +7. CPU usage +------------ + +HAProxy normally spends most of its time in the system and a smaller part in +userland. A finely tuned 3.5 GHz CPU can sustain a rate about 80000 end-to-end +connection setups and closes per second at 100% CPU on a single core. When one +core is saturated, typical figures are : + - 95% system, 5% user for long TCP connections or large HTTP objects + - 85% system and 15% user for short TCP connections or small HTTP objects in + close mode + - 70% system and 30% user for small HTTP objects in keep-alive mode + +The amount of rules processing and regular expressions will increase the user +land part. The presence of firewall rules, connection tracking, complex routing +tables in the system will instead increase the system part. + +On most systems, the CPU time observed during network transfers can be cut in 4 +parts : + - the interrupt part, which concerns all the processing performed upon I/O + receipt, before the target process is even known. Typically Rx packets are + accounted for in interrupt. On some systems such as Linux where interrupt + processing may be deferred to a dedicated thread, it can appear as softirq, + and the thread is called ksoftirqd/0 (for CPU 0). The CPU taking care of + this load is generally defined by the hardware settings, though in the case + of softirq it is often possible to remap the processing to another CPU. + This interrupt part will often be perceived as parasitic since it's not + associated with any process, but it actually is some processing being done + to prepare the work for the process. + + - the system part, which concerns all the processing done using kernel code + called from userland. System calls are accounted as system for example. All + synchronously delivered Tx packets will be accounted for as system time. If + some packets have to be deferred due to queues filling up, they may then be + processed in interrupt context later (eg: upon receipt of an ACK opening a + TCP window). + + - the user part, which exclusively runs application code in userland. HAProxy + runs exclusively in this part, though it makes heavy use of system calls. + Rules processing, regular expressions, compression, encryption all add to + the user portion of CPU consumption. + + - the idle part, which is what the CPU does when there is nothing to do. For + example HAProxy waits for an incoming connection, or waits for some data to + leave, meaning the system is waiting for an ACK from the client to push + these data. + +In practice regarding HAProxy's activity, it is in general reasonably accurate +(but totally inexact) to consider that interrupt/softirq are caused by Rx +processing in kernel drivers, that user-land is caused by layer 7 processing +in HAProxy, and that system time is caused by network processing on the Tx +path. + +Since HAProxy runs around an event loop, it waits for new events using poll() +(or any alternative) and processes all these events as fast as possible before +going back to poll() waiting for new events. It measures the time spent waiting +in poll() compared to the time spent doing processing events. The ratio of +polling time vs total time is called the "idle" time, it's the amount of time +spent waiting for something to happen. This ratio is reported in the stats page +on the "idle" line, or "Idle_pct" on the CLI. When it's close to 100%, it means +the load is extremely low. When it's close to 0%, it means that there is +constantly some activity. While it cannot be very accurate on an overloaded +system due to other processes possibly preempting the CPU from the haproxy +process, it still provides a good estimate about how HAProxy considers it is +working : if the load is low and the idle ratio is low as well, it may indicate +that HAProxy has a lot of work to do, possibly due to very expensive rules that +have to be processed. Conversely, if HAProxy indicates the idle is close to +100% while things are slow, it means that it cannot do anything to speed things +up because it is already waiting for incoming data to process. In the example +below, haproxy is completely idle : + + $ echo "show info" | socat - /var/run/haproxy.sock | grep ^Idle + Idle_pct: 100 + +When the idle ratio starts to become very low, it is important to tune the +system and place processes and interrupts correctly to save the most possible +CPU resources for all tasks. If a firewall is present, it may be worth trying +to disable it or to tune it to ensure it is not responsible for a large part +of the performance limitation. It's worth noting that unloading a stateful +firewall generally reduces both the amount of interrupt/softirq and of system +usage since such firewalls act both on the Rx and the Tx paths. On Linux, +unloading the nf_conntrack and ip_conntrack modules will show whether there is +anything to gain. If so, then the module runs with default settings and you'll +have to figure how to tune it for better performance. In general this consists +in considerably increasing the hash table size. On FreeBSD, "pfctl -d" will +disable the "pf" firewall and its stateful engine at the same time. + +If it is observed that a lot of time is spent in interrupt/softirq, it is +important to ensure that they don't run on the same CPU. Most systems tend to +pin the tasks on the CPU where they receive the network traffic because for +certain workloads it improves things. But with heavily network-bound workloads +it is the opposite as the haproxy process will have to fight against its kernel +counterpart. Pinning haproxy to one CPU core and the interrupts to another one, +all sharing the same L3 cache tends to sensibly increase network performance +because in practice the amount of work for haproxy and the network stack are +quite close, so they can almost fill an entire CPU each. On Linux this is done +using taskset (for haproxy) or using cpu-map (from the haproxy config), and the +interrupts are assigned under /proc/irq. Many network interfaces support +multiple queues and multiple interrupts. In general it helps to spread them +across a small number of CPU cores provided they all share the same L3 cache. +Please always stop irq_balance which always does the worst possible thing on +such workloads. + +For CPU-bound workloads consisting in a lot of SSL traffic or a lot of +compression, it may be worth using multiple processes dedicated to certain +tasks, though there is no universal rule here and experimentation will have to +be performed. + +In order to increase the CPU capacity, it is possible to make HAProxy run as +several processes, using the "nbproc" directive in the global section. There +are some limitations though : + - health checks are run per process, so the target servers will get as many + checks as there are running processes ; + - maxconn values and queues are per-process so the correct value must be set + to avoid overloading the servers ; + - outgoing connections should avoid using port ranges to avoid conflicts + - stick-tables are per process and are not shared between processes ; + - each peers section may only run on a single process at a time ; + - the CLI operations will only act on a single process at a time. + +With this in mind, it appears that the easiest setup often consists in having +one first layer running on multiple processes and in charge for the heavy +processing, passing the traffic to a second layer running in a single process. +This mechanism is suited to SSL and compression which are the two CPU-heavy +features. Instances can easily be chained over UNIX sockets (which are cheaper +than TCP sockets and which do not waste ports), and the proxy protocol which is +useful to pass client information to the next stage. When doing so, it is +generally a good idea to bind all the single-process tasks to process number 1 +and extra tasks to next processes, as this will make it easier to generate +similar configurations for different machines. + +On Linux versions 3.9 and above, running HAProxy in multi-process mode is much +more efficient when each process uses a distinct listening socket on the same +IP:port ; this will make the kernel evenly distribute the load across all +processes instead of waking them all up. Please check the "process" option of +the "bind" keyword lines in the configuration manual for more information. + + +8. Logging +---------- + +For logging, HAProxy always relies on a syslog server since it does not perform +any file-system access. The standard way of using it is to send logs over UDP +to the log server (by default on port 514). Very commonly this is configured to +127.0.0.1 where the local syslog daemon is running, but it's also used over the +network to log to a central server. The central server provides additional +benefits especially in active-active scenarios where it is desirable to keep +the logs merged in arrival order. HAProxy may also make use of a UNIX socket to +send its logs to the local syslog daemon, but it is not recommended at all, +because if the syslog server is restarted while haproxy runs, the socket will +be replaced and new logs will be lost. Since HAProxy will be isolated inside a +chroot jail, it will not have the ability to reconnect to the new socket. It +has also been observed in field that the log buffers in use on UNIX sockets are +very small and lead to lost messages even at very light loads. But this can be +fine for testing however. + +It is recommended to add the following directive to the "global" section to +make HAProxy log to the local daemon using facility "local0" : + + log 127.0.0.1:514 local0 + +and then to add the following one to each "defaults" section or to each frontend +and backend section : + + log global + +This way, all logs will be centralized through the global definition of where +the log server is. + +Some syslog daemons do not listen to UDP traffic by default, so depending on +the daemon being used, the syntax to enable this will vary : + + - on sysklogd, you need to pass argument "-r" on the daemon's command line + so that it listens to a UDP socket for "remote" logs ; note that there is + no way to limit it to address 127.0.0.1 so it will also receive logs from + remote systems ; + + - on rsyslogd, the following lines must be added to the configuration file : + + $ModLoad imudp + $UDPServerAddress * + $UDPServerRun 514 + + - on syslog-ng, a new source can be created the following way, it then needs + to be added as a valid source in one of the "log" directives : + + source s_udp { + udp(ip(127.0.0.1) port(514)); + }; + +Please consult your syslog daemon's manual for more information. If no logs are +seen in the system's log files, please consider the following tests : + + - restart haproxy. Each frontend and backend logs one line indicating it's + starting. If these logs are received, it means logs are working. + + - run "strace -tt -s100 -etrace=sendmsg -p <haproxy's pid>" and perform some + activity that you expect to be logged. You should see the log messages + being sent using sendmsg() there. If they don't appear, restart using + strace on top of haproxy. If you still see no logs, it definitely means + that something is wrong in your configuration. + + - run tcpdump to watch for port 514, for example on the loopback interface if + the traffic is being sent locally : "tcpdump -As0 -ni lo port 514". If the + packets are seen there, it's the proof they're sent then the syslogd daemon + needs to be troubleshooted. + +While traffic logs are sent from the frontends (where the incoming connections +are accepted), backends also need to be able to send logs in order to report a +server state change consecutive to a health check. Please consult HAProxy's +configuration manual for more information regarding all possible log settings. + +It is convenient to chose a facility that is not used by other daemons. HAProxy +examples often suggest "local0" for traffic logs and "local1" for admin logs +because they're never seen in field. A single facility would be enough as well. +Having separate logs is convenient for log analysis, but it's also important to +remember that logs may sometimes convey confidential information, and as such +they must not be mixed with other logs that may accidentally be handed out to +unauthorized people. + +For in-field troubleshooting without impacting the server's capacity too much, +it is recommended to make use of the "halog" utility provided with HAProxy. +This is sort of a grep-like utility designed to process HAProxy log files at +a very fast data rate. Typical figures range between 1 and 2 GB of logs per +second. It is capable of extracting only certain logs (eg: search for some +classes of HTTP status codes, connection termination status, search by response +time ranges, look for errors only), count lines, limit the output to a number +of lines, and perform some more advanced statistics such as sorting servers +by response time or error counts, sorting URLs by time or count, sorting client +addresses by access count, and so on. It is pretty convenient to quickly spot +anomalies such as a bot looping on the site, and block them. + + +9. Statistics and monitoring +---------------------------- + +It is possible to query HAProxy about its status. The most commonly used +mechanism is the HTTP statistics page. This page also exposes an alternative +CSV output format for monitoring tools. The same format is provided on the +Unix socket. + +Statistics are regroup in categories labelled as domains, corresponding to the +multiple components of HAProxy. There are two domains available: proxy and dns. +If not specified, the proxy domain is selected. Note that only the proxy +statistics are printed on the HTTP page. + +9.1. CSV format +--------------- + +The statistics may be consulted either from the unix socket or from the HTTP +page. Both means provide a CSV format whose fields follow. The first line +begins with a sharp ('#') and has one word per comma-delimited field which +represents the title of the column. All other lines starting at the second one +use a classical CSV format using a comma as the delimiter, and the double quote +('"') as an optional text delimiter, but only if the enclosed text is ambiguous +(if it contains a quote or a comma). The double-quote character ('"') in the +text is doubled ('""'), which is the format that most tools recognize. Please +do not insert any column before these ones in order not to break tools which +use hard-coded column positions. + +For proxy statistics, after each field name, the types which may have a value +for that field are specified in brackets. The types are L (Listeners), F +(Frontends), B (Backends), and S (Servers). There is a fixed set of static +fields that are always available in the same order. A column containing the +character '-' delimits the end of the static fields, after which presence or +order of the fields are not guaranteed. + +Here is the list of static fields using the proxy statistics domain: + 0. pxname [LFBS]: proxy name + 1. svname [LFBS]: service name (FRONTEND for frontend, BACKEND for backend, + any name for server/listener) + 2. qcur [..BS]: current queued requests. For the backend this reports the + number queued without a server assigned. + 3. qmax [..BS]: max value of qcur + 4. scur [LFBS]: current sessions + 5. smax [LFBS]: max sessions + 6. slim [LFBS]: configured session limit + 7. stot [LFBS]: cumulative number of sessions + 8. bin [LFBS]: bytes in + 9. bout [LFBS]: bytes out + 10. dreq [LFB.]: requests denied because of security concerns. + - For tcp this is because of a matched tcp-request content rule. + - For http this is because of a matched http-request or tarpit rule. + 11. dresp [LFBS]: responses denied because of security concerns. + - For http this is because of a matched http-request rule, or + "option checkcache". + 12. ereq [LF..]: request errors. Some of the possible causes are: + - early termination from the client, before the request has been sent. + - read error from the client + - client timeout + - client closed connection + - various bad requests from the client. + - request was tarpitted. + 13. econ [..BS]: number of requests that encountered an error trying to + connect to a backend server. The backend stat is the sum of the stat + for all servers of that backend, plus any connection errors not + associated with a particular server (such as the backend having no + active servers). + 14. eresp [..BS]: response errors. srv_abrt will be counted here also. + Some other errors are: + - write error on the client socket (won't be counted for the server stat) + - failure applying filters to the response. + 15. wretr [..BS]: number of times a connection to a server was retried. + 16. wredis [..BS]: number of times a request was redispatched to another + server. The server value counts the number of times that server was + switched away from. + 17. status [LFBS]: status (UP/DOWN/NOLB/MAINT/MAINT(via)/MAINT(resolution)...) + 18. weight [..BS]: total effective weight (backend), effective weight (server) + 19. act [..BS]: number of active servers (backend), server is active (server) + 20. bck [..BS]: number of backup servers (backend), server is backup (server) + 21. chkfail [...S]: number of failed checks. (Only counts checks failed when + the server is up.) + 22. chkdown [..BS]: number of UP->DOWN transitions. The backend counter counts + transitions to the whole backend being down, rather than the sum of the + counters for each server. + 23. lastchg [..BS]: number of seconds since the last UP<->DOWN transition + 24. downtime [..BS]: total downtime (in seconds). The value for the backend + is the downtime for the whole backend, not the sum of the server downtime. + 25. qlimit [...S]: configured maxqueue for the server, or nothing in the + value is 0 (default, meaning no limit) + 26. pid [LFBS]: process id (0 for first instance, 1 for second, ...) + 27. iid [LFBS]: unique proxy id + 28. sid [L..S]: server id (unique inside a proxy) + 29. throttle [...S]: current throttle percentage for the server, when + slowstart is active, or no value if not in slowstart. + 30. lbtot [..BS]: total number of times a server was selected, either for new + sessions, or when re-dispatching. The server counter is the number + of times that server was selected. + 31. tracked [...S]: id of proxy/server if tracking is enabled. + 32. type [LFBS]: (0=frontend, 1=backend, 2=server, 3=socket/listener) + 33. rate [.FBS]: number of sessions per second over last elapsed second + 34. rate_lim [.F..]: configured limit on new sessions per second + 35. rate_max [.FBS]: max number of new sessions per second + 36. check_status [...S]: status of last health check, one of: + UNK -> unknown + INI -> initializing + SOCKERR -> socket error + L4OK -> check passed on layer 4, no upper layers testing enabled + L4TOUT -> layer 1-4 timeout + L4CON -> layer 1-4 connection problem, for example + "Connection refused" (tcp rst) or "No route to host" (icmp) + L6OK -> check passed on layer 6 + L6TOUT -> layer 6 (SSL) timeout + L6RSP -> layer 6 invalid response - protocol error + L7OK -> check passed on layer 7 + L7OKC -> check conditionally passed on layer 7, for example 404 with + disable-on-404 + L7TOUT -> layer 7 (HTTP/SMTP) timeout + L7RSP -> layer 7 invalid response - protocol error + L7STS -> layer 7 response error, for example HTTP 5xx + Notice: If a check is currently running, the last known status will be + reported, prefixed with "* ". e. g. "* L7OK". + 37. check_code [...S]: layer5-7 code, if available + 38. check_duration [...S]: time in ms took to finish last health check + 39. hrsp_1xx [.FBS]: http responses with 1xx code + 40. hrsp_2xx [.FBS]: http responses with 2xx code + 41. hrsp_3xx [.FBS]: http responses with 3xx code + 42. hrsp_4xx [.FBS]: http responses with 4xx code + 43. hrsp_5xx [.FBS]: http responses with 5xx code + 44. hrsp_other [.FBS]: http responses with other codes (protocol error) + 45. hanafail [...S]: failed health checks details + 46. req_rate [.F..]: HTTP requests per second over last elapsed second + 47. req_rate_max [.F..]: max number of HTTP requests per second observed + 48. req_tot [.FB.]: total number of HTTP requests received + 49. cli_abrt [..BS]: number of data transfers aborted by the client + 50. srv_abrt [..BS]: number of data transfers aborted by the server + (inc. in eresp) + 51. comp_in [.FB.]: number of HTTP response bytes fed to the compressor + 52. comp_out [.FB.]: number of HTTP response bytes emitted by the compressor + 53. comp_byp [.FB.]: number of bytes that bypassed the HTTP compressor + (CPU/BW limit) + 54. comp_rsp [.FB.]: number of HTTP responses that were compressed + 55. lastsess [..BS]: number of seconds since last session assigned to + server/backend + 56. last_chk [...S]: last health check contents or textual error + 57. last_agt [...S]: last agent check contents or textual error + 58. qtime [..BS]: the average queue time in ms over the 1024 last requests + 59. ctime [..BS]: the average connect time in ms over the 1024 last requests + 60. rtime [..BS]: the average response time in ms over the 1024 last requests + (0 for TCP) + 61. ttime [..BS]: the average total session time in ms over the 1024 last + requests + 62. agent_status [...S]: status of last agent check, one of: + UNK -> unknown + INI -> initializing + SOCKERR -> socket error + L4OK -> check passed on layer 4, no upper layers testing enabled + L4TOUT -> layer 1-4 timeout + L4CON -> layer 1-4 connection problem, for example + "Connection refused" (tcp rst) or "No route to host" (icmp) + L7OK -> agent reported "up" + L7STS -> agent reported "fail", "stop", or "down" + 63. agent_code [...S]: numeric code reported by agent if any (unused for now) + 64. agent_duration [...S]: time in ms taken to finish last check + 65. check_desc [...S]: short human-readable description of check_status + 66. agent_desc [...S]: short human-readable description of agent_status + 67. check_rise [...S]: server's "rise" parameter used by checks + 68. check_fall [...S]: server's "fall" parameter used by checks + 69. check_health [...S]: server's health check value between 0 and rise+fall-1 + 70. agent_rise [...S]: agent's "rise" parameter, normally 1 + 71. agent_fall [...S]: agent's "fall" parameter, normally 1 + 72. agent_health [...S]: agent's health parameter, between 0 and rise+fall-1 + 73. addr [L..S]: address:port or "unix". IPv6 has brackets around the address. + 74: cookie [..BS]: server's cookie value or backend's cookie name + 75: mode [LFBS]: proxy mode (tcp, http, health, unknown) + 76: algo [..B.]: load balancing algorithm + 77: conn_rate [.F..]: number of connections over the last elapsed second + 78: conn_rate_max [.F..]: highest known conn_rate + 79: conn_tot [.F..]: cumulative number of connections + 80: intercepted [.FB.]: cum. number of intercepted requests (monitor, stats) + 81: dcon [LF..]: requests denied by "tcp-request connection" rules + 82: dses [LF..]: requests denied by "tcp-request session" rules + 83: wrew [LFBS]: cumulative number of failed header rewriting warnings + 84: connect [..BS]: cumulative number of connection establishment attempts + 85: reuse [..BS]: cumulative number of connection reuses + 86: cache_lookups [.FB.]: cumulative number of cache lookups + 87: cache_hits [.FB.]: cumulative number of cache hits + 88: srv_icur [...S]: current number of idle connections available for reuse + 89: src_ilim [...S]: limit on the number of available idle connections + 90. qtime_max [..BS]: the maximum observed queue time in ms + 91. ctime_max [..BS]: the maximum observed connect time in ms + 92. rtime_max [..BS]: the maximum observed response time in ms (0 for TCP) + 93. ttime_max [..BS]: the maximum observed total session time in ms + 94. eint [LFBS]: cumulative number of internal errors + 95. idle_conn_cur [...S]: current number of unsafe idle connections + 96. safe_conn_cur [...S]: current number of safe idle connections + 97. used_conn_cur [...S]: current number of connections in use + 98. need_conn_est [...S]: estimated needed number of connections + 99. uweight [..BS]: total user weight (backend), server user weight (server) + +For all other statistics domains, the presence or the order of the fields are +not guaranteed. In this case, the header line should always be used to parse +the CSV data. + +9.2. Typed output format +------------------------ + +Both "show info" and "show stat" support a mode where each output value comes +with its type and sufficient information to know how the value is supposed to +be aggregated between processes and how it evolves. + +In all cases, the output consists in having a single value per line with all +the information split into fields delimited by colons (':'). + +The first column designates the object or metric being dumped. Its format is +specific to the command producing this output and will not be described in this +section. Usually it will consist in a series of identifiers and field names. + +The second column contains 3 characters respectively indicating the origin, the +nature and the scope of the value being reported. The first character (the +origin) indicates where the value was extracted from. Possible characters are : + + M The value is a metric. It is valid at one instant any may change depending + on its nature . + + S The value is a status. It represents a discrete value which by definition + cannot be aggregated. It may be the status of a server ("UP" or "DOWN"), + the PID of the process, etc. + + K The value is a sorting key. It represents an identifier which may be used + to group some values together because it is unique among its class. All + internal identifiers are keys. Some names can be listed as keys if they + are unique (eg: a frontend name is unique). In general keys come from the + configuration, even though some of them may automatically be assigned. For + most purposes keys may be considered as equivalent to configuration. + + C The value comes from the configuration. Certain configuration values make + sense on the output, for example a concurrent connection limit or a cookie + name. By definition these values are the same in all processes started + from the same configuration file. + + P The value comes from the product itself. There are very few such values, + most common use is to report the product name, version and release date. + These elements are also the same between all processes. + +The second character (the nature) indicates the nature of the information +carried by the field in order to let an aggregator decide on what operation to +use to aggregate multiple values. Possible characters are : + + A The value represents an age since a last event. This is a bit different + from the duration in that an age is automatically computed based on the + current date. A typical example is how long ago did the last session + happen on a server. Ages are generally aggregated by taking the minimum + value and do not need to be stored. + + a The value represents an already averaged value. The average response times + and server weights are of this nature. Averages can typically be averaged + between processes. + + C The value represents a cumulative counter. Such measures perpetually + increase until they wrap around. Some monitoring protocols need to tell + the difference between a counter and a gauge to report a different type. + In general counters may simply be summed since they represent events or + volumes. Examples of metrics of this nature are connection counts or byte + counts. + + D The value represents a duration for a status. There are a few usages of + this, most of them include the time taken by the last health check and + the time a server has spent down. Durations are generally not summed, + most of the time the maximum will be retained to compute an SLA. + + G The value represents a gauge. It's a measure at one instant. The memory + usage or the current number of active connections are of this nature. + Metrics of this type are typically summed during aggregation. + + L The value represents a limit (generally a configured one). By nature, + limits are harder to aggregate since they are specific to the point where + they were retrieved. In certain situations they may be summed or be kept + separate. + + M The value represents a maximum. In general it will apply to a gauge and + keep the highest known value. An example of such a metric could be the + maximum amount of concurrent connections that was encountered in the + product's life time. To correctly aggregate maxima, you are supposed to + output a range going from the maximum of all maxima and the sum of all + of them. There is indeed no way to know if they were encountered + simultaneously or not. + + m The value represents a minimum. In general it will apply to a gauge and + keep the lowest known value. An example of such a metric could be the + minimum amount of free memory pools that was encountered in the product's + life time. To correctly aggregate minima, you are supposed to output a + range going from the minimum of all minima and the sum of all of them. + There is indeed no way to know if they were encountered simultaneously + or not. + + N The value represents a name, so it is a string. It is used to report + proxy names, server names and cookie names. Names have configuration or + keys as their origin and are supposed to be the same among all processes. + + O The value represents a free text output. Outputs from various commands, + returns from health checks, node descriptions are of such nature. + + R The value represents an event rate. It's a measure at one instant. It is + quite similar to a gauge except that the recipient knows that this measure + moves slowly and may decide not to keep all values. An example of such a + metric is the measured amount of connections per second. Metrics of this + type are typically summed during aggregation. + + T The value represents a date or time. A field emitting the current date + would be of this type. The method to aggregate such information is left + as an implementation choice. For now no field uses this type. + +The third character (the scope) indicates what extent the value reflects. Some +elements may be per process while others may be per configuration or per system. +The distinction is important to know whether or not a single value should be +kept during aggregation or if values have to be aggregated. The following +characters are currently supported : + + C The value is valid for a whole cluster of nodes, which is the set of nodes + communicating over the peers protocol. An example could be the amount of + entries present in a stick table that is replicated with other peers. At + the moment no metric use this scope. + + P The value is valid only for the process reporting it. Most metrics use + this scope. + + S The value is valid for the whole service, which is the set of processes + started together from the same configuration file. All metrics originating + from the configuration use this scope. Some other metrics may use it as + well for some shared resources (eg: shared SSL cache statistics). + + s The value is valid for the whole system, such as the system's hostname, + current date or resource usage. At the moment this scope is not used by + any metric. + +Consumers of these information will generally have enough of these 3 characters +to determine how to accurately report aggregated information across multiple +processes. + +After this column, the third column indicates the type of the field, among "s32" +(signed 32-bit integer), "s64" (signed 64-bit integer), "u32" (unsigned 32-bit +integer), "u64" (unsigned 64-bit integer), "str" (string). It is important to +know the type before parsing the value in order to properly read it. For example +a string containing only digits is still a string an not an integer (eg: an +error code extracted by a check). + +Then the fourth column is the value itself, encoded according to its type. +Strings are dumped as-is immediately after the colon without any leading space. +If a string contains a colon, it will appear normally. This means that the +output should not be exclusively split around colons or some check outputs +or server addresses might be truncated. + + +9.3. Unix Socket commands +------------------------- + +The stats socket is not enabled by default. In order to enable it, it is +necessary to add one line in the global section of the haproxy configuration. +A second line is recommended to set a larger timeout, always appreciated when +issuing commands by hand : + + global + stats socket /var/run/haproxy.sock mode 600 level admin + stats timeout 2m + +It is also possible to add multiple instances of the stats socket by repeating +the line, and make them listen to a TCP port instead of a UNIX socket. This is +never done by default because this is dangerous, but can be handy in some +situations : + + global + stats socket /var/run/haproxy.sock mode 600 level admin + stats socket ipv4@192.168.0.1:9999 level admin + stats timeout 2m + +To access the socket, an external utility such as "socat" is required. Socat is +a swiss-army knife to connect anything to anything. We use it to connect +terminals to the socket, or a couple of stdin/stdout pipes to it for scripts. +The two main syntaxes we'll use are the following : + + # socat /var/run/haproxy.sock stdio + # socat /var/run/haproxy.sock readline + +The first one is used with scripts. It is possible to send the output of a +script to haproxy, and pass haproxy's output to another script. That's useful +for retrieving counters or attack traces for example. + +The second one is only useful for issuing commands by hand. It has the benefit +that the terminal is handled by the readline library which supports line +editing and history, which is very convenient when issuing repeated commands +(eg: watch a counter). + +The socket supports two operation modes : + - interactive + - non-interactive + +The non-interactive mode is the default when socat connects to the socket. In +this mode, a single line may be sent. It is processed as a whole, responses are +sent back, and the connection closes after the end of the response. This is the +mode that scripts and monitoring tools use. It is possible to send multiple +commands in this mode, they need to be delimited by a semi-colon (';'). For +example : + + # echo "show info;show stat;show table" | socat /var/run/haproxy stdio + +If a command needs to use a semi-colon or a backslash (eg: in a value), it +must be preceded by a backslash ('\'). + +The interactive mode displays a prompt ('>') and waits for commands to be +entered on the line, then processes them, and displays the prompt again to wait +for a new command. This mode is entered via the "prompt" command which must be +sent on the first line in non-interactive mode. The mode is a flip switch, if +"prompt" is sent in interactive mode, it is disabled and the connection closes +after processing the last command of the same line. + +For this reason, when debugging by hand, it's quite common to start with the +"prompt" command : + + # socat /var/run/haproxy readline + prompt + > show info + ... + > + +Since multiple commands may be issued at once, haproxy uses the empty line as a +delimiter to mark an end of output for each command, and takes care of ensuring +that no command can emit an empty line on output. A script can thus easily +parse the output even when multiple commands were pipelined on a single line. + +Some commands may take an optional payload. To add one to a command, the first +line needs to end with the "<<\n" pattern. The next lines will be treated as +the payload and can contain as many lines as needed. To validate a command with +a payload, it needs to end with an empty line. + +Limitations do exist: the length of the whole buffer passed to the CLI must +not be greater than tune.bfsize and the pattern "<<" must not be glued to the +last word of the line. + +When entering a paylod while in interactive mode, the prompt will change from +"> " to "+ ". + +It is important to understand that when multiple haproxy processes are started +on the same sockets, any process may pick up the request and will output its +own stats. + +The list of commands currently supported on the stats socket is provided below. +If an unknown command is sent, haproxy displays the usage message which reminds +all supported commands. Some commands support a more complex syntax, generally +it will explain what part of the command is invalid when this happens. + +Some commands require a higher level of privilege to work. If you do not have +enough privilege, you will get an error "Permission denied". Please check +the "level" option of the "bind" keyword lines in the configuration manual +for more information. + +abort ssl ca-file <cafile> + Abort and destroy a temporary CA file update transaction. + + See also "set ssl ca-file" and "commit ssl ca-file". + +abort ssl cert <filename> + Abort and destroy a temporary SSL certificate update transaction. + + See also "set ssl cert" and "commit ssl cert". + +abort ssl crl-file <crlfile> + Abort and destroy a temporary CRL file update transaction. + + See also "set ssl crl-file" and "commit ssl crl-file". + +add acl [@<ver>] <acl> <pattern> + Add an entry into the acl <acl>. <acl> is the #<id> or the <file> returned by + "show acl". This command does not verify if the entry already exists. Entries + are added to the current version of the ACL, unless a specific version is + specified with "@<ver>". This version number must have preliminary been + allocated by "prepare acl", and it will be comprised between the versions + reported in "curr_ver" and "next_ver" on the output of "show acl". Entries + added with a specific version number will not match until a "commit acl" + operation is performed on them. They may however be consulted using the + "show acl @<ver>" command, and cleared using a "clear acl @<ver>" command. + This command cannot be used if the reference <acl> is a file also used with + a map. In this case, the "add map" command must be used instead. + +add map [@<ver>] <map> <key> <value> +add map [@<ver>] <map> <payload> + Add an entry into the map <map> to associate the value <value> to the key + <key>. This command does not verify if the entry already exists. It is + mainly used to fill a map after a "clear" or "prepare" operation. Entries + are added to the current version of the ACL, unless a specific version is + specified with "@<ver>". This version number must have preliminary been + allocated by "prepare acl", and it will be comprised between the versions + reported in "curr_ver" and "next_ver" on the output of "show acl". Entries + added with a specific version number will not match until a "commit map" + operation is performed on them. They may however be consulted using the + "show map @<ver>" command, and cleared using a "clear acl @<ver>" command. + If the designated map is also used as an ACL, the ACL will only match the + <key> part and will ignore the <value> part. Using the payload syntax it is + possible to add multiple key/value pairs by entering them on separate lines. + On each new line, the first word is the key and the rest of the line is + considered to be the value which can even contains spaces. + + Example: + + # socat /tmp/sock1 - + prompt + + > add map #-1 << + + key1 value1 + + key2 value2 with spaces + + key3 value3 also with spaces + + key4 value4 + + > + +add server <backend>/<server> [args]* + Instantiate a new server attached to the backend <backend>. + + The <server> name must not be already used in the backend. A special + restriction is put on the backend which must used a dynamic load-balancing + algorithm. A subset of keywords from the server config file statement can be + used to configure the server behavior. Also note that no settings will be + reused from an hypothetical 'default-server' statement in the same backend. + + Currently a dynamic server is statically initialized with the "none" + init-addr method. This means that no resolution will be undertaken if a FQDN + is specified as an address, even if the server creation will be validated. + + To support the reload operations, it is expected that the server created via + the CLI is also manually inserted in the relevant haproxy configuration file. + A dynamic server not present in the configuration won't be restored after a + reload operation. + + A dynamic server may use the "track" keyword to follow the check status of + another server from the configuration. However, it is not possible to track + another dynamic server. This is to ensure that the tracking chain is kept + consistent even in the case of dynamic servers deletion. + + Use the "check" keyword to enable health-check support. Note that the + health-check is disabled by default and must be enabled independently from + the server using the "enable health" command. For agent checks, use the + "agent-check" keyword and the "enable agent" command. Note that in this case + the server may be activated via the agent depending on the status reported, + without an explicit "enable server" command. This also means that extra care + is required when removing a dynamic server with agent check. The agent should + be first deactivated via "disable agent" to be able to put the server in the + required maintenance mode before removal. + + It may be possible to reach the fd limit when using a large number of dynamic + servers. Please refer to the "u-limit" global keyword documentation in this + case. + + Here is the list of the currently supported keywords : + + - agent-addr + - agent-check + - agent-inter + - agent-port + - agent-send + - allow-0rtt + - alpn + - addr + - backup + - ca-file + - check + - check-alpn + - check-proto + - check-send-proxy + - check-sni + - check-ssl + - check-via-socks4 + - ciphers + - ciphersuites + - crl-file + - crt + - disabled + - downinter + - enabled + - error-limit + - fall + - fastinter + - force-sslv3/tlsv10/tlsv11/tlsv12/tlsv13 + - id + - inter + - maxconn + - maxqueue + - minconn + - no-ssl-reuse + - no-sslv3/tlsv10/tlsv11/tlsv12/tlsv13 + - no-tls-tickets + - npn + - observe + - on-error + - on-marked-down + - on-marked-up + - pool-low-conn + - pool-max-conn + - pool-purge-delay + - port + - proto + - proxy-v2-options + - rise + - send-proxy + - send-proxy-v2 + - send-proxy-v2-ssl + - send-proxy-v2-ssl-cn + - slowstart + - sni + - source + - ssl + - ssl-max-ver + - ssl-min-ver + - tfo + - tls-tickets + - track + - usesrc + - verify + - verifyhost + - weight + - ws + + Their syntax is similar to the server line from the configuration file, + please refer to their individual documentation for details. + +add ssl crt-list <crtlist> <certificate> +add ssl crt-list <crtlist> <payload> + Add an certificate in a crt-list. It can also be used for directories since + directories are now loaded the same way as the crt-lists. This command allow + you to use a certificate name in parameter, to use SSL options or filters a + crt-list line must sent as a payload instead. Only one crt-list line is + supported in the payload. This command will load the certificate for every + bind lines using the crt-list. To push a new certificate to HAProxy the + commands "new ssl cert" and "set ssl cert" must be used. + + Example: + $ echo "new ssl cert foobar.pem" | socat /tmp/sock1 - + $ echo -e "set ssl cert foobar.pem <<\n$(cat foobar.pem)\n" | socat + /tmp/sock1 - + $ echo "commit ssl cert foobar.pem" | socat /tmp/sock1 - + $ echo "add ssl crt-list certlist1 foobar.pem" | socat /tmp/sock1 - + + $ echo -e 'add ssl crt-list certlist1 <<\nfoobar.pem [allow-0rtt] foo.bar.com + !test1.com\n' | socat /tmp/sock1 - + +clear counters + Clear the max values of the statistics counters in each proxy (frontend & + backend) and in each server. The accumulated counters are not affected. The + internal activity counters reported by "show activity" are also reset. This + can be used to get clean counters after an incident, without having to + restart nor to clear traffic counters. This command is restricted and can + only be issued on sockets configured for levels "operator" or "admin". + +clear counters all + Clear all statistics counters in each proxy (frontend & backend) and in each + server. This has the same effect as restarting. This command is restricted + and can only be issued on sockets configured for level "admin". + +clear acl [@<ver>] <acl> + Remove all entries from the acl <acl>. <acl> is the #<id> or the <file> + returned by "show acl". Note that if the reference <acl> is a file and is + shared with a map, this map will be also cleared. By default only the current + version of the ACL is cleared (the one being matched against). However it is + possible to specify another version using '@' followed by this version. + +clear map [@<ver>] <map> + Remove all entries from the map <map>. <map> is the #<id> or the <file> + returned by "show map". Note that if the reference <map> is a file and is + shared with a acl, this acl will be also cleared. By default only the current + version of the map is cleared (the one being matched against). However it is + possible to specify another version using '@' followed by this version. + +clear table <table> [ data.<type> <operator> <value> ] | [ key <key> ] + Remove entries from the stick-table <table>. + + This is typically used to unblock some users complaining they have been + abusively denied access to a service, but this can also be used to clear some + stickiness entries matching a server that is going to be replaced (see "show + table" below for details). Note that sometimes, removal of an entry will be + refused because it is currently tracked by a session. Retrying a few seconds + later after the session ends is usual enough. + + In the case where no options arguments are given all entries will be removed. + + When the "data." form is used entries matching a filter applied using the + stored data (see "stick-table" in section 4.2) are removed. A stored data + type must be specified in <type>, and this data type must be stored in the + table otherwise an error is reported. The data is compared according to + <operator> with the 64-bit integer <value>. Operators are the same as with + the ACLs : + + - eq : match entries whose data is equal to this value + - ne : match entries whose data is not equal to this value + - le : match entries whose data is less than or equal to this value + - ge : match entries whose data is greater than or equal to this value + - lt : match entries whose data is less than this value + - gt : match entries whose data is greater than this value + + When the key form is used the entry <key> is removed. The key must be of the + same type as the table, which currently is limited to IPv4, IPv6, integer and + string. + + Example : + $ echo "show table http_proxy" | socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:2 + >>> 0x80e6a4c: key=127.0.0.1 use=0 exp=3594729 gpc0=0 conn_rate(30000)=1 \ + bytes_out_rate(60000)=187 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + + $ echo "clear table http_proxy key 127.0.0.1" | socat stdio /tmp/sock1 + + $ echo "show table http_proxy" | socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:1 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + $ echo "clear table http_proxy data.gpc0 eq 1" | socat stdio /tmp/sock1 + $ echo "show table http_proxy" | socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:1 + +commit acl @<ver> <acl> + Commit all changes made to version <ver> of ACL <acl>, and deletes all past + versions. <acl> is the #<id> or the <file> returned by "show acl". The + version number must be between "curr_ver"+1 and "next_ver" as reported in + "show acl". The contents to be committed to the ACL can be consulted with + "show acl @<ver> <acl>" if desired. The specified version number has normally + been created with the "prepare acl" command. The replacement is atomic. It + consists in atomically updating the current version to the specified version, + which will instantly cause all entries in other versions to become invisible, + and all entries in the new version to become visible. It is also possible to + use this command to perform an atomic removal of all visible entries of an + ACL by calling "prepare acl" first then committing without adding any + entries. This command cannot be used if the reference <acl> is a file also + used as a map. In this case, the "commit map" command must be used instead. + +commit map @<ver> <map> + Commit all changes made to version <ver> of map <map>, and deletes all past + versions. <map> is the #<id> or the <file> returned by "show map". The + version number must be between "curr_ver"+1 and "next_ver" as reported in + "show map". The contents to be committed to the map can be consulted with + "show map @<ver> <map>" if desired. The specified version number has normally + been created with the "prepare map" command. The replacement is atomic. It + consists in atomically updating the current version to the specified version, + which will instantly cause all entries in other versions to become invisible, + and all entries in the new version to become visible. It is also possible to + use this command to perform an atomic removal of all visible entries of an + map by calling "prepare map" first then committing without adding any + entries. + +commit ssl ca-file <cafile> + Commit a temporary SSL CA file update transaction. + + In the case of an existing CA file (in a "Used" state in "show ssl ca-file"), + the new CA file tree entry is inserted in the CA file tree and every instance + that used the CA file entry is rebuilt, along with the SSL contexts it needs. + All the contexts previously used by the rebuilt instances are removed. + Upon success, the previous CA file entry is removed from the tree. + Upon failure, nothing is removed or deleted, and all the original SSL + contexts are kept and used. + Once the temporary transaction is committed, it is destroyed. + + In the case of a new CA file (after a "new ssl ca-file" and in a "Unused" + state in "show ssl ca-file"), the CA file will be inserted in the CA file + tree but it won't be used anywhere in HAProxy. To use it and generate SSL + contexts that use it, you will need to add it to a crt-list with "add ssl + crt-list". + + See also "new ssl ca-file", "set ssl ca-file", "abort ssl ca-file" and + "add ssl crt-list". + +commit ssl cert <filename> + Commit a temporary SSL certificate update transaction. + + In the case of an existing certificate (in a "Used" state in "show ssl + cert"), generate every SSL contextes and SNIs it need, insert them, and + remove the previous ones. Replace in memory the previous SSL certificates + everywhere the <filename> was used in the configuration. Upon failure it + doesn't remove or insert anything. Once the temporary transaction is + committed, it is destroyed. + + In the case of a new certificate (after a "new ssl cert" and in a "Unused" + state in "show ssl cert"), the certificate will be committed in a certificate + storage, but it won't be used anywhere in haproxy. To use it and generate + its SNIs you will need to add it to a crt-list or a directory with "add ssl + crt-list". + + See also "new ssl cert", "set ssl cert", "abort ssl cert" and + "add ssl crt-list". + +commit ssl crl-file <crlfile> + Commit a temporary SSL CRL file update transaction. + + In the case of an existing CRL file (in a "Used" state in "show ssl + crl-file"), the new CRL file entry is inserted in the CA file tree (which + holds both the CA files and the CRL files) and every instance that used the + CRL file entry is rebuilt, along with the SSL contexts it needs. + All the contexts previously used by the rebuilt instances are removed. + Upon success, the previous CRL file entry is removed from the tree. + Upon failure, nothing is removed or deleted, and all the original SSL + contexts are kept and used. + Once the temporary transaction is committed, it is destroyed. + + In the case of a new CRL file (after a "new ssl crl-file" and in a "Unused" + state in "show ssl crl-file"), the CRL file will be inserted in the CRL file + tree but it won't be used anywhere in HAProxy. To use it and generate SSL + contexts that use it, you will need to add it to a crt-list with "add ssl + crt-list". + + See also "new ssl crl-file", "set ssl crl-file", "abort ssl crl-file" and + "add ssl crt-list". + +debug dev <command> [args]* + Call a developer-specific command. Only supported on a CLI connection running + in expert mode (see "expert-mode on"). Such commands are extremely dangerous + and not forgiving, any misuse may result in a crash of the process. They are + intended for experts only, and must really not be used unless told to do so. + Some of them are only available when haproxy is built with DEBUG_DEV defined + because they may have security implications. All of these commands require + admin privileges, and are purposely not documented to avoid encouraging their + use by people who are not at ease with the source code. + +del acl <acl> [<key>|#<ref>] + Delete all the acl entries from the acl <acl> corresponding to the key <key>. + <acl> is the #<id> or the <file> returned by "show acl". If the <ref> is used, + this command delete only the listed reference. The reference can be found with + listing the content of the acl. Note that if the reference <acl> is a file and + is shared with a map, the entry will be also deleted in the map. + +del map <map> [<key>|#<ref>] + Delete all the map entries from the map <map> corresponding to the key <key>. + <map> is the #<id> or the <file> returned by "show map". If the <ref> is used, + this command delete only the listed reference. The reference can be found with + listing the content of the map. Note that if the reference <map> is a file and + is shared with a acl, the entry will be also deleted in the map. + +del ssl ca-file <cafile> + Delete a CA file tree entry from HAProxy. The CA file must be unused and + removed from any crt-list. "show ssl ca-file" displays the status of the CA + files. The deletion doesn't work with a certificate referenced directly with + the "ca-file" or "ca-verify-file" directives in the configuration. + +del ssl cert <certfile> + Delete a certificate store from HAProxy. The certificate must be unused and + removed from any crt-list or directory. "show ssl cert" displays the status + of the certificate. The deletion doesn't work with a certificate referenced + directly with the "crt" directive in the configuration. + +del ssl crl-file <crlfile> + Delete a CRL file tree entry from HAProxy. The CRL file must be unused and + removed from any crt-list. "show ssl crl-file" displays the status of the CRL + files. The deletion doesn't work with a certificate referenced directly with + the "crl-file" directive in the configuration. + +del ssl crt-list <filename> <certfile[:line]> + Delete an entry in a crt-list. This will delete every SNIs used for this + entry in the frontends. If a certificate is used several time in a crt-list, + you will need to provide which line you want to delete. To display the line + numbers, use "show ssl crt-list -n <crtlist>". + +del server <backend>/<server> + Remove a server attached to the backend <backend>. All servers are eligible, + except servers which are referenced by other configuration elements. The + server must be put in maintenance mode prior to its deletion. The operation + is cancelled if the serveur still has active or idle connection or its + connection queue is not empty. + +disable agent <backend>/<server> + Mark the auxiliary agent check as temporarily stopped. + + In the case where an agent check is being run as a auxiliary check, due + to the agent-check parameter of a server directive, new checks are only + initialized when the agent is in the enabled. Thus, disable agent will + prevent any new agent checks from begin initiated until the agent + re-enabled using enable agent. + + When an agent is disabled the processing of an auxiliary agent check that + was initiated while the agent was set as enabled is as follows: All + results that would alter the weight, specifically "drain" or a weight + returned by the agent, are ignored. The processing of agent check is + otherwise unchanged. + + The motivation for this feature is to allow the weight changing effects + of the agent checks to be paused to allow the weight of a server to be + configured using set weight without being overridden by the agent. + + This command is restricted and can only be issued on sockets configured for + level "admin". + +disable dynamic-cookie backend <backend> + Disable the generation of dynamic cookies for the backend <backend> + +disable frontend <frontend> + Mark the frontend as temporarily stopped. This corresponds to the mode which + is used during a soft restart : the frontend releases the port but can be + enabled again if needed. This should be used with care as some non-Linux OSes + are unable to enable it back. This is intended to be used in environments + where stopping a proxy is not even imaginable but a misconfigured proxy must + be fixed. That way it's possible to release the port and bind it into another + process to restore operations. The frontend will appear with status "STOP" + on the stats page. + + The frontend may be specified either by its name or by its numeric ID, + prefixed with a sharp ('#'). + + This command is restricted and can only be issued on sockets configured for + level "admin". + +disable health <backend>/<server> + Mark the primary health check as temporarily stopped. This will disable + sending of health checks, and the last health check result will be ignored. + The server will be in unchecked state and considered UP unless an auxiliary + agent check forces it down. + + This command is restricted and can only be issued on sockets configured for + level "admin". + +disable server <backend>/<server> + Mark the server DOWN for maintenance. In this mode, no more checks will be + performed on the server until it leaves maintenance. + If the server is tracked by other servers, those servers will be set to DOWN + during the maintenance. + + In the statistics page, a server DOWN for maintenance will appear with a + "MAINT" status, its tracking servers with the "MAINT(via)" one. + + Both the backend and the server may be specified either by their name or by + their numeric ID, prefixed with a sharp ('#'). + + This command is restricted and can only be issued on sockets configured for + level "admin". + +enable agent <backend>/<server> + Resume auxiliary agent check that was temporarily stopped. + + See "disable agent" for details of the effect of temporarily starting + and stopping an auxiliary agent. + + This command is restricted and can only be issued on sockets configured for + level "admin". + +enable dynamic-cookie backend <backend> + Enable the generation of dynamic cookies for the backend <backend>. + A secret key must also be provided. + +enable frontend <frontend> + Resume a frontend which was temporarily stopped. It is possible that some of + the listening ports won't be able to bind anymore (eg: if another process + took them since the 'disable frontend' operation). If this happens, an error + is displayed. Some operating systems might not be able to resume a frontend + which was disabled. + + The frontend may be specified either by its name or by its numeric ID, + prefixed with a sharp ('#'). + + This command is restricted and can only be issued on sockets configured for + level "admin". + +enable health <backend>/<server> + Resume a primary health check that was temporarily stopped. This will enable + sending of health checks again. Please see "disable health" for details. + + This command is restricted and can only be issued on sockets configured for + level "admin". + +enable server <backend>/<server> + If the server was previously marked as DOWN for maintenance, this marks the + server UP and checks are re-enabled. + + Both the backend and the server may be specified either by their name or by + their numeric ID, prefixed with a sharp ('#'). + + This command is restricted and can only be issued on sockets configured for + level "admin". + +experimental-mode [on|off] + Without options, this indicates whether the experimental mode is enabled or + disabled on the current connection. When passed "on", it turns the + experimental mode on for the current CLI connection only. With "off" it turns + it off. + + The experimental mode is used to access to extra features still in + development. These features are currently not stable and should be used with + care. They may be subject to breaking changes across versions. + + When used from the master CLI, this command shouldn't be prefixed, as it will + set the mode for any worker when connecting to its CLI. + + Example: + echo "@1; experimental-mode on; <experimental_cmd>..." | socat /var/run/haproxy.master - + echo "experimental-mode on; @1 <experimental_cmd>..." | socat /var/run/haproxy.master - + +expert-mode [on|off] + This command is similar to experimental-mode but is used to toggle the + expert mode. + + The expert mode enables displaying of expert commands that can be extremely + dangerous for the process and which may occasionally help developers collect + important information about complex bugs. Any misuse of these features will + likely lead to a process crash. Do not use this option without being invited + to do so. Note that this command is purposely not listed in the help message. + This command is only accessible in admin level. Changing to another level + automatically resets the expert mode. + + When used from the master CLI, this command shouldn't be prefixed, as it will + set the mode for any worker when connecting to its CLI. + + Example: + echo "@1; expert-mode on; debug dev exit 1" | socat /var/run/haproxy.master - + echo "expert-mode on; @1 debug dev exit 1" | socat /var/run/haproxy.master - + +get map <map> <value> +get acl <acl> <value> + Lookup the value <value> in the map <map> or in the ACL <acl>. <map> or <acl> + are the #<id> or the <file> returned by "show map" or "show acl". This command + returns all the matching patterns associated with this map. This is useful for + debugging maps and ACLs. The output format is composed by one line par + matching type. Each line is composed by space-delimited series of words. + + The first two words are: + + <match method>: The match method applied. It can be "found", "bool", + "int", "ip", "bin", "len", "str", "beg", "sub", "dir", + "dom", "end" or "reg". + + <match result>: The result. Can be "match" or "no-match". + + The following words are returned only if the pattern matches an entry. + + <index type>: "tree" or "list". The internal lookup algorithm. + + <case>: "case-insensitive" or "case-sensitive". The + interpretation of the case. + + <entry matched>: match="<entry>". Return the matched pattern. It is + useful with regular expressions. + + The two last word are used to show the returned value and its type. With the + "acl" case, the pattern doesn't exist. + + return=nothing: No return because there are no "map". + return="<value>": The value returned in the string format. + return=cannot-display: The value cannot be converted as string. + + type="<type>": The type of the returned sample. + +get var <name> + Show the existence, type and contents of the process-wide variable 'name'. + Only process-wide variables are readable, so the name must begin with + 'proc.' otherwise no variable will be found. This command requires levels + "operator" or "admin". + +get weight <backend>/<server> + Report the current weight and the initial weight of server <server> in + backend <backend> or an error if either doesn't exist. The initial weight is + the one that appears in the configuration file. Both are normally equal + unless the current weight has been changed. Both the backend and the server + may be specified either by their name or by their numeric ID, prefixed with a + sharp ('#'). + +help [<command>] + Print the list of known keywords and their basic usage, or commands matching + the requested one. The same help screen is also displayed for unknown + commands. + +httpclient <method> <URI> + Launch an HTTP client request and print the response on the CLI. Only + supported on a CLI connection running in expert mode (see "expert-mode on"). + It's only meant for debugging. The httpclient is able to resolve a server + name in the URL using the "default" resolvers section, which is populated + with the DNS servers of your /etc/resolv.conf by default. However it won't be + able to resolve an host from /etc/hosts if you don't use a local dns daemon + which can resolve those. + +new ssl ca-file <cafile> + Create a new empty CA file tree entry to be filled with a set of CA + certificates and added to a crt-list. This command should be used in + combination with "set ssl ca-file" and "add ssl crt-list". + +new ssl cert <filename> + Create a new empty SSL certificate store to be filled with a certificate and + added to a directory or a crt-list. This command should be used in + combination with "set ssl cert" and "add ssl crt-list". + +new ssl crl-file <crlfile> + Create a new empty CRL file tree entry to be filled with a set of CRLs + and added to a crt-list. This command should be used in combination with "set + ssl crl-file" and "add ssl crt-list". + +prepare acl <acl> + Allocate a new version number in ACL <acl> for atomic replacement. <acl> is + the #<id> or the <file> returned by "show acl". The new version number is + shown in response after "New version created:". This number will then be + usable to prepare additions of new entries into the ACL which will then + atomically replace the current ones once committed. It is reported as + "next_ver" in "show acl". There is no impact of allocating new versions, as + unused versions will automatically be removed once a more recent version is + committed. Version numbers are unsigned 32-bit values which wrap at the end, + so care must be taken when comparing them in an external program. This + command cannot be used if the reference <acl> is a file also used as a map. + In this case, the "prepare map" command must be used instead. + +prepare map <map> + Allocate a new version number in map <map> for atomic replacement. <map> is + the #<id> or the <file> returned by "show map". The new version number is + shown in response after "New version created:". This number will then be + usable to prepare additions of new entries into the map which will then + atomically replace the current ones once committed. It is reported as + "next_ver" in "show map". There is no impact of allocating new versions, as + unused versions will automatically be removed once a more recent version is + committed. Version numbers are unsigned 32-bit values which wrap at the end, + so care must be taken when comparing them in an external program. + +prompt + Toggle the prompt at the beginning of the line and enter or leave interactive + mode. In interactive mode, the connection is not closed after a command + completes. Instead, the prompt will appear again, indicating the user that + the interpreter is waiting for a new command. The prompt consists in a right + angle bracket followed by a space "> ". This mode is particularly convenient + when one wants to periodically check information such as stats or errors. + It is also a good idea to enter interactive mode before issuing a "help" + command. + +quit + Close the connection when in interactive mode. + +set dynamic-cookie-key backend <backend> <value> + Modify the secret key used to generate the dynamic persistent cookies. + This will break the existing sessions. + +set map <map> [<key>|#<ref>] <value> + Modify the value corresponding to each key <key> in a map <map>. <map> is the + #<id> or <file> returned by "show map". If the <ref> is used in place of + <key>, only the entry pointed by <ref> is changed. The new value is <value>. + +set maxconn frontend <frontend> <value> + Dynamically change the specified frontend's maxconn setting. Any positive + value is allowed including zero, but setting values larger than the global + maxconn does not make much sense. If the limit is increased and connections + were pending, they will immediately be accepted. If it is lowered to a value + below the current number of connections, new connections acceptation will be + delayed until the threshold is reached. The frontend might be specified by + either its name or its numeric ID prefixed with a sharp ('#'). + +set maxconn server <backend/server> <value> + Dynamically change the specified server's maxconn setting. Any positive + value is allowed including zero, but setting values larger than the global + maxconn does not make much sense. + +set maxconn global <maxconn> + Dynamically change the global maxconn setting within the range defined by the + initial global maxconn setting. If it is increased and connections were + pending, they will immediately be accepted. If it is lowered to a value below + the current number of connections, new connections acceptation will be + delayed until the threshold is reached. A value of zero restores the initial + setting. + +set profiling { tasks | memory } { auto | on | off } + Enables or disables CPU or memory profiling for the indicated subsystem. This + is equivalent to setting or clearing the "profiling" settings in the "global" + section of the configuration file. Please also see "show profiling". Note + that manually setting the tasks profiling to "on" automatically resets the + scheduler statistics, thus allows to check activity over a given interval. + The memory profiling is limited to certain operating systems (known to work + on the linux-glibc target), and requires USE_MEMORY_PROFILING to be set at + compile time. + +set rate-limit connections global <value> + Change the process-wide connection rate limit, which is set by the global + 'maxconnrate' setting. A value of zero disables the limitation. This limit + applies to all frontends and the change has an immediate effect. The value + is passed in number of connections per second. + +set rate-limit http-compression global <value> + Change the maximum input compression rate, which is set by the global + 'maxcomprate' setting. A value of zero disables the limitation. The value is + passed in number of kilobytes per second. The value is available in the "show + info" on the line "CompressBpsRateLim" in bytes. + +set rate-limit sessions global <value> + Change the process-wide session rate limit, which is set by the global + 'maxsessrate' setting. A value of zero disables the limitation. This limit + applies to all frontends and the change has an immediate effect. The value + is passed in number of sessions per second. + +set rate-limit ssl-sessions global <value> + Change the process-wide SSL session rate limit, which is set by the global + 'maxsslrate' setting. A value of zero disables the limitation. This limit + applies to all frontends and the change has an immediate effect. The value + is passed in number of sessions per second sent to the SSL stack. It applies + before the handshake in order to protect the stack against handshake abuses. + +set server <backend>/<server> addr <ip4 or ip6 address> [port <port>] + Replace the current IP address of a server by the one provided. + Optionally, the port can be changed using the 'port' parameter. + Note that changing the port also support switching from/to port mapping + (notation with +X or -Y), only if a port is configured for the health check. + +set server <backend>/<server> agent [ up | down ] + Force a server's agent to a new state. This can be useful to immediately + switch a server's state regardless of some slow agent checks for example. + Note that the change is propagated to tracking servers if any. + +set server <backend>/<server> agent-addr <addr> [port <port>] + Change addr for servers agent checks. Allows to migrate agent-checks to + another address at runtime. You can specify both IP and hostname, it will be + resolved. + Optionally, change the port agent. + +set server <backend>/<server> agent-port <port> + Change the port used for agent checks. + +set server <backend>/<server> agent-send <value> + Change agent string sent to agent check target. Allows to update string while + changing server address to keep those two matching. + +set server <backend>/<server> health [ up | stopping | down ] + Force a server's health to a new state. This can be useful to immediately + switch a server's state regardless of some slow health checks for example. + Note that the change is propagated to tracking servers if any. + +set server <backend>/<server> check-addr <ip4 | ip6> [port <port>] + Change the IP address used for server health checks. + Optionally, change the port used for server health checks. + +set server <backend>/<server> check-port <port> + Change the port used for health checking to <port> + +set server <backend>/<server> state [ ready | drain | maint ] + Force a server's administrative state to a new state. This can be useful to + disable load balancing and/or any traffic to a server. Setting the state to + "ready" puts the server in normal mode, and the command is the equivalent of + the "enable server" command. Setting the state to "maint" disables any traffic + to the server as well as any health checks. This is the equivalent of the + "disable server" command. Setting the mode to "drain" only removes the server + from load balancing but still allows it to be checked and to accept new + persistent connections. Changes are propagated to tracking servers if any. + +set server <backend>/<server> weight <weight>[%] + Change a server's weight to the value passed in argument. This is the exact + equivalent of the "set weight" command below. + +set server <backend>/<server> fqdn <FQDN> + Change a server's FQDN to the value passed in argument. This requires the + internal run-time DNS resolver to be configured and enabled for this server. + +set server <backend>/<server> ssl [ on | off ] (deprecated) + This option configures SSL ciphering on outgoing connections to the server. + When switch off, all traffic becomes plain text; health check path is not + changed. + + This command is deprecated, create a new server dynamically with or without + SSL instead, using the "add server" command. + +set severity-output [ none | number | string ] + Change the severity output format of the stats socket connected to for the + duration of the current session. + +set ssl ca-file <cafile> <payload> + This command is part of a transaction system, the "commit ssl ca-file" and + "abort ssl ca-file" commands could be required. + If there is no on-going transaction, it will create a CA file tree entry into + which the certificates contained in the payload will be stored. The CA file + entry will not be stored in the CA file tree and will only be kept in a + temporary transaction. If a transaction with the same filename already exists, + the previous CA file entry will be deleted and replaced by the new one. + Once the modifications are done, you have to commit the transaction through + a "commit ssl ca-file" call. + + Example: + echo -e "set ssl ca-file cafile.pem <<\n$(cat rootCA.crt)\n" | \ + socat /var/run/haproxy.stat - + echo "commit ssl ca-file cafile.pem" | socat /var/run/haproxy.stat - + +set ssl cert <filename> <payload> + This command is part of a transaction system, the "commit ssl cert" and + "abort ssl cert" commands could be required. + This whole transaction system works on any certificate displayed by the + "show ssl cert" command, so on any frontend or backend certificate. + If there is no on-going transaction, it will duplicate the certificate + <filename> in memory to a temporary transaction, then update this + transaction with the PEM file in the payload. If a transaction exists with + the same filename, it will update this transaction. It's also possible to + update the files linked to a certificate (.issuer, .sctl, .oscp etc.) + Once the modification are done, you have to "commit ssl cert" the + transaction. + + Injection of files over the CLI must be done with caution since an empty line + is used to notify the end of the payload. It is recommended to inject a PEM + file which has been sanitized. A simple method would be to remove every empty + line and only leave what are in the PEM sections. It could be achieved with a + sed command. + + Example: + + # With some simple sanitizing + echo -e "set ssl cert localhost.pem <<\n$(sed -n '/^$/d;/-BEGIN/,/-END/p' 127.0.0.1.pem)\n" | \ + socat /var/run/haproxy.stat - + + # Complete example with commit + echo -e "set ssl cert localhost.pem <<\n$(cat 127.0.0.1.pem)\n" | \ + socat /var/run/haproxy.stat - + echo -e \ + "set ssl cert localhost.pem.issuer <<\n $(cat 127.0.0.1.pem.issuer)\n" | \ + socat /var/run/haproxy.stat - + echo -e \ + "set ssl cert localhost.pem.ocsp <<\n$(base64 -w 1000 127.0.0.1.pem.ocsp)\n" | \ + socat /var/run/haproxy.stat - + echo "commit ssl cert localhost.pem" | socat /var/run/haproxy.stat - + +set ssl crl-file <crlfile> <payload> + This command is part of a transaction system, the "commit ssl crl-file" and + "abort ssl crl-file" commands could be required. + If there is no on-going transaction, it will create a CRL file tree entry into + which the Revocation Lists contained in the payload will be stored. The CRL + file entry will not be stored in the CRL file tree and will only be kept in a + temporary transaction. If a transaction with the same filename already exists, + the previous CRL file entry will be deleted and replaced by the new one. + Once the modifications are done, you have to commit the transaction through + a "commit ssl crl-file" call. + + Example: + echo -e "set ssl crl-file crlfile.pem <<\n$(cat rootCRL.pem)\n" | \ + socat /var/run/haproxy.stat - + echo "commit ssl crl-file crlfile.pem" | socat /var/run/haproxy.stat - + +set ssl ocsp-response <response | payload> + This command is used to update an OCSP Response for a certificate (see "crt" + on "bind" lines). Same controls are performed as during the initial loading of + the response. The <response> must be passed as a base64 encoded string of the + DER encoded response from the OCSP server. This command is not supported with + BoringSSL. + + Example: + openssl ocsp -issuer issuer.pem -cert server.pem \ + -host ocsp.issuer.com:80 -respout resp.der + echo "set ssl ocsp-response $(base64 -w 10000 resp.der)" | \ + socat stdio /var/run/haproxy.stat + + using the payload syntax: + echo -e "set ssl ocsp-response <<\n$(base64 resp.der)\n" | \ + socat stdio /var/run/haproxy.stat + +set ssl tls-key <id> <tlskey> + Set the next TLS key for the <id> listener to <tlskey>. This key becomes the + ultimate key, while the penultimate one is used for encryption (others just + decrypt). The oldest TLS key present is overwritten. <id> is either a numeric + #<id> or <file> returned by "show tls-keys". <tlskey> is a base64 encoded 48 + or 80 bits TLS ticket key (ex. openssl rand 80 | openssl base64 -A). + +set table <table> key <key> [data.<data_type> <value>]* + Create or update a stick-table entry in the table. If the key is not present, + an entry is inserted. See stick-table in section 4.2 to find all possible + values for <data_type>. The most likely use consists in dynamically entering + entries for source IP addresses, with a flag in gpc0 to dynamically block an + IP address or affect its quality of service. It is possible to pass multiple + data_types in a single call. + +set timeout cli <delay> + Change the CLI interface timeout for current connection. This can be useful + during long debugging sessions where the user needs to constantly inspect + some indicators without being disconnected. The delay is passed in seconds. + +set var <name> <expression> +set var <name> expr <expression> +set var <name> fmt <format> + Allows to set or overwrite the process-wide variable 'name' with the result + of expression <expression> or format string <format>. Only process-wide + variables may be used, so the name must begin with 'proc.' otherwise no + variable will be set. The <expression> and <format> may only involve + "internal" sample fetch keywords and converters even though the most likely + useful ones will be str('something'), int(), simple strings or references to + other variables. Note that the command line parser doesn't know about quotes, + so any space in the expression must be preceded by a backslash. This command + requires levels "operator" or "admin". This command is only supported on a + CLI connection running in experimental mode (see "experimental-mode on"). + +set weight <backend>/<server> <weight>[%] + Change a server's weight to the value passed in argument. If the value ends + with the '%' sign, then the new weight will be relative to the initially + configured weight. Absolute weights are permitted between 0 and 256. + Relative weights must be positive with the resulting absolute weight is + capped at 256. Servers which are part of a farm running a static + load-balancing algorithm have stricter limitations because the weight + cannot change once set. Thus for these servers, the only accepted values + are 0 and 100% (or 0 and the initial weight). Changes take effect + immediately, though certain LB algorithms require a certain amount of + requests to consider changes. A typical usage of this command is to + disable a server during an update by setting its weight to zero, then to + enable it again after the update by setting it back to 100%. This command + is restricted and can only be issued on sockets configured for level + "admin". Both the backend and the server may be specified either by their + name or by their numeric ID, prefixed with a sharp ('#'). + +show acl [[@<ver>] <acl>] + Dump info about acl converters. Without argument, the list of all available + acls is returned. If a <acl> is specified, its contents are dumped. <acl> is + the #<id> or <file>. By default the current version of the ACL is shown (the + version currently being matched against and reported as 'curr_ver' in the ACL + list). It is possible to instead dump other versions by prepending '@<ver>' + before the ACL's identifier. The version works as a filter and non-existing + versions will simply report no result. The dump format is the same as for the + maps even for the sample values. The data returned are not a list of + available ACL, but are the list of all patterns composing any ACL. Many of + these patterns can be shared with maps. The 'entry_cnt' value represents the + count of all the ACL entries, not just the active ones, which means that it + also includes entries currently being added. + +show backend + Dump the list of backends available in the running process + +show cli level + Display the CLI level of the current CLI session. The result could be + 'admin', 'operator' or 'user'. See also the 'operator' and 'user' commands. + + Example : + + $ socat /tmp/sock1 readline + prompt + > operator + > show cli level + operator + > user + > show cli level + user + > operator + Permission denied + +operator + Decrease the CLI level of the current CLI session to operator. It can't be + increased. It also drops expert and experimental mode. See also "show cli + level". + +user + Decrease the CLI level of the current CLI session to user. It can't be + increased. It also drops expert and experimental mode. See also "show cli + level". + +show activity + Reports some counters about internal events that will help developers and + more generally people who know haproxy well enough to narrow down the causes + of reports of abnormal behaviours. A typical example would be a properly + running process never sleeping and eating 100% of the CPU. The output fields + will be made of one line per metric, and per-thread counters on the same + line. These counters are 32-bit and will wrap during the process's life, which + is not a problem since calls to this command will typically be performed + twice. The fields are purposely not documented so that their exact meaning is + verified in the code where the counters are fed. These values are also reset + by the "clear counters" command. + +show cli sockets + List CLI sockets. The output format is composed of 3 fields separated by + spaces. The first field is the socket address, it can be a unix socket, a + ipv4 address:port couple or a ipv6 one. Socket of other types won't be dump. + The second field describe the level of the socket: 'admin', 'user' or + 'operator'. The last field list the processes on which the socket is bound, + separated by commas, it can be numbers or 'all'. + + Example : + + $ echo 'show cli sockets' | socat stdio /tmp/sock1 + # socket lvl processes + /tmp/sock1 admin all + 127.0.0.1:9999 user 2,3,4 + 127.0.0.2:9969 user 2 + [::1]:9999 operator 2 + +show cache + List the configured caches and the objects stored in each cache tree. + + $ echo 'show cache' | socat stdio /tmp/sock1 + 0x7f6ac6c5b03a: foobar (shctx:0x7f6ac6c5b000, available blocks:3918) + 1 2 3 4 + + 1. pointer to the cache structure + 2. cache name + 3. pointer to the mmap area (shctx) + 4. number of blocks available for reuse in the shctx + + 0x7f6ac6c5b4cc hash:286881868 vary:0x0011223344556677 size:39114 (39 blocks), refcount:9, expire:237 + 1 2 3 4 5 6 7 + + 1. pointer to the cache entry + 2. first 32 bits of the hash + 3. secondary hash of the entry in case of vary + 4. size of the object in bytes + 5. number of blocks used for the object + 6. number of transactions using the entry + 7. expiration time, can be negative if already expired + +show env [<name>] + Dump one or all environment variables known by the process. Without any + argument, all variables are dumped. With an argument, only the specified + variable is dumped if it exists. Otherwise "Variable not found" is emitted. + Variables are dumped in the same format as they are stored or returned by the + "env" utility, that is, "<name>=<value>". This can be handy when debugging + certain configuration files making heavy use of environment variables to + ensure that they contain the expected values. This command is restricted and + can only be issued on sockets configured for levels "operator" or "admin". + +show errors [<iid>|<proxy>] [request|response] + Dump last known request and response errors collected by frontends and + backends. If <iid> is specified, the limit the dump to errors concerning + either frontend or backend whose ID is <iid>. Proxy ID "-1" will cause + all instances to be dumped. If a proxy name is specified instead, its ID + will be used as the filter. If "request" or "response" is added after the + proxy name or ID, only request or response errors will be dumped. This + command is restricted and can only be issued on sockets configured for + levels "operator" or "admin". + + The errors which may be collected are the last request and response errors + caused by protocol violations, often due to invalid characters in header + names. The report precisely indicates what exact character violated the + protocol. Other important information such as the exact date the error was + detected, frontend and backend names, the server name (when known), the + internal session ID and the source address which has initiated the session + are reported too. + + All characters are returned, and non-printable characters are encoded. The + most common ones (\t = 9, \n = 10, \r = 13 and \e = 27) are encoded as one + letter following a backslash. The backslash itself is encoded as '\\' to + avoid confusion. Other non-printable characters are encoded '\xNN' where + NN is the two-digits hexadecimal representation of the character's ASCII + code. + + Lines are prefixed with the position of their first character, starting at 0 + for the beginning of the buffer. At most one input line is printed per line, + and large lines will be broken into multiple consecutive output lines so that + the output never goes beyond 79 characters wide. It is easy to detect if a + line was broken, because it will not end with '\n' and the next line's offset + will be followed by a '+' sign, indicating it is a continuation of previous + line. + + Example : + $ echo "show errors -1 response" | socat stdio /tmp/sock1 + >>> [04/Mar/2009:15:46:56.081] backend http-in (#2) : invalid response + src 127.0.0.1, session #54, frontend fe-eth0 (#1), server s2 (#1) + response length 213 bytes, error at position 23: + + 00000 HTTP/1.0 200 OK\r\n + 00017 header/bizarre:blah\r\n + 00038 Location: blah\r\n + 00054 Long-line: this is a very long line which should b + 00104+ e broken into multiple lines on the output buffer, + 00154+ otherwise it would be too large to print in a ter + 00204+ minal\r\n + 00211 \r\n + + In the example above, we see that the backend "http-in" which has internal + ID 2 has blocked an invalid response from its server s2 which has internal + ID 1. The request was on session 54 initiated by source 127.0.0.1 and + received by frontend fe-eth0 whose ID is 1. The total response length was + 213 bytes when the error was detected, and the error was at byte 23. This + is the slash ('/') in header name "header/bizarre", which is not a valid + HTTP character for a header name. + +show events [<sink>] [-w] [-n] + With no option, this lists all known event sinks and their types. With an + option, it will dump all available events in the designated sink if it is of + type buffer. If option "-w" is passed after the sink name, then once the end + of the buffer is reached, the command will wait for new events and display + them. It is possible to stop the operation by entering any input (which will + be discarded) or by closing the session. Finally, option "-n" is used to + directly seek to the end of the buffer, which is often convenient when + combined with "-w" to only report new events. For convenience, "-wn" or "-nw" + may be used to enable both options at once. + +show fd [<fd>] + Dump the list of either all open file descriptors or just the one number <fd> + if specified. This is only aimed at developers who need to observe internal + states in order to debug complex issues such as abnormal CPU usages. One fd + is reported per lines, and for each of them, its state in the poller using + upper case letters for enabled flags and lower case for disabled flags, using + "P" for "polled", "R" for "ready", "A" for "active", the events status using + "H" for "hangup", "E" for "error", "O" for "output", "P" for "priority" and + "I" for "input", a few other flags like "N" for "new" (just added into the fd + cache), "U" for "updated" (received an update in the fd cache), "L" for + "linger_risk", "C" for "cloned", then the cached entry position, the pointer + to the internal owner, the pointer to the I/O callback and its name when + known. When the owner is a connection, the connection flags, and the target + are reported (frontend, proxy or server). When the owner is a listener, the + listener's state and its frontend are reported. There is no point in using + this command without a good knowledge of the internals. It's worth noting + that the output format may evolve over time so this output must not be parsed + by tools designed to be durable. Some internal structure states may look + suspicious to the function listing them, in this case the output line will be + suffixed with an exclamation mark ('!'). This may help find a starting point + when trying to diagnose an incident. + +show info [typed|json] [desc] [float] + Dump info about haproxy status on current process. If "typed" is passed as an + optional argument, field numbers, names and types are emitted as well so that + external monitoring products can easily retrieve, possibly aggregate, then + report information found in fields they don't know. Each field is dumped on + its own line. If "json" is passed as an optional argument then + information provided by "typed" output is provided in JSON format as a + list of JSON objects. By default, the format contains only two columns + delimited by a colon (':'). The left one is the field name and the right + one is the value. It is very important to note that in typed output + format, the dump for a single object is contiguous so that there is no + need for a consumer to store everything at once. If "float" is passed as an + optional argument, some fields usually emitted as integers may switch to + floats for higher accuracy. It is purposely unspecified which ones are + concerned as this might evolve over time. Using this option implies that the + consumer is able to process floats. The output format used is sprintf("%f"). + + When using the typed output format, each line is made of 4 columns delimited + by colons (':'). The first column is a dot-delimited series of 3 elements. The + first element is the numeric position of the field in the list (starting at + zero). This position shall not change over time, but holes are to be expected, + depending on build options or if some fields are deleted in the future. The + second element is the field name as it appears in the default "show info" + output. The third element is the relative process number starting at 1. + + The rest of the line starting after the first colon follows the "typed output + format" described in the section above. In short, the second column (after the + first ':') indicates the origin, nature and scope of the variable. The third + column indicates the type of the field, among "s32", "s64", "u32", "u64" and + "str". Then the fourth column is the value itself, which the consumer knows + how to parse thanks to column 3 and how to process thanks to column 2. + + Thus the overall line format in typed mode is : + + <field_pos>.<field_name>.<process_num>:<tags>:<type>:<value> + + When "desc" is appended to the command, one extra colon followed by a quoted + string is appended with a description for the metric. At the time of writing, + this is only supported for the "typed" and default output formats. + + Example : + + > show info + Name: HAProxy + Version: 1.7-dev1-de52ea-146 + Release_date: 2016/03/11 + Nbproc: 1 + Process_num: 1 + Pid: 28105 + Uptime: 0d 0h00m04s + Uptime_sec: 4 + Memmax_MB: 0 + PoolAlloc_MB: 0 + PoolUsed_MB: 0 + PoolFailed: 0 + (...) + + > show info typed + 0.Name.1:POS:str:HAProxy + 1.Version.1:POS:str:1.7-dev1-de52ea-146 + 2.Release_date.1:POS:str:2016/03/11 + 3.Nbproc.1:CGS:u32:1 + 4.Process_num.1:KGP:u32:1 + 5.Pid.1:SGP:u32:28105 + 6.Uptime.1:MDP:str:0d 0h00m08s + 7.Uptime_sec.1:MDP:u32:8 + 8.Memmax_MB.1:CLP:u32:0 + 9.PoolAlloc_MB.1:MGP:u32:0 + 10.PoolUsed_MB.1:MGP:u32:0 + 11.PoolFailed.1:MCP:u32:0 + (...) + + In the typed format, the presence of the process ID at the end of the + first column makes it very easy to visually aggregate outputs from + multiple processes. + Example : + + $ ( echo show info typed | socat /var/run/haproxy.sock1 ; \ + echo show info typed | socat /var/run/haproxy.sock2 ) | \ + sort -t . -k 1,1n -k 2,2 -k 3,3n + 0.Name.1:POS:str:HAProxy + 0.Name.2:POS:str:HAProxy + 1.Version.1:POS:str:1.7-dev1-868ab3-148 + 1.Version.2:POS:str:1.7-dev1-868ab3-148 + 2.Release_date.1:POS:str:2016/03/11 + 2.Release_date.2:POS:str:2016/03/11 + 3.Nbproc.1:CGS:u32:2 + 3.Nbproc.2:CGS:u32:2 + 4.Process_num.1:KGP:u32:1 + 4.Process_num.2:KGP:u32:2 + 5.Pid.1:SGP:u32:30120 + 5.Pid.2:SGP:u32:30121 + 6.Uptime.1:MDP:str:0d 0h01m28s + 6.Uptime.2:MDP:str:0d 0h01m28s + (...) + + The format of JSON output is described in a schema which may be output + using "show schema json". + + The JSON output contains no extra whitespace in order to reduce the + volume of output. For human consumption passing the output through a + pretty printer may be helpful. Example : + + $ echo "show info json" | socat /var/run/haproxy.sock stdio | \ + python -m json.tool + + The JSON output contains no extra whitespace in order to reduce the + volume of output. For human consumption passing the output through a + pretty printer may be helpful. Example : + + $ echo "show info json" | socat /var/run/haproxy.sock stdio | \ + python -m json.tool + +show libs + Dump the list of loaded shared dynamic libraries and object files, on systems + that support it. When available, for each shared object the range of virtual + addresses will be indicated, the size and the path to the object. This can be + used for example to try to estimate what library provides a function that + appears in a dump. Note that on many systems, addresses will change upon each + restart (address space randomization), so that this list would need to be + retrieved upon startup if it is expected to be used to analyse a core file. + This command may only be issued on sockets configured for levels "operator" + or "admin". Note that the output format may vary between operating systems, + architectures and even haproxy versions, and ought not to be relied on in + scripts. + +show map [[@<ver>] <map>] + Dump info about map converters. Without argument, the list of all available + maps is returned. If a <map> is specified, its contents are dumped. <map> is + the #<id> or <file>. By default the current version of the map is shown (the + version currently being matched against and reported as 'curr_ver' in the map + list). It is possible to instead dump other versions by prepending '@<ver>' + before the map's identifier. The version works as a filter and non-existing + versions will simply report no result. The 'entry_cnt' value represents the + count of all the map entries, not just the active ones, which means that it + also includes entries currently being added. + + In the output, the first column is a unique entry identifier, which is usable + as a reference for operations "del map" and "set map". The second column is + the pattern and the third column is the sample if available. The data returned + are not directly a list of available maps, but are the list of all patterns + composing any map. Many of these patterns can be shared with ACL. + +show peers [dict|-] [<peers section>] + Dump info about the peers configured in "peers" sections. Without argument, + the list of the peers belonging to all the "peers" sections are listed. If + <peers section> is specified, only the information about the peers belonging + to this "peers" section are dumped. When "dict" is specified before the peers + section name, the entire Tx/Rx dictionary caches will also be dumped (very + large). Passing "-" may be required to dump a peers section called "dict". + + Here are two examples of outputs where hostA, hostB and hostC peers belong to + "sharedlb" peers sections. Only hostA and hostB are connected. Only hostA has + sent data to hostB. + + $ echo "show peers" | socat - /tmp/hostA + 0x55deb0224320: [15/Apr/2019:11:28:01] id=sharedlb state=0 flags=0x3 \ + resync_timeout=<PAST> task_calls=45122 + 0x55deb022b540: id=hostC(remote) addr=127.0.0.12:10002 status=CONN \ + reconnect=4s confirm=0 + flags=0x0 + 0x55deb022a440: id=hostA(local) addr=127.0.0.10:10000 status=NONE \ + reconnect=<NEVER> confirm=0 + flags=0x0 + 0x55deb0227d70: id=hostB(remote) addr=127.0.0.11:10001 status=ESTA + reconnect=2s confirm=0 + flags=0x20000200 appctx:0x55deb028fba0 st0=7 st1=0 task_calls=14456 \ + state=EST + xprt=RAW src=127.0.0.1:37257 addr=127.0.0.10:10000 + remote_table:0x55deb0224a10 id=stkt local_id=1 remote_id=1 + last_local_table:0x55deb0224a10 id=stkt local_id=1 remote_id=1 + shared tables: + 0x55deb0224a10 local_id=1 remote_id=1 flags=0x0 remote_data=0x65 + last_acked=0 last_pushed=3 last_get=0 teaching_origin=0 update=3 + table:0x55deb022d6a0 id=stkt update=3 localupdate=3 \ + commitupdate=3 syncing=0 + + $ echo "show peers" | socat - /tmp/hostB + 0x55871b5ab320: [15/Apr/2019:11:28:03] id=sharedlb state=0 flags=0x3 \ + resync_timeout=<PAST> task_calls=3 + 0x55871b5b2540: id=hostC(remote) addr=127.0.0.12:10002 status=CONN \ + reconnect=3s confirm=0 + flags=0x0 + 0x55871b5b1440: id=hostB(local) addr=127.0.0.11:10001 status=NONE \ + reconnect=<NEVER> confirm=0 + flags=0x0 + 0x55871b5aed70: id=hostA(remote) addr=127.0.0.10:10000 status=ESTA \ + reconnect=2s confirm=0 + flags=0x20000200 appctx:0x7fa46800ee00 st0=7 st1=0 task_calls=62356 \ + state=EST + remote_table:0x55871b5ab960 id=stkt local_id=1 remote_id=1 + last_local_table:0x55871b5ab960 id=stkt local_id=1 remote_id=1 + shared tables: + 0x55871b5ab960 local_id=1 remote_id=1 flags=0x0 remote_data=0x65 + last_acked=3 last_pushed=0 last_get=3 teaching_origin=0 update=0 + table:0x55871b5b46a0 id=stkt update=1 localupdate=0 \ + commitupdate=0 syncing=0 + +show pools + Dump the status of internal memory pools. This is useful to track memory + usage when suspecting a memory leak for example. It does exactly the same + as the SIGQUIT when running in foreground except that it does not flush + the pools. + +show profiling [{all | status | tasks | memory}] [byaddr] [<max_lines>] + Dumps the current profiling settings, one per line, as well as the command + needed to change them. When tasks profiling is enabled, some per-function + statistics collected by the scheduler will also be emitted, with a summary + covering the number of calls, total/avg CPU time and total/avg latency. When + memory profiling is enabled, some information such as the number of + allocations/releases and their sizes will be reported. It is possible to + limit the dump to only the profiling status, the tasks, or the memory + profiling by specifying the respective keywords; by default all profiling + information are dumped. It is also possible to limit the number of lines + of output of each category by specifying a numeric limit. If is possible to + request that the output is sorted by address instead of usage, e.g. to ease + comparisons between subsequent calls. Please note that profiling is + essentially aimed at developers since it gives hints about where CPU cycles + or memory are wasted in the code. There is nothing useful to monitor there. + +show resolvers [<resolvers section id>] + Dump statistics for the given resolvers section, or all resolvers sections + if no section is supplied. + + For each name server, the following counters are reported: + sent: number of DNS requests sent to this server + valid: number of DNS valid responses received from this server + update: number of DNS responses used to update the server's IP address + cname: number of CNAME responses + cname_error: CNAME errors encountered with this server + any_err: number of empty response (IE: server does not support ANY type) + nx: non existent domain response received from this server + timeout: how many time this server did not answer in time + refused: number of requests refused by this server + other: any other DNS errors + invalid: invalid DNS response (from a protocol point of view) + too_big: too big response + outdated: number of response arrived too late (after another name server) + +show servers conn [<backend>] + Dump the current and idle connections state of the servers belonging to the + designated backend (or all backends if none specified). A backend name or + identifier may be used. + + The output consists in a header line showing the fields titles, then one + server per line with for each, the backend name and ID, server name and ID, + the address, port and a series or values. The number of fields varies + depending on thread count. + + Given the threaded nature of idle connections, it's important to understand + that some values may change once read, and that as such, consistency within a + line isn't granted. This output is mostly provided as a debugging tool and is + not relevant to be routinely monitored nor graphed. + +show servers state [<backend>] + Dump the state of the servers found in the running configuration. A backend + name or identifier may be provided to limit the output to this backend only. + + The dump has the following format: + - first line contains the format version (1 in this specification); + - second line contains the column headers, prefixed by a sharp ('#'); + - third line and next ones contain data; + - each line starting by a sharp ('#') is considered as a comment. + + Since multiple versions of the output may co-exist, below is the list of + fields and their order per file format version : + 1: + be_id: Backend unique id. + be_name: Backend label. + srv_id: Server unique id (in the backend). + srv_name: Server label. + srv_addr: Server IP address. + srv_op_state: Server operational state (UP/DOWN/...). + 0 = SRV_ST_STOPPED + The server is down. + 1 = SRV_ST_STARTING + The server is warming up (up but + throttled). + 2 = SRV_ST_RUNNING + The server is fully up. + 3 = SRV_ST_STOPPING + The server is up but soft-stopping + (eg: 404). + srv_admin_state: Server administrative state (MAINT/DRAIN/...). + The state is actually a mask of values : + 0x01 = SRV_ADMF_FMAINT + The server was explicitly forced into + maintenance. + 0x02 = SRV_ADMF_IMAINT + The server has inherited the maintenance + status from a tracked server. + 0x04 = SRV_ADMF_CMAINT + The server is in maintenance because of + the configuration. + 0x08 = SRV_ADMF_FDRAIN + The server was explicitly forced into + drain state. + 0x10 = SRV_ADMF_IDRAIN + The server has inherited the drain status + from a tracked server. + 0x20 = SRV_ADMF_RMAINT + The server is in maintenance because of an + IP address resolution failure. + 0x40 = SRV_ADMF_HMAINT + The server FQDN was set from stats socket. + + srv_uweight: User visible server's weight. + srv_iweight: Server's initial weight. + srv_time_since_last_change: Time since last operational change. + srv_check_status: Last health check status. + srv_check_result: Last check result (FAILED/PASSED/...). + 0 = CHK_RES_UNKNOWN + Initialized to this by default. + 1 = CHK_RES_NEUTRAL + Valid check but no status information. + 2 = CHK_RES_FAILED + Check failed. + 3 = CHK_RES_PASSED + Check succeeded and server is fully up + again. + 4 = CHK_RES_CONDPASS + Check reports the server doesn't want new + sessions. + srv_check_health: Checks rise / fall current counter. + srv_check_state: State of the check (ENABLED/PAUSED/...). + The state is actually a mask of values : + 0x01 = CHK_ST_INPROGRESS + A check is currently running. + 0x02 = CHK_ST_CONFIGURED + This check is configured and may be + enabled. + 0x04 = CHK_ST_ENABLED + This check is currently administratively + enabled. + 0x08 = CHK_ST_PAUSED + Checks are paused because of maintenance + (health only). + srv_agent_state: State of the agent check (ENABLED/PAUSED/...). + This state uses the same mask values as + "srv_check_state", adding this specific one : + 0x10 = CHK_ST_AGENT + Check is an agent check (otherwise it's a + health check). + bk_f_forced_id: Flag to know if the backend ID is forced by + configuration. + srv_f_forced_id: Flag to know if the server's ID is forced by + configuration. + srv_fqdn: Server FQDN. + srv_port: Server port. + srvrecord: DNS SRV record associated to this SRV. + srv_use_ssl: use ssl for server connections. + srv_check_port: Server health check port. + srv_check_addr: Server health check address. + srv_agent_addr: Server health agent address. + srv_agent_port: Server health agent port. + +show sess + Dump all known sessions. Avoid doing this on slow connections as this can + be huge. This command is restricted and can only be issued on sockets + configured for levels "operator" or "admin". Note that on machines with + quickly recycled connections, it is possible that this output reports less + entries than really exist because it will dump all existing sessions up to + the last one that was created before the command was entered; those which + die in the mean time will not appear. + +show sess <id> + Display a lot of internal information about the specified session identifier. + This identifier is the first field at the beginning of the lines in the dumps + of "show sess" (it corresponds to the session pointer). Those information are + useless to most users but may be used by haproxy developers to troubleshoot a + complex bug. The output format is intentionally not documented so that it can + freely evolve depending on demands. You may find a description of all fields + returned in src/dumpstats.c + + The special id "all" dumps the states of all sessions, which must be avoided + as much as possible as it is highly CPU intensive and can take a lot of time. + +show stat [domain <dns|proxy>] [{<iid>|<proxy>} <type> <sid>] [typed|json] \ + [desc] [up|no-maint] + Dump statistics. The domain is used to select which statistics to print; dns + and proxy are available for now. By default, the CSV format is used; you can + activate the extended typed output format described in the section above if + "typed" is passed after the other arguments; or in JSON if "json" is passed + after the other arguments. By passing <id>, <type> and <sid>, it is possible + to dump only selected items : + - <iid> is a proxy ID, -1 to dump everything. Alternatively, a proxy name + <proxy> may be specified. In this case, this proxy's ID will be used as + the ID selector. + - <type> selects the type of dumpable objects : 1 for frontends, 2 for + backends, 4 for servers, -1 for everything. These values can be ORed, + for example: + 1 + 2 = 3 -> frontend + backend. + 1 + 2 + 4 = 7 -> frontend + backend + server. + - <sid> is a server ID, -1 to dump everything from the selected proxy. + + Example : + $ echo "show info;show stat" | socat stdio unix-connect:/tmp/sock1 + >>> Name: HAProxy + Version: 1.4-dev2-49 + Release_date: 2009/09/23 + Nbproc: 1 + Process_num: 1 + (...) + + # pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq, (...) + stats,FRONTEND,,,0,0,1000,0,0,0,0,0,0,,,,,OPEN,,,,,,,,,1,1,0, (...) + stats,BACKEND,0,0,0,0,1000,0,0,0,0,0,,0,0,0,0,UP,0,0,0,,0,250,(...) + (...) + www1,BACKEND,0,0,0,0,1000,0,0,0,0,0,,0,0,0,0,UP,1,1,0,,0,250, (...) + + $ + + In this example, two commands have been issued at once. That way it's easy to + find which process the stats apply to in multi-process mode. This is not + needed in the typed output format as the process number is reported on each + line. Notice the empty line after the information output which marks the end + of the first block. A similar empty line appears at the end of the second + block (stats) so that the reader knows the output has not been truncated. + + When "typed" is specified, the output format is more suitable to monitoring + tools because it provides numeric positions and indicates the type of each + output field. Each value stands on its own line with process number, element + number, nature, origin and scope. This same format is available via the HTTP + stats by passing ";typed" after the URI. It is very important to note that in + typed output format, the dump for a single object is contiguous so that there + is no need for a consumer to store everything at once. + + The "up" modifier will result in listing only servers which reportedly up or + not checked. Those down, unresolved, or in maintenance will not be listed. + This is analogous to the ";up" option on the HTTP stats. Similarly, the + "no-maint" modifier will act like the ";no-maint" HTTP modifier and will + result in disabled servers not to be listed. The difference is that those + which are enabled but down will not be evicted. + + When using the typed output format, each line is made of 4 columns delimited + by colons (':'). The first column is a dot-delimited series of 5 elements. The + first element is a letter indicating the type of the object being described. + At the moment the following object types are known : 'F' for a frontend, 'B' + for a backend, 'L' for a listener, and 'S' for a server. The second element + The second element is a positive integer representing the unique identifier of + the proxy the object belongs to. It is equivalent to the "iid" column of the + CSV output and matches the value in front of the optional "id" directive found + in the frontend or backend section. The third element is a positive integer + containing the unique object identifier inside the proxy, and corresponds to + the "sid" column of the CSV output. ID 0 is reported when dumping a frontend + or a backend. For a listener or a server, this corresponds to their respective + ID inside the proxy. The fourth element is the numeric position of the field + in the list (starting at zero). This position shall not change over time, but + holes are to be expected, depending on build options or if some fields are + deleted in the future. The fifth element is the field name as it appears in + the CSV output. The sixth element is a positive integer and is the relative + process number starting at 1. + + The rest of the line starting after the first colon follows the "typed output + format" described in the section above. In short, the second column (after the + first ':') indicates the origin, nature and scope of the variable. The third + column indicates the field type, among "s32", "s64", "u32", "u64", "flt' and + "str". Then the fourth column is the value itself, which the consumer knows + how to parse thanks to column 3 and how to process thanks to column 2. + + When "desc" is appended to the command, one extra colon followed by a quoted + string is appended with a description for the metric. At the time of writing, + this is only supported for the "typed" output format. + + Thus the overall line format in typed mode is : + + <obj>.<px_id>.<id>.<fpos>.<fname>.<process_num>:<tags>:<type>:<value> + + Here's an example of typed output format : + + $ echo "show stat typed" | socat stdio unix-connect:/tmp/sock1 + F.2.0.0.pxname.1:MGP:str:private-frontend + F.2.0.1.svname.1:MGP:str:FRONTEND + F.2.0.8.bin.1:MGP:u64:0 + F.2.0.9.bout.1:MGP:u64:0 + F.2.0.40.hrsp_2xx.1:MGP:u64:0 + L.2.1.0.pxname.1:MGP:str:private-frontend + L.2.1.1.svname.1:MGP:str:sock-1 + L.2.1.17.status.1:MGP:str:OPEN + L.2.1.73.addr.1:MGP:str:0.0.0.0:8001 + S.3.13.60.rtime.1:MCP:u32:0 + S.3.13.61.ttime.1:MCP:u32:0 + S.3.13.62.agent_status.1:MGP:str:L4TOUT + S.3.13.64.agent_duration.1:MGP:u64:2001 + S.3.13.65.check_desc.1:MCP:str:Layer4 timeout + S.3.13.66.agent_desc.1:MCP:str:Layer4 timeout + S.3.13.67.check_rise.1:MCP:u32:2 + S.3.13.68.check_fall.1:MCP:u32:3 + S.3.13.69.check_health.1:SGP:u32:0 + S.3.13.70.agent_rise.1:MaP:u32:1 + S.3.13.71.agent_fall.1:SGP:u32:1 + S.3.13.72.agent_health.1:SGP:u32:1 + S.3.13.73.addr.1:MCP:str:1.255.255.255:8888 + S.3.13.75.mode.1:MAP:str:http + B.3.0.0.pxname.1:MGP:str:private-backend + B.3.0.1.svname.1:MGP:str:BACKEND + B.3.0.2.qcur.1:MGP:u32:0 + B.3.0.3.qmax.1:MGP:u32:0 + B.3.0.4.scur.1:MGP:u32:0 + B.3.0.5.smax.1:MGP:u32:0 + B.3.0.6.slim.1:MGP:u32:1000 + B.3.0.55.lastsess.1:MMP:s32:-1 + (...) + + In the typed format, the presence of the process ID at the end of the + first column makes it very easy to visually aggregate outputs from + multiple processes, as show in the example below where each line appears + for each process : + + $ ( echo show stat typed | socat /var/run/haproxy.sock1 - ; \ + echo show stat typed | socat /var/run/haproxy.sock2 - ) | \ + sort -t . -k 1,1 -k 2,2n -k 3,3n -k 4,4n -k 5,5 -k 6,6n + B.3.0.0.pxname.1:MGP:str:private-backend + B.3.0.0.pxname.2:MGP:str:private-backend + B.3.0.1.svname.1:MGP:str:BACKEND + B.3.0.1.svname.2:MGP:str:BACKEND + B.3.0.2.qcur.1:MGP:u32:0 + B.3.0.2.qcur.2:MGP:u32:0 + B.3.0.3.qmax.1:MGP:u32:0 + B.3.0.3.qmax.2:MGP:u32:0 + B.3.0.4.scur.1:MGP:u32:0 + B.3.0.4.scur.2:MGP:u32:0 + B.3.0.5.smax.1:MGP:u32:0 + B.3.0.5.smax.2:MGP:u32:0 + B.3.0.6.slim.1:MGP:u32:1000 + B.3.0.6.slim.2:MGP:u32:1000 + (...) + + The format of JSON output is described in a schema which may be output + using "show schema json". + + The JSON output contains no extra whitespace in order to reduce the + volume of output. For human consumption passing the output through a + pretty printer may be helpful. Example : + + $ echo "show stat json" | socat /var/run/haproxy.sock stdio | \ + python -m json.tool + + The JSON output contains no extra whitespace in order to reduce the + volume of output. For human consumption passing the output through a + pretty printer may be helpful. Example : + + $ echo "show stat json" | socat /var/run/haproxy.sock stdio | \ + python -m json.tool + +show ssl ca-file [<cafile>[:<index>]] + Display the list of CA files loaded into the process and their respective + certificate counts. The certificates are not used by any frontend or backend + until their status is "Used". + A "@system-ca" entry can appear in the list, it is loaded by the httpclient + by default. It contains the list of trusted CA of your system returned by + OpenSSL. + If a filename is prefixed by an asterisk, it is a transaction which + is not committed yet. If a <cafile> is specified without <index>, it will show + the status of the CA file ("Used"/"Unused") followed by details about all the + certificates contained in the CA file. The details displayed for every + certificate are the same as the ones displayed by a "show ssl cert" command. + If a <cafile> is specified followed by an <index>, it will only display the + details of the certificate having the specified index. Indexes start from 1. + If the index is invalid (too big for instance), nothing will be displayed. + This command can be useful to check if a CA file was properly updated. + You can also display the details of an ongoing transaction by prefixing the + filename by an asterisk. + + Example : + + $ echo "show ssl ca-file" | socat /var/run/haproxy.master - + # transaction + *cafile.crt - 2 certificate(s) + # filename + cafile.crt - 1 certificate(s) + + $ echo "show ssl ca-file cafile.crt" | socat /var/run/haproxy.master - + Filename: /home/tricot/work/haproxy/reg-tests/ssl/set_cafile_ca2.crt + Status: Used + + Certificate #1: + Serial: 11A4D2200DC84376E7D233CAFF39DF44BF8D1211 + notBefore: Apr 1 07:40:53 2021 GMT + notAfter: Aug 17 07:40:53 2048 GMT + Subject Alternative Name: + Algorithm: RSA4096 + SHA1 FingerPrint: A111EF0FEFCDE11D47FE3F33ADCA8435EBEA4864 + Subject: /C=FR/ST=Some-State/O=HAProxy Technologies/CN=HAProxy Technologies CA + Issuer: /C=FR/ST=Some-State/O=HAProxy Technologies/CN=HAProxy Technologies CA + + $ echo "show ssl ca-file *cafile.crt:2" | socat /var/run/haproxy.master - + Filename: */home/tricot/work/haproxy/reg-tests/ssl/set_cafile_ca2.crt + Status: Unused + + Certificate #2: + Serial: 587A1CE5ED855040A0C82BF255FF300ADB7C8136 + [...] + +show ssl cert [<filename>] + Display the list of certificates loaded into the process. They are not used + by any frontend or backend until their status is "Used". + If a filename is prefixed by an asterisk, it is a transaction which is not + committed yet. If a filename is specified, it will show details about the + certificate. This command can be useful to check if a certificate was well + updated. You can also display details on a transaction by prefixing the + filename by an asterisk. + This command can also be used to display the details of a certificate's OCSP + response by suffixing the filename with a ".ocsp" extension. It works for + committed certificates as well as for ongoing transactions. On a committed + certificate, this command is equivalent to calling "show ssl ocsp-response" + with the certificate's corresponding OCSP response ID. + + Example : + + $ echo "@1 show ssl cert" | socat /var/run/haproxy.master - + # transaction + *test.local.pem + # filename + test.local.pem + + $ echo "@1 show ssl cert test.local.pem" | socat /var/run/haproxy.master - + Filename: test.local.pem + Status: Used + Serial: 03ECC19BA54B25E85ABA46EE561B9A10D26F + notBefore: Sep 13 21:20:24 2019 GMT + notAfter: Dec 12 21:20:24 2019 GMT + Issuer: /C=US/O=Let's Encrypt/CN=Let's Encrypt Authority X3 + Subject: /CN=test.local + Subject Alternative Name: DNS:test.local, DNS:imap.test.local + Algorithm: RSA2048 + SHA1 FingerPrint: 417A11CAE25F607B24F638B4A8AEE51D1E211477 + + $ echo "@1 show ssl cert *test.local.pem" | socat /var/run/haproxy.master - + Filename: *test.local.pem + Status: Unused + [...] + +show ssl crl-file [<crlfile>[:<index>]] + Display the list of CRL files loaded into the process. They are not used + by any frontend or backend until their status is "Used". + If a filename is prefixed by an asterisk, it is a transaction which is not + committed yet. If a <crlfile> is specified without <index>, it will show the + status of the CRL file ("Used"/"Unused") followed by details about all the + Revocation Lists contained in the CRL file. The details displayed for every + list are based on the output of "openssl crl -text -noout -in <file>". + If a <crlfile> is specified followed by an <index>, it will only display the + details of the list having the specified index. Indexes start from 1. + If the index is invalid (too big for instance), nothing will be displayed. + This command can be useful to check if a CRL file was properly updated. + You can also display the details of an ongoing transaction by prefixing the + filename by an asterisk. + + Example : + + $ echo "show ssl crl-file" | socat /var/run/haproxy.master - + # transaction + *crlfile.pem + # filename + crlfile.pem + + $ echo "show ssl crl-file crlfile.pem" | socat /var/run/haproxy.master - + Filename: /home/tricot/work/haproxy/reg-tests/ssl/crlfile.pem + Status: Used + + Certificate Revocation List #1: + Version 1 + Signature Algorithm: sha256WithRSAEncryption + Issuer: /C=FR/O=HAProxy Technologies/CN=Intermediate CA2 + Last Update: Apr 23 14:45:39 2021 GMT + Next Update: Sep 8 14:45:39 2048 GMT + Revoked Certificates: + Serial Number: 1008 + Revocation Date: Apr 23 14:45:36 2021 GMT + + Certificate Revocation List #2: + Version 1 + Signature Algorithm: sha256WithRSAEncryption + Issuer: /C=FR/O=HAProxy Technologies/CN=Root CA + Last Update: Apr 23 14:30:44 2021 GMT + Next Update: Sep 8 14:30:44 2048 GMT + No Revoked Certificates. + +show ssl crt-list [-n] [<filename>] + Display the list of crt-list and directories used in the HAProxy + configuration. If a filename is specified, dump the content of a crt-list or + a directory. Once dumped the output can be used as a crt-list file. + The '-n' option can be used to display the line number, which is useful when + combined with the 'del ssl crt-list' option when a entry is duplicated. The + output with the '-n' option is not compatible with the crt-list format and + not loadable by haproxy. + + Example: + echo "show ssl crt-list -n localhost.crt-list" | socat /tmp/sock1 - + # localhost.crt-list + common.pem:1 !not.test1.com *.test1.com !localhost + common.pem:2 + ecdsa.pem:3 [verify none allow-0rtt ssl-min-ver TLSv1.0 ssl-max-ver TLSv1.3] localhost !www.test1.com + ecdsa.pem:4 [verify none allow-0rtt ssl-min-ver TLSv1.0 ssl-max-ver TLSv1.3] + +show ssl ocsp-response [<id>] + Display the IDs of the OCSP tree entries corresponding to all the OCSP + responses used in HAProxy, as well as the issuer's name and key hash and the + serial number of the certificate for which the OCSP response was built. + If a valid <id> is provided, display the contents of the corresponding OCSP + response. The information displayed is the same as in an "openssl ocsp -respin + <ocsp-response> -text" call. + + Example : + + $ echo "show ssl ocsp-response" | socat /var/run/haproxy.master - + # Certificate IDs + Certificate ID key : 303b300906052b0e03021a050004148a83e0060faff709ca7e9b95522a2e81635fda0a0414f652b0e435d5ea923851508f0adbe92d85de007a0202100a + Certificate ID: + Issuer Name Hash: 8A83E0060FAFF709CA7E9B95522A2E81635FDA0A + Issuer Key Hash: F652B0E435D5EA923851508F0ADBE92D85DE007A + Serial Number: 100A + + $ echo "show ssl ocsp-response 303b300906052b0e03021a050004148a83e0060faff709ca7e9b95522a2e81635fda0a0414f652b0e435d5ea923851508f0adbe92d85de007a0202100a" | socat /var/run/haproxy.master - + OCSP Response Data: + OCSP Response Status: successful (0x0) + Response Type: Basic OCSP Response + Version: 1 (0x0) + Responder Id: C = FR, O = HAProxy Technologies, CN = ocsp.haproxy.com + Produced At: May 27 15:43:38 2021 GMT + Responses: + Certificate ID: + Hash Algorithm: sha1 + Issuer Name Hash: 8A83E0060FAFF709CA7E9B95522A2E81635FDA0A + Issuer Key Hash: F652B0E435D5EA923851508F0ADBE92D85DE007A + Serial Number: 100A + Cert Status: good + This Update: May 27 15:43:38 2021 GMT + Next Update: Oct 12 15:43:38 2048 GMT + [...] + +show ssl providers + Display the names of the providers loaded by OpenSSL during init. Provider + loading can indeed be configured via the OpenSSL configuration file and this + option allows to check that the right providers were loaded. This command is + only available with OpenSSL v3. + + Example : + $ echo "show ssl providers" | socat /var/run/haproxy.master - + Loaded providers : + - fips + - base + +show startup-logs + Dump all messages emitted during the startup of the current haproxy process, + each startup-logs buffer is unique to its haproxy worker. + +show table + Dump general information on all known stick-tables. Their name is returned + (the name of the proxy which holds them), their type (currently zero, always + IP), their size in maximum possible number of entries, and the number of + entries currently in use. + + Example : + $ echo "show table" | socat stdio /tmp/sock1 + >>> # table: front_pub, type: ip, size:204800, used:171454 + >>> # table: back_rdp, type: ip, size:204800, used:0 + +show table <name> [ data.<type> <operator> <value> [data.<type> ...]] | [ key <key> ] + Dump contents of stick-table <name>. In this mode, a first line of generic + information about the table is reported as with "show table", then all + entries are dumped. Since this can be quite heavy, it is possible to specify + a filter in order to specify what entries to display. + + When the "data." form is used the filter applies to the stored data (see + "stick-table" in section 4.2). A stored data type must be specified + in <type>, and this data type must be stored in the table otherwise an + error is reported. The data is compared according to <operator> with the + 64-bit integer <value>. Operators are the same as with the ACLs : + + - eq : match entries whose data is equal to this value + - ne : match entries whose data is not equal to this value + - le : match entries whose data is less than or equal to this value + - ge : match entries whose data is greater than or equal to this value + - lt : match entries whose data is less than this value + - gt : match entries whose data is greater than this value + + In this form, you can use multiple data filter entries, up to a maximum + defined during build time (4 by default). + + When the key form is used the entry <key> is shown. The key must be of the + same type as the table, which currently is limited to IPv4, IPv6, integer, + and string. + + Example : + $ echo "show table http_proxy" | socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:2 + >>> 0x80e6a4c: key=127.0.0.1 use=0 exp=3594729 gpc0=0 conn_rate(30000)=1 \ + bytes_out_rate(60000)=187 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + + $ echo "show table http_proxy data.gpc0 gt 0" | socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:2 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + + $ echo "show table http_proxy data.conn_rate gt 5" | \ + socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:2 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + + $ echo "show table http_proxy key 127.0.0.2" | \ + socat stdio /tmp/sock1 + >>> # table: http_proxy, type: ip, size:204800, used:2 + >>> 0x80e6a80: key=127.0.0.2 use=0 exp=3594740 gpc0=1 conn_rate(30000)=10 \ + bytes_out_rate(60000)=191 + + When the data criterion applies to a dynamic value dependent on time such as + a bytes rate, the value is dynamically computed during the evaluation of the + entry in order to decide whether it has to be dumped or not. This means that + such a filter could match for some time then not match anymore because as + time goes, the average event rate drops. + + It is possible to use this to extract lists of IP addresses abusing the + service, in order to monitor them or even blacklist them in a firewall. + Example : + $ echo "show table http_proxy data.gpc0 gt 0" \ + | socat stdio /tmp/sock1 \ + | fgrep 'key=' | cut -d' ' -f2 | cut -d= -f2 > abusers-ip.txt + ( or | awk '/key/{ print a[split($2,a,"=")]; }' ) + +show tasks + Dumps the number of tasks currently in the run queue, with the number of + occurrences for each function, and their average latency when it's known + (for pure tasks with task profiling enabled). The dump is a snapshot of the + instant it's done, and there may be variations depending on what tasks are + left in the queue at the moment it happens, especially in mono-thread mode + as there's less chance that I/Os can refill the queue (unless the queue is + full). This command takes exclusive access to the process and can cause + minor but measurable latencies when issued on a highly loaded process, so + it must not be abused by monitoring bots. + +show threads + Dumps some internal states and structures for each thread, that may be useful + to help developers understand a problem. The output tries to be readable by + showing one block per thread. When haproxy is built with USE_THREAD_DUMP=1, + an advanced dump mechanism involving thread signals is used so that each + thread can dump its own state in turn. Without this option, the thread + processing the command shows all its details but the other ones are less + detailed. A star ('*') is displayed in front of the thread handling the + command. A right angle bracket ('>') may also be displayed in front of + threads which didn't make any progress since last invocation of this command, + indicating a bug in the code which must absolutely be reported. When this + happens between two threads it usually indicates a deadlock. If a thread is + alone, it's a different bug like a corrupted list. In all cases the process + needs is not fully functional anymore and needs to be restarted. + + The output format is purposely not documented so that it can easily evolve as + new needs are identified, without having to maintain any form of backwards + compatibility, and just like with "show activity", the values are meaningless + without the code at hand. + +show tls-keys [id|*] + Dump all loaded TLS ticket keys references. The TLS ticket key reference ID + and the file from which the keys have been loaded is shown. Both of those + can be used to update the TLS keys using "set ssl tls-key". If an ID is + specified as parameter, it will dump the tickets, using * it will dump every + keys from every references. + +show schema json + Dump the schema used for the output of "show info json" and "show stat json". + + The contains no extra whitespace in order to reduce the volume of output. + For human consumption passing the output through a pretty printer may be + helpful. Example : + + $ echo "show schema json" | socat /var/run/haproxy.sock stdio | \ + python -m json.tool + + The schema follows "JSON Schema" (json-schema.org) and accordingly + verifiers may be used to verify the output of "show info json" and "show + stat json" against the schema. + +show trace [<source>] + Show the current trace status. For each source a line is displayed with a + single-character status indicating if the trace is stopped, waiting, or + running. The output sink used by the trace is indicated (or "none" if none + was set), as well as the number of dropped events in this sink, followed by a + brief description of the source. If a source name is specified, a detailed + list of all events supported by the source, and their status for each action + (report, start, pause, stop), indicated by a "+" if they are enabled, or a + "-" otherwise. All these events are independent and an event might trigger + a start without being reported and conversely. + +show version + Show the version of the current HAProxy process. This is available from + master and workers CLI. + Example: + + $ echo "show version" | socat /var/run/haproxy.sock stdio + 2.4.9 + + $ echo "show version" | socat /var/run/haproxy-master.sock stdio + 2.5.0 + +shutdown frontend <frontend> + Completely delete the specified frontend. All the ports it was bound to will + be released. It will not be possible to enable the frontend anymore after + this operation. This is intended to be used in environments where stopping a + proxy is not even imaginable but a misconfigured proxy must be fixed. That + way it's possible to release the port and bind it into another process to + restore operations. The frontend will not appear at all on the stats page + once it is terminated. + + The frontend may be specified either by its name or by its numeric ID, + prefixed with a sharp ('#'). + + This command is restricted and can only be issued on sockets configured for + level "admin". + +shutdown session <id> + Immediately terminate the session matching the specified session identifier. + This identifier is the first field at the beginning of the lines in the dumps + of "show sess" (it corresponds to the session pointer). This can be used to + terminate a long-running session without waiting for a timeout or when an + endless transfer is ongoing. Such terminated sessions are reported with a 'K' + flag in the logs. + +shutdown sessions server <backend>/<server> + Immediately terminate all the sessions attached to the specified server. This + can be used to terminate long-running sessions after a server is put into + maintenance mode, for instance. Such terminated sessions are reported with a + 'K' flag in the logs. + +trace + The "trace" command alone lists the trace sources, their current status, and + their brief descriptions. It is only meant as a menu to enter next levels, + see other "trace" commands below. + +trace 0 + Immediately stops all traces. This is made to be used as a quick solution + to terminate a debugging session or as an emergency action to be used in case + complex traces were enabled on multiple sources and impact the service. + +trace <source> event [ [+|-|!]<name> ] + Without argument, this will list all the events supported by the designated + source. They are prefixed with a "-" if they are not enabled, or a "+" if + they are enabled. It is important to note that a single trace may be labelled + with multiple events, and as long as any of the enabled events matches one of + the events labelled on the trace, the event will be passed to the trace + subsystem. For example, receiving an HTTP/2 frame of type HEADERS may trigger + a frame event and a stream event since the frame creates a new stream. If + either the frame event or the stream event are enabled for this source, the + frame will be passed to the trace framework. + + With an argument, it is possible to toggle the state of each event and + individually enable or disable them. Two special keywords are supported, + "none", which matches no event, and is used to disable all events at once, + and "any" which matches all events, and is used to enable all events at + once. Other events are specific to the event source. It is possible to + enable one event by specifying its name, optionally prefixed with '+' for + better readability. It is possible to disable one event by specifying its + name prefixed by a '-' or a '!'. + + One way to completely disable a trace source is to pass "event none", and + this source will instantly be totally ignored. + +trace <source> level [<level>] + Without argument, this will list all trace levels for this source, and the + current one will be indicated by a star ('*') prepended in front of it. With + an argument, this will change the trace level to the specified level. Detail + levels are a form of filters that are applied before reporting the events. + These filters are used to selectively include or exclude events depending on + their level of importance. For example a developer might need to know + precisely where in the code an HTTP header was considered invalid while the + end user may not even care about this header's validity at all. There are + currently 5 distinct levels for a trace : + + user this will report information that are suitable for use by a + regular haproxy user who wants to observe his traffic. + Typically some HTTP requests and responses will be reported + without much detail. Most sources will set this as the + default level to ease operations. + + proto in addition to what is reported at the "user" level, it also + displays protocol-level updates. This can for example be the + frame types or HTTP headers after decoding. + + state in addition to what is reported at the "proto" level, it + will also display state transitions (or failed transitions) + which happen in parsers, so this will show attempts to + perform an operation while the "proto" level only shows + the final operation. + + data in addition to what is reported at the "state" level, it + will also include data transfers between the various layers. + + developer it reports everything available, which can include advanced + information such as "breaking out of this loop" that are + only relevant to a developer trying to understand a bug that + only happens once in a while in field. Function names are + only reported at this level. + + It is highly recommended to always use the "user" level only and switch to + other levels only if instructed to do so by a developer. Also it is a good + idea to first configure the events before switching to higher levels, as it + may save from dumping many lines if no filter is applied. + +trace <source> lock [criterion] + Without argument, this will list all the criteria supported by this source + for lock-on processing, and display the current choice by a star ('*') in + front of it. Lock-on means that the source will focus on the first matching + event and only stick to the criterion which triggered this event, and ignore + all other ones until the trace stops. This allows for example to take a trace + on a single connection or on a single stream. The following criteria are + supported by some traces, though not necessarily all, since some of them + might not be available to the source : + + backend lock on the backend that started the trace + connection lock on the connection that started the trace + frontend lock on the frontend that started the trace + listener lock on the listener that started the trace + nothing do not lock on anything + server lock on the server that started the trace + session lock on the session that started the trace + thread lock on the thread that started the trace + + In addition to this, each source may provide up to 4 specific criteria such + as internal states or connection IDs. For example in HTTP/2 it is possible + to lock on the H2 stream and ignore other streams once a strace starts. + + When a criterion is passed in argument, this one is used instead of the + other ones and any existing tracking is immediately terminated so that it can + restart with the new criterion. The special keyword "nothing" is supported by + all sources to permanently disable tracking. + +trace <source> { pause | start | stop } [ [+|-|!]event] + Without argument, this will list the events enabled to automatically pause, + start, or stop a trace for this source. These events are specific to each + trace source. With an argument, this will either enable the event for the + specified action (if optionally prefixed by a '+') or disable it (if + prefixed by a '-' or '!'). The special keyword "now" is not an event and + requests to take the action immediately. The keywords "none" and "any" are + supported just like in "trace event". + + The 3 supported actions are respectively "pause", "start" and "stop". The + "pause" action enumerates events which will cause a running trace to stop and + wait for a new start event to restart it. The "start" action enumerates the + events which switch the trace into the waiting mode until one of the start + events appears. And the "stop" action enumerates the events which definitely + stop the trace until it is manually enabled again. In practice it makes sense + to manually start a trace using "start now" without caring about events, and + to stop it using "stop now". In order to capture more subtle event sequences, + setting "start" to a normal event (like receiving an HTTP request) and "stop" + to a very rare event like emitting a certain error, will ensure that the last + captured events will match the desired criteria. And the pause event is + useful to detect the end of a sequence, disable the lock-on and wait for + another opportunity to take a capture. In this case it can make sense to + enable lock-on to spot only one specific criterion (e.g. a stream), and have + "start" set to anything that starts this criterion (e.g. all events which + create a stream), "stop" set to the expected anomaly, and "pause" to anything + that ends that criterion (e.g. any end of stream event). In this case the + trace log will contain complete sequences of perfectly clean series affecting + a single object, until the last sequence containing everything from the + beginning to the anomaly. + +trace <source> sink [<sink>] + Without argument, this will list all event sinks available for this source, + and the currently configured one will have a star ('*') prepended in front + of it. Sink "none" is always available and means that all events are simply + dropped, though their processing is not ignored (e.g. lock-on does occur). + Other sinks are available depending on configuration and build options, but + typically "stdout" and "stderr" will be usable in debug mode, and in-memory + ring buffers should be available as well. When a name is specified, the sink + instantly changes for the specified source. Events are not changed during a + sink change. In the worst case some may be lost if an invalid sink is used + (or "none"), but operations do continue to a different destination. + +trace <source> verbosity [<level>] + Without argument, this will list all verbosity levels for this source, and the + current one will be indicated by a star ('*') prepended in front of it. With + an argument, this will change the verbosity level to the specified one. + + Verbosity levels indicate how far the trace decoder should go to provide + detailed information. It depends on the trace source, since some sources will + not even provide a specific decoder. Level "quiet" is always available and + disables any decoding. It can be useful when trying to figure what's + happening before trying to understand the details, since it will have a very + low impact on performance and trace size. When no verbosity levels are + declared by a source, level "default" is available and will cause a decoder + to be called when specified in the traces. It is an opportunistic decoding. + When the source declares some verbosity levels, these ones are listed with + a description of what they correspond to. In this case the trace decoder + provided by the source will be as accurate as possible based on the + information available at the trace point. The first level above "quiet" is + set by default. + + +9.4. Master CLI +--------------- + +The master CLI is a socket bound to the master process in master-worker mode. +This CLI gives access to the unix socket commands in every running or leaving +processes and allows a basic supervision of those processes. + +The master CLI is configurable only from the haproxy program arguments with +the -S option. This option also takes bind options separated by commas. + +Example: + + # haproxy -W -S 127.0.0.1:1234 -f test1.cfg + # haproxy -Ws -S /tmp/master-socket,uid,1000,gid,1000,mode,600 -f test1.cfg + # haproxy -W -S /tmp/master-socket,level,user -f test1.cfg + + +9.4.1. Master CLI commands +-------------------------- + +@<[!]pid> + The master CLI uses a special prefix notation to access the multiple + processes. This notation is easily identifiable as it begins by a @. + + A @ prefix can be followed by a relative process number or by an exclamation + point and a PID. (e.g. @1 or @!1271). A @ alone could be use to specify the + master. Leaving processes are only accessible with the PID as relative process + number are only usable with the current processes. + + Examples: + + $ socat /var/run/haproxy-master.sock readline + prompt + master> @1 show info; @2 show info + [...] + Process_num: 1 + Pid: 1271 + [...] + Process_num: 2 + Pid: 1272 + [...] + master> + + $ echo '@!1271 show info; @!1272 show info' | socat /var/run/haproxy-master.sock - + [...] + + A prefix could be use as a command, which will send every next commands to + the specified process. + + Examples: + + $ socat /var/run/haproxy-master.sock readline + prompt + master> @1 + 1271> show info + [...] + 1271> show stat + [...] + 1271> @ + master> + + $ echo '@1; show info; show stat; @2; show info; show stat' | socat /var/run/haproxy-master.sock - + [...] + +expert-mode [on|off] + This command activates the "expert-mode" for every worker accessed from the + master CLI. Combined with "mcli-debug-mode" it also activates the command on + the master. Display the flag "e" in the master CLI prompt. + + See also "expert-mode" in Section 9.3 and "mcli-debug-mode" in 9.4.1. + +experimental-mode [on|off] + This command activates the "experimental-mode" for every worker accessed from + the master CLI. Combined with "mcli-debug-mode" it also activates the command on + the master. Display the flag "x" in the master CLI prompt. + + See also "experimental-mode" in Section 9.3 and "mcli-debug-mode" in 9.4.1. + +mcli-debug-mode [on|off] + This keyword allows a special mode in the master CLI which enables every + keywords that were meant for a worker CLI on the master CLI, allowing to debug + the master process. Once activated, you list the new available keywords with + "help". Combined with "experimental-mode" or "expert-mode" it enables even + more keywords. Display the flag "d" in the master CLI prompt. + +prompt + When the prompt is enabled (via the "prompt" command), the context the CLI is + working on is displayed in the prompt. The master is identified by the "master" + string, and other processes are identified with their PID. In case the last + reload failed, the master prompt will be changed to "master[ReloadFailed]>" so + that it becomes visible that the process is still running on the previous + configuration and that the new configuration is not operational. + + The prompt of the master CLI is able to display several flags which are the + enable modes. "d" for mcli-debug-mode, "e" for expert-mode, "x" for + experimental-mode. + + Example: + $ socat /var/run/haproxy-master.sock - + prompt + master> expert-mode on + master(e)> experimental-mode on + master(xe)> mcli-debug-mode on + master(xed)> @1 + 95191(xed)> + +reload + You can also reload the HAProxy master process with the "reload" command which + does the same as a `kill -USR2` on the master process, provided that the user + has at least "operator" or "admin" privileges. + + Example: + + $ echo "reload" | socat /var/run/haproxy-master.sock stdin + + Note that a reload will close the connection to the master CLI. + +show proc + The master CLI introduces a 'show proc' command to surpervise the + processe. + + Example: + + $ echo 'show proc' | socat /var/run/haproxy-master.sock - + #<PID> <type> <reloads> <uptime> <version> + 1162 master 5 [failed: 0] 0d00h02m07s 2.5-dev13 + # workers + 1271 worker 1 0d00h00m00s 2.5-dev13 + # old workers + 1233 worker 3 0d00h00m43s 2.0-dev3-6019f6-289 + # programs + 1244 foo 0 0d00h00m00s - + 1255 bar 0 0d00h00m00s - + + In this example, the master has been reloaded 5 times but one of the old + worker is still running and survived 3 reloads. You could access the CLI of + this worker to understand what's going on. + +10. Tricks for easier configuration management +---------------------------------------------- + +It is very common that two HAProxy nodes constituting a cluster share exactly +the same configuration modulo a few addresses. Instead of having to maintain a +duplicate configuration for each node, which will inevitably diverge, it is +possible to include environment variables in the configuration. Thus multiple +configuration may share the exact same file with only a few different system +wide environment variables. This started in version 1.5 where only addresses +were allowed to include environment variables, and 1.6 goes further by +supporting environment variables everywhere. The syntax is the same as in the +UNIX shell, a variable starts with a dollar sign ('$'), followed by an opening +curly brace ('{'), then the variable name followed by the closing brace ('}'). +Except for addresses, environment variables are only interpreted in arguments +surrounded with double quotes (this was necessary not to break existing setups +using regular expressions involving the dollar symbol). + +Environment variables also make it convenient to write configurations which are +expected to work on various sites where only the address changes. It can also +permit to remove passwords from some configs. Example below where the the file +"site1.env" file is sourced by the init script upon startup : + + $ cat site1.env + LISTEN=192.168.1.1 + CACHE_PFX=192.168.11 + SERVER_PFX=192.168.22 + LOGGER=192.168.33.1 + STATSLP=admin:pa$$w0rd + ABUSERS=/etc/haproxy/abuse.lst + TIMEOUT=10s + + $ cat haproxy.cfg + global + log "${LOGGER}:514" local0 + + defaults + mode http + timeout client "${TIMEOUT}" + timeout server "${TIMEOUT}" + timeout connect 5s + + frontend public + bind "${LISTEN}:80" + http-request reject if { src -f "${ABUSERS}" } + stats uri /stats + stats auth "${STATSLP}" + use_backend cache if { path_end .jpg .css .ico } + default_backend server + + backend cache + server cache1 "${CACHE_PFX}.1:18080" check + server cache2 "${CACHE_PFX}.2:18080" check + + backend server + server cache1 "${SERVER_PFX}.1:8080" check + server cache2 "${SERVER_PFX}.2:8080" check + + +11. Well-known traps to avoid +----------------------------- + +Once in a while, someone reports that after a system reboot, the haproxy +service wasn't started, and that once they start it by hand it works. Most +often, these people are running a clustered IP address mechanism such as +keepalived, to assign the service IP address to the master node only, and while +it used to work when they used to bind haproxy to address 0.0.0.0, it stopped +working after they bound it to the virtual IP address. What happens here is +that when the service starts, the virtual IP address is not yet owned by the +local node, so when HAProxy wants to bind to it, the system rejects this +because it is not a local IP address. The fix doesn't consist in delaying the +haproxy service startup (since it wouldn't stand a restart), but instead to +properly configure the system to allow binding to non-local addresses. This is +easily done on Linux by setting the net.ipv4.ip_nonlocal_bind sysctl to 1. This +is also needed in order to transparently intercept the IP traffic that passes +through HAProxy for a specific target address. + +Multi-process configurations involving source port ranges may apparently seem +to work but they will cause some random failures under high loads because more +than one process may try to use the same source port to connect to the same +server, which is not possible. The system will report an error and a retry will +happen, picking another port. A high value in the "retries" parameter may hide +the effect to a certain extent but this also comes with increased CPU usage and +processing time. Logs will also report a certain number of retries. For this +reason, port ranges should be avoided in multi-process configurations. + +Since HAProxy uses SO_REUSEPORT and supports having multiple independent +processes bound to the same IP:port, during troubleshooting it can happen that +an old process was not stopped before a new one was started. This provides +absurd test results which tend to indicate that any change to the configuration +is ignored. The reason is that in fact even the new process is restarted with a +new configuration, the old one also gets some incoming connections and +processes them, returning unexpected results. When in doubt, just stop the new +process and try again. If it still works, it very likely means that an old +process remains alive and has to be stopped. Linux's "netstat -lntp" is of good +help here. + +When adding entries to an ACL from the command line (eg: when blacklisting a +source address), it is important to keep in mind that these entries are not +synchronized to the file and that if someone reloads the configuration, these +updates will be lost. While this is often the desired effect (for blacklisting) +it may not necessarily match expectations when the change was made as a fix for +a problem. See the "add acl" action of the CLI interface. + + +12. Debugging and performance issues +------------------------------------ + +When HAProxy is started with the "-d" option, it will stay in the foreground +and will print one line per event, such as an incoming connection, the end of a +connection, and for each request or response header line seen. This debug +output is emitted before the contents are processed, so they don't consider the +local modifications. The main use is to show the request and response without +having to run a network sniffer. The output is less readable when multiple +connections are handled in parallel, though the "debug2ansi" and "debug2html" +scripts found in the examples/ directory definitely help here by coloring the +output. + +If a request or response is rejected because HAProxy finds it is malformed, the +best thing to do is to connect to the CLI and issue "show errors", which will +report the last captured faulty request and response for each frontend and +backend, with all the necessary information to indicate precisely the first +character of the input stream that was rejected. This is sometimes needed to +prove to customers or to developers that a bug is present in their code. In +this case it is often possible to relax the checks (but still keep the +captures) using "option accept-invalid-http-request" or its equivalent for +responses coming from the server "option accept-invalid-http-response". Please +see the configuration manual for more details. + +Example : + + > show errors + Total events captured on [13/Oct/2015:13:43:47.169] : 1 + + [13/Oct/2015:13:43:40.918] frontend HAProxyLocalStats (#2): invalid request + backend <NONE> (#-1), server <NONE> (#-1), event #0 + src 127.0.0.1:51981, session #0, session flags 0x00000080 + HTTP msg state 26, msg flags 0x00000000, tx flags 0x00000000 + HTTP chunk len 0 bytes, HTTP body len 0 bytes + buffer flags 0x00808002, out 0 bytes, total 31 bytes + pending 31 bytes, wrapping at 8040, error at position 13: + + 00000 GET /invalid request HTTP/1.1\r\n + + +The output of "show info" on the CLI provides a number of useful information +regarding the maximum connection rate ever reached, maximum SSL key rate ever +reached, and in general all information which can help to explain temporary +issues regarding CPU or memory usage. Example : + + > show info + Name: HAProxy + Version: 1.6-dev7-e32d18-17 + Release_date: 2015/10/12 + Nbproc: 1 + Process_num: 1 + Pid: 7949 + Uptime: 0d 0h02m39s + Uptime_sec: 159 + Memmax_MB: 0 + Ulimit-n: 120032 + Maxsock: 120032 + Maxconn: 60000 + Hard_maxconn: 60000 + CurrConns: 0 + CumConns: 3 + CumReq: 3 + MaxSslConns: 0 + CurrSslConns: 0 + CumSslConns: 0 + Maxpipes: 0 + PipesUsed: 0 + PipesFree: 0 + ConnRate: 0 + ConnRateLimit: 0 + MaxConnRate: 1 + SessRate: 0 + SessRateLimit: 0 + MaxSessRate: 1 + SslRate: 0 + SslRateLimit: 0 + MaxSslRate: 0 + SslFrontendKeyRate: 0 + SslFrontendMaxKeyRate: 0 + SslFrontendSessionReuse_pct: 0 + SslBackendKeyRate: 0 + SslBackendMaxKeyRate: 0 + SslCacheLookups: 0 + SslCacheMisses: 0 + CompressBpsIn: 0 + CompressBpsOut: 0 + CompressBpsRateLim: 0 + ZlibMemUsage: 0 + MaxZlibMemUsage: 0 + Tasks: 5 + Run_queue: 1 + Idle_pct: 100 + node: wtap + description: + +When an issue seems to randomly appear on a new version of HAProxy (eg: every +second request is aborted, occasional crash, etc), it is worth trying to enable +memory poisoning so that each call to malloc() is immediately followed by the +filling of the memory area with a configurable byte. By default this byte is +0x50 (ASCII for 'P'), but any other byte can be used, including zero (which +will have the same effect as a calloc() and which may make issues disappear). +Memory poisoning is enabled on the command line using the "-dM" option. It +slightly hurts performance and is not recommended for use in production. If +an issue happens all the time with it or never happens when poisoning uses +byte zero, it clearly means you've found a bug and you definitely need to +report it. Otherwise if there's no clear change, the problem it is not related. + +When debugging some latency issues, it is important to use both strace and +tcpdump on the local machine, and another tcpdump on the remote system. The +reason for this is that there are delays everywhere in the processing chain and +it is important to know which one is causing latency to know where to act. In +practice, the local tcpdump will indicate when the input data come in. Strace +will indicate when haproxy receives these data (using recv/recvfrom). Warning, +openssl uses read()/write() syscalls instead of recv()/send(). Strace will also +show when haproxy sends the data, and tcpdump will show when the system sends +these data to the interface. Then the external tcpdump will show when the data +sent are really received (since the local one only shows when the packets are +queued). The benefit of sniffing on the local system is that strace and tcpdump +will use the same reference clock. Strace should be used with "-tts200" to get +complete timestamps and report large enough chunks of data to read them. +Tcpdump should be used with "-nvvttSs0" to report full packets, real sequence +numbers and complete timestamps. + +In practice, received data are almost always immediately received by haproxy +(unless the machine has a saturated CPU or these data are invalid and not +delivered). If these data are received but not sent, it generally is because +the output buffer is saturated (ie: recipient doesn't consume the data fast +enough). This can be confirmed by seeing that the polling doesn't notify of +the ability to write on the output file descriptor for some time (it's often +easier to spot in the strace output when the data finally leave and then roll +back to see when the write event was notified). It generally matches an ACK +received from the recipient, and detected by tcpdump. Once the data are sent, +they may spend some time in the system doing nothing. Here again, the TCP +congestion window may be limited and not allow these data to leave, waiting for +an ACK to open the window. If the traffic is idle and the data take 40 ms or +200 ms to leave, it's a different issue (which is not an issue), it's the fact +that the Nagle algorithm prevents empty packets from leaving immediately, in +hope that they will be merged with subsequent data. HAProxy automatically +disables Nagle in pure TCP mode and in tunnels. However it definitely remains +enabled when forwarding an HTTP body (and this contributes to the performance +improvement there by reducing the number of packets). Some HTTP non-compliant +applications may be sensitive to the latency when delivering incomplete HTTP +response messages. In this case you will have to enable "option http-no-delay" +to disable Nagle in order to work around their design, keeping in mind that any +other proxy in the chain may similarly be impacted. If tcpdump reports that data +leave immediately but the other end doesn't see them quickly, it can mean there +is a congested WAN link, a congested LAN with flow control enabled and +preventing the data from leaving, or more commonly that HAProxy is in fact +running in a virtual machine and that for whatever reason the hypervisor has +decided that the data didn't need to be sent immediately. In virtualized +environments, latency issues are almost always caused by the virtualization +layer, so in order to save time, it's worth first comparing tcpdump in the VM +and on the external components. Any difference has to be credited to the +hypervisor and its accompanying drivers. + +When some TCP SACK segments are seen in tcpdump traces (using -vv), it always +means that the side sending them has got the proof of a lost packet. While not +seeing them doesn't mean there are no losses, seeing them definitely means the +network is lossy. Losses are normal on a network, but at a rate where SACKs are +not noticeable at the naked eye. If they appear a lot in the traces, it is +worth investigating exactly what happens and where the packets are lost. HTTP +doesn't cope well with TCP losses, which introduce huge latencies. + +The "netstat -i" command will report statistics per interface. An interface +where the Rx-Ovr counter grows indicates that the system doesn't have enough +resources to receive all incoming packets and that they're lost before being +processed by the network driver. Rx-Drp indicates that some received packets +were lost in the network stack because the application doesn't process them +fast enough. This can happen during some attacks as well. Tx-Drp means that +the output queues were full and packets had to be dropped. When using TCP it +should be very rare, but will possibly indicate a saturated outgoing link. + + +13. Security considerations +--------------------------- + +HAProxy is designed to run with very limited privileges. The standard way to +use it is to isolate it into a chroot jail and to drop its privileges to a +non-root user without any permissions inside this jail so that if any future +vulnerability were to be discovered, its compromise would not affect the rest +of the system. + +In order to perform a chroot, it first needs to be started as a root user. It is +pointless to build hand-made chroots to start the process there, these ones are +painful to build, are never properly maintained and always contain way more +bugs than the main file-system. And in case of compromise, the intruder can use +the purposely built file-system. Unfortunately many administrators confuse +"start as root" and "run as root", resulting in the uid change to be done prior +to starting haproxy, and reducing the effective security restrictions. + +HAProxy will need to be started as root in order to : + - adjust the file descriptor limits + - bind to privileged port numbers + - bind to a specific network interface + - transparently listen to a foreign address + - isolate itself inside the chroot jail + - drop to another non-privileged UID + +HAProxy may require to be run as root in order to : + - bind to an interface for outgoing connections + - bind to privileged source ports for outgoing connections + - transparently bind to a foreign address for outgoing connections + +Most users will never need the "run as root" case. But the "start as root" +covers most usages. + +A safe configuration will have : + + - a chroot statement pointing to an empty location without any access + permissions. This can be prepared this way on the UNIX command line : + + # mkdir /var/empty && chmod 0 /var/empty || echo "Failed" + + and referenced like this in the HAProxy configuration's global section : + + chroot /var/empty + + - both a uid/user and gid/group statements in the global section : + + user haproxy + group haproxy + + - a stats socket whose mode, uid and gid are set to match the user and/or + group allowed to access the CLI so that nobody may access it : + + stats socket /var/run/haproxy.stat uid hatop gid hatop mode 600 + diff --git a/doc/netscaler-client-ip-insertion-protocol.txt b/doc/netscaler-client-ip-insertion-protocol.txt new file mode 100644 index 0000000..dc64327 --- /dev/null +++ b/doc/netscaler-client-ip-insertion-protocol.txt @@ -0,0 +1,55 @@ +When NetScaler application switch is used as L3+ switch, information +regarding the original IP and TCP headers are lost as a new TCP +connection is created between the NetScaler and the backend server. + +NetScaler provides a feature to insert in the TCP data the original data +that can then be consumed by the backend server. + +Specifications and documentations from NetScaler: + https://support.citrix.com/article/CTX205670 + https://www.citrix.com/blogs/2016/04/25/how-to-enable-client-ip-in-tcpip-option-of-netscaler/ + +When CIP is enabled on the NetScaler, then a TCP packet is inserted just after +the TCP handshake. Two versions of the CIP extension exist. + +Legacy (NetScaler < 10.5) + + - CIP magic number : 4 bytes + Both sender and receiver have to agree on a magic number so that + they both handle the incoming data as a NetScaler Client IP insertion + packet. + + - Header length : 4 bytes + Defines the length on the remaining data. + + - IP header : >= 20 bytes if IPv4, 40 bytes if IPv6 + Contains the header of the last IP packet sent by the client during TCP + handshake. + + - TCP header : >= 20 bytes + Contains the header of the last TCP packet sent by the client during TCP + handshake. + +Standard (NetScaler >= 10.5) + + - CIP magic number : 4 bytes + Both sender and receiver have to agree on a magic number so that + they both handle the incoming data as a NetScaler Client IP insertion + packet. + + - CIP length : 4 bytes + Defines the total length on the CIP header. + + - CIP type: 2 bytes + Always set to 1. + + - Header length : 2 bytes + Defines the length on the remaining data. + + - IP header : >= 20 bytes if IPv4, 40 bytes if IPv6 + Contains the header of the last IP packet sent by the client during TCP + handshake. + + - TCP header : >= 20 bytes + Contains the header of the last TCP packet sent by the client during TCP + handshake. diff --git a/doc/network-namespaces.txt b/doc/network-namespaces.txt new file mode 100644 index 0000000..9448f43 --- /dev/null +++ b/doc/network-namespaces.txt @@ -0,0 +1,106 @@ +Linux network namespace support for HAProxy +=========================================== + +HAProxy supports proxying between Linux network namespaces. This +feature can be used, for example, in a multi-tenant networking +environment to proxy between different networks. HAProxy can also act +as a front-end proxy for non namespace-aware services. + +The proxy protocol has been extended to support transferring the +namespace information, so the originating namespace information can be +kept. This is useful when chaining multiple proxies and services. + +To enable Linux namespace support, compile HAProxy with the `USE_NS=1` +make option. + + +## Setting up namespaces on Linux + +To create network namespaces, use the 'ip netns' command. See the +manual page ip-netns(8) for details. + +Make sure that the file descriptors representing the network namespace +are located under `/var/run/netns`. + +For example, you can create a network namespace and assign one of the +networking interfaces to the new namespace: + +``` +$ ip netns add netns1 +$ ip link set eth7 netns netns1 +``` + + +## Listing namespaces in the configuration file + +HAProxy uses namespaces explicitly listed in its configuration file. +If you are not using namespace information received through the proxy +protocol, this usually means that you must specify namespaces for +listeners and servers in the configuration file with the 'namespace' +keyword. + +However, if you're using the namespace information received through +the proxy protocol to determine the namespace of servers (see +'namespace * below'), you have to explicitly list all allowed +namespaces in the namespace_list section of your configuration file: + +``` +namespace_list + namespace netns1 + namespace netns2 +``` + + +## Namespace information flow + +The haproxy process always runs in the namespace it was started on. +This is the default namespace. + +The bind addresses of listeners can have their namespace specified in +the configuration file. Unless specified, sockets associated with +listener bind addresses are created in the default namespace. For +example, this creates a listener in the netns2 namespace: + +``` +frontend f_example + bind 192.168.1.1:80 namespace netns2 + default_backend http +``` + +Each client connection is associated with its source namespace. By +default, this is the namespace of the bind socket it arrived on, but +can be overridden by information received through the proxy protocol. +Proxy protocol v2 supports transferring namespace information, so if +it is enabled for the listener, it can override the associated +namespace of the connection. + +Servers can have their namespaces specified in the configuration file +with the 'namespace' keyword: + +``` +backend b_example + server s1 192.168.1.100:80 namespace netns2 +``` + +If no namespace is set for a server, it is assumed that it is in the +default namespace. When specified, outbound sockets to the server are +created in the network namespace configured. To create the outbound +(server) connection in the namespace associated with the client, use +the '*' namespace. This is especially useful when using the +destination address and namespace received from the proxy protocol. + +``` +frontend f_example + bind 192.168.1.1:9990 accept-proxy + default_backend b_example + +backend b_example + mode tcp + source 0.0.0.0 usesrc clientip + server snodes * namespace * +``` + +If HAProxy is configured to send proxy protocol v2 headers to the +server, the outgoing header will always contain the namespace +associated with the client connection, not the namespace configured +for the server. diff --git a/doc/peers-v2.0.txt b/doc/peers-v2.0.txt new file mode 100644 index 0000000..5f3aa11 --- /dev/null +++ b/doc/peers-v2.0.txt @@ -0,0 +1,288 @@ + HAProxy's peers v2.0 protocol 08/18/2016 + +Author: Emeric Brun ebrun@haproxy.com + + +I) Encoded Integer and Bitfield. + + + 0 <= X < 240 : 1 byte (7.875 bits) [ XXXX XXXX ] + 240 <= X < 2288 : 2 bytes (11 bits) [ 1111 XXXX ] [ 0XXX XXXX ] + 2288 <= X < 264432 : 3 bytes (18 bits) [ 1111 XXXX ] [ 1XXX XXXX ] [ 0XXX XXXX ] + 264432 <= X < 33818864 : 4 bytes (25 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*2 [ 0XXX XXXX ] + 33818864 <= X < 4328786160 : 5 bytes (32 bits) [ 1111 XXXX ] [ 1XXX XXXX ]*3 [ 0XXX XXXX ] + + + + +I) Handshake + +Each peer try to connect to each others, and each peer is listening +for a connect from other peers. + + +Client Server + Hello Message + ------------------------> + + Status Message + <------------------------ + +1) Hello Message + +Hello message is composed of 3 lines: + +<protocol> <version> +<remotepeerid> +<localpeerid> <processpid> <relativepid> + +protocol: current value is "HAProxyS" +version: current value is "2.0" +remotepeerid: is the name of the target peer as defined in the configuration peers section. +localpeerid: is the name of the local peer as defined on cmdline or using hostname. +processid: is the system process id of the local process. +relativepid: is the haproxy's relative pid (0 if nbproc == 1) + +2) Status Message + +Status message is a code followed by a LF. + +200: Handshake succeeded +300: Try again later +501: Protocol error +502: Bad version +503: Local peer name mismatch +504: Remote peer name mismatch + + +IV) Messages + +Messages: + +0 - - - - - - - 8 - - - - - - - 16 + Message Class| Message Type + +if Message Type >= 128 + +0 - - - - - - - 8 - - - - - - - 16 ..... + Message Class| Message Type | encoded data length | data + +Message Classes: +0: control +1: error +10: related to stick table updates +255: reserved + + +1) Control Messages Class + +Available message Types for class control: +0: resync request +1: resync finished +2: resync partial +3: resync confirm + + +a) Resync Request Message + +This message is used to request a full resync from a peer + +b) Resync Finished Message + +This message is used to signal remote peer that locally known updates have been pushed, and local peer was considered up to date. + +c) Resync Partial Message + +This message is used to signal remote peer that locally known updates have been pushed, and but the local peer is not considered up to date. + +d) Resync Confirm Message + +This message is an ack for Resync Partial or Finished Messages. + +It's allow the remote peer to go back to "on the fly" update process. + + +2) Messages Class + +Available message Types for this class are: +0: protocol error +1: size limit reached + +a) Protocol Message + +To signal that a protocol error occurred. Connection will be shutdown just after sending this message. + +b) Size Limit Error Message + +To signal that a message is outsized and can not be correctly handled. Connection will be broken. + + + +3) Stick Table Updates Messages Class + +Available message Types for this class are: +0: Entry update +1: Incremental entry update +2: table definition +3: table switch +4: updates ack message. + + +a) Update Message + +0 - - - - - - - 8 - - - - - - - 16 ..... + Message class | Message Type | encoded data length | data + + +data is composed like this + +0 - - - - - - - 32 ............................. +Local Update ID | Key value | data values .... + +Update ID in a 32bits identifier of the local update. + +Key value format depends of the table key type: + +- for keytype string + +0 ................................. +encoded string length | string value + +- for keytype integer +0 - - - - - - - - - - 32 +encoded integer value | + +- for other key type + +The value length is annonced in table definition message +0 .................... +value + + +b) Incremental Update Message + +Same format than update message except the Update ID is not present, the receiver should +consider that the update ID is an increment of 1 of the previous considered update message (partial or not) + + +c) Table Definition Message + +This message is used by the receiver to identify the stick table concerned by next update messages and +to know which data is pushed in these updates. + +0 - - - - - - - 8 - - - - - - - 16 ..... + Message class | Message Type | encoded data length | data + + +data is composed like this + +0 ................................................................... +Encoded Sender Table Id | Encoded Local Table Name Length | Table Name | Encoded Table Type | Encoded Table Keylen | Encoded Table Data Types Bitfield + + +Encoded Sender Table ID present a the id numerical ID affected to that table by the sender +It will be used by "Updates Aknowlegement Messages" and "Table Switch Messages". + +Encoded Local Table Name Length present the length to read the table name. + +"Table Name" is the shared identifier of the table (name of the current table in the configuration) +It permits the receiver to identify the concerned table. The receiver should keep in memory the matching +between the "Sender Table ID" to identify it directly in case of "Table Switch Message". + +Table Type present the numeric type of key used to store stick table entries: +integer + 2: signed integer + 4: IPv4 address + 5: IPv6 address + 6: string + 7: binary + +Table Keylen present the key length or max length in case of strings or binary (padded with 0). + +Data Types Bitfield present the types of data linearly pushed in next updates message (they will be linearly pushed in the update message) +Known types are +bit + 0: server id + 1: gpt0 + 2: gpc0 + 3: gpc0 rate + 4: connections counter + 5: connection rate + 6: number of current connections + 7: sessions counter + 8: session rate + 9: http requests counter + 10: http requests rate + 11: errors counter + 12: errors rate + 13: bytes in counter + 14: bytes in rate + 15: bytes out rate + 16: bytes out rate + 17: gpc1 + 18: gpc1 rate + +d) Table Switch Message + +After a Table Message Define, this message can be used by the receiver to identify the stick table concerned by next update messages. + +0 - - - - - - - 8 - - - - - - - 16 ..... + Message class | Message Type | encoded data length | data + + +data is composed like this + + +0 ..................... +encoded Sender Table Id + +c) Update Ack Message + +0 - - - - - - - 8 - - - - - - - 16 ..... + Message class | Message Type | encoded data length | data + +data is composed like this + +0 ....................... - - - - - - - - 32 +Encoded Remote Table Id | Update Id + + +Remote Table Id is the numeric identifier of the table on the remote side. +Update Id is the id of the last update locally committed. + +If a re-connection occurred, the sender should know they will have to restart the push of updates from this point. + +III) Initial full resync process. + + +a) Resync from local old process + +An old soft-stopped process will close all established sessions with remote peers and will try to connect to a new +local process to push all known ending with a Resync Finished Message or a Resync Partial Message (if it it does not consider itself as full updated). + +A new process will wait for a an incoming connection from a local process during 5 seconds. It will learn the updates from this +process until it receives a Resync Finished Message or a Resync Partial Message. If it receive a Resync Finished Message it will consider itself +as fully updated and stops to ask for resync. If it receive a Resync Partial Message it will wait once again for 5 seconds for an other incoming connection from a local process. +Same thing if the session was broken before receiving any "Resync Partial Message" or "Resync Finished Message". + +If one of these 5 seconds timeout expire, the process will try to request resync from a remote connected peer (see b). The process will wait until 5seconds +if no available remote peers are found. + +If the timeout expire, the process will consider itself ass fully updated + +b) Resync from remote peers + +The process will randomly choose a remote connected peer and ask for a full resync using a Resync Request Message. The process will wait until 5seconds +if no available remote peers are found. + +The chosen remote peer will push its all known data ending with a Resync Finished Message or a Resync Partial Message (if it it does not consider itself as full updated). + +If it receives a Resync Finished Message it will consider itself as fully updated and stops to ask for resync. + +If it receives a Resync Partial Message, the current peer will be flagged to anymore be requested and any other connected peer will be randomly chosen for a resync request (5s). + +If the session is broken before receiving any of these messages any other connected peer will be randomly chosen for a resync request (5s). + +If the timeout expire, the process will consider itself as fully updated + + diff --git a/doc/peers.txt b/doc/peers.txt new file mode 100644 index 0000000..a5da40f --- /dev/null +++ b/doc/peers.txt @@ -0,0 +1,491 @@ + +--------------------+ + | Peers protocol 2.1 | + +--------------------+ + + + Peers protocol has been implemented over TCP. Its aim is to transmit + stick-table entries information between several haproxy processes. + + This protocol is symmetrical. This means that at any time, each peer + may connect to other peers they have been configured for, to send + their last stick-table updates. There is no role of client or server in this + protocol. As peers may connect to each others at the same time, the protocol + ensures that only one peer session may stay opened between a couple of peers + before they start sending their stick-table information, possibly in both + directions (or not). + + + Handshake + +++++++++ + + Just after having connected to another one, a peer must identify itself + and identify the remote peer, sending a "hello" message. The remote peer + replies with a "status" message. + + A "hello" message is made of three lines terminated by a line feed character + as follows: + + <protocol identifier> <version>\n + <remote peer identifier>\n + <local peer identifier> <process ID> <relative process ID>\n + + protocol identifier : HAProxyS + version : 2.1 + remote peer identifier: the peer name this "hello" message is sent to. + local peer identifier : the name of the peer which sends this "hello" message. + process ID : the ID of the process handling this peer session. + relative process ID : the haproxy's relative process ID (0 if nbproc == 1). + + The "status" message is made of a unique line terminated by a line feed + character as follows: + + <status code>\n + + with these values as status code (a three-digit number): + + +-------------+---------------------------------+ + | status code | signification | + +-------------+---------------------------------+ + | 200 | Handshake succeeded | + +-------------+---------------------------------+ + | 300 | Try again later | + +-------------+---------------------------------+ + | 501 | Protocol error | + +-------------+---------------------------------+ + | 502 | Bad version | + +-------------+---------------------------------+ + | 503 | Local peer identifier mismatch | + +-------------+---------------------------------+ + | 504 | Remote peer identifier mismatch | + +-------------+---------------------------------+ + + As the protocol is symmetrical, some peers may connect to each other at the + same time. For efficiency reasons, the protocol ensures there may be only + one TCP session opened after the handshake succeeded and before transmitting + any stick-table data information. In fact, for each couple of peers, this is + the last connected peer which wins. Each time a peer A receives a "hello" + message from a peer B, peer A checks if it already managed to open a peer + session with peer B, so with a successful handshake. If it is the case, + peer A closes its peer session. So, this is the peer session opened by B + which stays opened. + + + Peer A Peer B + hello + ----------------------> + status 200 + <---------------------- + hello + <++++++++++++++++++++++ + TCP/FIN-ACK + ----------------------> + TCP/FIN-ACK + <---------------------- + status 200 + ++++++++++++++++++++++> + data + <++++++++++++++++++++++ + data + ++++++++++++++++++++++> + data + ++++++++++++++++++++++> + data + <++++++++++++++++++++++ + . + . + . + + As it is still possible that a couple of peers decide to close both their + peer sessions at the same time, the protocol ensures peers will not reconnect + at the same time, adding a random delay (50 up to 2050 ms) before any + reconnection. + + + Encoding + ++++++++ + + As some TCP data may be corrupted, for integrity reason, some data fields + are encoded at peer session level. + + The following algorithms explain how to encode/decode the data. + + encode: + input : val (64bits integer) + output: bitf (variable-length bitfield) + + if val has no bit set above bit 4 (or if val is less than 0xf0) + set the next byte of bitf to the value of val + return bitf + + set the next byte of bitf to the value of val OR'ed with 0xf0 + subtract 0xf0 from val + right shift val by 4 + + while val bit 7 is set (or if val is greater or equal to 0x80): + set the next byte of bitf to the value of the byte made of the last + 7 bits of val OR'ed with 0x80 + subtract 0x80 from val + right shift val by 7 + + set the next byte of bitf to the value of val + return bitf + + decode: + input : bitf (variable-length bitfield) + output: val (64bits integer) + + set val to the value of the first byte of bitf + if bit 4 up to 7 of val are not set + return val + + set loop to 0 + do + add to val the value of the next byte of bitf left shifted by (4 + 7*loop) + set loop to (loop + 1) + while the bit 7 of the next byte of bitf is set + return val + + Example: + + let's say that we must encode 0x1234. + + "set the next byte of bitf to the value of val OR'ed with 0xf0" + => bitf[0] = (0x1234 | 0xf0) & 0xff = 0xf4 + + "subtract 0xf0 from val" + => val = 0x1144 + + right shift val by 4 + => val = 0x114 + + "set the next byte of bitf to the value of the byte made of the last + 7 bits of val OR'ed with 0x80" + => bitf[1] = (0x114 | 0x80) & 0xff = 0x94 + + "subtract 0x80 from val" + => val= 0x94 + + "right shift val by 7" + => val = 0x1 + + => bitf[2] = 0x1 + + So, the encoded value of 0x1234 is 0xf49401. + + To decode this value: + + "set val to the value of the first byte of bitf" + => val = 0xf4 + + "add to val the value of the next byte of bitf left shifted by 4" + => val = 0xf4 + (0x94 << 4) = 0xf4 + 0x940 = 0xa34 + + "add to val the value of the next byte of bitf left shifted by (4 + 7)" + => val = 0xa34 + (0x01 << 11) = 0xa34 + 0x800 = 0x1234 + + + Messages + ++++++++ + + *** General *** + + After the handshake has successfully completed, peers are authorized to send + some messages to each others, possibly in both direction. + + All the messages are made at least of a two bytes length header. + + The first byte of this header identifies the class of the message. The next + byte identifies the type of message in the class. + + Some of these messages are variable-length. Others have a fixed size. + Variable-length messages are identified by the value of the message type + byte. For such messages, it is greater than or equal to 128. + + All variable-length message headers must be followed by the encoded length + of the remaining bytes (so the encoded length of the message minus 2 bytes + for the header and minus the length of the encoded length). + + There exist four classes of messages: + + +------------+---------------------+--------------+ + | class byte | signification | message size | + +------------+---------------------+--------------+ + | 0 | control | fixed (2) | + +------------+---------------------+--------------| + | 1 | error | fixed (2) | + +------------+---------------------+--------------| + | 10 | stick-table updates | variable | + +------------+---------------------+--------------| + | 255 | reserved | | + +------------+---------------------+--------------+ + + At this time of this writing, only control and error messages have a fixed + size of two bytes (header only). The stick-table updates messages are all + variable-length (their message type bytes are greater than 128). + + + *** Control message class *** + + At this time of writing, control messages are fixed-length messages used + only to control the synchronizations between local and/or remote processes + and to emit heartbeat messages. + + There exists five types of such control messages: + + +------------+--------------------------------------------------------+ + | type byte | signification | + +------------+--------------------------------------------------------+ + | 0 | synchronisation request: ask a remote peer for a full | + | | synchronization | + +------------+--------------------------------------------------------+ + | 1 | synchronization finished: signal a remote peer that | + | | local updates have been pushed and local is considered | + | | up to date. | + +------------+--------------------------------------------------------+ + | 2 | synchronization partial: signal a remote peer that | + | | local updates have been pushed and local is not | + | | considered up to date. | + +------------+--------------------------------------------------------+ + | 3 | synchronization confirmed: acknowledge a finished or | + | | partial synchronization message. | + +------------+--------------------------------------------------------+ + | 4 | Heartbeat message. | + +------------+--------------------------------------------------------+ + + About hearbeat messages: a peer sends heartbeat messages to peers it is + connected to after periods of 3s of inactivity (i.e. when there is no + stick-table to synchronize for 3s). After a successful peer protocol + handshake between two peers, if one of them does not send any other peer + protocol messages (i.e. no heartbeat and no stick-table update messages) + during a 5s period, it is considered as no more alive by its remote peer + which closes the session and then tries to reconnect to the peer which + has just disappeared. + + *** Error message class *** + + There exits two types of such error messages: + + +-----------+------------------+ + | type byte | signification | + +-----------+------------------+ + | 0 | protocol error | + +-----------+------------------+ + | 1 | size limit error | + +-----------+------------------+ + + + *** Stick-table update message class *** + + This class is the more important one because it is in relation with the + stick-table entries handling between peers which is at the core of peers + protocol. + + All the messages of this class are variable-length. Their type bytes are + all greater than or equal to 128. + + There exits five types of such stick-table update messages: + + +-----------+--------------------------------+ + | type byte | signification | + +-----------+--------------------------------+ + | 128 | Entry update | + +-----------+--------------------------------+ + | 129 | Incremental entry update | + +-----------+--------------------------------+ + | 130 | Stick-table definition | + +-----------+--------------------------------+ + | 131 | Stick-table switch (unused) | + +-----------+--------------------------------+ + | 133 | Update message acknowledgement | + +-----------+--------------------------------+ + + Note that entry update messages may be multiplexed. This means that different + entry update messages for different stick-tables may be sent over the same + peer session. + + To do so, each time entry update messages have to sent, they must be preceded + by a stick-table definition message. This remains true for incremental entry + update messages. + + As its name indicate, "Update message acknowledgement" messages are used to + acknowledge the entry update messages. + + In this following paragraph, we give some information about the format of + each stick-table update messages. This very simple following legend will + contribute in understanding it. The unit used is the octet. + + XX + +-----------+ + | foo | Unique fixed sized "foo" field, made of XX octets. + +-----------+ + + +===========+ + | foo | Variable-length "foo" field. + +===========+ + + +xxxxxxxxxxx+ + | foo | Encoded variable-length "foo" field. + +xxxxxxxxxxx+ + + +###########+ + | foo | hereunder described "foo" field. + +###########+ + + + With this legend, all the stick-table update messages have such a header: + + 1 1 + +--------------------+------------------------+xxxxxxxxxxxxxxxx+ + | Message Class (10) | Message type (128-133) | Message length | + +--------------------+------------------------+xxxxxxxxxxxxxxxx+ + + Note that to help in making communicate different versions of peers protocol, + such stick-table update messages may be extended adding non mandatory + fields at the end of such messages, announcing a total message length + which is greater than the message length of the previous versions of + peers protocol. After having parsed such messages, the remaining ones + will be skipped to parse the next message. + + - Definition message format: + + Before sending entry update messages, a peer must announce the configuration + of the stick-table in relation with these messages thanks to a + "Stick-table definition" message with such a following format: + + +xxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxx+==================+ + | Stick-table ID | Stick-table name length | Stick-table name | + +xxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxx+==================+ + + +xxxxxxxxxxxx+xxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxx+ + | Key type | Key length | Data types bitfield | Expiry | + +xxxxxxxxxxxx+xxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxx+ + + +xxxxxxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx+ + | Frequency counter #1 | Frequency counter #1 period | + +xxxxxxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx+ + + +xxxxxxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx+ + | Frequency counter #2 | Frequency counter #2 period | + +xxxxxxxxxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx+ + . + . + . + + Note that "Stick-table ID" field is an encoded integer which is used to + identify the stick-table without using its name (or "Stick-table name" + field). It is local to the process handling the stick-table. So we can have + two peers attached to processes which generate stick-table updates for + the same stick-table (same name) but with different stick-table IDs. + + Also note that the list of "Frequency counter #X" and their associated + periods fields exists only if their underlying types are already defined + in "Data types bitfield" field. + + "Expiry" field and the remaining ones are not used by all the existing + version of haproxy peers. But they are MANDATORY, so that to make a + stick-table aggregator peer be able to autoconfigure itself. + + + - Entry update message format: + 4 + +-----------------+###########+############+ + | Local update ID | Key | Data | + +-----------------+###########+############+ + + with "Key" described as follows: + + +xxxxxxxxxxx+=======+ + | length | value | if key type is (non null terminated) "string", + +xxxxxxxxxxx+=======+ + + 4 + +-------+ + | value | if key type is "integer", + +-------+ + + +=======+ + | value | for other key types: the size is announced in + +=======+ the previous stick-table definition message. + + "Data" field is basically a list of encoded values for each type announced + by the "Data types bitfield" field of the previous "Stick-table definition" + message: + + +xxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxx+ +xxxxxxxxxxxxxxxxxxxx+ + | Data type #1 value | Data type #2 value | .... | Data type #n value | + +xxxxxxxxxxxxxxxxxxxx+xxxxxxxxxxxxxxxxxxxx+ +xxxxxxxxxxxxxxxxxxxx+ + + + Most of these fields are internally stored as uint32_t (see STD_T_SINT, + STD_T_UINT, STD_T_ULL C enumerations) or structures made of several uint32_t + (see STD_T_FRQP C enumeration). The remaining one STD_T_DICT is internally + used to store entries of LRU caches for others literal dictionary entries + (couples of IDs associated to strings). It is used to transmit these cache + entries as follows: + + +xxxxxxxxxxx+xxxx+xxxxxxxxxxxxxxx+========+ + | length | ID | string length | string | + +xxxxxxxxxxx+xxxx+xxxxxxxxxxxxxxx+========+ + + "length" is the length in bytes of the remaining data after this "length" field. + "string length" is the length of "string" field which follows. + + Here the cache is used so that not to have to send again and again an already + sent string. Indeed, the second time we have to send the same dictionary entry, + if still cached, a peer sends only its ID: + + +xxxxxxxxxxx+xxxx+ + | length | ID | + +xxxxxxxxxxx+xxxx+ + + - Update message acknowledgement format: + + These messages are responses to "Entry update" messages only. + + Its format is very basic for efficiency reasons: + + 4 + +xxxxxxxxxxxxxxxx+-----------+ + | Stick-table ID | Update ID | + +xxxxxxxxxxxxxxxx+-----------+ + + + Note that the "Stick-table ID" field value is in relation with the one which + has been previously announce by a "Stick-table definition" message. + + The following schema may help in understanding how to handle a stream of + stick-table update messages. The handshake step is not represented. + Stick-table IDs are preceded by a '#' character. + + + Peer A Peer B + + stkt def. #1 + ----------------------> + updates (1-5) + ----------------------> + stkt def. #3 + ----------------------> + updates (1000-1005) + ----------------------> + + stkt def. #2 + <---------------------- + updates (10-15) + <---------------------- + ack 5 for #1 + <---------------------- + ack 1005 for #3 + <---------------------- + stkt def. #4 + <---------------------- + updates (100-105) + <---------------------- + + ack 10 for #2 + ----------------------> + ack 105 for #4 + ----------------------> + (from here, on both sides, all stick-table updates + are considered as received) + diff --git a/doc/proxy-protocol.txt b/doc/proxy-protocol.txt new file mode 100644 index 0000000..fac0331 --- /dev/null +++ b/doc/proxy-protocol.txt @@ -0,0 +1,1051 @@ +2020/03/05 Willy Tarreau + HAProxy Technologies + The PROXY protocol + Versions 1 & 2 + +Abstract + + The PROXY protocol provides a convenient way to safely transport connection + information such as a client's address across multiple layers of NAT or TCP + proxies. It is designed to require little changes to existing components and + to limit the performance impact caused by the processing of the transported + information. + + +Revision history + + 2010/10/29 - first version + 2011/03/20 - update: implementation and security considerations + 2012/06/21 - add support for binary format + 2012/11/19 - final review and fixes + 2014/05/18 - modify and extend PROXY protocol version 2 + 2014/06/11 - fix example code to consider ver+cmd merge + 2014/06/14 - fix v2 header check in example code, and update Forwarded spec + 2014/07/12 - update list of implementations (add Squid) + 2015/05/02 - update list of implementations and format of the TLV add-ons + 2017/03/10 - added the checksum, noop and more SSL-related TLV types, + reserved TLV type ranges, added TLV documentation, clarified + string encoding. With contributions from Andriy Palamarchuk + (Amazon.com). + 2020/03/05 - added the unique ID TLV type (Tim DĂĽsterhus) + + +1. Background + +Relaying TCP connections through proxies generally involves a loss of the +original TCP connection parameters such as source and destination addresses, +ports, and so on. Some protocols make it a little bit easier to transfer such +information. For SMTP, Postfix authors have proposed the XCLIENT protocol [1] +which received broad adoption and is particularly suited to mail exchanges. +For HTTP, there is the "Forwarded" extension [2], which aims at replacing the +omnipresent "X-Forwarded-For" header which carries information about the +original source address, and the less common X-Original-To which carries +information about the destination address. + +However, both mechanisms require a knowledge of the underlying protocol to be +implemented in intermediaries. + +Then comes a new class of products which we'll call "dumb proxies", not because +they don't do anything, but because they're processing protocol-agnostic data. +Both Stunnel[3] and Stud[4] are examples of such "dumb proxies". They talk raw +TCP on one side, and raw SSL on the other one, and do that reliably, without +any knowledge of what protocol is transported on top of the connection. HAProxy +running in pure TCP mode obviously falls into that category as well. + +The problem with such a proxy when it is combined with another one such as +haproxy, is to adapt it to talk the higher level protocol. A patch is available +for Stunnel to make it capable of inserting an X-Forwarded-For header in the +first HTTP request of each incoming connection. HAProxy is able not to add +another one when the connection comes from Stunnel, so that it's possible to +hide it from the servers. + +The typical architecture becomes the following one : + + + +--------+ HTTP :80 +----------+ + | client | --------------------------------> | | + | | | haproxy, | + +--------+ +---------+ | 1 or 2 | + / / HTTPS | stunnel | HTTP :81 | listening| + <________/ ---------> | (server | ---------> | ports | + | mode) | | | + +---------+ +----------+ + + +The problem appears when haproxy runs with keep-alive on the side towards the +client. The Stunnel patch will only add the X-Forwarded-For header to the first +request of each connection and all subsequent requests will not have it. One +solution could be to improve the patch to make it support keep-alive and parse +all forwarded data, whether they're announced with a Content-Length or with a +Transfer-Encoding, taking care of special methods such as HEAD which announce +data without transferring them, etc... In fact, it would require implementing a +full HTTP stack in Stunnel. It would then become a lot more complex, a lot less +reliable and would not anymore be the "dumb proxy" that fits every purposes. + +In practice, we don't need to add a header for each request because we'll emit +the exact same information every time : the information related to the client +side connection. We could then cache that information in haproxy and use it for +every other request. But that becomes dangerous and is still limited to HTTP +only. + +Another approach consists in prepending each connection with a header reporting +the characteristics of the other side's connection. This method is simpler to +implement, does not require any protocol-specific knowledge on either side, and +completely fits the purpose since what is desired precisely is to know the +other side's connection endpoints. It is easy to perform for the sender (just +send a short header once the connection is established) and to parse for the +receiver (simply perform one read() on the incoming connection to fill in +addresses after an accept). The protocol used to carry connection information +across proxies was thus called the PROXY protocol. + + +2. The PROXY protocol header + +This document uses a few terms that are worth explaining here : + - "connection initiator" is the party requesting a new connection + - "connection target" is the party accepting a connection request + - "client" is the party for which a connection was requested + - "server" is the party to which the client desired to connect + - "proxy" is the party intercepting and relaying the connection + from the client to the server. + - "sender" is the party sending data over a connection. + - "receiver" is the party receiving data from the sender. + - "header" or "PROXY protocol header" is the block of connection information + the connection initiator prepends at the beginning of a connection, which + makes it the sender from the protocol point of view. + +The PROXY protocol's goal is to fill the server's internal structures with the +information collected by the proxy that the server would have been able to get +by itself if the client was connecting directly to the server instead of via a +proxy. The information carried by the protocol are the ones the server would +get using getsockname() and getpeername() : + - address family (AF_INET for IPv4, AF_INET6 for IPv6, AF_UNIX) + - socket protocol (SOCK_STREAM for TCP, SOCK_DGRAM for UDP) + - layer 3 source and destination addresses + - layer 4 source and destination ports if any + +Unlike the XCLIENT protocol, the PROXY protocol was designed with limited +extensibility in order to help the receiver parse it very fast. Version 1 was +focused on keeping it human-readable for better debugging possibilities, which +is always desirable for early adoption when few implementations exist. Version +2 adds support for a binary encoding of the header which is much more efficient +to produce and to parse, especially when dealing with IPv6 addresses that are +expensive to emit in ASCII form and to parse. + +In both cases, the protocol simply consists in an easily parsable header placed +by the connection initiator at the beginning of each connection. The protocol +is intentionally stateless in that it does not expect the sender to wait for +the receiver before sending the header, nor the receiver to send anything back. + +This specification supports two header formats, a human-readable format which +is the only format supported in version 1 of the protocol, and a binary format +which is only supported in version 2. Both formats were designed to ensure that +the header cannot be confused with common higher level protocols such as HTTP, +SSL/TLS, FTP or SMTP, and that both formats are easily distinguishable one from +each other for the receiver. + +Version 1 senders MAY only produce the human-readable header format. Version 2 +senders MAY only produce the binary header format. Version 1 receivers MUST at +least implement the human-readable header format. Version 2 receivers MUST at +least implement the binary header format, and it is recommended that they also +implement the human-readable header format for better interoperability and ease +of upgrade when facing version 1 senders. + +Both formats are designed to fit in the smallest TCP segment that any TCP/IP +host is required to support (576 - 40 = 536 bytes). This ensures that the whole +header will always be delivered at once when the socket buffers are still empty +at the beginning of a connection. The sender must always ensure that the header +is sent at once, so that the transport layer maintains atomicity along the path +to the receiver. The receiver may be tolerant to partial headers or may simply +drop the connection when receiving a partial header. Recommendation is to be +tolerant, but implementation constraints may not always easily permit this. It +is important to note that nothing forces any intermediary to forward the whole +header at once, because TCP is a streaming protocol which may be processed one +byte at a time if desired, causing the header to be fragmented when reaching +the receiver. But due to the places where such a protocol is used, the above +simplification generally is acceptable because the risk of crossing such a +device handling one byte at a time is close to zero. + +The receiver MUST NOT start processing the connection before it receives a +complete and valid PROXY protocol header. This is particularly important for +protocols where the receiver is expected to speak first (eg: SMTP, FTP or SSH). +The receiver may apply a short timeout and decide to abort the connection if +the protocol header is not seen within a few seconds (at least 3 seconds to +cover a TCP retransmit). + +The receiver MUST be configured to only receive the protocol described in this +specification and MUST not try to guess whether the protocol header is present +or not. This means that the protocol explicitly prevents port sharing between +public and private access. Otherwise it would open a major security breach by +allowing untrusted parties to spoof their connection addresses. The receiver +SHOULD ensure proper access filtering so that only trusted proxies are allowed +to use this protocol. + +Some proxies are smart enough to understand transported protocols and to reuse +idle server connections for multiple messages. This typically happens in HTTP +where requests from multiple clients may be sent over the same connection. Such +proxies MUST NOT implement this protocol on multiplexed connections because the +receiver would use the address advertised in the PROXY header as the address of +all forwarded requests's senders. In fact, such proxies are not dumb proxies, +and since they do have a complete understanding of the transported protocol, +they MUST use the facilities provided by this protocol to present the client's +address. + + +2.1. Human-readable header format (Version 1) + +This is the format specified in version 1 of the protocol. It consists in one +line of US-ASCII text matching exactly the following block, sent immediately +and at once upon the connection establishment and prepended before any data +flowing from the sender to the receiver : + + - a string identifying the protocol : "PROXY" ( \x50 \x52 \x4F \x58 \x59 ) + Seeing this string indicates that this is version 1 of the protocol. + + - exactly one space : " " ( \x20 ) + + - a string indicating the proxied INET protocol and family. As of version 1, + only "TCP4" ( \x54 \x43 \x50 \x34 ) for TCP over IPv4, and "TCP6" + ( \x54 \x43 \x50 \x36 ) for TCP over IPv6 are allowed. Other, unsupported, + or unknown protocols must be reported with the name "UNKNOWN" ( \x55 \x4E + \x4B \x4E \x4F \x57 \x4E ). For "UNKNOWN", the rest of the line before the + CRLF may be omitted by the sender, and the receiver must ignore anything + presented before the CRLF is found. Note that an earlier version of this + specification suggested to use this when sending health checks, but this + causes issues with servers that reject the "UNKNOWN" keyword. Thus is it + now recommended not to send "UNKNOWN" when the connection is expected to + be accepted, but only when it is not possible to correctly fill the PROXY + line. + + - exactly one space : " " ( \x20 ) + + - the layer 3 source address in its canonical format. IPv4 addresses must be + indicated as a series of exactly 4 integers in the range [0..255] inclusive + written in decimal representation separated by exactly one dot between each + other. Heading zeroes are not permitted in front of numbers in order to + avoid any possible confusion with octal numbers. IPv6 addresses must be + indicated as series of sets of 4 hexadecimal digits (upper or lower case) + delimited by colons between each other, with the acceptance of one double + colon sequence to replace the largest acceptable range of consecutive + zeroes. The total number of decoded bits must exactly be 128. The + advertised protocol family dictates what format to use. + + - exactly one space : " " ( \x20 ) + + - the layer 3 destination address in its canonical format. It is the same + format as the layer 3 source address and matches the same family. + + - exactly one space : " " ( \x20 ) + + - the TCP source port represented as a decimal integer in the range + [0..65535] inclusive. Heading zeroes are not permitted in front of numbers + in order to avoid any possible confusion with octal numbers. + + - exactly one space : " " ( \x20 ) + + - the TCP destination port represented as a decimal integer in the range + [0..65535] inclusive. Heading zeroes are not permitted in front of numbers + in order to avoid any possible confusion with octal numbers. + + - the CRLF sequence ( \x0D \x0A ) + + +The maximum line lengths the receiver must support including the CRLF are : + - TCP/IPv4 : + "PROXY TCP4 255.255.255.255 255.255.255.255 65535 65535\r\n" + => 5 + 1 + 4 + 1 + 15 + 1 + 15 + 1 + 5 + 1 + 5 + 2 = 56 chars + + - TCP/IPv6 : + "PROXY TCP6 ffff:f...f:ffff ffff:f...f:ffff 65535 65535\r\n" + => 5 + 1 + 4 + 1 + 39 + 1 + 39 + 1 + 5 + 1 + 5 + 2 = 104 chars + + - unknown connection (short form) : + "PROXY UNKNOWN\r\n" + => 5 + 1 + 7 + 2 = 15 chars + + - worst case (optional fields set to 0xff) : + "PROXY UNKNOWN ffff:f...f:ffff ffff:f...f:ffff 65535 65535\r\n" + => 5 + 1 + 7 + 1 + 39 + 1 + 39 + 1 + 5 + 1 + 5 + 2 = 107 chars + +So a 108-byte buffer is always enough to store all the line and a trailing zero +for string processing. + +The receiver must wait for the CRLF sequence before starting to decode the +addresses in order to ensure they are complete and properly parsed. If the CRLF +sequence is not found in the first 107 characters, the receiver should declare +the line invalid. A receiver may reject an incomplete line which does not +contain the CRLF sequence in the first atomic read operation. The receiver must +not tolerate a single CR or LF character to end the line when a complete CRLF +sequence is expected. + +Any sequence which does not exactly match the protocol must be discarded and +cause the receiver to abort the connection. It is recommended to abort the +connection as soon as possible so that the sender gets a chance to notice the +anomaly and log it. + +If the announced transport protocol is "UNKNOWN", then the receiver knows that +the sender speaks the correct PROXY protocol with the appropriate version, and +SHOULD accept the connection and use the real connection's parameters as if +there were no PROXY protocol header on the wire. However, senders SHOULD not +use the "UNKNOWN" protocol when they are the initiators of outgoing connections +because some receivers may reject them. When a load balancing proxy has to send +health checks to a server, it SHOULD build a valid PROXY line which it will +fill with a getsockname()/getpeername() pair indicating the addresses used. It +is important to understand that doing so is not appropriate when some source +address translation is performed between the sender and the receiver. + +An example of such a line before an HTTP request would look like this (CR +marked as "\r" and LF marked as "\n") : + + PROXY TCP4 192.168.0.1 192.168.0.11 56324 443\r\n + GET / HTTP/1.1\r\n + Host: 192.168.0.11\r\n + \r\n + +For the sender, the header line is easy to put into the output buffers once the +connection is established. Note that since the line is always shorter than an +MSS, the sender is guaranteed to always be able to emit it at once and should +not even bother handling partial sends. For the receiver, once the header is +parsed, it is easy to skip it from the input buffers. Please consult section 9 +for implementation suggestions. + + +2.2. Binary header format (version 2) + +Producing human-readable IPv6 addresses and parsing them is very inefficient, +due to the multiple possible representation formats and the handling of compact +address format. It was also not possible to specify address families outside +IPv4/IPv6 nor non-TCP protocols. Another drawback of the human-readable format +is the fact that implementations need to parse all characters to find the +trailing CRLF, which makes it harder to read only the exact bytes count. Last, +the UNKNOWN address type has not always been accepted by servers as a valid +protocol because of its imprecise meaning. + +Version 2 of the protocol thus introduces a new binary format which remains +distinguishable from version 1 and from other commonly used protocols. It was +specially designed in order to be incompatible with a wide range of protocols +and to be rejected by a number of common implementations of these protocols +when unexpectedly presented (please see section 7). Also for better processing +efficiency, IPv4 and IPv6 addresses are respectively aligned on 4 and 16 bytes +boundaries. + +The binary header format starts with a constant 12 bytes block containing the +protocol signature : + + \x0D \x0A \x0D \x0A \x00 \x0D \x0A \x51 \x55 \x49 \x54 \x0A + +Note that this block contains a null byte at the 5th position, so it must not +be handled as a null-terminated string. + +The next byte (the 13th one) is the protocol version and command. + +The highest four bits contains the version. As of this specification, it must +always be sent as \x2 and the receiver must only accept this value. + +The lowest four bits represents the command : + - \x0 : LOCAL : the connection was established on purpose by the proxy + without being relayed. The connection endpoints are the sender and the + receiver. Such connections exist when the proxy sends health-checks to the + server. The receiver must accept this connection as valid and must use the + real connection endpoints and discard the protocol block including the + family which is ignored. + + - \x1 : PROXY : the connection was established on behalf of another node, + and reflects the original connection endpoints. The receiver must then use + the information provided in the protocol block to get original the address. + + - other values are unassigned and must not be emitted by senders. Receivers + must drop connections presenting unexpected values here. + +The 14th byte contains the transport protocol and address family. The highest 4 +bits contain the address family, the lowest 4 bits contain the protocol. + +The address family maps to the original socket family without necessarily +matching the values internally used by the system. It may be one of : + + - 0x0 : AF_UNSPEC : the connection is forwarded for an unknown, unspecified + or unsupported protocol. The sender should use this family when sending + LOCAL commands or when dealing with unsupported protocol families. The + receiver is free to accept the connection anyway and use the real endpoint + addresses or to reject it. The receiver should ignore address information. + + - 0x1 : AF_INET : the forwarded connection uses the AF_INET address family + (IPv4). The addresses are exactly 4 bytes each in network byte order, + followed by transport protocol information (typically ports). + + - 0x2 : AF_INET6 : the forwarded connection uses the AF_INET6 address family + (IPv6). The addresses are exactly 16 bytes each in network byte order, + followed by transport protocol information (typically ports). + + - 0x3 : AF_UNIX : the forwarded connection uses the AF_UNIX address family + (UNIX). The addresses are exactly 108 bytes each. + + - other values are unspecified and must not be emitted in version 2 of this + protocol and must be rejected as invalid by receivers. + +The transport protocol is specified in the lowest 4 bits of the 14th byte : + + - 0x0 : UNSPEC : the connection is forwarded for an unknown, unspecified + or unsupported protocol. The sender should use this family when sending + LOCAL commands or when dealing with unsupported protocol families. The + receiver is free to accept the connection anyway and use the real endpoint + addresses or to reject it. The receiver should ignore address information. + + - 0x1 : STREAM : the forwarded connection uses a SOCK_STREAM protocol (eg: + TCP or UNIX_STREAM). When used with AF_INET/AF_INET6 (TCP), the addresses + are followed by the source and destination ports represented on 2 bytes + each in network byte order. + + - 0x2 : DGRAM : the forwarded connection uses a SOCK_DGRAM protocol (eg: + UDP or UNIX_DGRAM). When used with AF_INET/AF_INET6 (UDP), the addresses + are followed by the source and destination ports represented on 2 bytes + each in network byte order. + + - other values are unspecified and must not be emitted in version 2 of this + protocol and must be rejected as invalid by receivers. + +In practice, the following protocol bytes are expected : + + - \x00 : UNSPEC : the connection is forwarded for an unknown, unspecified + or unsupported protocol. The sender should use this family when sending + LOCAL commands or when dealing with unsupported protocol families. When + used with a LOCAL command, the receiver must accept the connection and + ignore any address information. For other commands, the receiver is free + to accept the connection anyway and use the real endpoints addresses or to + reject the connection. The receiver should ignore address information. + + - \x11 : TCP over IPv4 : the forwarded connection uses TCP over the AF_INET + protocol family. Address length is 2*4 + 2*2 = 12 bytes. + + - \x12 : UDP over IPv4 : the forwarded connection uses UDP over the AF_INET + protocol family. Address length is 2*4 + 2*2 = 12 bytes. + + - \x21 : TCP over IPv6 : the forwarded connection uses TCP over the AF_INET6 + protocol family. Address length is 2*16 + 2*2 = 36 bytes. + + - \x22 : UDP over IPv6 : the forwarded connection uses UDP over the AF_INET6 + protocol family. Address length is 2*16 + 2*2 = 36 bytes. + + - \x31 : UNIX stream : the forwarded connection uses SOCK_STREAM over the + AF_UNIX protocol family. Address length is 2*108 = 216 bytes. + + - \x32 : UNIX datagram : the forwarded connection uses SOCK_DGRAM over the + AF_UNIX protocol family. Address length is 2*108 = 216 bytes. + + +Only the UNSPEC protocol byte (\x00) is mandatory to implement on the receiver. +A receiver is not required to implement other ones, provided that it +automatically falls back to the UNSPEC mode for the valid combinations above +that it does not support. + +The 15th and 16th bytes is the address length in bytes in network endian order. +It is used so that the receiver knows how many address bytes to skip even when +it does not implement the presented protocol. Thus the length of the protocol +header in bytes is always exactly 16 + this value. When a sender presents a +LOCAL connection, it should not present any address so it sets this field to +zero. Receivers MUST always consider this field to skip the appropriate number +of bytes and must not assume zero is presented for LOCAL connections. When a +receiver accepts an incoming connection showing an UNSPEC address family or +protocol, it may or may not decide to log the address information if present. + +So the 16-byte version 2 header can be described this way : + + struct proxy_hdr_v2 { + uint8_t sig[12]; /* hex 0D 0A 0D 0A 00 0D 0A 51 55 49 54 0A */ + uint8_t ver_cmd; /* protocol version and command */ + uint8_t fam; /* protocol family and address */ + uint16_t len; /* number of following bytes part of the header */ + }; + +Starting from the 17th byte, addresses are presented in network byte order. +The address order is always the same : + - source layer 3 address in network byte order + - destination layer 3 address in network byte order + - source layer 4 address if any, in network byte order (port) + - destination layer 4 address if any, in network byte order (port) + +The address block may directly be sent from or received into the following +union which makes it easy to cast from/to the relevant socket native structs +depending on the address type : + + union proxy_addr { + struct { /* for TCP/UDP over IPv4, len = 12 */ + uint32_t src_addr; + uint32_t dst_addr; + uint16_t src_port; + uint16_t dst_port; + } ipv4_addr; + struct { /* for TCP/UDP over IPv6, len = 36 */ + uint8_t src_addr[16]; + uint8_t dst_addr[16]; + uint16_t src_port; + uint16_t dst_port; + } ipv6_addr; + struct { /* for AF_UNIX sockets, len = 216 */ + uint8_t src_addr[108]; + uint8_t dst_addr[108]; + } unix_addr; + }; + +The sender must ensure that all the protocol header is sent at once. This block +is always smaller than an MSS, so there is no reason for it to be segmented at +the beginning of the connection. The receiver should also process the header +at once. The receiver must not start to parse an address before the whole +address block is received. The receiver must also reject incoming connections +containing partial protocol headers. + +A receiver may be configured to support both version 1 and version 2 of the +protocol. Identifying the protocol version is easy : + + - if the incoming byte count is 16 or above and the 13 first bytes match + the protocol signature block followed by the protocol version 2 : + + \x0D\x0A\x0D\x0A\x00\x0D\x0A\x51\x55\x49\x54\x0A\x20 + + - otherwise, if the incoming byte count is 8 or above, and the 5 first + characters match the US-ASCII representation of "PROXY" then the protocol + must be parsed as version 1 : + + \x50\x52\x4F\x58\x59 + + - otherwise the protocol is not covered by this specification and the + connection must be dropped. + +If the length specified in the PROXY protocol header indicates that additional +bytes are part of the header beyond the address information, a receiver may +choose to skip over and ignore those bytes, or attempt to interpret those +bytes. + +The information in those bytes will be arranged in Type-Length-Value (TLV +vectors) in the following format. The first byte is the Type of the vector. +The second two bytes represent the length in bytes of the value (not included +the Type and Length bytes), and following the length field is the number of +bytes specified by the length. + + struct pp2_tlv { + uint8_t type; + uint8_t length_hi; + uint8_t length_lo; + uint8_t value[0]; + }; + +A receiver may choose to skip over and ignore the TLVs it is not interested in +or it does not understand. Senders can generate the TLVs only for +the information they choose to publish. + +The following types have already been registered for the <type> field : + + #define PP2_TYPE_ALPN 0x01 + #define PP2_TYPE_AUTHORITY 0x02 + #define PP2_TYPE_CRC32C 0x03 + #define PP2_TYPE_NOOP 0x04 + #define PP2_TYPE_UNIQUE_ID 0x05 + #define PP2_TYPE_SSL 0x20 + #define PP2_SUBTYPE_SSL_VERSION 0x21 + #define PP2_SUBTYPE_SSL_CN 0x22 + #define PP2_SUBTYPE_SSL_CIPHER 0x23 + #define PP2_SUBTYPE_SSL_SIG_ALG 0x24 + #define PP2_SUBTYPE_SSL_KEY_ALG 0x25 + #define PP2_TYPE_NETNS 0x30 + + +2.2.1 PP2_TYPE_ALPN + +Application-Layer Protocol Negotiation (ALPN). It is a byte sequence defining +the upper layer protocol in use over the connection. The most common use case +will be to pass the exact copy of the ALPN extension of the Transport Layer +Security (TLS) protocol as defined by RFC7301 [9]. + + +2.2.2 PP2_TYPE_AUTHORITY + +Contains the host name value passed by the client, as an UTF8-encoded string. +In case of TLS being used on the client connection, this is the exact copy of +the "server_name" extension as defined by RFC3546 [10], section 3.1, often +referred to as "SNI". There are probably other situations where an authority +can be mentioned on a connection without TLS being involved at all. + + +2.2.3. PP2_TYPE_CRC32C + +The value of the type PP2_TYPE_CRC32C is a 32-bit number storing the CRC32c +checksum of the PROXY protocol header. + +When the checksum is supported by the sender after constructing the header +the sender MUST: + + - initialize the checksum field to '0's. + + - calculate the CRC32c checksum of the PROXY header as described in RFC4960, + Appendix B [8]. + + - put the resultant value into the checksum field, and leave the rest of + the bits unchanged. + +If the checksum is provided as part of the PROXY header and the checksum +functionality is supported by the receiver, the receiver MUST: + + - store the received CRC32c checksum value aside. + + - replace the 32 bits of the checksum field in the received PROXY header with + all '0's and calculate a CRC32c checksum value of the whole PROXY header. + + - verify that the calculated CRC32c checksum is the same as the received + CRC32c checksum. If it is not, the receiver MUST treat the TCP connection + providing the header as invalid. + +The default procedure for handling an invalid TCP connection is to abort it. + + +2.2.4. PP2_TYPE_NOOP + +The TLV of this type should be ignored when parsed. The value is zero or more +bytes. Can be used for data padding or alignment. Note that it can be used +to align only by 3 or more bytes because a TLV can not be smaller than that. + + +2.2.5. PP2_TYPE_UNIQUE_ID + +The value of the type PP2_TYPE_UNIQUE_ID is an opaque byte sequence of up to +128 bytes generated by the upstream proxy that uniquely identifies the +connection. + +The unique ID can be used to easily correlate connections across multiple +layers of proxies, without needing to look up IP addresses and port numbers. + + +2.2.6. The PP2_TYPE_SSL type and subtypes + +For the type PP2_TYPE_SSL, the value is itself a defined like this : + + struct pp2_tlv_ssl { + uint8_t client; + uint32_t verify; + struct pp2_tlv sub_tlv[0]; + }; + +The <verify> field will be zero if the client presented a certificate +and it was successfully verified, and non-zero otherwise. + +The <client> field is made of a bit field from the following values, +indicating which element is present : + + #define PP2_CLIENT_SSL 0x01 + #define PP2_CLIENT_CERT_CONN 0x02 + #define PP2_CLIENT_CERT_SESS 0x04 + +Note, that each of these elements may lead to extra data being appended to +this TLV using a second level of TLV encapsulation. It is thus possible to +find multiple TLV values after this field. The total length of the pp2_tlv_ssl +TLV will reflect this. + +The PP2_CLIENT_SSL flag indicates that the client connected over SSL/TLS. When +this field is present, the US-ASCII string representation of the TLS version is +appended at the end of the field in the TLV format using the type +PP2_SUBTYPE_SSL_VERSION. + +PP2_CLIENT_CERT_CONN indicates that the client provided a certificate over the +current connection. PP2_CLIENT_CERT_SESS indicates that the client provided a +certificate at least once over the TLS session this connection belongs to. + +The second level TLV PP2_SUBTYPE_SSL_CIPHER provides the US-ASCII string name +of the used cipher, for example "ECDHE-RSA-AES128-GCM-SHA256". + +The second level TLV PP2_SUBTYPE_SSL_SIG_ALG provides the US-ASCII string name +of the algorithm used to sign the certificate presented by the frontend when +the incoming connection was made over an SSL/TLS transport layer, for example +"SHA256". + +The second level TLV PP2_SUBTYPE_SSL_KEY_ALG provides the US-ASCII string name +of the algorithm used to generate the key of the certificate presented by the +frontend when the incoming connection was made over an SSL/TLS transport layer, +for example "RSA2048". + +In all cases, the string representation (in UTF8) of the Common Name field +(OID: 2.5.4.3) of the client certificate's Distinguished Name, is appended +using the TLV format and the type PP2_SUBTYPE_SSL_CN. E.g. "example.com". + + +2.2.7. The PP2_TYPE_NETNS type + +The type PP2_TYPE_NETNS defines the value as the US-ASCII string representation +of the namespace's name. + + +2.2.8. Reserved type ranges + +The following range of 16 type values is reserved for application-specific +data and will be never used by the PROXY Protocol. If you need more values +consider extending the range with a type field in your TLVs. + + #define PP2_TYPE_MIN_CUSTOM 0xE0 + #define PP2_TYPE_MAX_CUSTOM 0xEF + +This range of 8 values is reserved for temporary experimental use by +application developers and protocol designers. The values from the range will +never be used by the PROXY protocol and should not be used by production +functionality. + + #define PP2_TYPE_MIN_EXPERIMENT 0xF0 + #define PP2_TYPE_MAX_EXPERIMENT 0xF7 + +The following range of 8 values is reserved for future use, potentially to +extend the protocol with multibyte type values. + + #define PP2_TYPE_MIN_FUTURE 0xF8 + #define PP2_TYPE_MAX_FUTURE 0xFF + + +3. Implementations + +HAProxy 1.5 implements version 1 of the PROXY protocol on both sides : + - the listening sockets accept the protocol when the "accept-proxy" setting + is passed to the "bind" keyword. Connections accepted on such listeners + will behave just as if the source really was the one advertised in the + protocol. This is true for logging, ACLs, content filtering, transparent + proxying, etc... + + - the protocol may be used to connect to servers if the "send-proxy" setting + is present on the "server" line. It is enabled on a per-server basis, so it + is possible to have it enabled for remote servers only and still have local + ones behave differently. If the incoming connection was accepted with the + "accept-proxy", then the relayed information is the one advertised in this + connection's PROXY line. + + - HAProxy 1.5 also implements version 2 of the PROXY protocol as a sender. In + addition, a TLV with limited, optional, SSL information has been added. + +Stunnel added support for version 1 of the protocol for outgoing connections in +version 4.45. + +Stud added support for version 1 of the protocol for outgoing connections on +2011/06/29. + +Postfix added support for version 1 of the protocol for incoming connections +in smtpd and postscreen in version 2.10. + +A patch is available for Stud[5] to implement version 1 of the protocol on +incoming connections. + +Support for versions 1 and 2 of the protocol was added to Varnish 4.1 [6]. + +Exim added support for version 1 and version 2 of the protocol for incoming +connections on 2014/05/13, and will be released as part of version 4.83. + +Squid added support for versions 1 and 2 of the protocol in version 3.5 [7]. + +Jetty 9.3.0 supports protocol version 1. + +lighttpd added support for versions 1 and 2 of the protocol for incoming +connections in version 1.4.46 [11]. + +The protocol is simple enough that it is expected that other implementations +will appear, especially in environments such as SMTP, IMAP, FTP, RDP where the +client's address is an important piece of information for the server and some +intermediaries. In fact, several proprietary deployments have already done so +on FTP and SMTP servers. + +Proxy developers are encouraged to implement this protocol, because it will +make their products much more transparent in complex infrastructures, and will +get rid of a number of issues related to logging and access control. + + +4. Architectural benefits +4.1. Multiple layers + +Using the PROXY protocol instead of transparent proxy provides several benefits +in multiple-layer infrastructures. The first immediate benefit is that it +becomes possible to chain multiple layers of proxies and always present the +original IP address. for instance, let's consider the following 2-layer proxy +architecture : + + Internet + ,---. | client to PX1: + ( X ) | native protocol + `---' | + | V + +--+--+ +-----+ + | FW1 |------| PX1 | + +--+--+ +-----+ | PX1 to PX2: PROXY + native + | V + +--+--+ +-----+ + | FW2 |------| PX2 | + +--+--+ +-----+ | PX2 to SRV: PROXY + native + | V + +--+--+ + | SRV | + +-----+ + +Firewall FW1 receives traffic from internet-based clients and forwards it to +reverse-proxy PX1. PX1 adds a PROXY header then forwards to PX2 via FW2. PX2 +is configured to read the PROXY header and to emit it on output. It then joins +the origin server SRV and presents the original client's address there. Since +all TCP connections endpoints are real machines and are not spoofed, there is +no issue for the return traffic to pass via the firewalls and reverse proxies. +Using transparent proxy, this would be quite difficult because the firewalls +would have to deal with the client's address coming from the proxies in the DMZ +and would have to correctly route the return traffic there instead of using the +default route. + + +4.2. IPv4 and IPv6 integration + +The protocol also eases IPv4 and IPv6 integration : if only the first layer +(FW1 and PX1) is IPv6-capable, it is still possible to present the original +client's IPv6 address to the target server even though the whole chain is only +connected via IPv4. + + +4.3. Multiple return paths + +When transparent proxy is used, it is not possible to run multiple proxies +because the return traffic would follow the default route instead of finding +the proper proxy. Some tricks are sometimes possible using multiple server +addresses and policy routing but these are very limited. + +Using the PROXY protocol, this problem disappears as the servers don't need +to route to the client, just to the proxy that forwarded the connection. So +it is perfectly possible to run a proxy farm in front of a very large server +farm and have it working effortless, even when dealing with multiple sites. + +This is particularly important in Cloud-like environments where there is little +choice of binding to random addresses and where the lower processing power per +node generally requires multiple front nodes. + +The example below illustrates the following case : virtualized infrastructures +are deployed in 3 datacenters (DC1..DC3). Each DC uses its own VIP which is +handled by the hosting provider's layer 3 load balancer. This load balancer +routes the traffic to a farm of layer 7 SSL/cache offloaders which load balance +among their local servers. The VIPs are advertised by geolocalised DNS so that +clients generally stick to a given DC. Since clients are not guaranteed to +stick to one DC, the L7 load balancing proxies have to know the other DCs' +servers that may be reached via the hosting provider's LAN or via the internet. +The L7 proxies use the PROXY protocol to join the servers behind them, so that +even inter-DC traffic can forward the original client's address and the return +path is unambiguous. This would not be possible using transparent proxy because +most often the L7 proxies would not be able to spoof an address, and this would +never work between datacenters. + + Internet + + DC1 DC2 DC3 + ,---. ,---. ,---. + ( X ) ( X ) ( X ) + `---' `---' `---' + | +-------+ | +-------+ | +-------+ + +----| L3 LB | +----| L3 LB | +----| L3 LB | + | +-------+ | +-------+ | +-------+ + ------+------- ~ ~ ~ ------+------- ~ ~ ~ ------+------- + ||||| |||| ||||| |||| ||||| |||| + 50 SRV 4 PX 50 SRV 4 PX 50 SRV 4 PX + + +5. Security considerations + +Version 1 of the protocol header (the human-readable format) was designed so as +to be distinguishable from HTTP. It will not parse as a valid HTTP request and +an HTTP request will not parse as a valid proxy request. Version 2 add to use a +non-parsable binary signature to make many products fail on this block. The +signature was designed to cause immediate failure on HTTP, SSL/TLS, SMTP, FTP, +and POP. It also causes aborts on LDAP and RDP servers (see section 6). That +makes it easier to enforce its use under certain connections and at the same +time, it ensures that improperly configured servers are quickly detected. + +Implementers should be very careful about not trying to automatically detect +whether they have to decode the header or not, but rather they must only rely +on a configuration parameter. Indeed, if the opportunity is left to a normal +client to use the protocol, it will be able to hide its activities or make them +appear as coming from somewhere else. However, accepting the header only from a +number of known sources should be safe. + + +6. Validation + +The version 2 protocol signature has been sent to a wide variety of protocols +and implementations including old ones. The following protocol and products +have been tested to ensure the best possible behavior when the signature was +presented, even with minimal implementations : + + - HTTP : + - Apache 1.3.33 : connection abort => pass/optimal + - Nginx 0.7.69 : 400 Bad Request + abort => pass/optimal + - lighttpd 1.4.20 : 400 Bad Request + abort => pass/optimal + - thttpd 2.20c : 400 Bad Request + abort => pass/optimal + - mini-httpd-1.19 : 400 Bad Request + abort => pass/optimal + - haproxy 1.4.21 : 400 Bad Request + abort => pass/optimal + - Squid 3 : 400 Bad Request + abort => pass/optimal + - SSL : + - stud 0.3.47 : connection abort => pass/optimal + - stunnel 4.45 : connection abort => pass/optimal + - nginx 0.7.69 : 400 Bad Request + abort => pass/optimal + - FTP : + - Pure-ftpd 1.0.20 : 3*500 then 221 Goodbye => pass/optimal + - vsftpd 2.0.1 : 3*530 then 221 Goodbye => pass/optimal + - SMTP : + - postfix 2.3 : 3*500 + 221 Bye => pass/optimal + - exim 4.69 : 554 + connection abort => pass/optimal + - POP : + - dovecot 1.0.10 : 3*ERR + Logout => pass/optimal + - IMAP : + - dovecot 1.0.10 : 5*ERR + hang => pass/non-optimal + - LDAP : + - openldap 2.3 : abort => pass/optimal + - SSH : + - openssh 3.9p1 : abort => pass/optimal + - RDP : + - Windows XP SP3 : abort => pass/optimal + +This means that most protocols and implementations will not be confused by an +incoming connection exhibiting the protocol signature, which avoids issues when +facing misconfigurations. + + +7. Future developments + +It is possible that the protocol may slightly evolve to present other +information such as the incoming network interface, or the origin addresses in +case of network address translation happening before the first proxy, but this +is not identified as a requirement right now. Some deep thinking has been spent +on this and it appears that trying to add a few more information open a Pandora +box with many information from MAC addresses to SSL client certificates, which +would make the protocol much more complex. So at this point it is not planned. +Suggestions on improvements are welcome. + + +8. Contacts and links + +Please use w@1wt.eu to send any comments to the author. + +The following links were referenced in the document. + +[1] http://www.postfix.org/XCLIENT_README.html +[2] http://tools.ietf.org/html/rfc7239 +[3] http://www.stunnel.org/ +[4] https://github.com/bumptech/stud +[5] https://github.com/bumptech/stud/pull/81 +[6] https://www.varnish-cache.org/docs/trunk/phk/ssl_again.html +[7] http://wiki.squid-cache.org/Squid-3.5 +[8] https://tools.ietf.org/html/rfc4960#appendix-B +[9] https://tools.ietf.org/rfc/rfc7301.txt +[10] https://www.ietf.org/rfc/rfc3546.txt +[11] https://redmine.lighttpd.net/issues/2804 + +9. Sample code + +The code below is an example of how a receiver may deal with both versions of +the protocol header for TCP over IPv4 or IPv6. The function is supposed to be +called upon a read event. Addresses may be directly copied into their final +memory location since they're transported in network byte order. The sending +side is even simpler and can easily be deduced from this sample code. + + struct sockaddr_storage from; /* already filled by accept() */ + struct sockaddr_storage to; /* already filled by getsockname() */ + const char v2sig[12] = "\x0D\x0A\x0D\x0A\x00\x0D\x0A\x51\x55\x49\x54\x0A"; + + /* returns 0 if needs to poll, <0 upon error or >0 if it did the job */ + int read_evt(int fd) + { + union { + struct { + char line[108]; + } v1; + struct { + uint8_t sig[12]; + uint8_t ver_cmd; + uint8_t fam; + uint16_t len; + union { + struct { /* for TCP/UDP over IPv4, len = 12 */ + uint32_t src_addr; + uint32_t dst_addr; + uint16_t src_port; + uint16_t dst_port; + } ip4; + struct { /* for TCP/UDP over IPv6, len = 36 */ + uint8_t src_addr[16]; + uint8_t dst_addr[16]; + uint16_t src_port; + uint16_t dst_port; + } ip6; + struct { /* for AF_UNIX sockets, len = 216 */ + uint8_t src_addr[108]; + uint8_t dst_addr[108]; + } unx; + } addr; + } v2; + } hdr; + + int size, ret; + + do { + ret = recv(fd, &hdr, sizeof(hdr), MSG_PEEK); + } while (ret == -1 && errno == EINTR); + + if (ret == -1) + return (errno == EAGAIN) ? 0 : -1; + + if (ret >= 16 && memcmp(&hdr.v2, v2sig, 12) == 0 && + (hdr.v2.ver_cmd & 0xF0) == 0x20) { + size = 16 + ntohs(hdr.v2.len); + if (ret < size) + return -1; /* truncated or too large header */ + + switch (hdr.v2.ver_cmd & 0xF) { + case 0x01: /* PROXY command */ + switch (hdr.v2.fam) { + case 0x11: /* TCPv4 */ + ((struct sockaddr_in *)&from)->sin_family = AF_INET; + ((struct sockaddr_in *)&from)->sin_addr.s_addr = + hdr.v2.addr.ip4.src_addr; + ((struct sockaddr_in *)&from)->sin_port = + hdr.v2.addr.ip4.src_port; + ((struct sockaddr_in *)&to)->sin_family = AF_INET; + ((struct sockaddr_in *)&to)->sin_addr.s_addr = + hdr.v2.addr.ip4.dst_addr; + ((struct sockaddr_in *)&to)->sin_port = + hdr.v2.addr.ip4.dst_port; + goto done; + case 0x21: /* TCPv6 */ + ((struct sockaddr_in6 *)&from)->sin6_family = AF_INET6; + memcpy(&((struct sockaddr_in6 *)&from)->sin6_addr, + hdr.v2.addr.ip6.src_addr, 16); + ((struct sockaddr_in6 *)&from)->sin6_port = + hdr.v2.addr.ip6.src_port; + ((struct sockaddr_in6 *)&to)->sin6_family = AF_INET6; + memcpy(&((struct sockaddr_in6 *)&to)->sin6_addr, + hdr.v2.addr.ip6.dst_addr, 16); + ((struct sockaddr_in6 *)&to)->sin6_port = + hdr.v2.addr.ip6.dst_port; + goto done; + } + /* unsupported protocol, keep local connection address */ + break; + case 0x00: /* LOCAL command */ + /* keep local connection address for LOCAL */ + break; + default: + return -1; /* not a supported command */ + } + } + else if (ret >= 8 && memcmp(hdr.v1.line, "PROXY", 5) == 0) { + char *end = memchr(hdr.v1.line, '\r', ret - 1); + if (!end || end[1] != '\n') + return -1; /* partial or invalid header */ + *end = '\0'; /* terminate the string to ease parsing */ + size = end + 2 - hdr.v1.line; /* skip header + CRLF */ + /* parse the V1 header using favorite address parsers like inet_pton. + * return -1 upon error, or simply fall through to accept. + */ + } + else { + /* Wrong protocol */ + return -1; + } + + done: + /* we need to consume the appropriate amount of data from the socket */ + do { + ret = recv(fd, &hdr, size, 0); + } while (ret == -1 && errno == EINTR); + return (ret >= 0) ? 1 : -1; + } diff --git a/doc/queuing.fig b/doc/queuing.fig new file mode 100644 index 0000000..8d57504 --- /dev/null +++ b/doc/queuing.fig @@ -0,0 +1,192 @@ +#FIG 3.2 +Portrait +Center +Metric +A4 +100.00 +Single +-2 +1200 2 +6 900 4770 1575 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 900 4770 1125 4995 1125 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 1575 4770 1350 4995 1350 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1170 4995 1170 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1215 4995 1215 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1260 4995 1260 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1305 4995 1305 5220 +2 3 0 1 7 7 52 -1 20 0.000 2 0 -1 0 0 7 + 900 4770 1125 4995 1125 5220 1350 5220 1350 4995 1575 4770 + 900 4770 +-6 +6 2250 4770 2925 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 2250 4770 2475 4995 2475 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 3 + 2925 4770 2700 4995 2700 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2520 4995 2520 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2565 4995 2565 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2610 4995 2610 5220 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2655 4995 2655 5220 +2 3 0 1 7 7 52 -1 20 0.000 2 0 -1 0 0 7 + 2250 4770 2475 4995 2475 5220 2700 5220 2700 4995 2925 4770 + 2250 4770 +-6 +6 1710 3420 2115 3870 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1710 3780 2115 3780 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1710 3825 2115 3825 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1710 3735 2115 3735 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1710 3690 2115 3690 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1710 3645 2115 3645 +2 1 0 1 0 6 51 -1 20 0.000 0 0 -1 0 0 4 + 1710 3420 1710 3870 2115 3870 2115 3420 +-6 +1 2 0 1 0 7 51 -1 20 0.000 1 0.0000 1935 2182 450 113 1485 2182 2385 2182 +1 2 0 1 0 7 51 -1 20 0.000 0 0.0000 2790 3082 450 113 2340 3082 3240 3082 +1 2 0 1 0 7 51 -1 20 0.000 1 0.0000 1935 1367 450 113 1485 1367 2385 1367 +1 2 0 1 0 7 51 -1 20 0.000 1 0.0000 1035 3082 450 113 585 3082 1485 3082 +2 1 2 1 0 2 53 -1 -1 3.000 0 0 -1 0 0 2 + 2745 3870 3015 3870 +2 1 2 1 0 2 53 -1 -1 3.000 0 0 -1 0 0 2 + 2745 4320 3015 4320 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 60.00 + 2970 5085 2745 5085 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 60.00 + 2205 5085 2430 5085 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 60.00 + 1620 5085 1395 5085 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 1 1 1.00 60.00 60.00 + 855 5085 1080 5085 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1890 3870 1440 4320 1440 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1935 3870 2385 4320 2385 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 2610 4320 2610 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 2835 3195 2835 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 2745 3195 2610 3330 2610 3870 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 1935 2295 1935 3420 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1080 3195 1215 3330 1215 3870 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1890 2295 1035 2745 1035 2970 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1980 2295 2790 2745 2790 2970 +2 1 0 1 0 2 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 1935 1485 1935 2070 +2 1 1 1 0 2 50 -1 -1 4.000 0 0 -1 1 0 5 + 0 0 1.00 60.00 60.00 + 810 5220 450 5220 450 2160 1080 2160 1485 2160 +2 1 1 1 0 2 50 -1 -1 4.000 0 0 -1 1 0 5 + 0 0 1.00 60.00 60.00 + 3060 5220 3375 5220 3375 2160 2655 2160 2385 2160 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 1215 4320 1215 4770 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 990 3195 990 4770 +2 1 0 1 0 2 50 -1 -1 0.000 0 0 -1 1 0 2 + 0 0 1.00 60.00 60.00 + 1935 855 1935 1260 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 1620 1440 900 2025 900 2970 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 1 0 3 + 0 0 1.00 60.00 60.00 + 2205 1440 2925 2025 2925 2970 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4230 1350 4230 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4275 1350 4275 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4185 1350 4185 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4140 1350 4140 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4095 1350 4095 +2 1 0 1 0 3 51 -1 20 0.000 0 0 -1 0 0 4 + 1125 3870 1125 4320 1350 4320 1350 3870 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2700 4230 2475 4230 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2700 4275 2475 4275 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2700 4185 2475 4185 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 2700 4140 2475 4140 +2 1 0 1 0 3 51 -1 20 0.000 0 0 -1 0 0 4 + 2700 3870 2700 4320 2475 4320 2475 3870 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4050 1350 4050 +2 1 0 1 0 7 50 -1 -1 0.000 0 0 -1 0 0 2 + 1125 4005 1350 4005 +2 1 0 1 0 2 53 -1 -1 0.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 60.00 + 0 0 1.00 60.00 60.00 + 900 3870 900 4320 +2 1 2 1 0 2 53 -1 -1 3.000 0 0 -1 0 0 2 + 855 3870 1125 3870 +2 1 2 1 0 2 53 -1 -1 3.000 0 0 -1 0 0 2 + 855 4320 1125 4320 +2 1 0 1 0 2 53 -1 -1 0.000 0 0 -1 1 1 2 + 0 0 1.00 60.00 60.00 + 0 0 1.00 60.00 60.00 + 2970 3870 2970 4320 +4 0 0 53 -1 16 7 0.0000 4 75 195 1260 3510 Yes\001 +4 2 0 53 -1 16 7 0.0000 4 75 135 945 3510 No\001 +4 2 0 53 -1 16 7 0.0000 4 75 195 2565 3510 Yes\001 +4 0 0 53 -1 16 7 0.0000 4 75 135 2880 3510 No\001 +4 1 0 50 -1 16 6 0.0000 4 75 210 1935 4140 global\001 +4 1 0 50 -1 16 6 0.0000 4 60 225 1935 4230 queue\001 +4 1 0 50 -1 16 8 1.5708 4 120 1005 405 4680 Redispatch on error\001 +4 1 0 53 -1 14 6 1.5708 4 60 480 2205 3645 maxqueue\001 +4 1 0 50 -1 18 8 0.0000 4 90 165 1935 2205 LB\001 +4 1 0 53 -1 16 7 1.5708 4 90 870 2070 2880 server, all are full.\001 +4 1 0 50 -1 18 8 0.0000 4 90 360 1935 1395 cookie\001 +4 1 0 50 -1 16 10 0.0000 4 135 1200 1935 765 Incoming request\001 +4 1 0 53 -1 16 7 1.5708 4 75 480 1890 1755 no cookie\001 +4 1 0 53 -1 16 7 1.5708 4 75 600 1890 2880 no available\001 +4 0 0 53 -1 16 7 5.6200 4 75 735 2340 1530 SRV2 selected\001 +4 1 0 50 -1 16 10 0.0000 4 105 405 1260 5445 SRV1\001 +4 1 0 50 -1 16 10 0.0000 4 105 405 2610 5445 SRV2\001 +4 2 0 53 -1 16 7 0.4712 4 75 735 1665 2385 SRV1 selected\001 +4 0 0 53 -1 16 7 5.8119 4 75 735 2205 2385 SRV2 selected\001 +4 2 0 53 -1 16 7 0.6632 4 75 735 1485 1530 SRV1 selected\001 +4 0 0 53 -1 14 6 0.0000 4 45 420 2880 5040 maxconn\001 +4 1 0 50 -1 16 8 1.5708 4 120 1005 3510 4680 Redispatch on error\001 +4 1 0 50 -1 18 8 0.0000 4 90 615 2790 3105 SRV2 full ?\001 +4 1 0 50 -1 18 8 0.0000 4 90 615 1035 3105 SRV1 full ?\001 +4 1 0 53 -1 14 6 1.5708 4 60 480 855 4095 maxqueue\001 +4 1 0 53 -1 14 6 1.5708 4 60 480 3105 4095 maxqueue\001 diff --git a/doc/regression-testing.txt b/doc/regression-testing.txt new file mode 100644 index 0000000..d8d71b5 --- /dev/null +++ b/doc/regression-testing.txt @@ -0,0 +1,706 @@ + +---------------------------------------+ + | HAProxy regression testing with vtest | + +---------------------------------------+ + + +The information found in this file are a short starting guide to help you to +write VTC (Varnish Test Case) scripts (or VTC files) for haproxy regression testing. +Such VTC files are currently used to test Varnish cache application developed by +Poul-Henning Kamp. A very big thanks you to him for having helped you to add +our haproxy C modules to vtest tool. Note that vtest was formally developed for +varnish cache reg testing and was named varnishtest. vtest is an haproxy specific +version of varnishtest program which reuses the non varnish cache specific code. + +A lot of general information about how to write VTC files may be found in 'man/vtc.7' +manual of varnish cache sources directory or directly on the web here: + + https://varnish-cache.org/docs/trunk/reference/vtc.html + +It is *highly* recommended to read this manual before asking to haproxy ML. This +documentation only deals with the vtest support for haproxy. + + +vtest installation +------------------------ + +To use vtest you will have to download and compile the recent vtest +sources found at https://github.com/vtest/VTest. + +To compile vtest: + + $ cd VTest + $ make vtest + +Note that varnishtest may be also compiled but not without the varnish cache +sources already compiled: + + $ VARNISH_SRC=<...> make varnishtest + +After having compiled these sources, the vtest executable location is at the +root of the vtest sources directory. + + +vtest execution +--------------------- + +vtest is able to search for the haproxy executable file it is supposed to +launch thanks to the PATH environment variable. To force the executable to be used by +vtest, the HAPROXY_PROGRAM environment variable for vtest may be +typically set as follows: + + $ HAPROXY_PROGRAM=~/srcs/haproxy/haproxy vtest ... + +vtest program comes with interesting options. The most interesting are: + + -t Timeout in seconds to abort the test if some launched program + -v By default, vtest does not dump the outputs of processus it launched + when the test passes. With this option the outputs are dumped even + when the test passes. + -L to always keep the temporary VTC directories. + -l to keep the temporary VTC directories only when the test fails. + +About haproxy, when launched by vtest, -d option is enabled by default. + + +How to write VTC files +---------------------- + +A VTC file must start with a "varnishtest" or "vtest" command line followed by a +descriptive line enclosed by double quotes. This is not specific to the VTC files +for haproxy. + +The VTC files for haproxy must also contain a "feature ignore_unknown_macro" line +if any macro is used for haproxy in this file. This is due to the fact that +vtest parser code for haproxy commands generates macros the vtest +parser code for varnish cache has no knowledge of. This line prevents vtest from +failing in such cases. As a "cli" macro automatically generated, this +"feature ignore_unknown_macro" is mandatory for each VTC file for haproxy. + +To make vtest capable of testing haproxy, two new VTC commands have been +implemented: "haproxy" and "syslog". "haproxy" is used to start haproxy processus. +"syslog" is used to start syslog servers (at this time, only used by haproxy). + +As haproxy cannot work without configuration file, a VTC file for haproxy must +embed the configuration files contents for the haproxy instances it declares. +This may be done using the following intuitive syntax construction: -conf {...}. +Here -conf is an argument of "haproxy" VTC command to declare the configuration +file of the haproxy instances it also declares (see "Basic HAProxy test" VTC file +below). + +As for varnish VTC files, the parser of VTC files for haproxy automatically +generates macros for the declared frontends to be reused by the clients later +in the script, so after having written the "haproxy" command sections. +The syntax "fd@${my_frontend_fd_name}" must be used to bind the frontend +listeners to localhost address and random ports (see "Environment variables" +section of haproxy documentation). This is mandatory. + +Each time the haproxy command parser finds a "fd@${xyz}" string in a 'ABC' +"haproxy" command section, it generates three macros: 'ABC_xyz_addr', 'ABC_xyz_port' +and 'ABC_xyz_sock', with 'ABC_xyz_sock' being resolved as 'ABC_xyz_addr +ABC_xyz_port' typically used by clients -connect parameter. + +Each haproxy instance works in their own temporary working directories located +at '/tmp/vtc.<varnitest PID>.XXXXXXXX/<haproxy_instance_name>' (with XXXXXXXX +a random 8 digits hexadecimal integer. This is in this temporary directory that +the configuration file is temporarily written. + +A 'stats.sock' UNIX socket is also created in this directory. There is no need +to declare such stats sockets in the -conf {...} section. The name of the parent +directory of the haproxy instances working directories is stored in 'tmpdir'. In +fact this is the working directory of the current vtest processus. + +There also exists 'testdir' macro which is the parent directory of the VTC file. +It may be useful to use other files located in the same directory than the current +VTC file. + + + +VTC file examples +----------------- + +The following first VTC file is a real regression test case file for a bug which has +been fixed by 84c844e commit. We declare a basic configuration for a 'h1' haproxy +instance. + + varnishtest "SPOE bug: missing configuration file" + + #commit 84c844eb12b250aa86f2aadaff77c42dfc3cb619 + #Author: Christopher Faulet <cfaulet@haproxy.com> + #Date: Fri Mar 23 14:37:14 2018 +0100 + + # BUG/MINOR: spoe: Initialize variables used during conf parsing before any check + + # Some initializations must be done at the beginning of parse_spoe_flt to avoid + # segmentation fault when first errors are caught, when the "filter spoe" line is + # parsed. + + haproxy h1 -conf-BAD {} { + defaults + timeout connect 5000ms + timeout client 50000ms + timeout server 50000ms + + frontend my-front + filter spoe + } + + +-conf-BAD haproxy command argument is used. Its role it to launch haproxy with +-c option (configuration file checking) and check that 'h1' exit(3) with 1 as +status. Here is the output when running this VTC file: + + + **** top 0.0 extmacro def pwd=/home/fred/src/haproxy + **** top 0.0 extmacro def localhost=127.0.0.1 + **** top 0.0 extmacro def bad_backend=127.0.0.1 39564 + **** top 0.0 extmacro def bad_ip=192.0.2.255 + **** top 0.0 macro def testdir=//home/fred/src/varnish-cache-haproxy + **** top 0.0 macro def tmpdir=/tmp/vtc.6377.64329194 + * top 0.0 TEST /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc starting + ** top 0.0 === varnishtest "SPOE bug: missing configuration file" + * top 0.0 TEST SPOE bug: missing configuration file + ** top 0.0 === haproxy h1 -conf-BAD {} { + **** h1 0.0 conf| global + **** h1 0.0 conf|\tstats socket /tmp/vtc.6377.64329194/h1/stats.sock level admin mode 600 + **** h1 0.0 conf| + **** h1 0.0 conf|\tdefaults + **** h1 0.0 conf| timeout connect 5000ms + **** h1 0.0 conf| timeout client 50000ms + **** h1 0.0 conf| timeout server 50000ms + **** h1 0.0 conf| + **** h1 0.0 conf|\tfrontend my-front + **** h1 0.0 conf|\t\tfilter spoe + **** h1 0.0 conf| + ** h1 0.0 haproxy_start + **** h1 0.0 opt_worker 0 opt_daemon 0 opt_check_mode 1 + **** h1 0.0 argv|exec /home/fred/src/haproxy/haproxy -c -f /tmp/vtc.6377.64329194/h1/cfg + **** h1 0.0 XXX 5 @277 + *** h1 0.0 PID: 6395 + **** h1 0.0 macro def h1_pid=6395 + **** h1 0.0 macro def h1_name=/tmp/vtc.6377.64329194/h1 + ** h1 0.0 Wait + ** h1 0.0 Stop HAProxy pid=6395 + **** h1 0.0 STDOUT poll 0x10 + ** h1 0.0 WAIT4 pid=6395 status=0x008b (user 0.000000 sys 0.000000) + * h1 0.0 Expected exit: 0x1 signal: 0 core: 0 + ---- h1 0.0 Bad exit status: 0x008b exit 0x0 signal 11 core 128 + * top 0.0 RESETTING after /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc + ** h1 0.0 Reset and free h1 haproxy 6395 + ** h1 0.0 Wait + ---- h1 0.0 Assert error in haproxy_wait(), vtc_haproxy.c line 326: Condition(*(&h->fds[1]) >= 0) not true. + + * top 0.0 failure during reset + # top TEST /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc FAILED (0.008) exit=2 + + +'h1' exited with (128 + 11) status and a core file was produced in +/tmp/vtc.6377.64329194/h1 directory. +With the patch provided by 84c844e commit, varnishtest makes this VTC file pass +as expected (verbose mode execution): + + **** top 0.0 extmacro def pwd=/home/fred/src/haproxy + **** top 0.0 extmacro def localhost=127.0.0.1 + **** top 0.0 extmacro def bad_backend=127.0.0.1 42264 + **** top 0.0 extmacro def bad_ip=192.0.2.255 + **** top 0.0 macro def testdir=//home/fred/src/varnish-cache-haproxy + **** top 0.0 macro def tmpdir=/tmp/vtc.25540.59b6ec5d + * top 0.0 TEST /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc starting + ** top 0.0 === varnishtest "SPOE bug: missing configuration file" + * top 0.0 TEST SPOE bug: missing configuration file + ** top 0.0 === haproxy h1 -conf-BAD {} { + **** h1 0.0 conf| global + **** h1 0.0 conf|\tstats socket /tmp/vtc.25540.59b6ec5d/h1/stats.sock level admin mode 600 + **** h1 0.0 conf| + **** h1 0.0 conf|\tdefaults + **** h1 0.0 conf| timeout connect 5000ms + **** h1 0.0 conf| timeout client 50000ms + **** h1 0.0 conf| timeout server 50000ms + **** h1 0.0 conf| + **** h1 0.0 conf|\tfrontend my-front + **** h1 0.0 conf|\t\tfilter spoe + **** h1 0.0 conf| + ** h1 0.0 haproxy_start + **** h1 0.0 opt_worker 0 opt_daemon 0 opt_check_mode 1 + **** h1 0.0 argv|exec /home/fred/src/haproxy/haproxy -c -f /tmp/vtc.25540.59b6ec5d/h1/cfg + **** h1 0.0 XXX 5 @277 + *** h1 0.0 PID: 25558 + **** h1 0.0 macro def h1_pid=25558 + **** h1 0.0 macro def h1_name=/tmp/vtc.25540.59b6ec5d/h1 + ** h1 0.0 Wait + ** h1 0.0 Stop HAProxy pid=25558 + *** h1 0.0 debug|[ALERT] (25558) : parsing [/tmp/vtc.25540.59b6ec5d/h1/cfg:10] : 'filter' : ''spoe' : missing config file' + *** h1 0.0 debug|[ALERT] (25558) : Error(s) found in configuration file : /tmp/vtc.25540.59b6ec5d/h1/cfg + *** h1 0.0 debug|[ALERT] (25558) : Fatal errors found in configuration. + **** h1 0.0 STDOUT poll 0x10 + ** h1 0.0 WAIT4 pid=25558 status=0x0100 (user 0.000000 sys 0.000000) + ** h1 0.0 Found expected '' + * top 0.0 RESETTING after /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc + ** h1 0.0 Reset and free h1 haproxy -1 + * top 0.0 TEST /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc completed + # top TEST /home/fred/src/varnish-cache-haproxy/spoe_bug.vtc passed (0.004) + + +The following VTC file does almost nothing except running a shell to list +the contents of 'tmpdir' directory after having launched a haproxy instance +and 's1' HTTP server. This shell also prints the content of 'cfg' 'h1' configuration +file. + + varnishtest "List the contents of 'tmpdir'" + feature ignore_unknown_macro + + server s1 { + } -start + + haproxy h1 -conf { + defaults + mode http + timeout connect 5s + timeout server 30s + timeout client 30s + + backend be1 + server srv1 ${s1_addr}:${s1_port} + + frontend http1 + use_backend be1 + bind "fd@${my_frontend_fd}" + } -start + + shell { + echo "${tmpdir} working directory content:" + ls -lR ${tmpdir} + cat ${tmpdir}/h1/cfg + } + +We give only the output of the shell to illustrate this example: + + . + . + . + ** top 0.0 === shell { + **** top 0.0 shell_cmd|exec 2>&1 ; + **** top 0.0 shell_cmd| echo "tmpdir: /tmp/vtc.32092.479d521e" + **** top 0.0 shell_cmd| ls -lR /tmp/vtc.32092.479d521e + **** top 0.0 shell_cmd| cat /tmp/vtc.32092.479d521e/h1/cfg + . + . + . + **** top 0.0 shell_out|/tmp/vtc.3808.448cbfe0 working directory content: + **** top 0.0 shell_out|/tmp/vtc.32092.479d521e: + **** top 0.0 shell_out|total 8 + **** top 0.0 shell_out|drwxr-xr-x 2 users 4096 Jun 7 11:09 h1 + **** top 0.0 shell_out|-rw-r--r-- 1 me users 84 Jun 7 11:09 INFO + **** top 0.0 shell_out| + **** top 0.0 shell_out|/tmp/vtc.32092.479d521e/h1: + **** top 0.0 shell_out|total 4 + **** top 0.0 shell_out|-rw-r----- 1 fred users 339 Jun 7 11:09 cfg + **** top 0.0 shell_out|srw------- 1 fred users 0 Jun 7 11:09 stats.sock + **** top 0.0 shell_out| global + **** top 0.0 shell_out|\tstats socket /tmp/vtc.32092.479d521e/h1/stats.sock level admin mode 600 + **** top 0.0 shell_out| + **** top 0.0 shell_out| defaults + **** top 0.0 shell_out| mode http + **** top 0.0 shell_out| timeout connect 5s + **** top 0.0 shell_out| timeout server 30s + **** top 0.0 shell_out| timeout client 30s + **** top 0.0 shell_out| + **** top 0.0 shell_out| backend be1 + **** top 0.0 shell_out| server srv1 127.0.0.1:36984 + **** top 0.0 shell_out| + **** top 0.0 shell_out| frontend http1 + **** top 0.0 shell_out| use_backend be1 + **** top 0.0 shell_out| bind "fd@${my_frontend_fd}" + **** top 0.0 shell_status = 0x0000 + + +The following example illustrate how to run a basic HTTP transaction between 'c1' +client and 's1' server with 'http1' as haproxy frontend. This frontend listen +on TCP socket with 'my_frontend_fd' as file descriptor. + + # Mandatory line + varnishtest "Basic HAProxy test" + + # As some macros for haproxy are used in this file, this line is mandatory. + feature ignore_unknown_macro + + server s1 { + rxreq + txresp -body "s1 >>> Hello world!" + } -start + + haproxy h1 -conf { + # Configuration file of 'h1' haproxy instance. + defaults + mode http + timeout connect 5s + timeout server 30s + timeout client 30s + + backend be1 + # declare 'srv1' server to point to 's1' server instance declare above. + server srv1 ${s1_addr}:${s1_port} + + frontend http1 + use_backend be1 + bind "fd@${my_frontend_fd}" + } -start + + client c1 -connect ${h1_my_frontend_fd_sock} { + txreq -url "/" + rxresp + expect resp.status == 200 + expect resp.body == "s1 >>> Hello world!" + } -run + + +It is possible to shorten the previous VTC file haproxy command section as follows: + + haproxy h1 -conf { + # Configuration file of 'h1' haproxy instance. + defaults + mode http + timeout connect 5s + timeout server 30s + timeout client 30s + } + +In this latter example, "backend" and "frontend" sections are automatically +generated depending on the declarations of server instances. + + +Another interesting real regression test case is the following: we declare one +server 's1', a syslog server 'Slg_1' and a basic haproxy configuration for 'h1' +haproxy instance. Here we want to check that the syslog message are correctly +formatted thanks to "expect" "syslog" command (see syslog Slg_1 {...} command) +below. + + varnishtest "Wrong ip/port logging" + feature ignore_unknown_macro + + #commit d02286d6c866e5c0a7eb6fbb127fa57f3becaf16 + #Author: Willy Tarreau <w@1wt.eu> + #Date: Fri Jun 23 11:23:43 2017 +0200 + # + # BUG/MINOR: log: pin the front connection when front ip/ports are logged + # + # Mathias Weiersmueller reported an interesting issue with logs which Lukas + # diagnosed as dating back from commit 9b061e332 (1.5-dev9). When front + # connection information (ip, port) are logged in TCP mode and the log is + # emitted at the end of the connection (eg: because %B or any log tag + # requiring LW_BYTES is set), the log is emitted after the connection is + # closed, so the address and ports cannot be retrieved anymore. + # + # It could be argued that we'd make a special case of these to immediately + # retrieve the source and destination addresses from the connection, but it + # seems cleaner to simply pin the front connection, marking it "tracked" by + # adding the LW_XPRT flag to mention that we'll need some of these elements + # at the last moment. Only LW_FRTIP and LW_CLIP are affected. Note that after + # this change, LW_FRTIP could simply be removed as it's not used anywhere. + + # Note that the problem doesn't happen when using %[src] or %[dst] since + # all sample expressions set LW_XPRT. + + + server s1 { + rxreq + txresp + } -start + + syslog Slg_1 -level notice { + recv + recv + recv info + expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_port}.*\"ts\":\"cD\",\" + } -start + + haproxy h1 -conf { + global + log ${Slg_1_addr}:${Slg_1_port} local0 + + defaults + log global + timeout connect 3000 + timeout client 5 + timeout server 10000 + + frontend fe1 + bind "fd@${fe_1}" + mode tcp + log-format {\"dip\":\"%fi\",\"dport\":\"%fp\",\"c_ip\":\"%ci\",\"c_port\":\"%cp\",\"fe_name\":\"%ft\",\"be_name\":\"%b\",\"s_name\":\"%s\",\"ts\":\"%ts\",\"bytes_read\":\"%B\"} + default_backend be_app + + backend be_app + server app1 ${s1_addr}:${s1_port} check + } -start + + client c1 -connect ${h1_fe_1_sock} { + txreq -url "/" + delay 0.02 + } -run + + syslog Slg_1 -wait + + +Here is the output produced by varnishtest with the latter VTC file: + + **** top 0.0 extmacro def pwd=/home/fred/src/haproxy + **** top 0.0 extmacro def localhost=127.0.0.1 + **** top 0.0 extmacro def bad_backend=127.0.0.1 40386 + **** top 0.0 extmacro def bad_ip=192.0.2.255 + **** top 0.0 macro def testdir=//home/fred/src/varnish-cache-haproxy + **** top 0.0 macro def tmpdir=/tmp/vtc.15752.560ca66b + * top 0.0 TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc starting + ** top 0.0 === varnishtest "HAPEE bug 2788" + * top 0.0 TEST HAPEE bug 2788 + ** top 0.0 === feature ignore_unknown_macro + ** top 0.0 === server s1 { + ** s1 0.0 Starting server + **** s1 0.0 macro def s1_addr=127.0.0.1 + **** s1 0.0 macro def s1_port=35564 + **** s1 0.0 macro def s1_sock=127.0.0.1 35564 + * s1 0.0 Listen on 127.0.0.1 35564 + ** top 0.0 === syslog Slg_1 -level notice { + ** Slg_1 0.0 Starting syslog server + ** s1 0.0 Started on 127.0.0.1 35564 + **** Slg_1 0.0 macro def Slg_1_addr=127.0.0.1 + **** Slg_1 0.0 macro def Slg_1_port=33012 + **** Slg_1 0.0 macro def Slg_1_sock=127.0.0.1 33012 + * Slg_1 0.0 Bound on 127.0.0.1 33012 + ** top 0.0 === haproxy h1 -conf { + ** Slg_1 0.0 Started on 127.0.0.1 33012 (level: 5) + ** Slg_1 0.0 === recv + **** h1 0.0 macro def h1_fe_1_sock=::1 51782 + **** h1 0.0 macro def h1_fe_1_addr=::1 + **** h1 0.0 macro def h1_fe_1_port=51782 + **** h1 0.0 setenv(fe_1, 7) + **** h1 0.0 conf| global + **** h1 0.0 conf|\tstats socket /tmp/vtc.15752.560ca66b/h1/stats.sock level admin mode 600 + **** h1 0.0 conf| + **** h1 0.0 conf| global + **** h1 0.0 conf| log 127.0.0.1:33012 local0 + **** h1 0.0 conf| + **** h1 0.0 conf| defaults + **** h1 0.0 conf| log global + **** h1 0.0 conf| timeout connect 3000 + **** h1 0.0 conf| timeout client 5 + **** h1 0.0 conf| timeout server 10000 + **** h1 0.0 conf| + **** h1 0.0 conf| frontend fe1 + **** h1 0.0 conf| bind "fd@${fe_1}" + **** h1 0.0 conf| mode tcp + **** h1 0.0 conf| log-format {\"dip\":\"%fi\",\"dport\":\"%fp\",\"c_ip\":\"%ci\",\"c_port\":\"%cp\",\"fe_name\":\"%ft\",\"be_name\":\"%b\",\"s_name\":\"%s\",\"ts\":\"%ts\",\"bytes_read\":\"%B\"} + **** h1 0.0 conf| default_backend be_app + **** h1 0.0 conf| + **** h1 0.0 conf| backend be_app + **** h1 0.0 conf| server app1 127.0.0.1:35564 check + ** h1 0.0 haproxy_start + **** h1 0.0 opt_worker 0 opt_daemon 0 opt_check_mode 0 + **** h1 0.0 argv|exec /home/fred/src/haproxy/haproxy -d -f /tmp/vtc.15752.560ca66b/h1/cfg + **** h1 0.0 XXX 9 @277 + *** h1 0.0 PID: 15787 + **** h1 0.0 macro def h1_pid=15787 + **** h1 0.0 macro def h1_name=/tmp/vtc.15752.560ca66b/h1 + ** top 0.0 === client c1 -connect ${h1_fe_1_sock} { + ** c1 0.0 Starting client + ** c1 0.0 Waiting for client + *** c1 0.0 Connect to ::1 51782 + *** c1 0.0 connected fd 8 from ::1 46962 to ::1 51782 + ** c1 0.0 === txreq -url "/" + **** c1 0.0 txreq|GET / HTTP/1.1\r + **** c1 0.0 txreq|Host: 127.0.0.1\r + **** c1 0.0 txreq|\r + ** c1 0.0 === delay 0.02 + *** c1 0.0 delaying 0.02 second(s) + *** h1 0.0 debug|Note: setting global.maxconn to 2000. + *** h1 0.0 debug|Available polling systems : + *** h1 0.0 debug| epoll : + *** h1 0.0 debug|pref=300, + *** h1 0.0 debug| test result OK + *** h1 0.0 debug| poll : pref=200, test result OK + *** h1 0.0 debug| select : + *** h1 0.0 debug|pref=150, test result FAILED + *** h1 0.0 debug|Total: 3 (2 usable), will use epoll. + *** h1 0.0 debug| + *** h1 0.0 debug|Available filters : + *** h1 0.0 debug|\t[SPOE] spoe + *** h1 0.0 debug|\t[COMP] compression + *** h1 0.0 debug|\t[TRACE] trace + **** Slg_1 0.0 syslog|<133>Jun 7 14:12:51 haproxy[15787]: Proxy fe1 started. + ** Slg_1 0.0 === recv + **** Slg_1 0.0 syslog|<133>Jun 7 14:12:51 haproxy[15787]: Proxy be_app started. + ** Slg_1 0.0 === recv info + *** h1 0.0 debug|00000000:fe1.accept(0007)=000a from [::1:46962] + *** s1 0.0 accepted fd 6 127.0.0.1 56770 + ** s1 0.0 === rxreq + **** s1 0.0 rxhdr|GET / HTTP/1.1\r + **** s1 0.0 rxhdr|Host: 127.0.0.1\r + **** s1 0.0 rxhdr|\r + **** s1 0.0 rxhdrlen = 35 + **** s1 0.0 http[ 0] |GET + **** s1 0.0 http[ 1] |/ + **** s1 0.0 http[ 2] |HTTP/1.1 + **** s1 0.0 http[ 3] |Host: 127.0.0.1 + **** s1 0.0 bodylen = 0 + ** s1 0.0 === txresp + **** s1 0.0 txresp|HTTP/1.1 200 OK\r + **** s1 0.0 txresp|Content-Length: 0\r + **** s1 0.0 txresp|\r + *** s1 0.0 shutting fd 6 + ** s1 0.0 Ending + *** h1 0.0 debug|00000000:be_app.srvcls[000a:000c] + *** h1 0.0 debug|00000000:be_app.clicls[000a:000c] + *** h1 0.0 debug|00000000:be_app.closed[000a:000c] + **** Slg_1 0.0 syslog|<134>Jun 7 14:12:51 haproxy[15787]: {"dip":"","dport":"0","c_ip":"::1","c_port":"46962","fe_name":"fe1","be_name":"be_app","s_name":"app1","ts":"cD","bytes_read":"38"} + ** Slg_1 0.0 === expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_p... + ---- Slg_1 0.0 EXPECT FAILED ~ "\"dip\":\"::1\",\"dport\":\"51782.*\"ts\":\"cD\",\"" + *** c1 0.0 closing fd 8 + ** c1 0.0 Ending + * top 0.0 RESETTING after /home/fred/src/varnish-cache-haproxy/d02286d.vtc + ** h1 0.0 Reset and free h1 haproxy 15787 + ** h1 0.0 Wait + ** h1 0.0 Stop HAProxy pid=15787 + **** h1 0.0 Kill(2)=0: Success + **** h1 0.0 STDOUT poll 0x10 + ** h1 0.1 WAIT4 pid=15787 status=0x0002 (user 0.000000 sys 0.004000) + ** s1 0.1 Waiting for server (4/-1) + ** Slg_1 0.1 Waiting for syslog server (5) + * top 0.1 TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc FAILED + # top TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc FAILED (0.131) exit=2 + +This test does not pass without the bug fix of d02286d commit. Indeed the third syslog +message received by 'Slg_1' syslog server does not match the regular expression +of the "syslog" "expect" command: + + expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_port}.*\"ts\":\"cD\",\" + +(the IP address and port are missing), contrary to what happens with the correct bug fix: + + **** top 0.0 extmacro def pwd=/home/fred/src/haproxy + **** top 0.0 extmacro def localhost=127.0.0.1 + **** top 0.0 extmacro def bad_backend=127.0.0.1 37284 + **** top 0.0 extmacro def bad_ip=192.0.2.255 + **** top 0.0 macro def testdir=//home/fred/src/varnish-cache-haproxy + **** top 0.0 macro def tmpdir=/tmp/vtc.12696.186b28b0 + * top 0.0 TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc starting + ** top 0.0 === varnishtest "HAPEE bug 2788" + * top 0.0 TEST HAPEE bug 2788 + ** top 0.0 === feature ignore_unknown_macro + ** top 0.0 === server s1 { + ** s1 0.0 Starting server + **** s1 0.0 macro def s1_addr=127.0.0.1 + **** s1 0.0 macro def s1_port=53384 + **** s1 0.0 macro def s1_sock=127.0.0.1 53384 + * s1 0.0 Listen on 127.0.0.1 53384 + ** top 0.0 === syslog Slg_1 -level notice { + ** Slg_1 0.0 Starting syslog server + **** Slg_1 0.0 macro def Slg_1_addr=127.0.0.1 + ** s1 0.0 Started on 127.0.0.1 53384 + **** Slg_1 0.0 macro def Slg_1_port=36195 + **** Slg_1 0.0 macro def Slg_1_sock=127.0.0.1 36195 + * Slg_1 0.0 Bound on 127.0.0.1 36195 + ** top 0.0 === haproxy h1 -conf { + ** Slg_1 0.0 Started on 127.0.0.1 36195 (level: 5) + ** Slg_1 0.0 === recv + **** h1 0.0 macro def h1_fe_1_sock=::1 39264 + **** h1 0.0 macro def h1_fe_1_addr=::1 + **** h1 0.0 macro def h1_fe_1_port=39264 + **** h1 0.0 setenv(fe_1, 7) + **** h1 0.0 conf| global + **** h1 0.0 conf|\tstats socket /tmp/vtc.12696.186b28b0/h1/stats.sock level admin mode 600 + **** h1 0.0 conf| + **** h1 0.0 conf| global + **** h1 0.0 conf| log 127.0.0.1:36195 local0 + **** h1 0.0 conf| + **** h1 0.0 conf| defaults + **** h1 0.0 conf| log global + **** h1 0.0 conf| timeout connect 3000 + **** h1 0.0 conf| timeout client 5 + **** h1 0.0 conf| timeout server 10000 + **** h1 0.0 conf| + **** h1 0.0 conf| frontend fe1 + **** h1 0.0 conf| bind "fd@${fe_1}" + **** h1 0.0 conf| mode tcp + **** h1 0.0 conf| log-format {\"dip\":\"%fi\",\"dport\":\"%fp\",\"c_ip\":\"%ci\",\"c_port\":\"%cp\",\"fe_name\":\"%ft\",\"be_name\":\"%b\",\"s_name\":\"%s\",\"ts\":\"%ts\",\"bytes_read\":\"%B\"} + **** h1 0.0 conf| default_backend be_app + **** h1 0.0 conf| + **** h1 0.0 conf| backend be_app + **** h1 0.0 conf| server app1 127.0.0.1:53384 check + ** h1 0.0 haproxy_start + **** h1 0.0 opt_worker 0 opt_daemon 0 opt_check_mode 0 + **** h1 0.0 argv|exec /home/fred/src/haproxy/haproxy -d -f /tmp/vtc.12696.186b28b0/h1/cfg + **** h1 0.0 XXX 9 @277 + *** h1 0.0 PID: 12728 + **** h1 0.0 macro def h1_pid=12728 + **** h1 0.0 macro def h1_name=/tmp/vtc.12696.186b28b0/h1 + ** top 0.0 === client c1 -connect ${h1_fe_1_sock} { + ** c1 0.0 Starting client + ** c1 0.0 Waiting for client + *** c1 0.0 Connect to ::1 39264 + *** c1 0.0 connected fd 8 from ::1 41245 to ::1 39264 + ** c1 0.0 === txreq -url "/" + **** c1 0.0 txreq|GET / HTTP/1.1\r + **** c1 0.0 txreq|Host: 127.0.0.1\r + **** c1 0.0 txreq|\r + ** c1 0.0 === delay 0.02 + *** c1 0.0 delaying 0.02 second(s) + *** h1 0.0 debug|Note: setting global.maxconn to 2000. + *** h1 0.0 debug|Available polling systems : + *** h1 0.0 debug| epoll : pref=300, + *** h1 0.0 debug| test result OK + *** h1 0.0 debug| poll : pref=200, test result OK + *** h1 0.0 debug| select : pref=150, test result FAILED + *** h1 0.0 debug|Total: 3 (2 usable), will use epoll. + *** h1 0.0 debug| + *** h1 0.0 debug|Available filters : + *** h1 0.0 debug|\t[SPOE] spoe + *** h1 0.0 debug|\t[COMP] compression + *** h1 0.0 debug|\t[TRACE] trace + *** h1 0.0 debug|Using epoll() as the polling mechanism. + **** Slg_1 0.0 syslog|<133>Jun 7 14:10:18 haproxy[12728]: Proxy fe1 started. + ** Slg_1 0.0 === recv + **** Slg_1 0.0 syslog|<133>Jun 7 14:10:18 haproxy[12728]: Proxy be_app started. + ** Slg_1 0.0 === recv info + *** h1 0.0 debug|00000000:fe1.accept(0007)=000c from [::1:41245] ALPN=<none> + *** s1 0.0 accepted fd 6 127.0.0.1 49946 + ** s1 0.0 === rxreq + **** s1 0.0 rxhdr|GET / HTTP/1.1\r + **** s1 0.0 rxhdr|Host: 127.0.0.1\r + **** s1 0.0 rxhdr|\r + **** s1 0.0 rxhdrlen = 35 + **** s1 0.0 http[ 0] |GET + **** s1 0.0 http[ 1] |/ + **** s1 0.0 http[ 2] |HTTP/1.1 + **** s1 0.0 http[ 3] |Host: 127.0.0.1 + **** s1 0.0 bodylen = 0 + ** s1 0.0 === txresp + **** s1 0.0 txresp|HTTP/1.1 200 OK\r + **** s1 0.0 txresp|Content-Length: 0\r + **** s1 0.0 txresp|\r + *** s1 0.0 shutting fd 6 + ** s1 0.0 Ending + *** h1 0.0 debug|00000000:be_app.srvcls[000c:adfd] + *** h1 0.0 debug|00000000:be_app.clicls[000c:adfd] + *** h1 0.0 debug|00000000:be_app.closed[000c:adfd] + **** Slg_1 0.0 syslog|<134>Jun 7 14:10:18 haproxy[12728]: {"dip":"::1","dport":"39264","c_ip":"::1","c_port":"41245","fe_name":"fe1","be_name":"be_app","s_name":"app1","ts":"cD","bytes_read":"38"} + ** Slg_1 0.0 === expect ~ \"dip\":\"${h1_fe_1_addr}\",\"dport\":\"${h1_fe_1_p... + **** Slg_1 0.0 EXPECT MATCH ~ "\"dip\":\"::1\",\"dport\":\"39264.*\"ts\":\"cD\",\"" + *** Slg_1 0.0 shutting fd 5 + ** Slg_1 0.0 Ending + *** c1 0.0 closing fd 8 + ** c1 0.0 Ending + ** top 0.0 === syslog Slg_1 -wait + ** Slg_1 0.0 Waiting for syslog server (-1) + * top 0.0 RESETTING after /home/fred/src/varnish-cache-haproxy/d02286d.vtc + ** h1 0.0 Reset and free h1 haproxy 12728 + ** h1 0.0 Wait + ** h1 0.0 Stop HAProxy pid=12728 + **** h1 0.0 Kill(2)=0: Success + **** h1 0.0 STDOUT poll 0x10 + ** h1 0.1 WAIT4 pid=12728 status=0x0002 (user 0.000000 sys 0.004000) + ** s1 0.1 Waiting for server (4/-1) + * top 0.1 TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc completed + # top TEST /home/fred/src/varnish-cache-haproxy/d02286d.vtc passed (0.128) + +In this latter execution the third syslog message is correct: + + **** Slg_1 0.0 syslog|<134>Jun 7 14:10:18 haproxy[12728]: {"dip":"::1","dport":"39264","c_ip":"::1","c_port":"41245","fe_name":"fe1","be_name":"be_app","s_name":"app1","ts":"cD","bytes_read":"38"} diff --git a/doc/seamless_reload.txt b/doc/seamless_reload.txt new file mode 100644 index 0000000..94df1bd --- /dev/null +++ b/doc/seamless_reload.txt @@ -0,0 +1,62 @@ +Reloading HAProxy without impacting server states +================================================= + +Of course, to fully understand below information please consult +doc/configuration.txt to understand how each HAProxy directive works. + +In the mean line, we update HAProxy's configuration to tell it where to +retrieve the last know trustable servers state. +Then, before reloading HAProxy, we simply dump servers state from running +process into the locations we pointed into the configuration. +And voilĂ :) + + +Using one file for all backends +------------------------------- + +HAProxy configuration +********************* + + global + [...] + stats socket /var/run/haproxy/socket + server-state-file global + server-state-base /var/state/haproxy/ + + defaults + [...] + load-server-state-from-file global + +HAProxy init script +******************* + +Run the following command BEFORE reloading: + + socat /var/run/haproxy/socket - <<< "show servers state" > /var/state/haproxy/global + + +Using one state file per backend +-------------------------------- + +HAProxy configuration +********************* + + global + [...] + stats socket /var/run/haproxy/socket + server-state-base /var/state/haproxy/ + + defaults + [...] + load-server-state-from-file local + +HAProxy init script +******************* + +Run the following command BEFORE reloading: + + for b in $(socat /var/run/haproxy/socket - <<< "show backend" | fgrep -v '#') + do + socat /var/run/haproxy/socket - <<< "show servers state $b" > /var/state/haproxy/$b + done + |