Adding upstream version 2.6.12.upstream/2.6.12 upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-28 09:35:11 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-28 09:35:11 +0000
commit: da76459dc21b5af2449af2d36eb95226cb186ce2 (patch)
tree: 542ebb3c1e796fac2742495b8437331727bbbfa0 /doc/internals/api
parent: Initial commit. (diff)
download: haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.tar.xz
haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.zip
10 files changed, 4266 insertions, 0 deletions
diff --git a/doc/internals/api/appctx.txt b/doc/internals/api/appctx.txt
new file mode 100644
index 0000000..137ec7b
--- /dev/null
+++ b/doc/internals/api/appctx.txt
@@ -0,0 +1,142 @@
+Instantiation of applet contexts (appctx) in 2.6.
+
+
+1. Background
+
+Most applets are in fact simplified services that are called by the CLI when a
+registered keyword is matched. Some of them only have a ->parse() function
+which immediately returns with a final result, while others will return zero
+asking for the->io_handler() one to be called till the end. For these ones, a
+context is generally needed between calls to know where to restart from.
+
+Other applets are completely autonomous applets with their init function and
+an I/O handler, and these ones also need a persistent context between calls to
+the I/O handler. These ones are typically instantiated by "use-service" or by
+other means.
+
+Originally a few integers were provided to keep a trivial state (st0, st1, st2)
+and these ones progressively proved insufficient, leading to a "ctx.cli" sub-
+context that was allowed to use extra fields of various types. Other applets
+preferred to use their own context definition.
+
+All this resulted in the appctx->ctx to contain a myriad of definitions of
+various service contexts, and in some services abusing other services'
+definitions by laziness, and others being extended to use their own definition
+after having run for a long time on the generic types, some of which were not
+noticed and mistakenly used the same storage locations by accident. A massive
+cleanup was needed.
+
+
+2. New approach in 2.6
+
+In 2.6, there's an "svcctx" pointer that's initialized to NULL before any
+instantiation of an applet or of a CLI keyword's function. Applets and keyword
+handlers are free to make it point wherever they want, and to find it unaltered
+between subsequent calls, including up to the ->release() call. The "st2" state
+that was totally abused with random enums is not used anymore and was marked as
+deprecated. It's still initialized to zero before the first call though.
+
+One special area, "svc.storage[]", is large enough to contain any of the
+contexts that used to be present under "appctx->ctx". The "svcctx" may be set
+to point to this area so that a small structure can be allocated for free and
+without requiring error checking. In order to make this easier, a specially
+purposed function is provided: "applet_reserve_svcctx()". This function will
+require the caller to indicate how large an area it needs, and will return a
+pointer to this area after checking that it fits. If it does not, haproxy will
+crash. This is purposely done so that it's known during development that if a
+small structure doesn't fit, a different approach is required.
+
+As such, for the vast majority of commands, the process is the following one:
+
+  struct foo_ctx {
+     int myfield1;
+     int myfield2;
+     char *myfield3;
+  };
+
+  int io_handler(struct appctx *appctx)
+  {
+      struct foo_ctx *ctx = applet_reserve_svcctx(appctx, sizeof(*ctx));
+
+        if (!ctx->myfield1) {
+            /* first call */
+            ctx->myfield1++;
+        }
+        ...
+  }
+
+The pointer may be directly accessed from the I/O handler if it's known that it
+was already reserved by the init handler or parsing function. Otherwise it's
+guaranteed to be NULL so that can also serve as a test for a first call:
+
+  int parse_handler(struct appctx *appctx)
+  {
+      struct foo_ctx *ctx = applet_reserve_svcctx(appctx, sizeof(*ctx));
+      ctx->myfield1 = 12;
+      return 0;
+  }
+
+  int io_handler(struct appctx *appctx)
+  {
+      struct foo_ctx *ctx = appctx->svcctx;
+
+      for (; !ctx->myfield1; ctx->myfield1--) {
+          do_something();
+      }
+      ...
+  }
+
+There is no need to free anything because that space is not allocated but just
+points to a reserved area.
+
+If it is too small (its size is APPLET_MAX_SVCCTX bytes), it is preferable to
+use it with dynamically allocated structures (pools, malloc, etc). For example:
+
+  int io_handler(struct appctx *appctx)
+  {
+      struct foo_ctx *ctx = appctx->svcctx;
+
+      if (!ctx) {
+          /* first call */
+          ctx = pool_alloc(pool_foo_ctx);
+          if (!ctx)
+              return 1;
+      }
+      ...
+  }
+
+  void io_release(struct appctx *appctx)
+  {
+      pool_free(pool_foo_ctx, appctx->svcctx);
+  }
+
+The CLI code itself uses this mechanism for the cli_print_*() functions. Since
+these functions are terminal (i.e. not meant to be used in the middle of an I/O
+handler as they share the same contextual space), they always reset the svcctx
+pointer to place it to the "cli_print_ctx" mapped in ->svc.storage.
+
+
+3. Transition for old code
+
+A lot of care was taken to make the transition as smooth as possible for
+out-of-tree code since that's an API change. A dummy "ctx.cli" struct still
+exists in the appctx struct, and it happens to map perfectly to the one set by
+cli_print_*, so that if some code uses a mix of both, it will still work.
+However, it will build with "deprecated" warnings allowing to spot the
+remaining places. It's a good exercise to rename "ctx.cli" in "appctx" and see
+if the code still compiles.
+
+Regarding the "st2" sub-state, it will disappear as well after 2.6, but is
+still provided and initialized so that code relying on it will still work even
+if it builds with deprecation warnings. The correct approach is to move this
+state into the newly defined applet's context, and to stop using the stats
+enums STAT_ST_* that often barely match the needs and result in code that is
+more complicated than desired (the STAT_ST_* enum values have also been marked
+as deprecated).
+
+The code dealing with "show fd", "show sess" and the peers applet show good
+examples of how to convert a registered keyword or an applet.
+
+All this transition code requires complex layouts that will be removed during
+2.7-dev so there is no other long-term option but to update the code (or better
+get it merged if it can be useful to other users).
diff --git a/doc/internals/api/buffer-api.txt b/doc/internals/api/buffer-api.txt
new file mode 100644
index 0000000..ac35300
--- /dev/null
+++ b/doc/internals/api/buffer-api.txt
@@ -0,0 +1,653 @@
+2018-07-13 - HAProxy Internal Buffer API
+
+
+1. Background
+
+HAProxy uses a "struct buffer" internally to store data received from external
+agents, as well as data to be sent to external agents. These buffers are also
+used during data transformation such as compression, header insertion or
+defragmentation, and are used to carry intermediary representations between the
+various internal layers. They support wrapping at the end, and they carry their
+own size information so that in theory it would be possible to use different
+buffer sizes in parallel even though this is not currently implemented.
+
+The format of this structure has evolved over time, to reach a point where it
+is convenient and versatile enough to have permitted to make several internal
+types converge into a single one (specifically the struct chunk disappeared).
+
+
+2. Representation as of 1.9-dev1
+
+The current buffer representation consists in a linear storage area of known
+size, with a head position indicating the oldest data, and a total data count
+expressed in bytes. The head position, data count and size are expressed as
+integers and are positive or null. By convention, the head position is strictly
+smaller than the buffer size and the data count is smaller than or equal to the
+size, so that wrapping can be resolved with a single subtract. A buffer not
+respecting these rules is said to be degenerate. Unless specified otherwise,
+the various API functions will adopt an undefined behaviour when passed such a
+degenerate buffer.
+
+ Buffer declaration :
+
+    struct buffer {
+        size_t size;     // size of the storage area (wrapping point)
+        char  *area;     // start of the storage area
+        size_t data;     // contents length after head
+        size_t head;     // start offset of remaining data relative to area
+    };
+
+
+ Linear buffer representation :
+
+    area
+      |
+      V<--------------------------------------------------------->| size
+      +-----------+---------------------------------+-------------+
+      |           |/////////////////////////////////|             |
+      +-----------+---------------------------------+-------------+
+      |<--------->|<------------------------------->|
+           head                 data                ^
+                                                    |
+                                                   tail
+
+
+ Wrapping buffer representation :
+
+    area
+      |
+      V<--------------------------------------------------------->| size
+      +---------------+------------------------+------------------+
+      |///////////////|                        |//////////////////|
+      +---------------+------------------------+------------------+
+      |<-------------------------------------->| head
+      |-------------->| ...data         data...|<-----------------|
+                      ^
+                      |
+                     tail
+
+
+3. Terminology
+
+Manipulating a buffer just based on a head and a wrapping data count is not
+very convenient, so we define a certain number of terms for important elements
+characterizing a buffer :
+
+   - origin    : pointer to relative position 0 in the storage area. Undefined
+                 when the buffer is not allocated.
+
+   - size      : the allocated size of the storage area starting at the origin,
+                 expressed in bytes. A buffer whose size is zero is said not to
+                 be allocated, and its origin in this case is undefined.
+
+   - data      : the amount of data the buffer contains, in bytes. It is always
+                 lower than or equal to the buffer's size, hence it is always 0
+                 for an unallocated buffer.
+
+   - emptiness : a buffer is said to be empty when it contains no data, hence
+                 data == 0. It is possible for such buffers not to be allocated
+                 and to have size == 0 as well.
+
+   - room      : the available space in the buffer. This is its size minus data.
+
+   - head      : position relative to origin where the oldest data byte is found
+                 (it typically is what send() uses to pick outgoing data). The
+                 head is strictly smaller than the size.
+
+   - tail      : position relative to origin where the first spare byte is found
+                 (it typically is what recv() uses to store incoming data). It
+                 is always equal to the buffer's data added to its head modulo
+                 the buffer's size.
+
+   - wrapping  : the byte following the last one of the storage area loops back
+                 to position 0. This is called wrapping. The wrapping point is
+                 the first position relative to origin which doesn't belong to
+                 the storage area. There is no wrapping when a buffer is not
+                 allocated. Wrapping requires special care and means that the
+                 regular string manipulation functions are not usable on most
+                 buffers, unless it is known that no wrapping happens. Free
+                 space may wrap as well if the buffer only contains data in the
+                 middle.
+
+   - alignment : a buffer is said to be aligned if its data do not wrap. That
+                 is, its head is strictly before the tail, or the buffer is
+                 empty and the head is null. Aligning a buffer may be required
+                 to use regular string manipulation functions which have no
+                 support for wrapping.
+
+
+A buffer may be in three different states :
+   - unallocated : size == 0, area == 0 (b_is_null() is true)
+   - waiting     : size == 0, area != 0
+   - allocated   : size  > 0, area  > 0
+
+It is not permitted to have area == 0 with a non-null size. In addition, the
+waiting state may also be used to indicate a read-only buffer which does not
+wrap and which must not be freed (e.g. for use with error messages).
+
+The basic API only covers allocated buffers. Switching to/from the other states
+is covered by the management API since it requires specific allocation and free
+calls.
+
+
+4. Using buffers
+
+Buffers are defined in a few files :
+   - include/common/buf.h    : structure definition, and manipulation functions
+   - include/common/buffer.h : resource management (alloc/free/wait lists)
+   - include/common/istbuf.h : advanced string manipulation
+
+
+4.1. Basic API
+
+The basic API is made of the functions which abstract accesses to the buffers
+and which help calculating their state, free space or used space.
+
+====================+==================+=======================================
+Function            | Arguments/Return | Description
+--------------------+------------------+---------------------------------------
+b_is_null()         | const buffer *buf| returns true if (and only if) the
+                    | ret: int         | buffer is not yet allocated and thus
+                    |                  | points to a NULL area
+--------------------+------------------+---------------------------------------
+b_orig()            | const buffer *buf| returns the pointer to the origin of
+                    | ret: char *      | the storage, which is the location of
+                    |                  | byte at offset zero. This is mostly
+                    |                  | used by functions which handle the
+                    |                  | wrapping by themselves
+--------------------+------------------+---------------------------------------
+b_size()            | const buffer *buf| returns the size of the buffer
+                    | ret: size_t      |
+--------------------+------------------+---------------------------------------
+b_wrap()            | const buffer *buf| returns the pointer to the wrapping
+                    | ret: char *      | position of the buffer area, which is
+                    |                  | by definition the first byte not part
+                    |                  | of the buffer
+--------------------+------------------+---------------------------------------
+b_data()            | const buffer *buf| returns the number of bytes present in
+                    | ret: size_t      | the buffer
+--------------------+------------------+---------------------------------------
+b_room()            | const buffer *buf| returns the amount of room left in the
+                    | ret: size_t      | buffer
+--------------------+------------------+---------------------------------------
+b_full()            | const buffer *buf| returns true if the buffer is full
+                    | ret: int         |
+--------------------+------------------+---------------------------------------
+__b_stop()          | const buffer *buf| returns a pointer to the byte
+                    | ret: char *      | following the end of the buffer, which
+                    |                  | may be out of the buffer if the buffer
+                    |                  | ends on the last byte of the area. It
+                    |                  | is the caller's responsibility to
+                    |                  | either know that the buffer does not
+                    |                  | wrap or to check that the result does
+                    |                  | not wrap
+--------------------+------------------+---------------------------------------
+__b_stop_ofs()      | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the byte following the end
+                    |                  | of the buffer, which may be out of the
+                    |                  | buffer if the buffer ends on the last
+                    |                  | byte of the area. It's the caller's
+                    |                  | responsibility to either know that the
+                    |                  | buffer does not wrap or to check that
+                    |                  | the result does not wrap
+--------------------+------------------+---------------------------------------
+b_stop()            | const buffer *buf| returns the pointer to the byte
+                    | ret: char *      | following the end of the buffer, which
+                    |                  | may be out of the buffer if the buffer
+                    |                  | ends on the last byte of the area
+--------------------+------------------+---------------------------------------
+b_stop_ofs()        | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the byte following the end
+                    |                  | of the buffer, which may be out of the
+                    |                  | buffer if the buffer ends on the last
+                    |                  | byte of the area
+--------------------+------------------+---------------------------------------
+__b_peek()          | const buffer *buf| returns a pointer to the data at
+                    | size_t ofs       | position <ofs> relative to the head of
+                    | ret: char *      | the buffer. Will typically point to
+                    |                  | input data if called with the amount
+                    |                  | of output data. It's the caller's
+                    |                  | responsibility to either know that the
+                    |                  | buffer does not wrap or to check that
+                    |                  | the result does not wrap
+--------------------+------------------+---------------------------------------
+__b_peek_ofs()      | const buffer *buf| returns an origin-relative offset
+                    | size_t ofs       | pointing to the data at position <ofs>
+                    | ret: size_t      | relative to the head of the
+                    |                  | buffer. Will typically point to input
+                    |                  | data if called with the amount of
+                    |                  | output data. It's the caller's
+                    |                  | responsibility to either know that the
+                    |                  | buffer does not wrap or to check that
+                    |                  | the result does not wrap
+--------------------+------------------+---------------------------------------
+b_peek()            | const buffer *buf| returns a pointer to the data at
+                    | size_t ofs       | position <ofs> relative to the head of
+                    | ret: char *      | the buffer. Will typically point to
+                    |                  | input data if called with the amount
+                    |                  | of output data. If applying <ofs> to
+                    |                  | the buffers' head results in a
+                    |                  | position between <size> and 2*>size>-1
+                    |                  | included, a wrapping compensation is
+                    |                  | applied to the result
+--------------------+------------------+---------------------------------------
+b_peek_ofs()        | const buffer *buf| returns an origin-relative offset
+                    | size_t ofs       | pointing to the data at position <ofs>
+                    | ret: size_t      | relative to the head of the
+                    |                  | buffer. Will typically point to input
+                    |                  | data if called with the amount of
+                    |                  | output data. If applying <ofs> to the
+                    |                  | buffers' head results in a position
+                    |                  | between <size> and 2*>size>-1
+                    |                  | included, a wrapping compensation is
+                    |                  | applied to the result
+--------------------+------------------+---------------------------------------
+__b_head()          | const buffer *buf| returns the pointer to the buffer's
+                    | ret: char *      | head, which is the location of the
+                    |                  | next byte to be dequeued. The result
+                    |                  | is undefined for unallocated buffers
+--------------------+------------------+---------------------------------------
+__b_head_ofs()      | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the buffer's head, which
+                    |                  | is the location of the next byte to be
+                    |                  | dequeued. The result is undefined for
+                    |                  | unallocated buffers
+--------------------+------------------+---------------------------------------
+b_head()            | const buffer *buf| returns the pointer to the buffer's
+                    | ret: char *      | head, which is the location of the
+                    |                  | next byte to be dequeued. The result
+                    |                  | is undefined for unallocated
+                    |                  | buffers. If applying <ofs> to the
+                    |                  | buffers' head results in a position
+                    |                  | between <size> and 2*>size>-1
+                    |                  | included, a wrapping compensation is
+                    |                  | applied to the result
+--------------------+------------------+---------------------------------------
+b_head_ofs()        | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the buffer's head, which
+                    |                  | is the location of the next byte to be
+                    |                  | dequeued. The result is undefined for
+                    |                  | unallocated buffers.  If applying
+                    |                  | <ofs> to the buffers' head results in
+                    |                  | a position between <size> and
+                    |                  | 2*>size>-1 included, a wrapping
+                    |                  | compensation is applied to the result
+--------------------+------------------+---------------------------------------
+__b_tail()          | const buffer *buf| returns the pointer to the tail of the
+                    | ret: char *      | buffer, which is the location of the
+                    |                  | first byte where it is possible to
+                    |                  | enqueue new data. The result is
+                    |                  | undefined for unallocated buffers
+--------------------+------------------+---------------------------------------
+__b_tail_ofs()      | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the tail of the buffer,
+                    |                  | which is the location of the first
+                    |                  | byte where it is possible to enqueue
+                    |                  | new data. The result is undefined for
+                    |                  | unallocated buffers
+--------------------+------------------+---------------------------------------
+b_tail()            | const buffer *buf| returns the pointer to the tail of the
+                    | ret: char *      | buffer, which is the location of the
+                    |                  | first byte where it is possible to
+                    |                  | enqueue new data. The result is
+                    |                  | undefined for unallocated buffers
+--------------------+------------------+---------------------------------------
+b_tail_ofs()        | const buffer *buf| returns an origin-relative offset
+                    | ret: size_t      | pointing to the tail of the buffer,
+                    |                  | which is the location of the first
+                    |                  | byte where it is possible to enqueue
+                    |                  | new data. The result is undefined for
+                    |                  | unallocated buffers
+--------------------+------------------+---------------------------------------
+b_next()            | const buffer *buf| for an absolute pointer <p> pointing
+                    | const char *p    | to a valid location within buffer <b>,
+                    | ret: char *      | returns the absolute pointer to the
+                    |                  | next byte, which usually is at (p + 1)
+                    |                  | unless p reaches the wrapping point
+                    |                  | and wrapping is needed
+--------------------+------------------+---------------------------------------
+b_next_ofs()        | const buffer *buf| for an origin-relative offset <o>
+                    | size_t o         | pointing to a valid location within
+                    | ret: size_t      | buffer <b>, returns either the
+                    |                  | relative offset pointing to the next
+                    |                  | byte, which usually is at (o + 1)
+                    |                  | unless o reaches the wrapping point
+                    |                  | and wrapping is needed
+--------------------+------------------+---------------------------------------
+b_dist()            | const buffer *buf| returns the distance between two
+                    | const char *from | pointers, taking into account the
+                    | const char *to   | ability to wrap around the buffer's
+                    | ret: size_t      | end. The operation is not defined if
+                    |                  | either of the pointers does not belong
+                    |                  | to the buffer or if their distance is
+                    |                  | greater than the buffer's size
+--------------------+------------------+---------------------------------------
+b_almost_full()     | const buffer *buf| returns 1 if the buffer uses at least
+                    | ret: int         | 3/4 of its capacity, otherwise
+                    |                  | zero. Buffers of size zero are
+                    |                  | considered full
+--------------------+------------------+---------------------------------------
+b_space_wraps()     | const buffer *buf| returns non-zero only if the buffer's
+                    | ret: int         | free space wraps, which means that the
+                    |                  | buffer contains data that are not
+                    |                  | touching at least one edge
+--------------------+------------------+---------------------------------------
+b_contig_data()     | const buffer *buf| returns the amount of data that can
+                    | size_t start     | contiguously be read at once starting
+                    | ret: size_t      | from a relative offset <start> (which
+                    |                  | allows to easily pre-compute blocks
+                    |                  | for memcpy). The start point will
+                    |                  | typically contain the amount of past
+                    |                  | data already returned by a previous
+                    |                  | call to this function
+--------------------+------------------+---------------------------------------
+b_contig_space()    | const buffer *buf| returns the amount of bytes that can
+                    | ret: size_t      | be appended to the buffer at once
+--------------------+------------------+---------------------------------------
+b_getblk()          | const buffer *buf| gets one full block of data at once
+                    | char *blk        | from a buffer, starting from offset
+                    | size_t len       | <offset> after the buffer's head, and
+                    | size_t offset    | limited to no more than <len> bytes.
+                    | ret: size_t      | The caller is responsible for ensuring
+                    |                  | that neither <offset> nor <offset> +
+                    |                  | <len> exceed the total number of bytes
+                    |                  | available in the buffer.  Return zero
+                    |                  | if not enough data was available, in
+                    |                  | which case blk is left undefined, or
+                    |                  | the number of bytes read which is
+                    |                  | equal to the requested size
+--------------------+------------------+---------------------------------------
+b_getblk_nc()       | const buffer *buf| gets one or two blocks of data at once
+                    | const char **blk1| from a buffer, starting from offset
+                    | size_t *len1     | <ofs> after the beginning of its
+                    | const char **blk2| output, and limited to no more than
+                    | size_t *len2     | <max> bytes. The caller is responsible
+                    | size_t ofs       | for ensuring that neither <ofs> nor
+                    | size_t max       | <ofs>+<max> exceed the total number of
+                    | ret: int         | bytes available in the buffer. Returns
+                    |                  | 0 if not enough data were available,
+                    |                  | or the number of blocks filled (1 or
+                    |                  | 2). <blk1> is always filled before
+                    |                  | <blk2>. The unused blocks are left
+                    |                  | undefined, and the buffer is left
+                    |                  | unaffected. Unused buffers are left in
+                    |                  | an undefined state
+--------------------+------------------+---------------------------------------
+b_reset()           | buffer *buf      | resets a buffer. The size is not
+                    | ret: void        | touched. In practice it resets the
+                    |                  | head and the data length
+--------------------+------------------+---------------------------------------
+b_sub()             | buffer *buf      | decreases the buffer length by <count>
+                    | size_t count     | without touching the head position
+                    | ret: void        | (only the tail moves). this may mostly
+                    |                  | be used to trim pending data before
+                    |                  | reusing a buffer. The caller is
+                    |                  | responsible for not removing more than
+                    |                  | the available data
+--------------------+------------------+---------------------------------------
+b_add()             | buffer *buf      | increase the buffer length by <count>
+                    | size_t count     | without touching the head position
+                    | ret: void        | (only the tail moves). This is used
+                    |                  | when adding data at the tail of a
+                    |                  | buffer. The caller is responsible for
+                    |                  | not adding more than the available
+                    |                  | room
+--------------------+------------------+---------------------------------------
+b_set_data()        | buffer *buf      | sets the buffer's length, by adjusting
+                    | size_t len       | the buffer's tail only. The caller is
+                    | ret: void        | responsible for passing a valid length
+--------------------+------------------+---------------------------------------
+b_del()             | buffer *buf      | deletes <del> bytes at the head of
+                    | size_t del       | buffer <b> and updates the head. The
+                    | ret: void        | caller is responsible for not removing
+                    |                  | more than the available data. This is
+                    |                  | used after sending data from the
+                    |                  | buffer
+--------------------+------------------+---------------------------------------
+b_realign_if_empty()| buffer *buf      | realigns a buffer if it's empty, does
+                    | ret: void        | nothing otherwise. This is mostly used
+                    |                  | after b_del() to make an empty
+                    |                  | buffer's free space contiguous
+--------------------+------------------+---------------------------------------
+b_slow_realign()    | buffer *buf      | realigns a possibly wrapping buffer so
+                    | size_t output    | that the part remaining to be parsed
+                    | ret: void        | is contiguous and starts at the
+                    |                  | beginning of the buffer and the
+                    |                  | already parsed output part ends at the
+                    |                  | end of the buffer. This provides the
+                    |                  | best conditions since it allows the
+                    |                  | largest inputs to be processed at once
+                    |                  | and ensures that once the output data
+                    |                  | leaves, the whole buffer is available
+                    |                  | at once. The number of output bytes
+                    |                  | supposedly present at the beginning of
+                    |                  | the buffer and which need to be moved
+                    |                  | to the end must be passed in <output>.
+                    |                  | It will effectively make this offset
+                    |                  | the new wrapping point. A temporary
+                    |                  | swap area at least as large as b->size
+                    |                  | must be provided in <swap>.  It's up
+                    |                  | to the caller to ensure <output> is no
+                    |                  | larger than the difference between the
+                    |                  | whole buffer's length and its input
+--------------------+------------------+---------------------------------------
+b_putchar()         | buffer *buf      | tries to append char <c> at the end of
+                    | char c           | buffer <b>. Supports wrapping. New
+                    | ret: void        | data are silently discarded if the
+                    |                  | buffer is already full
+--------------------+------------------+---------------------------------------
+b_putblk()          | buffer *buf      | tries to append block <blk> at the end
+                    | const char *blk  | of buffer <b>. Supports wrapping. Data
+                    | size_t len       | are truncated if the buffer is too
+                    | ret: size_t      | short or if not enough space is
+                    |                  | available. It returns the number of
+                    |                  | bytes really copied
+--------------------+------------------+---------------------------------------
+b_move()            | buffer *buf      | moves block (src,len) left or right
+                    | size_t src       | by <shift> bytes, supporting wrapping
+                    | size_t len       | and overlapping.
+                    | size_t shift     |
+--------------------+------------------+---------------------------------------
+b_rep_blk()         | buffer *buf      | writes the block <blk> at position
+                    | char *pos        | <pos> which must be in buffer <b>, and
+                    | char *end        | moves the part between <end> and the
+                    | const char *blk  | buffer's tail just after the end of
+                    | size_t len       | the copy of <blk>. This effectively
+                    | ret: int         | replaces the part located between
+                    |                  | <pos> and <end> with a copy of <blk>
+                    |                  | of length <len>. The buffer's length
+                    |                  | is automatically updated. This is used
+                    |                  | to replace a block with another one
+                    |                  | inside a buffer. The shift value
+                    |                  | (positive or negative) is returned. If
+                    |                  | there's no space left, the move is not
+                    |                  | done. If <len> is null, the <blk>
+                    |                  | pointer is allowed to be null, in
+                    |                  | order to erase a block
+--------------------+------------------+---------------------------------------
+b_xfer()            | buffer *src      | transfers at most <count> bytes from
+                    | buffer *dst      | buffer <src> to buffer <dst> and
+                    | size_t cout      | returns the number of bytes copied.
+                    | ret: size_t      | The bytes are removed from <src> and
+                    |                  | added to <dst>. The caller guarantees
+                    |                  | that <count> is <= b_room(dst)
+====================+==================+=======================================
+
+
+4.2. String API
+
+The string API aims at providing both convenient and efficient ways to read and
+write to/from buffers using indirect strings (ist). These strings and some
+associated functions are defined in ist.h.
+
+====================+==================+=======================================
+Function            | Arguments/Return | Description
+--------------------+------------------+---------------------------------------
+b_isteq()           | const buffer *b  | b_isteq() : returns > 0 if the first
+                    | size_t o         | <n> characters of buffer <b> starting
+                    | size_t n         | at offset <o> relative to the buffer's
+                    | const ist ist    | head match <ist>. (empty strings do
+                    | ret: int         | match). It is designed to be used with
+                    |                  | reasonably small strings (it matches a
+                    |                  | single byte per loop iteration). It is
+                    |                  | expected to be used with an offset to
+                    |                  | skip old data. Return value number of
+                    |                  | matching bytes if >0, not enough bytes
+                    |                  | or empty string if 0, or non-matching
+                    |                  | byte found if <0.
+--------------------+------------------+---------------------------------------
+b_isteat            | struct buffer *b | b_isteat() : "eats" string <ist> from
+                    | const ist ist    | the head of buffer <b>. Wrapping data
+                    | ret: ssize_t     | is explicitly supported. It matches a
+                    |                  | single byte per iteration so strings
+                    |                  | should remain reasonably small.
+                    |                  | Returns the number of bytes matched
+                    |                  | and eaten if >0, not enough bytes or
+                    |                  | matched empty string if 0, or non
+                    |                  | matching byte found if <0.
+--------------------+------------------+---------------------------------------
+b_istput            | struct buffer *b | b_istput() : injects string <ist> at
+                    | const ist ist    | the tail of output buffer <b> provided
+                    | ret: ssize_t     | that it fits. Wrapping is supported.
+                    |                  | It's designed for small strings as it
+                    |                  | only writes a single byte per
+                    |                  | iteration. Returns the number of
+                    |                  | characters copied (ist.len), 0 if it
+                    |                  | temporarily does not fit, or -1 if it
+                    |                  | will never fit. It will only modify
+                    |                  | the buffer upon success. In all cases,
+                    |                  | the contents are copied prior to
+                    |                  | reporting an error, so that the
+                    |                  | destination at least contains a valid
+                    |                  | but truncated string.
+--------------------+------------------+---------------------------------------
+b_putist            | struct buffer *b | b_putist() : tries to copy as much as
+                    | const ist ist    | possible of string <ist> into buffer
+                    | ret: size_t      | <b> and returns the number of bytes
+                    |                  | copied (truncation is possible). It
+                    |                  | uses b_putblk() and is suitable for
+                    |                  | large blocks.
+====================+==================+=======================================
+
+
+4.3. Management API
+
+The management API makes a distinction between an empty buffer, which by
+definition is not allocated but is ready to be allocated at any time, and a
+buffer which failed an allocation and is waiting for an available area to be
+offered. The functions allow to register on a list to be notified about buffer
+availability, to notify others of a number of buffers just released, and to be
+and to be notified of buffer availability. All allocations are made through the
+standard buffer pools.
+
+====================+==================+=======================================
+Function            | Arguments/Return | Description
+--------------------+------------------+---------------------------------------
+buffer_almost_full  | const buffer *buf| returns true if the buffer is not null
+                    | ret: int         | and at least 3/4 of the buffer's space
+                    |                  | are used. A waiting buffer will match.
+--------------------+------------------+---------------------------------------
+b_alloc             | buffer *buf      | ensures that <buf> is allocated or
+                    | ret: buffer *    | allocates a buffer and assigns it to
+                    |                  | *buf. If no memory is available, (1)
+                    |                  | is assigned instead with a zero size.
+                    |                  | The allocated buffer is returned, or
+                    |                  | NULL in case no memory is available
+--------------------+------------------+---------------------------------------
+__b_free            | buffer *buf      | releases <buf> which must be allocated
+                    | ret: void        | and marks it empty
+--------------------+------------------+---------------------------------------
+b_free              | buffer *buf      | releases <buf> only if it is allocated
+                    | ret: void        | and marks it empty
+--------------------+------------------+---------------------------------------
+offer_buffers()     | void *from       | offer a buffer currently belonging to
+                    | uint threshold   | target <from> to whoever needs
+                    | ret: void        | one. Any pointer is valid for <from>,
+                    |                  | including NULL. Its purpose is to
+                    |                  | avoid passing a buffer to oneself in
+                    |                  | case of failed allocations (e.g. need
+                    |                  | two buffers, get one, fail, release it
+                    |                  | and wake up self again). In case of
+                    |                  | normal buffer release where it is
+                    |                  | expected that the caller is not
+                    |                  | waiting for a buffer, NULL is fine
+====================+==================+=======================================
+
+
+5. Porting code from older versions
+
+The previous buffer API introduced in 1.5-dev9 (May 2012) used to look like the
+following (with the struct renamed to old_buffer here to avoid confusion during
+quick lookups at the doc). It's worth noting that the "data" field used to be
+part of the struct but with a different type and meaning. It's important to be
+careful about potential code making use of &b->data as it will silently compile
+but fail.
+
+ Previous buffer declaration :
+
+    struct old_buffer {
+        char *p;                        /* buffer's start pointer, separates in and out data */
+        unsigned int size;              /* buffer size in bytes */
+        unsigned int i;                 /* number of input bytes pending for analysis in the buffer */
+        unsigned int o;                 /* number of out bytes the sender can consume from this buffer */
+        char data[0];                   /* <size> bytes */
+    };
+
+ Previous linear buffer representation :
+
+    data                               p
+      |                                |
+      V                                V
+      +-----------+--------------------+------------+-------------+
+      |           |////////////////////|////////////|             |
+      +-----------+--------------------+------------+-------------+
+       <---------------------------------------------------------> size
+                   <------------------> <---------->
+                            o                i
+
+There is this correspondence between old and new fields (some will involve a
+knowledge of a channel when the output byte count is required) :
+
+     Old    | New
+    --------+----------------------------------------------------
+     p      | data + head + co_data(channel) // ci_head(channel)
+     size   | size
+     i      | data - co_data(channel) // ci_data(channel)
+     o      | co_data(channel) // channel->output
+     data   | area
+    --------+-----------------------------------------------------
+
+Then some common expressions can be mapped like this :
+
+     Old                   | New
+    -----------------------+---------------------------------------
+     b->data               | b_orig(b)
+     &b->data              | b_orig(b)
+     bi_ptr(b)             | ci_head(channel)
+     bi_end(b)             | b_tail(b)
+     bo_ptr(b)             | b_head(b)
+     bo_end(b)             | co_tail(channel)
+     bi_putblk(b,s,l)      | b_putblk(b,s,l)
+     bo_getblk(b,s,l,o)    | b_getblk(b,s,l,o)
+     bo_getblk_nc(b,s,l,o) | b_getblk_nc(b,s,l,o,0,co_data(channel))
+     b->i + b->o           | b_data(b)
+     b->data + b->size     | b_wrap(b)
+     b->i += len           | b_add(b, len)
+     b->i -= len           | b_sub(b, len)
+     b->i = len            | b_set_data(b, co_data(channel) + len)
+     b->o += len           | b_add(b, len); channel->output += len
+     b->o -= len           | b_del(b, len); channel->output -= len
+    -----------------------+---------------------------------------
+
+The buffer modification functions are less straightforward and depend a lot on
+the context where they are used. It is strongly advised to figure in the list
+of functions above what is available based on what is attempted to be done in
+the existing code.
+
+Note that it is very likely that any out-of-tree code relying on buffers will
+not use both ->i and ->o but instead will use exclusively ->i on the side
+producing data and use exclusively ->o on the side consuming data (such as in a
+mux or in an applet). In both cases, it should be assumed that the other side
+is always zero and that either ->i or ->o is replaced with ->data, making the
+remaining code much simpler (no more code duplication based on the data
+direction).
diff --git a/doc/internals/api/filters.txt b/doc/internals/api/filters.txt
new file mode 100644
index 0000000..eee74cf
--- /dev/null
+++ b/doc/internals/api/filters.txt
@@ -0,0 +1,1186 @@
+                   -----------------------------------------
+                          Filters Guide - version 2.5
+                          ( Last update: 2021-02-24 )
+                   ------------------------------------------
+                          Author : Christopher Faulet
+              Contact : christopher dot faulet at capflam dot org
+
+
+ABSTRACT
+--------
+
+The filters support is a new feature of HAProxy 1.7. It is a way to extend
+HAProxy without touching its core code and, in certain extent, without knowing
+its internals. This feature will ease contributions, reducing impact of
+changes. Another advantage will be to simplify HAProxy by replacing some parts
+by filters. As we will see, and as an example, the HTTP compression is the first
+feature moved in a filter.
+
+This document describes how to write a filter and what to keep in mind to do
+so. It also talks about the known limits and the pitfalls to avoid.
+
+As said, filters are quite new for now. The API is not freezed and will be
+updated/modified/improved/extended as needed.
+
+
+
+SUMMARY
+-------
+
+  1.    Filters introduction
+  2.    How to use filters
+  3.    How to write a new filter
+  3.1.      API Overview
+  3.2.      Defining the filter name and its configuration
+  3.3.      Managing the filter lifecycle
+  3.3.1.        Dealing with threads
+  3.4.      Handling the streams activity
+  3.5.      Analyzing the channels activity
+  3.6.      Filtering the data exchanged
+  4.    FAQ
+
+
+
+1. FILTERS INTRODUCTION
+-----------------------
+
+First of all, to fully understand how filters work and how to create one, it is
+best to know, at least from a distance, what is a proxy (frontend/backend), a
+stream and a channel in HAProxy and how these entities are linked to each other.
+doc/internals/entities.pdf is a good overview.
+
+Then, to support filters, many callbacks has been added to HAProxy at different
+places, mainly around channel analyzers. Their purpose is to allow filters to
+be involved in the data processing, from the stream creation/destruction to
+the data forwarding. Depending of what it should do, a filter can implement all
+or part of these callbacks. For now, existing callbacks are focused on
+streams. But future improvements could enlarge filters scope. For instance, it
+could be useful to handle events at the connection level.
+
+In HAProxy configuration file, a filter is declared in a proxy section, except
+default. So the configuration corresponding to a filter declaration is attached
+to a specific proxy, and will be shared by all its instances. it is opaque from
+the HAProxy point of view, this is the filter responsibility to manage it. For
+each filter declaration matches a uniq configuration. Several declarations of
+the same filter in the same proxy will be handle as different filters by
+HAProxy.
+
+A filter instance is represented by a partially opaque context (or a state)
+attached to a stream and passed as arguments to callbacks. Through this context,
+filter instances are stateful. Depending the filter is declared in a frontend or
+a backend section, its instances will be created, respectively, when a stream is
+created or when a backend is selected. Their behaviors will also be
+different. Only instances of filters declared in a frontend section will be
+aware of the creation and the destruction of the stream, and will take part in
+the channels analyzing before the backend is defined.
+
+It is important to remember the configuration of a filter is shared by all its
+instances, while the context of an instance is owned by a uniq stream.
+
+Filters are designed to be chained. It is possible to declare several filters in
+the same proxy section. The declaration order is important because filters will
+be called one after the other respecting this order. Frontend and backend
+filters are also chained, frontend ones called first. Even if the filters
+processing is serialized, each filter will bahave as it was alone (unless it was
+developed to be aware of other filters). For all that, some constraints are
+imposed to filters, especially when data exchanged between the client and the
+server are processed. We will discuss again these constraints when we will tackle
+the subject of writing a filter.
+
+
+
+2. HOW TO USE FILTERS
+---------------------
+
+To use a filter, the parameter 'filter' should be used, followed by the filter
+name and, optionally, its configuration in the desired listen, frontend or
+backend section. For instance :
+
+    listen test
+        ...
+        filter trace name TST
+        ...
+
+
+See doc/configuration.txt for a formal definition of the parameter 'filter'.
+Note that additional parameters on the filter line must be parsed by the filter
+itself.
+
+The list of available filters is reported by 'haproxy -vv' :
+
+    $> haproxy -vv
+    HAProxy version 1.7-dev2-3a1d4a-33 2016/03/21
+    Copyright 2000-2016 Willy Tarreau <willy@haproxy.org>
+
+    [...]
+
+    Available filters :
+            [COMP] compression
+            [TRACE] trace
+
+
+Multiple filter lines can be used in a proxy section to chain filters. Filters
+will be called in the declaration order.
+
+Some filters can support implicit declarations in certain circumstances
+(without the filter line). This is not recommended for new features but are
+useful for existing ones moved in a filter, for backward compatibility
+reasons. Implicit declarations are supported when there is only one filter used
+on a proxy. When several filters are used, explicit declarations are mandatory.
+The HTTP compression filter is one of these filters. Alone, using 'compression'
+keywords is enough to use it. But when at least a second filter is used, a
+filter line must be added.
+
+    # filter line is optional
+    listen t1
+        bind *:80
+        compression algo gzip
+        compression offload
+        server srv x.x.x.x:80
+
+    # filter line is mandatory for the compression filter
+    listen t2
+        bind *:81
+        filter trace name T2
+        filter compression
+        compression algo gzip
+        compression offload
+        server srv x.x.x.x:80
+
+
+
+
+3. HOW TO WRITE A NEW FILTER
+----------------------------
+
+To write a filter, there are 2 header files to explore :
+
+  * include/haproxy/filters-t.h : This is the main header file, containing all
+                                  important structures to use. It represents the
+                                  filter API.
+
+  * include/haproxy/filters.h : This header file contains helper functions that
+                                may be used. It also contains the internal API
+                                used by HAProxy to handle filters.
+
+To ease the filters integration, it is better to follow some conventions :
+
+  * Use 'flt_' prefix to name the filter (e.g flt_http_comp or flt_trace).
+
+  * Keep everything related to the filter in a same file.
+
+The filter 'trace' can be used as a template to write new filter. It is a good
+start to see how filters really work.
+
+3.1 API OVERVIEW
+----------------
+
+Writing a filter can be summarized to write functions and attach them to the
+existing callbacks. Available callbacks are listed in the following structure :
+
+    struct flt_ops {
+        /*
+         * Callbacks to manage the filter lifecycle
+         */
+        int  (*init)             (struct proxy *p, struct flt_conf *fconf);
+        void (*deinit)           (struct proxy *p, struct flt_conf *fconf);
+        int  (*check)            (struct proxy *p, struct flt_conf *fconf);
+        int  (*init_per_thread)  (struct proxy *p, struct flt_conf *fconf);
+        void (*deinit_per_thread)(struct proxy *p, struct flt_conf *fconf);
+
+        /*
+         * Stream callbacks
+         */
+        int  (*attach)            (struct stream *s, struct filter *f);
+        int  (*stream_start)      (struct stream *s, struct filter *f);
+        int  (*stream_set_backend)(struct stream *s, struct filter *f, struct proxy *be);
+        void (*stream_stop)       (struct stream *s, struct filter *f);
+        void (*detach)            (struct stream *s, struct filter *f);
+        void (*check_timeouts)    (struct stream *s, struct filter *f);
+
+        /*
+         * Channel callbacks
+         */
+        int  (*channel_start_analyze)(struct stream *s, struct filter *f,
+                                      struct channel *chn);
+        int  (*channel_pre_analyze)  (struct stream *s, struct filter *f,
+                                      struct channel *chn,
+                                      unsigned int an_bit);
+        int  (*channel_post_analyze) (struct stream *s, struct filter *f,
+                                      struct channel *chn,
+                                      unsigned int an_bit);
+        int  (*channel_end_analyze)  (struct stream *s, struct filter *f,
+                                      struct channel *chn);
+
+        /*
+         * HTTP callbacks
+         */
+        int  (*http_headers)       (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        int  (*http_payload)       (struct stream *s, struct filter *f,
+                                    struct http_msg *msg, unsigned int offset,
+                                    unsigned int len);
+        int  (*http_end)           (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+
+        void (*http_reset)         (struct stream *s, struct filter *f,
+                                    struct http_msg *msg);
+        void (*http_reply)         (struct stream *s, struct filter *f,
+                                    short status,
+                                    const struct buffer *msg);
+
+        /*
+         * TCP callbacks
+         */
+        int  (*tcp_payload)     (struct stream *s, struct filter *f,
+                                 struct channel *chn, unsigned int offset,
+                                 unsigned int len);
+    };
+
+
+We will explain in following parts when these callbacks are called and what they
+should do.
+
+Filters are declared in proxy sections. So each proxy have an ordered list of
+filters, possibly empty if no filter is used. When the configuration of a proxy
+is parsed, each filter line represents an entry in this list. In the structure
+'proxy', the filters configurations are stored in the field 'filter_configs',
+each one of type 'struct flt_conf *' :
+
+    /*
+     * Structure representing the filter configuration, attached to a proxy and
+     * accessible from a filter when instantiated in a stream
+     */
+    struct flt_conf {
+        const char     *id;   /* The filter id */
+        struct flt_ops *ops;  /* The filter callbacks */
+        void           *conf; /* The filter configuration */
+        struct list     list; /* Next filter for the same proxy */
+        unsigned int    flags; /* FLT_CFG_FL_* */
+    };
+
+  * 'flt_conf.id' is an identifier, defined by the filter. It can be
+    NULL. HAProxy does not use this field. Filters can use it in log messages or
+    as a uniq identifier to check multiple declarations. It is the filter
+    responsibility to free it, if necessary.
+
+  * 'flt_conf.conf' is opaque. It is the internal configuration of a filter,
+    generally allocated and filled by its parsing function (See § 3.2). It is
+    the filter responsibility to free it.
+
+  * 'flt_conf.ops' references the callbacks implemented by the filter. This
+    field must be set during the parsing phase (See § 3.2) and can be refine
+    during the initialization phase (See § 3.3). If it is dynamically allocated,
+    it is the filter responsibility to free it.
+
+  * 'flt_conf.flags' is a bitfield to specify the filter capabilities. For now,
+    only FLT_CFG_FL_HTX may be set when a filter is able to process HTX
+    streams. If not set, the filter is excluded from the HTTP filtering.
+
+
+The filter configuration is global and shared by all its instances. A filter
+instance is created in the context of a stream and attached to this stream. in
+the structure 'stream', the field 'strm_flt' is the state of all filter
+instances attached to a stream :
+
+    /*
+     * Structure representing the "global" state of filters attached to a
+     * stream.
+     */
+    struct strm_flt {
+        struct list    filters;             /* List of filters attached to a stream */
+        struct filter *current[2];          /* From which filter resume processing, for a specific channel.
+                                             * This is used for resumable callbacks only,
+                                             * If NULL, we start from the first filter.
+                                             * 0: request channel, 1: response channel */
+        unsigned short flags;               /* STRM_FL_* */
+        unsigned char  nb_req_data_filters; /* Number of data filters registered on the request channel */
+        unsigned char  nb_rsp_data_filters; /* Number of data filters registered on the response channel */
+        unsigned long long offset[2];       /* gloal offset of input data already filtered for a specific channel
+                                             * 0: request channel, 1: response channel */
+    };
+
+
+Filter instances attached to a stream are stored in the field
+'strm_flt.filters', each instance is of type 'struct filter *' :
+
+    /*
+     * Structure representing a filter instance attached to a stream
+     *
+     * 2D-Array fields are used to store info per channel. The first index
+     * stands for the request channel, and the second one for the response
+     * channel.  Especially, <next> and <fwd> are offsets representing amount of
+     * data that the filter are, respectively, parsed and forwarded on a
+     * channel. Filters can access these values using FLT_NXT and FLT_FWD
+     * macros.
+     */
+    struct filter {
+        struct flt_conf *config; /* the filter's configuration */
+        void           *ctx;     /* The filter context (opaque) */
+        unsigned short  flags;   /* FLT_FL_* */
+        unsigned long long offset[2];   /* Offset of input data already filtered for a specific channel
+                                         * 0: request channel, 1: response channel */
+        unsigned int    pre_analyzers;  /* bit field indicating analyzers to
+                                         * pre-process */
+        unsigned int    post_analyzers; /* bit field indicating analyzers to
+                                         * post-process */
+        struct list     list;    /* Next filter for the same proxy/stream */
+    };
+
+  * 'filter.config' is the filter configuration previously described. All
+    instances of a filter share it.
+
+  * 'filter.ctx' is an opaque context. It is managed by the filter, so it is its
+    responsibility to free it.
+
+  * 'filter.pre_analyzers and 'filter.post_analyzers will be described later
+    (See § 3.5).
+
+  * 'filter.offset' will be described later (See § 3.6).
+
+
+3.2. DEFINING THE FILTER NAME AND ITS CONFIGURATION
+---------------------------------------------------
+
+During the filter development, the first thing to do is to add it in the
+supported filters. To do so, its name must be registered as a valid keyword on
+the filter line :
+
+    /* Declare the filter parser for "my_filter" keyword */
+    static struct flt_kw_list flt_kws = { "MY_FILTER_SCOPE", { }, {
+            { "my_filter", parse_my_filter_cfg, NULL /* private data */ },
+            { NULL, NULL, NULL },
+        }
+    };
+    INITCALL1(STG_REGISTER, flt_register_keywords, &flt_kws);
+
+
+Then the filter internal configuration must be defined. For instance :
+
+    struct my_filter_config {
+        struct proxy *proxy;
+        char         *name;
+        /* ... */
+    };
+
+
+All callbacks implemented by the filter must then be declared. Here, a global
+variable is used :
+
+    struct flt_ops my_filter_ops {
+        .init   = my_filter_init,
+        .deinit = my_filter_deinit,
+        .check  = my_filter_config_check,
+
+        /* ... */
+     };
+
+
+Finally, the function to parse the filter configuration must be written, here
+'parse_my_filter_cfg'. This function must parse all remaining keywords on the
+filter line :
+
+    /* Return -1 on error, else 0 */
+    static int
+    parse_my_filter_cfg(char **args, int *cur_arg, struct proxy *px,
+                        struct flt_conf *flt_conf, char **err, void *private)
+    {
+        struct my_filter_config *my_conf;
+        int pos = *cur_arg;
+
+        /* Allocate the internal configuration used by the filter */
+        my_conf = calloc(1, sizeof(*my_conf));
+        if (!my_conf) {
+            memprintf(err, "%s : out of memory", args[*cur_arg]);
+            return -1;
+        }
+        my_conf->proxy = px;
+
+        /* ... */
+
+        /* Parse all keywords supported by the filter and fill the internal
+         * configuration */
+        pos++; /* Skip the filter name */
+        while (*args[pos]) {
+            if (!strcmp(args[pos], "name")) {
+                if (!*args[pos + 1]) {
+                    memprintf(err, "'%s' : '%s' option without value",
+                              args[*cur_arg], args[pos]);
+                              goto error;
+                }
+                my_conf->name = strdup(args[pos + 1]);
+                if (!my_conf->name) {
+                    memprintf(err, "%s : out of memory", args[*cur_arg]);
+                    goto error;
+                }
+                pos += 2;
+            }
+
+            /* ... parse other keywords ... */
+        }
+        *cur_arg = pos;
+
+        /* Set callbacks supported by the filter */
+        flt_conf->ops  = &my_filter_ops;
+
+        /* Last, save the internal configuration */
+        flt_conf->conf = my_conf;
+        return 0;
+
+      error:
+         if (my_conf->name)
+            free(my_conf->name);
+        free(my_conf);
+        return -1;
+    }
+
+
+WARNING : In this parsing function, 'flt_conf->ops' must be initialized. All
+          arguments of the filter line must also be parsed. This is mandatory.
+
+In the previous example, the filter lne should be read as follows :
+
+    filter my_filter name MY_NAME ...
+
+
+Optionally, by implementing the 'flt_ops.check' callback, an extra set is added
+to check the internal configuration of the filter after the parsing phase, when
+the HAProxy configuration is fully defined. For instance :
+
+    /* Check configuration of a trace filter for a specified proxy.
+     * Return 1 on error, else 0. */
+    static int
+    my_filter_config_check(struct proxy *px, struct flt_conf *my_conf)
+    {
+        if (px->mode != PR_MODE_HTTP) {
+            Alert("The filter 'my_filter' cannot be used in non-HTTP mode.\n");
+            return 1;
+        }
+
+        /* ... */
+
+        return 0;
+    }
+
+
+
+3.3. MANAGING THE FILTER LIFECYCLE
+----------------------------------
+
+Once the configuration parsed and checked, filters are ready to by used. There
+are two main callbacks to manage the filter lifecycle :
+
+  * 'flt_ops.init' : It initializes the filter for a proxy. This callback may be
+                     defined to finish the filter configuration.
+
+  * 'flt_ops.deinit' : It cleans up what the parsing function and the init
+                       callback have done. This callback is useful to release
+                       memory allocated for the filter configuration.
+
+Here is an example :
+
+    /* Initialize the filter. Returns -1 on error, else 0. */
+    static int
+    my_filter_init(struct proxy *px, struct flt_conf *fconf)
+    {
+        struct my_filter_config *my_conf = fconf->conf;
+
+        /* ... */
+
+        return 0;
+    }
+
+    /* Free resources allocated by the trace filter. */
+    static void
+    my_filter_deinit(struct proxy *px, struct flt_conf *fconf)
+    {
+        struct my_filter_config *my_conf = fconf->conf;
+
+        if (my_conf) {
+            free(my_conf->name);
+            /* ... */
+            free(my_conf);
+        }
+        fconf->conf = NULL;
+    }
+
+
+3.3.1 DEALING WITH THREADS
+--------------------------
+
+When HAProxy is compiled with the threads support and started with more that one
+thread (global.nbthread > 1), then it is possible to manage the filter per
+thread with following callbacks :
+
+  * 'flt_ops.init_per_thread': It initializes the filter for each thread. It
+                               works the same way than 'flt_ops.init' but in the
+                               context of a thread. This callback is called
+                               after the thread creation.
+
+  * 'flt_ops.deinit_per_thread': It cleans up what the init_per_thread callback
+                                 have done. It is called in the context of a
+                                 thread, before exiting it.
+
+It is the filter responsibility to deal with concurrency. check, init and deinit
+callbacks are called on the main thread. All others are called on a "worker"
+thread (not always the same). It is also the filter responsibility to know if
+HAProxy is started with more than one thread. If it is started with one thread
+(or compiled without the threads support), these callbacks will be silently
+ignored (in this case, global.nbthread will be always equal to one).
+
+
+3.4. HANDLING THE STREAMS ACTIVITY
+-----------------------------------
+
+It may be interesting to handle streams activity. For now, there is three
+callbacks that should define to do so :
+
+  * 'flt_ops.stream_start' : It is called when a stream is started. This
+                             callback can fail by returning a negative value. It
+                             will be considered as a critical error by HAProxy
+                             which disabled the listener for a short time.
+
+  * 'flt_ops.stream_set_backend' : It is called when a backend is set for a
+                                   stream. This callbacks will be called for all
+                                   filters attached to a stream (frontend and
+                                   backend). Note this callback is not called if
+                                   the frontend and the backend are the same.
+
+  * 'flt_ops.stream_stop' : It is called when a stream is stopped. This callback
+                            always succeed. Anyway, it is too late to return an
+                            error.
+
+For instance :
+
+    /* Called when a stream is created. Returns -1 on error, else 0. */
+    static int
+    my_filter_stream_start(struct stream *s, struct filter *filter)
+    {
+         struct my_filter_config *my_conf = FLT_CONF(filter);
+
+         /* ... */
+
+         return 0;
+    }
+
+    /* Called when a backend is set for a stream */
+    static int
+    my_filter_stream_set_backend(struct stream *s, struct filter *filter,
+                                 struct proxy *be)
+    {
+         struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... */
+
+        return 0;
+    }
+
+    /* Called when a stream is destroyed */
+    static void
+    my_filter_stream_stop(struct stream *s, struct filter *filter)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+         /* ... */
+    }
+
+
+WARNING : Handling the streams creation and destruction is only possible for
+          filters defined on proxies with the frontend capability.
+
+In addition, it is possible to handle creation and destruction of filter
+instances using following callbacks:
+
+  * 'flt_ops.attach' : It is called after a filter instance creation, when it is
+                       attached to a stream. This happens when the stream is
+                       started for filters defined on the stream's frontend and
+                       when the backend is set for filters declared on the
+                       stream's backend. It is possible to ignore the filter, if
+                       needed, by returning 0. This could be useful to have
+                       conditional filtering.
+
+  * 'flt_ops.detach' : It is called when a filter instance is detached from a
+                       stream, before its destruction. This happens when the
+                       stream is stopped for filters defined on the stream's
+                       frontend and when the analyze ends for filters defined on
+                       the stream's backend.
+
+For instance :
+
+    /* Called when a filter instance is created and attach to a stream */
+    static int
+    my_filter_attach(struct stream *s, struct filter *filter)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        if (/* ... */)
+            return 0; /* Ignore the filter here */
+        return 1;
+    }
+
+    /* Called when a filter instance is detach from a stream, just before its
+     * destruction */
+    static void
+    my_filter_detach(struct stream *s, struct filter *filter)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... */
+    }
+
+Finally, it may be interesting to notify the filter when the stream is woken up
+because of an expired timer. This could let a chance to check some internal
+timeouts, if any. To do so the following callback must be used :
+
+  * 'flt_opt.check_timeouts' : It is called when a stream is woken up because of
+                               an expired timer.
+
+For instance :
+
+    /* Called when a stream is woken up because of an expired timer */
+    static void
+    my_filter_check_timeouts(struct stream *s, struct filter *filter)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... */
+    }
+
+
+3.5. ANALYZING THE CHANNELS ACTIVITY
+------------------------------------
+
+The main purpose of filters is to take part in the channels analyzing. To do so,
+there is 2 callbacks, 'flt_ops.channel_pre_analyze' and
+'flt_ops.channel_post_analyze', called respectively before and after each
+analyzer attached to a channel, except analyzers responsible for the data
+forwarding (TCP or HTTP). Concretely, on the request channel, these callbacks
+could be called before following analyzers :
+
+  * tcp_inspect_request        (AN_REQ_INSPECT_FE and AN_REQ_INSPECT_BE)
+  * http_wait_for_request      (AN_REQ_WAIT_HTTP)
+  * http_wait_for_request_body (AN_REQ_HTTP_BODY)
+  * http_process_req_common    (AN_REQ_HTTP_PROCESS_FE)
+  * process_switching_rules    (AN_REQ_SWITCHING_RULES)
+  * http_process_req_ common   (AN_REQ_HTTP_PROCESS_BE)
+  * http_process_tarpit        (AN_REQ_HTTP_TARPIT)
+  * process_server_rules       (AN_REQ_SRV_RULES)
+  * http_process_request       (AN_REQ_HTTP_INNER)
+  * tcp_persist_rdp_cookie     (AN_REQ_PRST_RDP_COOKIE)
+  * process_sticking_rules     (AN_REQ_STICKING_RULES)
+
+And on the response channel :
+
+  * tcp_inspect_response     (AN_RES_INSPECT)
+  * http_wait_for_response   (AN_RES_WAIT_HTTP)
+  * process_store_rules      (AN_RES_STORE_RULES)
+  * http_process_res_common  (AN_RES_HTTP_PROCESS_BE)
+
+Unlike the other callbacks previously seen before, 'flt_ops.channel_pre_analyze'
+can interrupt the stream processing. So a filter can decide to not execute the
+analyzer that follows and wait the next iteration. If there are more than one
+filter, following ones are skipped. On the next iteration, the filtering resumes
+where it was stopped, i.e. on the filter that has previously stopped the
+processing. So it is possible for a filter to stop the stream processing on a
+specific analyzer for a while before continuing. Moreover, this callback can be
+called many times for the same analyzer, until it finishes its processing. For
+instance :
+
+    /* Called before a processing happens on a given channel.
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_pre_analyze(struct stream *s, struct filter *filter,
+                              struct channel *chn, unsigned an_bit)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        switch (an_bit) {
+            case AN_REQ_WAIT_HTTP:
+                if (/* wait that a condition is verified before continuing */)
+                    return 0;
+                break;
+            /* ... * /
+        }
+        return 1;
+    }
+
+  * 'an_bit' is the analyzer id. All analyzers are listed in
+    'include/haproxy/channels-t.h'.
+
+  * 'chn' is the channel on which the analyzing is done. It is possible to
+    determine if it is the request or the response channel by testing if
+    CF_ISRESP flag is set :
+
+      │ ((chn->flags & CF_ISRESP) == CF_ISRESP)
+
+
+In previous example, the stream processing is blocked before receipt of the HTTP
+request until a condition is verified.
+
+'flt_ops.channel_post_analyze', for its part, is not resumable. It returns a
+negative value if an error occurs, any other value otherwise. It is called when
+a filterable analyzer finishes its processing, so once for the same analyzer.
+For instance :
+
+    /* Called after a processing happens on a given channel.
+     * Returns a negative value if an error occurs, any other
+     * value otherwise. */
+    static int
+    my_filter_chn_post_analyze(struct stream *s, struct filter *filter,
+                               struct channel *chn, unsigned an_bit)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+        struct http_msg         *msg;
+
+        switch (an_bit) {
+            case AN_REQ_WAIT_HTTP:
+                if (/* A test on received headers before any other treatment */) {
+                    msg = ((chn->flags & CF_ISRESP) ? &s->txn->rsp : &s->txn->req);
+                    txn->status = 400;
+                    msg->msg_state = HTTP_MSG_ERROR;
+                    http_reply_and_close(s, s->txn->status, http_error_message(s));
+                    return -1; /* This is an error ! */
+                }
+                break;
+            /* ... * /
+        }
+        return 1;
+    }
+
+
+Pre and post analyzer callbacks of a filter are not automatically called. They
+must be regiesterd explicitly on analyzers, updating the value of
+'filter.pre_analyzers' and 'filter.post_analyzers' bit fields. All analyzer bits
+are listed in 'include/types/channels.h'. Here is an example :
+
+    static int
+    my_filter_stream_start(struct stream *s, struct filter *filter)
+    {
+        /* ... * /
+
+        /* Register the pre analyzer callback on all request and response
+         * analyzers */
+        filter->pre_analyzers |= (AN_REQ_ALL | AN_RES_ALL)
+
+        /* Register the post analyzer callback of only on AN_REQ_WAIT_HTTP and
+         * AN_RES_WAIT_HTTP analyzers */
+         filter->post_analyzers |= (AN_REQ_WAIT_HTTP | AN_RES_WAIT_HTTP)
+
+        /* ... * /
+        return 0;
+    }
+
+
+To surround activity of a filter during the channel analyzing, two new analyzers
+has been added :
+
+  * 'flt_start_analyze' (AN_REQ/RES_FLT_START_FE/AN_REQ_RES_FLT_START_BE) : For
+    a specific filter, this analyzer is called before any call to the
+    'channel_analyze' callback. From the filter point of view, it calls the
+    'flt_ops.channel_start_analyze' callback.
+
+  * 'flt_end_analyze' (AN_REQ/RES_FLT_END) : For a specific filter, this
+    analyzer is called when all other analyzers have finished their
+    processing. From the filter point of view, it calls the
+    'flt_ops.channel_end_analyze' callback.
+
+These analyzers are called only once per streams.
+
+'flt_ops.channel_start_analyze' and 'flt_ops.channel_end_analyze' callbacks can
+interrupt the stream processing, as 'flt_ops.channel_analyze'. Here is an
+example :
+
+    /* Called when analyze starts for a given channel
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_start_analyze(struct stream *s, struct filter *filter,
+                                struct channel *chn)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... TODO ... */
+
+        return 1;
+    }
+
+    /* Called when analyze ends for a given channel
+     * Returns a negative value if an error occurs, 0 if it needs to wait,
+     * any other value otherwise. */
+    static int
+    my_filter_chn_end_analyze(struct stream *s, struct filter *filter,
+                              struct channel *chn)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* ... TODO ... */
+
+        return 1;
+    }
+
+
+Workflow on channels can be summarized as following :
+
+   FE: Called for filters defined on the stream's frontend
+   BE: Called for filters defined on the stream's backend
+
+                                          +------->---------+
+                |                         |                 |
+    +----------------------+              |      +----------------------+
+    |  flt_ops.attach (FE) |              |      |  flt_ops.attach (BE) |
+    +----------------------+              |      +----------------------+
+                |                         |                  |
+                V                         |                  V
+  +--------------------------+            | +------------------------------------+
+  | flt_ops.stream_start (FE)|            | | flt_ops.stream_set_backend (FE+BE) |
+  +--------------------------+            | +------------------------------------+
+                |                         |                  |
+               ...                        |                 ...
+                |                         |                  |
+                |                         ^                  |
+                |                 --+     |                  |                 --+
+                +------<----------+ |     |                  +--------<--------+ |
+                |                 | |     |                  |                 | |
+                V                 | |     |                  V                 | |
++-------------------------------+ | |     |  +-------------------------------+ | |
+|      flt_start_analyze (FE)   +-+ |     |  |      flt_start_analyze (BE)   +-+ |
+|(flt_ops.channel_start_analyze)|   | F   |  |(flt_ops.channel_start_analyze)|   |
++---------------+---------------+   | R   |  +-------------------------------+   |
+                |                   | O   |                  |                   |
+                +------<---------+  | N   ^                  +--------<-------+  | B
+                |                |  | T   |                  |                |  | A
++---------------|------------+   |  | E   |  +---------------|------------+   |  | C
+|+--------------V-------------+  |  | N   |  |+--------------V-------------+  |  | K
+||+----------------------------+ |  | D   |  ||+----------------------------+ |  | E
+|||flt_ops.channel_pre_analyze | |  |     |  |||flt_ops.channel_pre_analyze | |  | N
+|||             V              | |  |     |  |||             V              | |  | D
+|||        analyzer (FE)       +-+  |     |  |||     analyzer (FE+BE)       +-+  |
++||             V              |    |     |  +||             V              |    |
+ +|flt_ops.channel_post_analyze|    |     |   +|flt_ops.channel_post_analyze|    |
+  +----------------------------+    |     |    +----------------------------+    |
+                |                 --+     |                  |                   |
+                +------------>------------+                 ...                  |
+                                                             |                   |
+                                                 [ data filtering (see below) ]  |
+                                                             |                   |
+                                                            ...                  |
+                                                             |                   |
+                                                             +--------<--------+ |
+                                                             |                 | |
+                                                             V                 | |
+                                             +-------------------------------+ | |
+                                             |    flt_end_analyze (FE+BE)    +-+ |
+                                             | (flt_ops.channel_end_analyze) |   |
+                                             +---------------+---------------+   |
+                                                             |                 --+
+                                                             V
+                                                  +----------------------+
+                                                  |  flt_ops.detach (BE) |
+                                                  +----------------------+
+                                                             |
+                                                             V
+                                                +--------------------------+
+                                                | flt_ops.stream_stop (FE) |
+                                                +--------------------------+
+                                                             |
+                                                             V
+                                                  +----------------------+
+                                                  |  flt_ops.detach (FE) |
+                                                  +----------------------+
+                                                             |
+                                                             V
+
+By zooming on an analyzer box we have:
+
+                     ...
+                      |
+                      V
+                      |
+                      +-----------<-----------+
+                      |                       |
+    +-----------------+--------------------+  |
+    |                 |                    |  |
+    |                 +--------<---------+ |  |
+    |                 |                  | |  |
+    |                 V                  | |  |
+    |     flt_ops.channel_pre_analyze ->-+ |  ^
+    |                 |                    |  |
+    |                 |                    |  |
+    |                 V                    |  |
+    |              analyzer --------->-----+--+
+    |                 |                    |
+    |                 |                    |
+    |                 V                    |
+    |     flt_ops.channel_post_analyze     |
+    |                 |                    |
+    |                 |                    |
+    +-----------------+--------------------+
+                      |
+                      V
+                     ...
+
+
+ 3.6. FILTERING THE DATA EXCHANGED
+-----------------------------------
+
+WARNING : To fully understand this part, it is important to be aware on how the
+          buffers work in HAProxy. For the HTTP part, it is also important to
+          understand how data are parsed and structured, and how the internal
+          representation, called HTX, works. See doc/internals/buffer-api.txt
+          and doc/internals/htx-api.txt for details.
+
+An extended feature of the filters is the data filtering. By default a filter
+does not look into data exchanged between the client and the server because it
+is expensive. Indeed, instead of forwarding data without any processing, each
+byte need to be buffered.
+
+So, to enable the data filtering on a channel, at any time, in one of previous
+callbacks, 'register_data_filter' function must be called. And conversely, to
+disable it, 'unregister_data_filter' function must be called. For instance :
+
+    my_filter_http_headers(struct stream *s, struct filter *filter,
+                           struct http_msg *msg)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+
+        /* 'chn' must be the request channel */
+        if (!(msg->chn->flags & CF_ISRESP)) {
+            struct htx *htx;
+            struct ist hdr;
+            struct http_hdr_ctx ctx;
+
+            htx = htxbuf(msg->chn->buf);
+
+            /* Enable the data filtering for the request if 'X-Filter' header
+             * is set to 'true'. */
+            hdr = ist("X-Filter);
+            ctx.blk = NULL;
+            if (http_find_header(htx, hdr, &ctx, 0) &&
+                ctx.value.len >= 4 && memcmp(ctx.value.ptr, "true", 4) == 0)
+                register_data_filter(s, chn, filter);
+        }
+
+        return 1;
+    }
+
+Here, the data filtering is enabled if the HTTP header 'X-Filter' is found and
+set to 'true'.
+
+If several filters are declared, the evaluation order remains the same,
+regardless the order of the registrations to the data filtering. Data
+registrations must be performed before the data forwarding step. However, a
+filter may be unregistered from the data filtering at any time.
+
+Depending on the stream type, TCP or HTTP, the way to handle data filtering is
+different. HTTP data are structured while TCP data are raw. And there are more
+callbacks for HTTP streams to fully handle all steps of an HTTP transaction. But
+the main part is the same. The data filtering is performed in one callback,
+called in loop on input data starting at a specific offset for a given
+length. Data analyzed by a filter are considered as forwarded from its point of
+view. Because filters are chained, a filter never analyzes more data than its
+predecessors. Thus only data analyzed by the last filter are effectively
+forwarded. This means, at any time, any filter may choose to not analyze all
+available data (available from its point of view), blocking the data forwarding.
+
+Internally, filters own 2 offsets representing the number of bytes already
+analyzed in the available input data, one per channel. There is also an offset
+couple at the stream level, in the strm_flt object, representing the total
+number of bytes already forwarded. These offsets may be retrieved and updated
+using following macros :
+
+  * FLT_OFF(flt, chn)
+
+  * FLT_STRM_OFF(s, chn)
+
+where 'flt' is the 'struct filter' passed as argument in all callbacks, 's' the
+filtered stream and 'chn' is the considered channel. However, there is no reason
+for a filter to use these macros or take care of these offsets.
+
+
+3.6.1 FILTERING DATA ON TCP STREAMS
+-----------------------------------
+
+The TCP data filtering for TCP streams is the easy case, because HAProxy do not
+parse these data. Data are stored in raw in the buffer. So there is only one
+callback to consider:
+
+  * 'flt_ops.tcp_payload : This callback is called when input data are
+    available. If not defined, all available data will be considered as analyzed
+    and forwarded from the filter point of view.
+
+This callback is called only if the filter is registered to analyze TCP
+data. Here is an example :
+
+    /* Returns a negative value if an error occurs, else the number of
+     * consumed bytes. */
+    static int
+    my_filter_tcp_payload(struct stream *s, struct filter *filter,
+                          struct channel *chn, unsigned int offset,
+                          unsigned int len)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+        int ret = len;
+
+        /* Do not parse more than 'my_conf->max_parse' bytes at a time */
+        if (my_conf->max_parse != 0 && ret > my_conf->max_parse)
+            ret = my_conf->max_parse;
+
+        /* if available data are not completely parsed, wake up the stream to
+         * be sure to not freeze it. The best is probably to set a
+         * chn->analyse_exp timer */
+        if (ret != len)
+            task_wakeup(s->task, TASK_WOKEN_MSG);
+        return ret;
+    }
+
+But it is important to note that tunnelled data of an HTTP stream may also be
+filtered via this callback. Tunnelled data are data exchange after an HTTP tunnel
+is established between the client and the server, via an HTTP CONNECT or via a
+protocol upgrade. In this case, the data are structured. Of course, to do so,
+the filter must be able to parse HTX data and must have the FLT_CFG_FL_HTX flag
+set. At any time, the IS_HTX_STRM() macros may be used on the stream to know if
+it is an HTX stream or a TCP stream.
+
+
+3.6.2 FILTERING DATA ON HTTP STREAMS
+------------------------------------
+
+The HTTP data filtering is a bit more complex because HAProxy data are
+structutred and represented to an internal format, called HTX. So basically
+there is the HTTP counterpart to the previous callback :
+
+  * 'flt_ops.http_payload' : This callback is called when input data are
+    available. If not defined, all available data will be considered as analyzed
+    and forwarded for the filter.
+
+But the prototype for this callbacks is slightly different. Instead of having
+the channel as parameter, we have the HTTP message (struct http_msg). This
+callback is called only if the filter is registered to analyze TCP data. Here is
+an example :
+
+    /* Returns a negative value if an error occurs, else the number of
+     * consumed bytes. */
+    static int
+    my_filter_http_payload(struct stream *s, struct filter *filter,
+                           struct http_msg *msg, unsigned int offset,
+                           unsigned int len)
+    {
+        struct my_filter_config *my_conf = FLT_CONF(filter);
+        struct htx *htx = htxbuf(&msg->chn->buf);
+        struct htx_ret htxret = htx_find_offset(htx, offset);
+        struct htx_blk *blk;
+
+        blk = htxret.blk;
+        offset = htxret.ret;
+        for (; blk; blk = htx_get_next_blk(blk, htx)) {
+            enum htx_blk_type type = htx_get_blk_type(blk);
+
+            if (type == HTX_BLK_UNUSED)
+                continue;
+            else if (type == HTX_BLK_DATA) {
+                /* filter data */
+            }
+            else
+                break;
+        }
+
+        return len;
+    }
+
+In addition, there are two others callbacks :
+
+  * 'flt_ops.http_headers' : This callback is called just before the HTTP body
+    forwarding and after any processing on the request/response HTTP
+    headers. When defined, this callback is always called for HTTP streams
+    (i.e. without needs of a registration on data filtering).
+    Here is an example :
+
+
+        /* Returns a negative value if an error occurs, 0 if it needs to wait,
+         * any other value otherwise. */
+         static int
+         my_filter_http_headers(struct stream *s, struct filter *filter,
+                                struct http_msg *msg)
+        {
+            struct my_filter_config *my_conf = FLT_CONF(filter);
+            struct htx *htx = htxbuf(&msg->chn->buf);
+            struct htx_sl *sl = http_get_stline(htx);
+            int32_t pos;
+
+            for (pos = htx_get_first(htx); pos != -1; pos = htx_get_next(htx, pos)) {
+                struct htx_blk *blk = htx_get_blk(htx, pos);
+                enum htx_blk_type type = htx_get_blk_type(blk);
+                struct ist n, v;
+
+                if (type == HTX_BLK_EOH)
+                        break;
+                if (type != HTX_BLK_HDR)
+                        continue;
+
+                n = htx_get_blk_name(htx, blk);
+                v = htx_get_blk_value(htx, blk);
+		/* Do something on the header name/value */
+            }
+
+            return 1;
+        }
+
+  * 'flt_ops.http_end' : This callback is called when the whole HTTP message was
+    processed. It may interrupt the stream processing. So, it could be used to
+    synchronize the HTTP request with the HTTP response, for instance :
+
+        /* Returns a negative value if an error occurs, 0 if it needs to wait,
+         * any other value otherwise. */
+         static int
+         my_filter_http_end(struct stream *s, struct filter *filter,
+                            struct http_msg *msg)
+         {
+             struct my_filter_ctx *my_ctx = filter->ctx;
+
+
+            if (!(msg->chn->flags & CF_ISRESP)) /* The request */
+                my_ctx->end_of_req = 1;
+            else /* The response */
+                my_ctx->end_of_rsp = 1;
+
+            /* Both the request and the response are finished */
+            if (my_ctx->end_of_req == 1 && my_ctx->end_of_rsp == 1)
+                return 1;
+
+            /* Wait */
+            return 0;
+        }
+
+Then, to finish, there are 2 informational callbacks :
+
+  * 'flt_ops.http_reset' : This callback is called when an HTTP message is
+    reset. This happens either when a 1xx informational response is received, or
+    if we're retrying to send the request to the server after it failed. It
+    could be useful to reset the filter context before receiving the true
+    response.
+    By checking s->txn->status, it is possible to know why this callback is
+    called. If it's a 1xx, we're called because of an informational
+    message. Otherwise, it is a L7 retry.
+
+  * 'flt_ops.http_reply' : This callback is called when, at any time, HAProxy
+    decides to stop the processing on a HTTP message and to send an internal
+    response to the client. This mainly happens when an error or a redirect
+    occurs.
+
+
+3.6.3 REWRITING DATA
+--------------------
+
+The last part, and the trickiest one about the data filtering, is about the data
+rewriting. For now, the filter API does not offer a lot of functions to handle
+it. There are only functions to notify HAProxy that the data size has changed to
+let it update internal state of filters. This is the developer responsibility to
+update data itself, i.e. the buffer offsets, using following function :
+
+  * 'flt_update_offsets()' : This function must be called when a filter alter
+    incoming data. It updates offsets of the stream and of all filters
+    preceding the calling one. Do not call this function when a filter change
+    the size of incoming data leads to an undefined behavior.
+
+A good example of filter changing the data size is the HTTP compression filter.
diff --git a/doc/internals/api/htx-api.txt b/doc/internals/api/htx-api.txt
new file mode 100644
index 0000000..971328b
--- /dev/null
+++ b/doc/internals/api/htx-api.txt
@@ -0,0 +1,570 @@
+                -----------------------------------------------
+                                   HTX API
+                                  Version 1.1
+                          ( Last update: 2021-02-24 )
+                -----------------------------------------------
+                          Author : Christopher Faulet
+                      Contact : cfaulet at haproxy dot com
+
+1. Background
+
+Historically, HAProxy stored HTTP messages in a raw fashion in buffers, keeping
+parsing information separately in a "struct http_msg" owned by the stream. It was
+optimized to the data transfer, but not so much for rewrites. It was also HTTP/1
+centered. While it was the only HTTP version supported, it was not a
+problem. But with the rise of HTTP/2, it starts to be hard to still use this
+representation.
+
+At the first age of the HTTP/2 in HAProxy, H2 messages were converted into
+H1. This was terribly unefficient because it required two parsing passes, a
+first one in H2 and a second one in H1, with a conversion in the middle. And of
+course, the same was also true in the opposite direction. outgoing H1 messages
+had to be converted back in H2 to be sent. Even worse, because the H2->H1
+conversion, only client H2 connections were supported.
+
+So, to address all these problems, we decided to replace the old raw
+representation by a version-agnostic and self-structured internal HTTP
+representation, the HTX. As an additional benefit, with this new representation,
+the message parsing and its processing are now separated, making all the HTTP
+analysis simpler and cleaner. The parsing of HTTP messages is now handled by
+the multiplexers (h1 or h2).
+
+
+2. The HTX message
+
+The HTX is a structure containing useful information about an HTTP message
+followed by a contiguous array with some parts of the message. These parts are
+called blocks. A block is composed of metadata (htx_blk) and an associated
+payload. Blocks' metadata are stored starting from the end of the array while
+their payload are stored at the beginning. Blocks' metadata are often simply
+called blocks. it is a misuse of language that's simplify explanations.
+
+Internally, this structure is "hidden" in a buffer. This way, there are few
+changes into intermediate layers (stream-interface and channels). They still
+manipulate buffers. Only the multiplexer and the stream have to know how data
+are really stored. From the HTX perspective, a buffer is just a memory
+area. When an HTX message is stored in a buffer, this one appears as full.
+
+  * General view of an HTX message :
+
+
+  buffer->area
+    |
+    |<------------ buffer->size == buffer->data ----------------------|
+    |                                                                 |
+    |     |<------------- Blocks array (htx->size) ------------------>|
+    V     |                                                           |
+    +-----+-----------------+-------------------------+---------------+
+    | HTX |   PAYLOADS ==>  |                         |  <== HTX_BLKs |
+    +-----+-----------------+-------------------------+---------------+
+          |                 |                         |               |
+          |<-payloads part->|<----- free space ------>|<-blocks part->|
+              (htx->data)
+
+
+The blocks part remains linear and sorted. It may be see as an array with
+negative indexes. But, instead of using negative indexes, we use positive
+positions to identify a block. This position is then converted to an address
+relatively to the beginning of the blocks array.
+
+                tail                                 head
+                 |                                     |
+                 V                                     V
+     .....--+----+-----------------------+------+------+
+            | Bn |       ...             |  B1  |  B0  |
+     .....--+----+-----------------------+------+------+
+                 ^                       ^      ^
+ Addr of the block       Addr of the block      Addr of the block
+ at the position N       at the position 1      at the position 0
+
+
+In the HTX structure, 3 "special" positions are stored :
+
+    - tail  : Position of the newest inserted block
+    - head  : Position of the oldest inserted block
+    - first : Position of the first block to (re)start the analyse
+
+The blocks part never wrap. If we have no space to allocate a new block and if
+there is a hole at the beginning of the blocks part (so at the end of the blocks
+array), we move back all blocks.
+
+
+      tail           head                                tail           head
+       |              |                                   |              |
+       V              V                                   V              V
+    ...+--------------+---------+    blocks  ...----------+--------------+
+       | X== HTX_BLKS |         |    defrag               | <== HTX_BLKS |
+    ...+--------------+---------+    =====>  ...----------+--------------+
+
+
+The payloads part is a raw space that may wrap. A block's payload must never be
+accessed directly. Instead a block must be selected to retrieve the address of
+its payload.
+
+
+          +------------------------( B0.addr )--------------------------+
+          |    +-------------------( B1.addr )----------------------+   |
+          |    |       +-----------( B2.addr )----------------+     |   |
+          V    V       V                                      |     |   |
+    +-----+----+-------+----+--------+-------------+-------+----+----+----+
+    | HTX | P0 |   P1  | P2 | ...==> |             | <=... | B2 | B1 | B0 |
+    +-----+----+-------+----+--------+-------------+-------+----+----+----+
+
+
+Because the payloads part may wrap, there are 2 usable free spaces :
+
+    - The free space in front of the blocks part. This one is used if and only if
+      the other one was not used yet.
+
+    - The free space at the beginning of the message. Once this one is used, the
+      other one is never used again, until a message defragmentation.
+
+
+  * Linear payloads part :
+
+
+      head_addr             end_addr     tail_addr
+          |                    |             |
+          V                    V             V
+    +-----+--------------------+-------------+--------------------+-------...
+    | HTX |                    |   PAYLOADS  |                    | HTX_BLKs
+    +-----+--------------------+-------------+--------------------+-------...
+          |<-- free space 2 -->|             |<-- free space 1 -->|
+     (used if the other is too small)          (used in priority)
+
+
+  * Wrapping payloads part :
+
+
+                            head_addr end_addr        tail_addr
+                                |        |                |
+                                V        V                V
+    +-----+----+----------------+--------+----------------+-------+-------...
+    | HTX |    | PAYLOADS part2 |        | PAYLOADS part1 |       | HTX_BLKs
+    +-----+----+----------------+--------+----------------+-------+-------...
+          |<-->|                |<------>|                |<----->|
+         unusable               free space                 unusable
+        free space                                        free space
+
+
+Finally, when the usable free space is not enough to store a new block, unusable
+parts may be get back with a full defragmentation. The payloads part is then
+realigned at the beginning of the blocks array and the free space becomes
+continuous again.
+
+
+3. The HTX blocks
+
+An HTX block can be as well a start-line as a header, a body part or a
+trailer. For all these types of block, a payload is attached to the block. It
+can also be a marker, the end-of-headers or end-of-trailers. For these blocks,
+there is no payload but it counts for a byte. It is important to not skip it
+when data are forwarded.
+
+As already said, a block is composed of metadata and a payload. Metadata are
+stored in the blocks part and are composed of 2 fields :
+
+    - info : It a 32 bits field containing the block's type on 4 bits followed
+             by the payload length. See below for details.
+
+    - addr : The payload's address, if any, relatively to the beginning the
+             array used to store part of the HTTP message itself.
+
+
+  * Block's info representation :
+
+    0b 0000 0000 0000 0000 0000 0000 0000 0000
+       ---- ------------------------ ---------
+       type     value (1 MB max)     name length (header/trailer - 256B max)
+            ----------------------------------
+                 data length (256 MB max)
+     (body, method, path, version, status, reason)
+
+
+Supported types are :
+
+    - 0000  (0) : The request start-line
+    - 0001  (1) : The response start-line
+    - 0010  (2) : A header block
+    - 0011  (3) : The end-of-headers marker
+    - 0100  (4) : A data block
+    - 0101  (5) : A trailer block
+    - 0110  (6) : The end-of-trailers marker
+    - 1111 (15) : An unused block
+
+Other types are unused for now and reserved for futur extensions.
+
+An HTX message is typically composed of following blocks, in this order :
+
+    - a start-line
+    - zero or more header blocks
+    - an end-of-headers marker
+    - zero or more data blocks
+    - zero or more trailer blocks (optional)
+    - an end-of-trailers marker (optional but always set if there is at least
+      one trailer block)
+
+Only one HTTP request at a time can be stored in an HTX message. For HTTP
+response, it is more complicated. Only one "final" response can be stored in an
+HTX message. It is a response with status-code 101 or greater or equal to
+200. But it may be preceded by several 1xx informational responses. Such
+responses are part of the same HTX message.
+
+When the end of the message is reached a special flag is set on the message
+(HTX_FL_EOM). It means no more data are expected for this message, except
+tunneled data. But tunneled data will never be mixed with message data to avoid
+ambiguities. Thus once the flag marking the end of the message is set, it is
+easy to know the message ends. The end is reached if the HTX message is empty or
+on the tail HTX block in the HTX message. Once all blocks of the HTX message are
+consumed, tunneled data, if any, may be transferred.
+
+
+3.1. The start-line
+
+Every HTX message starts with a start-line. Its payload is a "struct htx_sl". In
+addition to the parts of the HTTP start-line, this structure contains some
+information about the represented HTTP message, mainly in the form of flags
+(HTX_SL_F_*). For instance, if an HTTP message contains the header
+"conten-length", then the flag HTX_SL_F_CLEN is set.
+
+Each HTTP message has its own start-line. So an HTX request has one and only one
+start-line because it must contain only one HTTP request at a time. But an HTX
+response may have more than one start-line if the final HTTP response is
+precedeed by some 1xx informational responses.
+
+In HTTP/2, there is no start-line. So the H2 multiplexer must create one when it
+converts an H2 message to HTX :
+
+    - For the request, it uses the pseudo headers ":method", ":path" or
+      ":authority" depending on the method and the hardcoded version "HTTP/2.0".
+
+    - For the response, it used the hardcoded version "HTTP/2.0", the
+      pseudo-header ":status" and an empty reason.
+
+
+3.2. The headers and trailers
+
+HTX Headers and trailers are quite similar. Different types are used to simplify
+headers processing. But from the HTX point of view, there is no real difference,
+except their position in the HTX message. The header blocks always follow an HTX
+start-line while trailer blocks come after the data. If there is no data, they
+follow the end-of-headers marker.
+
+Headers and trailers are the only blocks containing a Key/Value payload. The
+corresponding end-of marker must always be placed after each group to mark, as
+it name suggests, the end.
+
+In HTTP/1, trailers are only present on chunked messages. But chunked messages
+do not always have trailers. In this case, the end-of-trailers block may or may
+not be present. Multiplexers must be able to handle both situations. In HTTP/2,
+trailers are only present if a HEADERS frame is sent after DATA frames.
+
+
+3.3. The data
+
+The payload body of an HTTP message is stored as DATA blocks in the HTX
+message. For HTTP/1 messages, it is the message body without the chunks
+formatting, if any. For HTTP/2, it is the payload of DATA frames.
+
+The DATA blocks are the only HTX blocks that may be partially processed (copied
+or removed). All other types of block must be entierly processed. This means
+DATA blocks can be resized.
+
+
+3.4. The end-of markers
+
+These blocks are used to delimit parts of an HTX message. It exists two
+markers :
+
+    - end-of-headers (EOH)
+    - end-of-trailers (EOT)
+
+EOH is always present in an HTX message. EOT is optional.
+
+
+4. The HTX API
+
+
+4.1. Get/set HTX message from/to the underlying buffer
+
+The first thing to do to process an HTX message is to get it from the underlying
+buffer. There are 2 functions to do so, the second one relying on the first :
+
+    - htxbuf() returns an HTX message from a buffer. It does not modify the
+      buffer. It only initialize the HTX message if the buffer is empty.
+
+    - htx_from_buf() uses htxbuf(). But it also updates the underlying buffer so
+      that it appears as full.
+
+Both functions return a "zero-sized" HTX message if the buffer is null. This
+way, the HTX message is always valid. The first function is the default function
+to use. The second one is only useful when some content will be added. For
+instance, it used by the HTX analyzers when HAProxy generates a response. Thus,
+the buffer is in a right state.
+
+Once the processing done, if the HTX message has been modified, the underlying
+buffer must be also updated, except htx_from_buf() was used _AND_ data was only
+added. For all other cases, the function htx_to_buf() must be called.
+
+Finally, the function htx_reset() may be called at any time to reset an HTX
+message. And the function buf_room_for_htx_data() may be called to know if a raw
+buffer is full from the HTX perspective. It is used during conversion from/to
+the HTX.
+
+
+4.2. Helpers to deal with free space in an HTX message
+
+Once with an HTX message, following functions may help to process it :
+
+    - htx_used_space() and htx_meta_space() return, respectively, the total
+      space used in an HTX message and the space used by block's metadata only.
+
+    - htx_free_space() and htx_free_data_space() return, respectively, the total
+      free space in an HTX message and the free space available for the payload
+      if a new HTX block is stored (so it is the total free space minus the size
+      of an HTX block).
+
+    - htx_is_empty() and htx_is_not_empty() are boolean functions to know if an
+      HTX message is empty or not.
+
+    - htx_get_max_blksz() returns the maximum size available for the payload,
+      not exceeding a maximum, metadata included.
+
+    - htx_almost_full() should be used to know if an HTX message uses at least
+      3/4 of its capacity.
+
+
+4.3. HTX Blocks manipulations
+
+Once the available sapce in an HTX message is known, the next step is to add HTX
+blocks. First of all the function htx_nbblks() returns the number of blocks
+allocated in an HTX message. Then, there is an add function per block's type :
+
+    - htx_add_stline() adds a start-line. The type (request or response) and the
+      flags of the start-line must be provided, as well as its three parts
+      (method,uri,version or version,status-code,reason).
+
+    - htx_add_header() and htx_add_trailers() are similar. The name and the
+      value must be provided. The inserted HTX block is returned on success or
+      NULL if an error occurred.
+
+    - htx_add_endof() must be used to add any end-of marker. The block's type
+      (EOH or EOT) must be specified. The inserted HTX block is returned on
+      success or NULL if an error occurred.
+
+    - htx_add_all_headers() and htx_add_all_trailers() add, respectively, a list
+      of headers and a list of trailers, followed by the appropriate end-of
+      marker. On success, this marker is returned. Otherwise, NULL is
+      returned. Note there is no rollback on the HTX message when an error
+      occurred. Some headers or trailers may have been added. So it is the
+      caller responsibility to take care of that.
+
+    - htx_add_data() must be used to add a DATA block. Unlike previous
+      functions, this one returns the number of bytes copied or 0 if nothing was
+      copied. If possible, the data are appended to the tail block if it is a
+      DATA block. Only a part of the payload may be copied because this function
+      will try to limit the message defragmentation and the wrapping of blocks
+      as far as possible.
+
+    - htx_add_data_atonce() must be used if all data must be added or nothing.
+      It tries to insert all the payload, this function returns the inserted
+      block on success.  Otherwise it returns NULL.
+
+When an HTX block is added, it is always the last one (the tail). But, if a
+block must be added at a specific place, it is not really handy. 2 functions may
+help (others could be added) :
+
+    - htx_add_last_data() adds a DATA block just after all other DATA blocks and
+      before any trailers and EOT marker. It relies on htx_add_data_atonce(), so
+      a defragmentation may be performed.
+
+    - htx_move_blk_before() moves a specific block just after another one. Both
+      blocks must already be in the HTX message and the block to move must
+      always be placed after the "pivot".
+
+Once added, there are three functions to update the block's payload :
+
+    - htx_replace_stline() updates a start-line. The HTX block must be passed as
+      argument. Only string parts of the start-line are updated by this
+      function. On success, it returns the new start-line. So it is pretty easy
+      to update its flags. NULL is returned if an error occurred.
+
+    - htx_replace_header() fully replaces a header (its name and its value) by a
+      new one. The HTX block must be passed a argument, as well as its new name
+      and its new value. The new header can be smaller or larger than the old
+      one. This function returns the new HTX block on success, or NULL is an
+      error occurred.
+
+    - htx_replace_blk_value() replaces a part of a block's payload or its
+      totality. It works for HEADERS, TRAILERS or DATA blocks. The HTX block
+      must be provided with the part to remove and the new one. The new part can
+      be smaller or larger than the old one. This function returns the new HTX
+      block on success, or NULL is an error occurred.
+
+    - htx_change_blk_value_len() changes the size of the value. It is the caller
+      responsibility to change the value itself, make sure there is enough space
+      and update allocated value. This function updates the HTX message
+      accordingly.
+
+    - htx_set_blk_value_len() changes the size of the value. It is the caller
+      responsibility to change the value itself, make sure there is enough space
+      and update allocated value. Unlike the function
+      htx_change_blk_value_len(), this one does not update the HTX message. So
+      it should be used with caution.
+
+    - htx_cut_data_blk() removes <n> bytes from the beginning of a DATA
+      block. The block's start address and its length are adjusted, and the
+      htx's total data count is updated. This is used to mark that part of some
+      data were transferred from a DATA block without removing this DATA
+      block. No sanity check is performed, the caller is responsible for doing
+      this exclusively on DATA blocks, and never removing more than the block's
+      size.
+
+    - htx_remove_blk() removes a block from an HTX message. It returns the
+      following block or NULL if it is the tail block.
+
+Finally, a block may be removed using the function htx_remove_blk(). This
+function returns the block following the one removed or NULL if it is the tail
+block.
+
+
+4.4. The HTX start-line
+
+Unlike other HTX blocks, the start-line is a bit special because its payload is
+a structure followed by its three parts :
+
+        +--------+-------+-------+-------+
+        | HTX_SL | PART1 | PART2 | PART3 |
+        +--------+-------+-------+-------+
+
+Some macros and functions may help to manipulate these parts :
+
+    - HTX_SL_P{N}_LEN() and HTX_SL_P{N}_PTR() are macros to get the length of a
+      part and a pointer on it. {N} should be 1, 2 or 3.
+
+    - HTX_SL_REQ_MLEN(), HTX_SL_REQ_ULEN(), HTX_SL_REQ_VLEN(),
+      HTX_SL_REQ_MPTR(), HTX_SL_REQ_UPTR() and HTX_SL_REQ_VPTR() are macros to
+      get info about a request start-line. These macros only wrap HTX_SL_P*
+      ones.
+
+    - HTX_SL_RES_VLEN(), HTX_SL_RES_CLEN(), HTX_SL_RES_RLEN(),
+      HTX_SL_RES_VPTR(), HTX_SL_RES_CPTR() and HTX_SL_RES_RPTR() are macros to
+      get info about a response start-line. These macros only wrap HTX_SL_P*
+      ones.
+
+    - htx_sl_p1(), htx_sl_p2() and htx_sl_p2() are functions to get the ist
+      corresponding to the right part of a start-line.
+
+    - htx_sl_req_meth(), htx_sl_req_uri() and htx_sl_req_vsn() get the ist
+      corresponding to the right part of a request start-line.
+
+    - htx_sl_res_vsn(), htx_sl_res_code() and htx_sl_res_reason() get the ist
+      corresponding to the right part of a response start-line.
+
+
+4.5. Iterate on the HTX message
+
+To iterate on an HTX message, the first thing to do is to get the HTX block to
+start the loop. There are three special blocks in an HTX message that may be
+good candidates to start a loop :
+
+    - the head block. It is the oldest inserted block. Multiplexers always start
+      to consume an HTX message from this block. The function htx_get_head()
+      returns its position and htx_get_head_blk() returns the blocks itself. In
+      addition, the function htx_get_head_type() returns its block's type.
+
+    - the tail block. It is the newest inserted block. The function
+      htx_get_tail() returns its position and htx_get_tail_blk() returns the
+      blocks itself. In addition, the function htx_get_tail_type() returns its
+      block's type.
+
+    - the first block. It is the block where to (re)start the analyse. It is
+      used as start point by HTX analyzers. The function htx_get_first() returns
+      its position and htx_get_first_blk() returns the blocks itself. In
+      addition, the function htx_get_first_type() returns its block's type.
+
+For all these functions, if the HTX message is empty, -1 is returned for the
+block's position, NULL instead of a block and HTX_BLK_UNUSED for its type.
+
+Then to iterate on blocks, foreword or backward :
+
+    - htx_get_prev() and htx_get_next() return, respectively, the position of
+      the previous block or the next block, given a specific position. Or -1 if
+      an edge is reached.
+
+    - htx_get_prev_blk() and htx_get_next_blk() return, respectively, the
+      previous block or the next one, given a specific block. Or NULL if an edge
+      is reached.
+
+4.6. Access block content and info
+
+Following functions may be used to retrieve information about a specific HTX
+block :
+
+    - htx_get_blk_pos() returns the position of a block. It must be in the HTX
+      message.
+
+    - htx_get_blk_ptr() returns a pointer on the payload of a block.
+
+    - htx_get_blk_type() returns the type of a block.
+
+    - htx_get_blksz() returns the payload size of a block
+
+    - htx_get_blk_name() returns the name of a block, only if it is a header or
+      a trailer. Otherwise, it returns an empty string.
+
+    - htx_get_blk_value() returns the value of a block, depending on its
+      type. For header and trailer blocks, it is the value field. For markers
+      (EOH or EOT), an empty string is returned. For other blocks an ist
+      pointing on the block payload is returned.
+
+    - htx_is_unique_blk() may be used to know if a block is the only one
+      remaining inside an HTX message, excluding unused blocks. This function is
+      pretty useful to determine the end of a HTX message, in conjunction with
+      HTX_FL_EOM flag.
+
+4.7. Advanced functions
+
+Some more advanced functions may be used to do complex processing on the HTX
+message. These functions are used by HTX analyzers or by multiplexers.
+
+    - htx_truncate() removes all blocks after the one containing a specific
+      offset relatively to the head block of the HTX message. If the offset is
+      inside a DATA block, it is truncated. For all other blocks, the removal
+      starts to the next block.
+
+    - htx_drain() tries to remove a specific amount of bytes of payload. If the
+      tail block is a DATA block, it may be truncated if necessary. All other
+      block are removed at once or kept. This function returns a mixed value,
+      with the first block not removed, or NULL if everything was removed, and
+      the amount of data drained.
+
+    - htx_xfer_blks() transfers HTX blocks from an HTX message to another,
+      stopping on the first block of a specified type or when a specific amount
+      of bytes, including meta-data, was moved. If the tail block is a DATA
+      block, it may be partially moved. All other block are transferred at once
+      or kept. This function returns a mixed value, with the last block moved,
+      or NULL if nothing was moved, and the amount of data transferred. When
+      HEADERS or TRAILERS blocks must be transferred, this function transfers
+      all of them. Otherwise, if it is not possible, it triggers an error. It is
+      the caller responsibility to transfer all headers or trailers at once.
+
+    - htx_append_msg() append an HTX message to another one. All the message is
+      copied or nothing. So, if an error occurred, a rollback is performed. This
+      function returns 1 on success and 0 on error.
+
+    - htx_reserve_max_data() Reserves the maximum possible size for an HTX data
+      block, by extending an existing one or by creating a new one. It returns a
+      compound result with the HTX block and the position where new data must be
+      inserted (0 for a new block). If an error occurs or if there is no space
+      left, NULL is returned instead of a pointer on an HTX block.
+
+    - htx_find_offset() looks for the HTX block containing a specific offset,
+      starting at the HTX message's head. The function returns the found HTX
+      block and the position inside this block where the offset is. If the
+      offset is outside of the HTX message, NULL is returned.
+
+    - htx_defrag() defragments an HTX message. It removes unused blocks and
+      unwraps the payloads part. A temporary buffer is used to do so. This
+      function never fails. A referenced block may be provided. If so, the
+      corresponding new block is returned. Otherwise, NULL is returned.
diff --git a/doc/internals/api/initcalls.txt b/doc/internals/api/initcalls.txt
new file mode 100644
index 0000000..30d8737
--- /dev/null
+++ b/doc/internals/api/initcalls.txt
@@ -0,0 +1,360 @@
+Initialization stages aka how to get your code initialized at the right moment
+
+
+1. Background
+
+Originally all subsystems were initialized via a dedicated function call
+from the huge main() function. Then some code started to become conditional
+or a bit more modular and the #ifdef placed there became a mess, resulting
+in init code being moved to function constructors in each subsystem's own
+file. Then pools of various things were introduced, starting to make the
+whole init sequence more complicated due to some forms of internal
+dependencies. Later epoll was introduced, requiring a post-fork callback,
+and finally threads arrived also requiring some post-thread init/deinit
+and allocation, marking the old architecture's last breath. Finally the
+whole thing resulted in lots of init code duplication and was simplified
+in 1.9 with the introduction of initcalls and initialization stages.
+
+
+2. New architecture
+
+The new architecture relies on two layers :
+  - the registration functions
+  - the INITCALL macros and initialization stages
+
+The first ones are mostly used to add a callback to a list. The second ones
+are used to specify when to call a function. Both are totally independent,
+however they are generally combined via another set consisting in the REGISTER
+macros which make some registration functions be called at some specific points
+during the init sequence.
+
+
+3. Registration functions
+
+Registration functions never fail. Or more precisely, if they fail it will only
+be on out-of-memory condition, and they will cause the process to immediately
+exit. As such they do not return any status and the caller doesn't have to care
+about their success.
+
+All available functions are described below in alphanumeric ordering. Please
+make sure to respect this ordering when adding new ones.
+
+- void hap_register_build_opts(const char *str, int must_free)
+
+  This appends the zero-terminated constant string <str> to the list of known
+  build options that will be reported on the output of "haproxy -vv". A line
+  feed character ('\n') will automatically be appended after the string when it
+  is displayed. The <must_free> argument must be zero, unless the string was
+  allocated by any malloc-compatible function such as malloc()/calloc()/
+  realloc()/strdup() or memprintf(), in which case it's better to pass a
+  non-null value so that the string is freed upon exit. Note that despite the
+  function's prototype taking a "const char *", the pointer will actually be
+  cast and freed. The const char* is here to leave more freedom to use consts
+  when making such options lists.
+
+- void hap_register_per_thread_alloc(int (*fct)())
+
+  This adds a call to function <fct> to the list of functions to be called when
+  threads are started, at the beginning of the polling loop. This is also valid
+  for the main thread and will be called even if threads are disabled, so that
+  it is guaranteed that this function will be called in any circumstance. Each
+  thread will first call all these functions exactly once when it starts. Calls
+  are serialized by the init_mutex, so that locking is not necessary in these
+  functions. There is no relation between the thread numbers and the callback
+  ordering. The function is expected to return non-zero on success, or zero on
+  failure. A failure will make the process emit a succinct error message and
+  immediately exit. See also hap_register_per_thread_free() for functions
+  called after these ones.
+
+- void hap_register_per_thread_deinit(void (*fct)());
+
+  This adds a call to function <fct> to the list of functions to be called when
+  threads are gracefully stopped, at the end of the polling loop. This is also
+  valid for the main thread and will be called even if threads are disabled, so
+  that it is guaranteed that this function will be called in any circumstance
+  if the process experiences a soft stop. Each thread will call this function
+  exactly once when it stops. However contrary to _alloc() and _init(), the
+  calls are made without any protection, thus if any shared resource if touched
+  by the function, the function is responsible for protecting it. The reason
+  behind this is that such resources are very likely to be still in use in one
+  other thread and that most of the time the functions will in fact only touch
+  a refcount or deinitialize their private resources. See also
+  hap_register_per_thread_free() for functions called after these ones.
+
+- void hap_register_per_thread_free(void (*fct)());
+
+  This adds a call to function <fct> to the list of functions to be called when
+  threads are gracefully stopped, at the end of the polling loop, after all calls
+  to _deinit() callbacks are done for this thread. This is also valid for the
+  main thread and will be called even if threads are disabled, so that it is
+  guaranteed that this function will be called in any circumstance if the
+  process experiences a soft stop. Each thread will call this function exactly
+  once when it stops. However contrary to _alloc() and _init(), the calls are
+  made without any protection, thus if any shared resource if touched by the
+  function, the function is responsible for protecting it. The reason behind
+  this is that such resources are very likely to be still in use in one other
+  thread and that most of the time the functions will in fact only touch a
+  refcount or deinitialize their private resources. See also
+  hap_register_per_thread_deinit() for functions called before these ones.
+
+- void hap_register_per_thread_init(int (*fct)())
+
+  This adds a call to function <fct> to the list of functions to be called when
+  threads are started, at the beginning of the polling loop, right after the
+  list of _alloc() functions. This is also valid for the main thread and will
+  be called even if threads are disabled, so that it is guaranteed that this
+  function will be called in any circumstance. Each thread will call this
+  function exactly once when it starts, and calls are serialized by the
+  init_mutex which is held over all _alloc() and _init() calls, so that locking
+  is not necessary in these functions. In other words for all threads but the
+  current one, the sequence of _alloc() and _init() calls will be atomic. There
+  is no relation between the thread numbers and the callback ordering. The
+  function is expected to return non-zero on success, or zero on failure. A
+  failure will make the process emit a succinct error message and immediately
+  exit. See also hap_register_per_thread_alloc() for functions called before
+  these ones.
+
+- void hap_register_pre_check(int (*fct)())
+
+  This adds a call to function <fct> to the list of functions to be called at
+  the step just before the configuration validity checks. This is useful when you
+  need to create things like it would have been done during the configuration
+  parsing and where the initialization should continue in the configuration
+  check.
+  It could be used for example to generate a proxy with multiple servers using
+  the configuration parser itself. At this step the trash buffers are allocated.
+  Threads are not yet started so no protection is required. The function is
+  expected to return non-zero on success, or zero on failure. A failure will make
+  the process emit a succinct error message and immediately exit.
+
+- void hap_register_post_check(int (*fct)())
+
+  This adds a call to function <fct> to the list of functions to be called at
+  the end of the configuration validity checks, just at the point where the
+  program either forks or exits depending whether it's called with "-c" or not.
+  Such calls are suited for memory allocation or internal table pre-computation
+  that would preferably not be done on the fly to avoid inducing extra time to
+  a pure configuration check. Threads are not yet started so no protection is
+  required. The function is expected to return non-zero on success, or zero on
+  failure. A failure will make the process emit a succinct error message and
+  immediately exit.
+
+- void hap_register_post_deinit(void (*fct)())
+
+  This adds a call to function <fct> to the list of functions to be called when
+  freeing the global sections at the end of deinit(), after everything is
+  stopped. The process is single-threaded at this point, thus these functions
+  are suitable for releasing configuration elements provided that no other
+  _deinit() function uses them, i.e. only close/release what is strictly
+  private to the subsystem. Since such functions are mostly only called during
+  soft stops (reloads) or failed startups, they tend to experience much less
+  test coverage than others despite being more exposed, and as such a lot of
+  care must be taken to test them especially when facing partial subsystem
+  initializations followed by errors.
+
+- void hap_register_post_proxy_check(int (*fct)(struct proxy *))
+
+  This adds a call to function <fct> to the list of functions to be called for
+  each proxy, after the calls to _post_server_check(). This can allow, for
+  example, to pre-configure default values for an option in a frontend based on
+  the "bind" lines or something in a backend based on the "server" lines. It's
+  worth being aware that such a function must be careful not to waste too much
+  time in order not to significantly slow down configurations with tens of
+  thousands of backends. The function is expected to return non-zero on
+  success, or zero on failure. A failure will make the process emit a succinct
+  error message and immediately exit.
+
+- void hap_register_post_server_check(int (*fct)(struct server *))
+
+  This adds a call to function <fct> to the list of functions to be called for
+  each server, after the call to check_config_validity(). This can allow, for
+  example, to preset a health state on a server or to allocate a protocol-
+  specific memory area. It's worth being aware that such a function must be
+  careful not to waste too much time in order not to significantly slow down
+  configurations with tens of thousands of servers. The function is expected
+  to return non-zero on success, or zero on failure. A failure will make the
+  process emit a succinct error message and immediately exit.
+
+- void hap_register_proxy_deinit(void (*fct)(struct proxy *))
+
+  This adds a call to function <fct> to the list of functions to be called when
+  freeing the resources during deinit(). These functions will be called as part
+  of the proxy's resource cleanup. Note that some of the proxy's fields will
+  already have been freed and others not, so such a function must not use any
+  information from the proxy that is subject to being released. In particular,
+  all servers have already been deleted. Since such functions are mostly only
+  called during soft stops (reloads) or failed startups, they tend to
+  experience much less test coverage than others despite being more exposed,
+  and as such a lot of care must be taken to test them especially when facing
+  partial subsystem initializations followed by errors. It's worth mentioning
+  that too slow functions could have a significant impact on the configuration
+  check or exit time especially on large configurations.
+
+- void hap_register_server_deinit(void (*fct)(struct server *))
+
+  This adds a call to function <fct> to the list of functions to be called when
+  freeing the resources during deinit(). These functions will be called as part
+  of the server's resource cleanup. Note that some of the server's fields will
+  already have been freed and others not, so such a function must not use any
+  information from the server that is subject to being released. Since such
+  functions are mostly only called during soft stops (reloads) or failed
+  startups, they tend to experience much less test coverage than others despite
+  being more exposed, and as such a lot of care must be taken to test them
+  especially when facing partial subsystem initializations followed by errors.
+  It's worth mentioning that too slow functions could have a significant impact
+  on the configuration check or exit time especially on large configurations.
+
+
+4. Initialization stages
+
+In order to offer some guarantees, the startup of the program is split into
+several stages. Some callbacks can be placed into each of these stages using
+an INITCALL macro, with 0 to 3 arguments, respectively called INITCALL0 to
+INITCALL3. These macros must be placed anywhere at the top level of a C file,
+preferably at the end so that the referenced symbols have already been met,
+but it may also be fine to place them right after the callbacks themselves.
+
+Such callbacks are referenced into small structures containing a pointer to the
+function and 3 arguments. NULL replaces unused arguments. The callbacks are
+cast to (void (*)(void *, void *, void *)) and the arguments to (void *).
+
+The first argument to the INITCALL macro is the initialization stage. The
+second one is the callback function, and others if any are the arguments.
+The init stage must be among the values of the "init_stage" enum, currently,
+and in this execution order:
+
+  - STG_PREPARE  : used to preset variables, pre-initialize lookup tables and
+                   pre-initialize list heads
+  - STG_LOCK     : used to pre-initialize locks
+  - STG_REGISTER : used to register static lists such as keywords
+  - STG_ALLOC    : used to allocate the required structures
+  - STG_POOL     : used to create pools
+  - STG_INIT     : used to initialize subsystems
+
+Each stage is guaranteed that previous stages have successfully completed. This
+means that an INITCALL placed at stage STG_INIT is guaranteed that all pools
+were already created and will be usable. Conversely, an INITCALL placed at
+stage STG_REGISTER must not rely on any field that requires preliminary
+allocation nor initialization. A callback cannot rely on other callbacks of the
+same stage, as the execution order within a stage is undefined and essentially
+depends on the linking order.
+
+The STG_REGISTER level is made for run-time linking of the various modules that
+compose the executable. Keywords, protocols and various other elements that are
+local known to each compilation unit can will be appended into common lists at
+boot time. This is why this call is placed just before STG_ALLOC.
+
+Example: register a very early call to init_log() with no argument, and another
+         call to cli_register_kw(&cli_kws) much later:
+
+   INITCALL0(STG_PREPARE, init_log);
+   INITCALL1(STG_REGISTER, cli_register_kw, &cli_kws);
+
+Technically speaking, each call to such a macro adds a distinct local symbol
+whose dynamic name involves the line number. These symbols are placed into a
+separate section and the beginning and end section pointers are provided by the
+linker. When too old a linker is used, a fallback is applied consisting in
+placing them into a linked list which is built by a constructor function for
+each initcall (this takes more room).
+
+Due to the symbols internally using the line number, it is very important not
+to place more than one INITCALL per line in the source file.
+
+It is also strongly recommended that functions and referenced arguments are
+static symbols local to the source file, unless they are global registration
+functions like in the example above with cli_register_kw(), where only the
+argument is a local keywords table.
+
+INITCALLs do not expect the callback function to return anything and as such
+do not perform any error check. As such, they are very similar to constructors
+offered by the compiler except that they are segmented in stages. It is thus
+the responsibility of the called functions to perform their own error checking
+and to exit in case of error. This may change in the future.
+
+
+5. REGISTER family of macros
+
+The association of INITCALLs and registration functions allows to perform some
+early dynamic registration of functions to be used anywhere, as well as values
+to be added to existing lists without having to manipulate list elements. For
+the sake of simplification, these combinations are available as a set of
+REGISTER macros which register calls to certain functions at the appropriate
+init stage. Such macros must be used at the top level in a file, just like
+INITCALL macros. The following macros are currently supported. Please keep them
+alphanumerically ordered:
+
+- REGISTER_BUILD_OPTS(str)
+
+  Adds the constant string <str> to the list of build options. This is done by
+  registering a call to hap_register_build_opts(str, 0) at stage STG_REGISTER.
+  The string will not be freed.
+
+- REGISTER_CONFIG_POSTPARSER(name, parser)
+
+  Adds a call to function <parser> at the end of the config parsing. The
+  function is called at the very end of check_config_validity() and may be used
+  to initialize a subsystem based on global settings for example. This is done
+  by registering a call to cfg_register_postparser(name, parser) at stage
+  STG_REGISTER.
+
+- REGISTER_CONFIG_SECTION(name, parse, post)
+
+  Registers a new config section name <name> which will be parsed by function
+  <parse> (if not null), and with an optional call to function <post> at the
+  end of the section. Function <parse> must be of type (int (*parse)(const char
+  *file, int linenum, char **args, int inv)), and returns 0 on success or an
+  error code among the ERR_* set on failure. The <post> callback takes no
+  argument and returns a similar error code. This is achieved by registering a
+  call to cfg_register_section() with the three arguments at stage
+  STG_REGISTER.
+
+- REGISTER_PER_THREAD_ALLOC(fct)
+
+  Registers a call to register_per_thread_alloc(fct) at stage STG_REGISTER.
+
+- REGISTER_PER_THREAD_DEINIT(fct)
+
+  Registers a call to register_per_thread_deinit(fct) at stage STG_REGISTER.
+
+- REGISTER_PER_THREAD_FREE(fct)
+
+  Registers a call to register_per_thread_free(fct) at stage STG_REGISTER.
+
+- REGISTER_PER_THREAD_INIT(fct)
+
+  Registers a call to register_per_thread_init(fct) at stage STG_REGISTER.
+
+- REGISTER_POOL(ptr, name, size)
+
+  Used internally to declare a new pool. This is made by calling function
+  create_pool_callback() with these arguments at stage STG_POOL. Do not use it
+  directly, use either DECLARE_POOL() or DECLARE_STATIC_POOL() instead.
+
+- REGISTER_PRE_CHECK(fct)
+
+  Registers a call to register_pre_check(fct) at stage STG_REGISTER.
+
+- REGISTER_POST_CHECK(fct)
+
+  Registers a call to register_post_check(fct) at stage STG_REGISTER.
+
+- REGISTER_POST_DEINIT(fct)
+
+  Registers a call to register_post_deinit(fct) at stage STG_REGISTER.
+
+- REGISTER_POST_PROXY_CHECK(fct)
+
+  Registers a call to register_post_proxy_check(fct) at stage STG_REGISTER.
+
+- REGISTER_POST_SERVER_CHECK(fct)
+
+  Registers a call to register_post_server_check(fct) at stage STG_REGISTER.
+
+- REGISTER_PROXY_DEINIT(fct)
+
+  Registers a call to register_proxy_deinit(fct) at stage STG_REGISTER.
+
+- REGISTER_SERVER_DEINIT(fct)
+
+  Registers a call to register_server_deinit(fct) at stage STG_REGISTER.
+
diff --git a/doc/internals/api/ist.txt b/doc/internals/api/ist.txt
new file mode 100644
index 0000000..0f118d6
--- /dev/null
+++ b/doc/internals/api/ist.txt
@@ -0,0 +1,167 @@
+2021-11-08 - Indirect Strings (IST) API
+
+
+1. Background
+-------------
+
+When parsing traffic, most of the standard C string functions are unusable
+since they rely on a trailing zero. In addition, for the rare ones that support
+a length, we have to constantly maintain both the pointer and the length. But
+then, it's easy to come up with complex lengths and offsets calculations all
+over the place, rendering the code hard to read and bugs hard to avoid or spot.
+
+IST provides a solution to this by defining a structure made of exactly two
+word size elements, that most C ABIs know how to handle as a register when
+used as a function argument or a function's return value. The functions are
+inlined to leave a maximum set of opportunities to the compiler or optimization
+and expression reduction, and as a result they are often inexpensive to use. It
+is important however to keep in mind that all of these are designed for minimal
+code size when dealing with short strings (i.e. parsing tokens in protocols),
+and they are not optimal for processing large blocks.
+
+
+2. API description
+------------------
+
+IST are defined like this:
+
+  struct ist {
+          char  *ptr;  // pointer to the string's first byte
+          size_t len;  // number of valid bytes starting from ptr
+  };
+
+A string is not set if its ->ptr member is NULL. In this case .len is undefined
+and is recommended to be zero.
+
+Declaring a function returning an IST:
+
+  struct ist produce_ist(int ok)
+  {
+      return ok ? IST("OK") : IST("KO");
+  }
+
+Declaring a function consuming an IST:
+
+  void say_ist(struct ist i)
+  {
+      write(1, istptr(i), istlen(i));
+  }
+
+Chaining the two:
+
+  void say_ok(int ok)
+  {
+      say_ist(produce_ist(ok));
+  }
+
+Notes:
+  - the arguments are passed as value, not reference, so there's no need for
+    any "const" in their declaration (except to catch coding mistakes).
+    Pointers to ist may benefit from being marked "const" however.
+
+  - similarly for the return value, there's no point is marking it "const" as
+    this would protect the pointer and length, not the data.
+
+  - use ist0() to append a trailing zero to a variable string for use with
+    printf()'s "%s" format, or for use with functions that work on NUL-
+    terminated strings, but beware of not doing this with constants.
+
+  - the API provides a starting pointer and current length, but does not
+    provide an allocated size. It remains up to the caller to know how large
+    the allocated area is when adding data, though most functions make this
+    easy.
+
+The following macros and functions are defined. Those whose name starts with
+underscores require special care and must not be used without being certain
+they are properly used (typically subject to buffer overflows if misused). Note
+that most functions were added over time depending on instant needs, and some
+are very close to each other. Many useful functions are still missing and would
+deserve being added.
+
+Below, arguments "i1","i2" are all of type "ist". Arguments "s" are
+NUL-terminated strings of type "char*", and "cs" are of type "const char *".
+Arguments "c" are of type "char", and "n" are of type size_t.
+
+  IST(cs):ist            make constant IST from a NUL-terminated const string
+  IST_NULL:ist           return an unset IST = ist2(NULL,0)
+  __istappend(i1,c):ist  append character <c> at the end of ist <i1>
+  ist(s):ist             return an IST from a nul-terminated string
+  ist0(i1):char*         write a \0 at the end of an IST, return the string
+  ist2(cs,l):ist         return a variable IST from a const string and length
+  ist2bin(s,i1):ist      copy IST into a buffer, return the result
+  ist2bin_lc(s,i1):ist   like ist2bin() but turning turning to lower case
+  ist2bin_uc(s,i1):ist   like ist2bin() but turning turning to upper case
+  ist2str(s,i1):ist      copy IST into a buffer, add NUL and return the result
+  ist2str_lc(s,i1):ist   like ist2str() but turning turning to lower case
+  ist2str_uc(s,i1):ist   like ist2str() but turning turning to upper case
+  ist_find(i1,c):ist     return first occurrence of char <c> in <i1>
+  ist_find_ctl(i1):char* return pointer to first CTL char in <i1> or NULL
+  ist_skip(i1,c):ist     return first occurrence of char not <c> in <i1>
+  istadv(i1,n):ist       advance the string by <n> characters
+  istalloc(n):ist        return allocated string of zero initial length
+  istcat(d,s,n):ssize_t  copy <s> after <d> for <n> chars max, return len or -1
+  istchr(i1,c):char*     return pointer to first occurrence of <c> in <i1>
+  istclear(i1*):size_t   return previous size and set size to zero
+  istcpy(d,s,n):ssize_t  copy <s> over <d> for <n> chars max, return len or -1
+  istdiff(i1,i2):int     return the ordinal difference, like strcmp()
+  istdup(i1):ist         allocate new ist and copy original one into it
+  istend(i1):char*       return pointer to first character after the IST
+  isteq(i1,i2):int       return non-zero if strings are equal
+  isteqi(i1,i2):int      like isteq() but case-insensitive
+  istfree(i1*)           free of allocated <i1>/IST_NULL and set it to IST_NULL
+  istissame(i1,i2):int   return true if pointers and lengths are equal
+  istist(i1,i2):ist      return first occurrence of <i2> in <i1>
+  istlen(i1):size_t      return the length of the IST (number of characters)
+  istmatch(i1,i2):int    return non-zero if i1 starts like i2 (empty OK)
+  istmatchi(i1,i2):int   like istmatch() but case insensitive
+  istneq(i1,i2,n):int    like isteq() but limited to the first <n> chars
+  istnext(i1):ist        return the IST advanced by one character
+  istnmatch(i1,i2,n):int like istmatch() but limited to the first <n> chars
+  istpad(s,i1):ist       copy IST into a buffer, add a NUL, return the result
+  istptr(i1):char*       return the starting pointer of the IST
+  istscat(d,s,n):ssize_t same as istcat() but always place a NUL at the end
+  istscpy(d,s,n):ssize_t same as istcpy() but always place a NUL at the end
+  istshift(i1*):char     return the first character and advance the IST by one
+  istsplit(i1*,c):ist    return part before <c>, make ist start from <c>
+  iststop(i1,c):ist      truncate ist before first occurrence of <c>
+  isttest(i1):int        return true if ist is not NULL, false otherwise
+  isttrim(i1,n):ist      return ist trimmed to no more than <n> characters
+  istzero(i1,n):ist      trim to <n> chars, trailing zero included.
+
+
+3. Quick index by typical C construct or function
+-------------------------------------------------
+
+Some common C constructs may be adjusted to use ist instead. The mapping is not
+always one-to-one, but usually the computations on the length part tends to
+disappear in the refactoring, allowing to directly chain function calls. The
+entries below are hints to figure what function to look for in order to rewrite
+some common use cases.
+
+  char*                  IST equivalent
+
+  strchr()               istchr(), ist_find(), iststop()
+  strstr()               istist()
+  strcpy()               istcpy()
+  strscpy()              istscpy()
+  strlcpy()              istscpy()
+  strcat()               istcat()
+  strscat()              istscat()
+  strlcat()              istscat()
+  strcmp()               istdiff()
+  strdup()               istdup()
+  !strcmp()              isteq()
+  !strncmp()             istneq(), istmatch(), istnmatch()
+  !strcasecmp()          isteqi()
+  !strncasecmp()         istneqi(), istmatchi()
+  strtok()               istsplit()
+  return NULL            return IST_NULL
+  s = malloc()           s = istalloc()
+  free(s); s = NULL      istfree(&s)
+  p != NULL              isttest(p)
+  c = *(p++)             c = istshift(p)
+  *(p++) = c             __istappend(p, c)
+  p += n                 istadv(p, n)
+  p + strlen(p)          istend(p)
+  p[max] = 0             isttrim(p, max)
+  p[max+1] = 0           istzero(p, max)
diff --git a/doc/internals/api/layers.txt b/doc/internals/api/layers.txt
new file mode 100644
index 0000000..b5c35f4
--- /dev/null
+++ b/doc/internals/api/layers.txt
@@ -0,0 +1,190 @@
+2022-05-27 - Stream layers in HAProxy 2.6
+
+
+1. Background
+
+There are streams at plenty of levels in haproxy, essentially due to the
+introduction of multiplexed protocols which provide high-level streams on top
+of low-level streams, themselves either based on stream-oriented protocols or
+datagram-oriented protocols.
+
+The refactoring of the appctx and muxes that allowed to drop a lot of duplicate
+code between 2.5 and 2.6-dev6 raised another concern with some entities like
+"conn_stream" that were not specific to connections anymore, "endpoints" that
+became entities on their own, and "targets" whose life had been extended to
+last all along a connection.
+
+It was time to rename all such legacy entities introduced in 1.8 and which had
+turned particularly confusing over time as their roles evolved.
+
+
+2. Naming principles
+
+The global renaming of some entities between streams and connections was
+articulated around several principles:
+
+  - avoid the confusing use of "context" in shared places. For example, the
+    endpoint's connection is in "ctx" and nothing makes it obvious that the
+    endpoint's context is a connection, especially when an applet is there.
+
+  - reserve relative nouns for pointers and not for types. "endpoint", just
+    like "owner" or "peer" is relative, but when accessed from a different
+    layer it starts to make no sense at all, or to make one believe it's
+    something else, particularly with void*.
+
+  - avoid too generic terms that have multiple meanings, or words that are
+    synonyms in a same place (e.g. "peer" and "remote", or "endpoint" and
+    "target"). If two synonyms are needed to designate two distinct entities,
+    there's probably a problem elsewhere, or the problem is poorly defined.
+
+  - make it clearer that all that is manipulated is related to streams. This
+    particularly important in sample fetch functions for example, which tend
+    to require low-level access and could be mislead in trying to follow the
+    wrong chain when trying to get information about a connection.
+
+  - use easily spellable short names that abbreviate unambiguously when used
+    together in adjacent contexts
+
+
+3. Current state as of 2.6
+
+- when a name is required to designate the lower block that starts at the mux
+  stream or the appctx, it is spoken of as a "stream endpoint", and abbreviated
+  "se". It's okay because while "endpoint" itself is relative, "stream
+  endpoint" unequivocally designates one extremity of a stream. If a type is
+  needed for this in the future (e.g. via obj_type), then the type "stendp"
+  may be used. Before 2.6-dev6 there was no name for this, it was known as
+  conn_stream->ctx.
+
+- the 2.6-dev6 cs_endpoint which preserves the state of a mux stream or an
+  appctx and abstracts them in front of a conn_stream becomes a "stream
+  endpoint descriptor", of type "sedesc" and often abbreviated "sd", "sed"
+  or "ed". Its "target" pointer became "se" as per the rule above. Before
+  2.6-dev6, these elements were mixed with others inside conn_stream. From
+  the appctx it's called "sedesc" (few occurrences hence long name OK).
+
+- the conn_stream which is always attached to either a stream or a health check
+  and that is used to reach a mux or an applet becomes a "stream connector" of
+  type "stconn", generally abbreviated "sc". Its "endp" pointer becomes
+  "sedesc" as per the rule above, and that one has a back pointer "sc". The
+  stream uses "scf" and "scb" as the respective front and back pointers to the
+  stconns. Prior to 2.6-dev6, these parts were split between conn_stream and
+  stream_interface.
+
+- the sedesc's "ctx" which is solely used to store the connection as of now, is
+  renamed "conn" to void any doubt in the context of applets or even muxes. In
+  the future the connection should be attached to the "se" instead and this
+  pointer should disappear (or be recycled for anything else).
+
+The new 2.6 model looks like this:
+
+                  +------------------------+
+                  | stream or health check |
+                  +------------------------+
+                            ^   \ scf, scb
+                           /     \
+                          |       |
+                           \     /
+                        app \   v
+                         +----------+
+                         |  stconn  |
+                         +----------+
+                            ^   \ sedesc
+                           /     \
+                  . . . . | . . . | . . . . . split point (retries etc)
+                           \     /
+                         sc \   v
+                         +----------+
+                flags <--|  sedesc  |                      :  sedesc  :
+                         +----------+              ...     +----------+
+                   conn /   ^   \ se                           ^ \
+     +------------+    /   /     \                             |  \
+     | connection |<--'   |       |            ... OR ...      |   |
+     +------------+        \     /                              \  |
+      mux|  ^ |ctx       sd \   v                       : sedesc \ v
+         |  | |   +----------------------+  \           #  +----------+ svcctx
+         |  | |   | mux stream or appctx |   |          #  |  appctx  |--.
+         |  | |   +----------------------+   |          #  +----------+  |
+         |  | |           ^  |              /  private  #  :          :  |
+         v  | |           |  v              >  to the   #  +----------+  |
+   mux_ops  | |     +----------------+      \   mux     #  |  svcctx  |<-'
+            | +---->| mux connection |       )          #  +----------+
+            +------ +----------------+      /           #
+
+Stream descriptors may exist in the following modes:
+  - .conn = NULL, .se = NULL     : backend, not connection attempt yet
+  - .conn = NULL, .se = <appctx> : frontend or backend, applet
+  - .conn = <conn>, .se = NULL   : backend, connection in progress
+  - .conn = <conn>, .se = <muxs> : frontend or backend, connected
+
+Notes:
+  - for historical reasons (connect, forced protocol upgrades, etc), during a
+    connection setup or a rule-based protocol upgrade, the connection's "ctx"
+    may temporarily point to the stconn
+
+
+4. Invariants and cardinalities
+
+Usually a stream is created from an existing stconn from a mux or some applets,
+but may also be allocated first by other applets schedulers. After stream_new()
+a stream always has exactly one stconn per side (scf, scb), each of which has
+one ->sedesc. Each side is initialized with either one or no stream endpoint
+attached to the descriptor.
+
+Both applets and a mux stream always have a stream endpoint descriptor. AS SUCH
+IT IS NEVER NECESSARY TO TEST FOR THE EXISTENCE OF THE SEDESC FROM ANY SIDE, IT
+ALWAYS EXISTS. This explains why as much as possible it's preferable to use the
+sedesc to access flags and statuses from any side, rather than bouncing via the
+stconn.
+
+An applet's app layer is always a stream, which means that there are always
+channels accessible above, and there is always an opposite stream connector and
+a stream endpoint descriptor. As such, it's always safe for an applet to access
+the other side using sc_opposite().
+
+When an outgoing connection is in the process of being established, the backend
+side sedesc has its ->conn pointer pointing to the pending connection, and no
+->se. Once the connection is established and a mux is chosen, it's attached to
+the ->se. If an applet is used instead of a mux, the appctx is attached to the
+sedesc's ->se and ->conn remains NULL.
+
+If either side wants to detach from the other, it must allocate a new virgin
+sedesc to replace the existing one, and leave the existing one to the endpoint,
+since it continues to describe the stream endpoint. The stconn keeps its state
+(modulo the updates related to the disconnection). The previous sedesc points
+to a NULL stconn. For example, disconnecting from a backend mux will leave the
+entities like this:
+
+                                              +------------------------+
+                                              | stream or health check |
+                                              +------------------------+
+                                                        ^   \ scf, scb
+                                                       /     \
+                                                      |       |
+                                                       \     /
+                                                    app \   v
+                                                     +----------+
+                                                     |  stconn  |
+                                                     +----------+
+                                                        ^   \ sedesc
+                                                       /     \
+                             NULL                     |       |
+                              ^                        \     /
+                           sc |               /      sc \   v
+                         +----------+        /       +----------+
+                flags <--|  sedesc1 |   . . . . .    |  sedesc2 |--> flags
+                         +----------+      /         +----------+
+                   conn /   ^   \ se      /       conn /     \ se
+     +------------+    /   /     \                    |       |
+     | connection |<--'   |       |                   v       v
+     +------------+        \     /                  NULL     NULL
+      mux|  ^ |ctx       sd \   v
+         |  | |   +----------------------+
+         |  | |   | mux stream or appctx |
+         |  | |   +----------------------+
+         |  | |           ^  |
+         v  | |           |  v
+   mux_ops  | |     +----------------+
+            | +---->| mux connection |
+            +------ +----------------+
+
diff --git a/doc/internals/api/list.txt b/doc/internals/api/list.txt
new file mode 100644
index 0000000..d03cf03
--- /dev/null
+++ b/doc/internals/api/list.txt
@@ -0,0 +1,195 @@
+2021-11-09 - List API
+
+
+1. Background
+-------------
+
+HAProxy's lists are almost all doubly-linked and circular so that it is always
+possible to insert at the beginning, append at the end, scan them in any order
+and delete any element without having to scan to search the predecessor nor the
+successor.
+
+A list's head is just a regular list element, and an element always points to
+another list element. Such elements only have two pointers, the next and the
+previous elements. The object being pointed to is retrieved by subtracting the
+list element's offset in its structure from the list element's pointer. This
+way there is no need for any separate allocation for the list element, for a
+pointer to the object in the list, nor for a pointer to the list element from
+the object, as the list is embedded into the object.
+
+All basic operations are provided, as well as some iterators. Some iterators
+are safe for removal of the current element within the loop, others not. In any
+case a list cannot be freely modified while iterating over it (e.g. the current
+element's successor cannot not be freed if it's saved as the restart point).
+
+Extreme care is taken nowadays in HAProxy to make sure that no dangling
+pointers are left in elements, so it is important to always initialize list
+heads and list elements, as well as elements that are removed from a list if
+they are not immediately freed, so that their deletion is idempotent. A rule of
+thumb is that a list pointer's validity never has to be checked, it is always
+valid to dereference it. A lot of complex bugs have been caused in the past by
+incorrect list manipulation, such as an element being deleted twice, resulting
+in damaging previously adjacent elements' neighbours. This usually has serious
+consequences at locations that are totally different from the one of the bug,
+and that are only detected much later, so it is required to be particularly
+strict on using lists safely.
+
+The lists are not thread-safe, but mt_lists may be used instead.
+
+
+2. API description
+------------------
+
+A list is defined like this, both for the list's head, and for any other
+element:
+
+    struct list {
+        struct list *n;    /* next */
+        struct list *p;    /* prev */
+    };
+
+An empty list points to itself for both pointers. I.e. a list's head is both
+its own successor and its own predecessor. This guarantees that insertions
+and deletions can be done without any check and that deletion is idempotent.
+For this reason and by convention, a detached element ought to be represented
+like an empty head.
+
+Lists are manipulated using a set of macros which are used to initialize, add,
+remove, or iterate over elements. Most of these macros are extremely simple and
+are not even protected against multiple evaluation, so it is fundamentally
+important that the expressions used in the arguments are idempotent and that
+the result does not depend on the evaluation order of the arguments.
+
+Macro   Description
+
+ILH
+        Initialized List Head : this is a non-NULL, non-empty list element used
+        to prevent the compiler from moving an empty list head declaration to
+        BSS, typically when it appears in an array of keywords Without this,
+        some older versions of gcc tend to trim all the array and cause
+        corruption.
+
+LIST_INIT(l)
+        Initialize the list as an empty list head
+
+LIST_HEAD_INIT(l)
+        Return a valid initialized empty list head pointing to this
+        element. Essentially used with assignments in declarations.
+
+LIST_INSERT(l, e)
+        Add an element at the beginning of a list and return it
+
+LIST_APPEND(l, e)
+        Add an element at the end of a list and return it
+
+LIST_SPLICE(n, o)
+        Add the contents of a list <o> at the beginning of another list <n>.
+        The old list head remains untouched.
+
+LIST_SPLICE_END_DETACHED(n, o)
+        Add the contents of a list whose first element is is <o> and last one
+        is <o->p> at the end of another list <n>. The old list DOES NOT have
+        any head here.
+
+LIST_DELETE(e)
+        Remove an element from a list and return it. Safe to call on
+        initialized elements, but will not change the element itself so it is
+        not idempotent. Consider using LIST_DEL_INIT() instead unless called
+        immediately after a free().
+
+LIST_DEL_INIT(e)
+        Remove an element from a list, initialize it and return it so that a
+        subsequent LIST_DELETE() is safe. This is faster than performing a
+        LIST_DELETE() followed by a LIST_INIT() as pointers are not reloaded.
+
+LIST_ELEM(l, t, m)
+        Return a pointer of type <t> to a structure containing a list head
+        member called <m> at address <l>. Note that <l> can be the result of a
+        function or macro since it's used only once.
+
+LIST_ISEMPTY(l)
+        Check if the list head <l> is empty (=initialized) or not, and return
+        non-zero only if so.
+
+LIST_INLIST(e)
+        Check if the list element <e> was added to a list or not, thus return
+        true unless the element was initialized.
+
+LIST_INLIST_ATOMIC(e)
+        Atomically check if the list element's next pointer points to anything
+        different from itself, implying the element should be part of a
+        list. This usually is similar to LIST_INLIST() except that while that
+        one might be instrumented using debugging code to perform further
+        consistency checks, the macro below guarantees to always perform a
+        single atomic test and is safe to use with barriers.
+
+LIST_NEXT(l, t, m)
+        Return a pointer of type <t> to a structure following the element which
+	contains list head <l>, which is known as member <m> in struct <t>.
+
+LIST_PREV(l, t, m)
+        Return a pointer of type <t> to a structure preceding the element which
+        contains list head <l>, which is known as member <m> in struct <t>.
+        Note that this macro is first undefined as it happened to already exist
+        on some old OSes.
+
+list_for_each_entry(i, l, m)
+        Iterate local variable <i> through a list of items of type "typeof(*i)"
+        which are linked via a "struct list" member named <m>. A pointer to the
+        head of the list is passed in <l>. No temporary variable is needed.
+        Note that <i> must not be modified during the loop.
+
+list_for_each_entry_from(i, l, m)
+        Same as list_for_each_entry() but starting from current value of <i>
+        instead of the list's head.
+
+list_for_each_entry_from_rev(i, l, m)
+        Same as list_for_each_entry_rev() but starting from current value of <i>
+        instead of the list's head.
+
+list_for_each_entry_rev(i, l, m)
+       Iterate backwards local variable <i> through a list of items of type
+        "typeof(*i)" which are linked via a "struct list" member named <m>. A
+        pointer to the head of the list is passed in <l>. No temporary variable
+        is needed. Note that <i> must not be modified during the loop.
+
+list_for_each_entry_safe(i, b, l, m)
+        Iterate variable <i> through a list of items of type "typeof(*i)" which
+        are linked via a "struct list" member named <m>. A pointer to the head
+        of the list is passed in <l>. A temporary backup variable <b> of same
+        type as <i> is needed so that <i> may safely be deleted if needed. Note
+        that it is only permitted to delete <i> and no other element during
+        this operation!
+
+list_for_each_entry_safe_from(i, b, l, m)
+        Same as list_for_each_entry_safe() but starting from current value of
+        <i> instead of the list's head.
+
+list_for_each_entry_safe_from_rev(i, b, l, m)
+        Same as list_for_each_entry_safe_rev() but starting from current value
+        of <i> instead of the list's head.
+
+list_for_each_entry_safe_rev(i, b, l, m)
+        Iterate backwards local variable <i> through a list of items of type
+        "typeof(*i)" which are linked via a "struct list" member named <m>. A
+        pointer to the head of the list is passed in <l>. A temporary variable
+        <b> of same type as <i> is needed so that <i> may safely be deleted if
+        needed. Note that it is only permitted to delete <i> and no other
+        element during this operation!
+
+3. Notes
+--------
+
+- This API is quite old and some macros are missing. For example there's still
+  no list_first() so it's common to use LIST_ELEM(head->n, ...) instead. Some
+  older parts of the code also used to rely on list_for_each() followed by a
+  break to stop on the first element.
+
+- Some parts were recently renamed because LIST_ADD() used to do what
+  LIST_INSERT() currently does and was often mistaken with LIST_ADDQ() which is
+  what LIST_APPEND() now is. As such it is not totally impossible that some
+  places use a LIST_INSERT() where a LIST_APPEND() would be desired.
+
+- The structure must not be modified at all (even to add debug info). Some
+  parts of the code assume that its layout is exactly this one, particularly
+  the parts ensuring the casting between MT lists and lists.
diff --git a/doc/internals/api/pools.txt b/doc/internals/api/pools.txt
new file mode 100644
index 0000000..2c54409
--- /dev/null
+++ b/doc/internals/api/pools.txt
@@ -0,0 +1,577 @@
+2022-02-24 - Pools structure and API
+
+1. Background
+-------------
+
+Memory allocation is a complex problem covered by a massive amount of
+literature. Memory allocators found in field cover a broad spectrum of
+capabilities, performance, fragmentation, efficiency etc.
+
+The main difficulty of memory allocation comes from finding the optimal chunks
+for arbitrary sized requests, that will still preserve a low fragmentation
+level. Doing this well is often expensive in CPU usage and/or memory usage.
+
+In programs like HAProxy that deal with a large number of fixed size objects,
+there is no point having to endure all this risk of fragmentation, and the
+associated costs (sometimes up to several milliseconds with certain minimalist
+allocators) are simply not acceptable. A better approach consists in grouping
+frequently used objects by size, knowing that due to the high repetitiveness of
+operations, a freed object will immediately be needed for another operation.
+
+This grouping of objects by size is what is called a pool. Pools are created
+for certain frequently allocated objects, are usually merged together when they
+are of the same size (or almost the same size), and significantly reduce the
+number of calls to the memory allocator.
+
+With the arrival of threads, pools started to become a bottleneck so they now
+implement an optional thread-local lockless cache. Finally with the arrival of
+really efficient memory allocator in modern operating systems, the shared part
+has also become optional so that it doesn't consume memory if it does not bring
+any value.
+
+In 2.6-dev2, a number of debugging options that used to be configured at build
+time only changed to boot-time and can be modified using keywords passed after
+"-dM" on the command line, which sets or clears bits in the pool_debugging
+variable. The build-time options still affect the default settings however.
+Default values may be consulted using "haproxy -dMhelp".
+
+
+2. Principles
+-------------
+
+The pools architecture is selected at build time. The main options are:
+
+  - thread-local caches and process-wide shared pool enabled (1)
+
+    This is the default situation on most operating systems. Each thread has
+    its own local cache, and when depleted it refills from the process-wide
+    pool that avoids calling the standard allocator too often. It is possible
+    to force this mode at build time by setting CONFIG_HAP_GLOBAL_POOLS or at
+    boot time with "-dMglobal".
+
+  - thread-local caches only are enabled (2)
+
+    This is the situation on operating systems where a fast and modern memory
+    allocator is detected and when it is estimated that the process-wide shared
+    pool will not bring any benefit. This detection is automatic at build time,
+    but may also be forced at build tmie by setting CONFIG_HAP_NO_GLOBAL_POOLS
+    or at boot time with "-dMno-global".
+
+  - pass-through to the standard allocator (3)
+
+    This is used when one absolutely wants to disable pools and rely on regular
+    malloc() and free() calls, essentially in order to trace memory allocations
+    by call points, either internally via DEBUG_MEM_STATS, or externally via
+    tools such as Valgrind. This mode of operation may be forced at build time
+    by setting DEBUG_NO_POOLS or at boot time with "-dMno-cache".
+
+  - pass-through to an mmap-based allocator for debugging (4)
+
+    This is used only during deep debugging when trying to detect various
+    conditions such as use-after-free. In this case each allocated object's
+    size is rounded up to a multiple of a page size (4096 bytes) and an
+    integral number of pages is allocated for each object using mmap(),
+    surrounded by two unaccessible holes that aim to detect some out-of-bounds
+    accesses. Released objects are instantly freed using munmap() so that any
+    immediate subsequent access to the memory area crashes the process if the
+    area had not been reallocated yet. This mode can be enabled at build time
+    by setting DEBUG_UAF. It tends to consume a lot of memory and not to scale
+    at all with concurrent calls, that tends to make the system stall. The
+    watchdog may even trigger on some slow allocations.
+
+There are no more provisions for running with a shared pool but no thread-local
+cache: the shared pool's main goal is to compensate for the expensive calls to
+the memory allocator. This gain may be huge on tiny systems using basic
+allocators, but the thread-local cache will already achieve this. And on larger
+threaded systems, the shared pool's benefit is visible when the underlying
+allocator scales poorly, but in this case the shared pool would suffer from
+the same limitations without its thread-local cache and wouldn't provide any
+benefit.
+
+Summary of the various operation modes:
+
+                  (1)            (2)            (3)            (4)
+
+                  User           User           User           User
+                   |              |              |              |
+      pool_alloc() V              V              |              |
+              +---------+    +---------+         |              |
+              | Thread  |    | Thread  |         |              |
+              |  Local  |    |  Local  |         |              |
+              |  Cache  |    |  Cache  |         |              |
+              +---------+    +---------+         |              |
+                   |              |              |              |
+    pool_refill*() V              |              |              |
+              +---------+         |              |              |
+              | Shared  |         |              |              |
+              |  Pool   |         |              |              |
+              +---------+         |              |              |
+                   |              |              |              |
+          malloc() V              V              V              |
+              +---------+    +---------+    +---------+         |
+              | Library |    | Library |    | Library |         |
+              +---------+    +---------+    +---------+         |
+                   |              |              |              |
+            mmap() V              V              V              V
+              +---------+    +---------+    +---------+    +---------+
+              |   OS    |    |   OS    |    |   OS    |    |   OS    |
+              +---------+    +---------+    +---------+    +---------+
+
+One extra build define, DEBUG_FAIL_ALLOC, is used to enforce random allocation
+failure in pool_alloc() by randomly returning NULL, to test that callers
+properly handle allocation failures. It may also be enabled at boot time using
+"-dMfail". In this case the desired average rate of allocation failures can be
+fixed by global setting "tune.fail-alloc" expressed in percent.
+
+The thread-local caches contain the freshest objects whose total size amounts
+to CONFIG_HAP_POOL_CACHE_SIZE bytes, which is typically was 1MB before 2.6 and
+is 512kB after. The aim is to keep hot objects that still fit in the CPU core's
+private L2 cache. Once these objects do not fit into the cache anymore, there's
+no benefit keeping them local to the thread, so they'd rather be returned to
+the shared pool or the main allocator so that any other thread may make use of
+them.
+
+
+3. Storage in thread-local caches
+---------------------------------
+
+This section describes how objects are linked in thread local caches. This is
+not meant to be a concern for users of the pools API but it can be useful when
+inspecting post-mortem dumps or when trying to figure certain size constraints.
+
+Objects are stored in the local cache using a doubly-linked list. This ensures
+that they can be visited by freshness order like a stack, while at the same
+time being able to access them from oldest to newest when it is needed to
+evict coldest ones first:
+
+  - releasing an object to the cache always puts it on the top.
+
+  - allocating an object from the cache always takes the topmost one, hence the
+    freshest one.
+
+  - scanning for older objects to evict starts from the bottom, where the
+    oldest ones are located
+
+To that end, each thread-local cache keeps a list head in the "list" member of
+its "pool_cache_head" descriptor, that links all objects cast to type
+"pool_cache_item" via their "by_pool" member.
+
+Note that the mechanism described above only works for a single pool. When
+trying to limit the total cache size to a certain value, all pools included,
+there is also a need to arrange all objects from all pools together in the
+local caches. For this, each thread_ctx maintains a list head of recently
+released objects, all pools included, in its member "pool_lru_head". All items
+in a thread-local cache are linked there via their "by_lru" member.
+
+This means that releasing an object using pool_free() consists in inserting
+it at the beginning of two lists:
+  - the local pool_cache_head's "list" list head
+  - the thread context's "pool_lru_head" list head
+
+Allocating an object consists in picking the first entry from the pool's "list"
+and deleting its "by_pool" and "by_lru" links.
+
+Evicting an object consists in scanning the thread context's "pool_lru_head"
+backwards and deleting the object's "by_pool" and "by_lru" links.
+
+Given that entries are both inserted and removed synchronously, we have the
+guarantee that the oldest object in the thread's LRU list is always the oldest
+object in its pool, and that the next element is the cache's list head. This is
+what allows the LRU eviction mechanism to figure what pool an object belongs to
+when releasing it.
+
+Note:
+ | Since a pool_cache_item has two list entries, on 64-bit systems it will be
+ | 32-bytes long. This is the smallest size that a pool may be, and any smaller
+ | size will automatically be rounded up to this size.
+
+When build option DEBUG_POOL_INTEGRITY is set, or the boot-time option
+"-dMintegrity" is passed on the command line, the area of the object between
+the two list elements and the end according to pool->size will be filled with
+pseudo-random words during pool_put_to_cache(), and these words will be
+compared between each other during pool_get_from_cache(), and the process will
+crash in case any bit differs, as this would indicate that the memory area was
+modified after the free. The pseudo-random pattern is in fact incremented by
+(~0)/3 upon each free so that roughly half of the bits change each time and we
+maximize the likelihood of detecting a single bit flip in either direction. In
+order to avoid an immediate reuse and maximize the time the object spends in
+the cache, when this option is set, objects are picked from the cache from the
+oldest one instead of the freshest one. This way even late memory corruptions
+have a chance to be detected.
+
+When build option DEBUG_MEMORY_POOLS is set, or the boot-time option "-dMtag"
+is passed on the executable's command line, pool objects are allocated with
+one extra pointer compared to the requested size, so that the bytes that follow
+the memory area point to the pool descriptor itself as long as the object is
+allocated via pool_alloc(). Upon releasing via pool_free(), the pointer is
+compared and the code will crash in if it differs. This allows to detect both
+memory overflows and object released to the wrong pool (code bug resulting from
+a copy-paste error typically).
+
+Thus an object will look like this depending whether it's in the cache or is
+currently in use:
+
+             in cache                 in use
+          +------------+          +------------+
+       <--+  by_pool.p |          |   N bytes  |
+          |  by_pool.n +-->       |            |
+          +------------+          |N=16 min on |
+       <--+  by_lru.p  |          |  32-bit,   |
+          |  by_lru.n  +-->       |  32 min on |
+          +------------+          |  64-bit    |
+          :            :          :            :
+          |   N bytes  |          |            |
+          +------------+          +------------+ \   optional, only if
+          :  (unused)  :          :  pool ptr  :  >  DEBUG_MEMORY_POOLS
+          +------------+          +------------+ /   is set at build time
+                                                     or -dMtag at boot time
+
+Right now no provisions are made to return objects aligned on larger boundaries
+than those currently covered by malloc() (i.e. two pointers). This need appears
+from time to time and the layout above might evolve a little bit if needed.
+
+
+4. Storage in the process-wide shared pool
+------------------------------------------
+
+In order for the shared pool not to be a contention point in a multi-threaded
+environment, objects are allocated from or released to shared pools by clusters
+of a few objects at once. The maximum number of objects that may be moved to or
+from a shared pool at once is defined by CONFIG_HAP_POOL_CLUSTER_SIZE at build
+time, and currently defaults to 8.
+
+In order to remain scalable, the shared pool has to make some tradeoffs to
+limit the number of atomic operations and the duration of any locked operation.
+As such, it's composed of a single-linked list of clusters, themselves made of
+a single-linked list of objects.
+
+Clusters and objects are of the same type "pool_item" and are accessed from the
+pool's "free_list" member. This member points to the latest pool_item inserted
+into the pool by a release operation. And the pool_item's "next" member points
+to the next pool_item, which was the one present in the pool's free_list just
+before the pool_item was inserted, and the last pool_item in the list simply
+has a NULL "next" field.
+
+The pool_item's "down" pointer points down to the next objects part of the same
+cluster, that will be released or allocated at the same time as the first one.
+Each of these items also has a NULL "next" field, and are chained by their
+respective "down" pointers until the last one is detected by a NULL value.
+
+This results in the following layout:
+
+      pool           pool_item   pool_item   pool_item
+    +-----------+    +------+    +------+    +------+
+    | free_list +--> | next +--> | next +--> | NULL |
+    +-----------+    +------+    +------+    +------+
+                     | down |    | NULL |    | down |
+                     +--+---+    +------+    +--+---+
+                        |                       |
+                        V                       V
+                     +------+                +------+
+                     | NULL |                | NULL |
+                     +------+                +------+
+                     | down |                | NULL |
+                     +--+---+                +------+
+                        |
+                        V
+                     +------+
+                     | NULL |
+                     +------+
+                     | NULL |
+                     +------+
+
+Allocating an entry is only a matter of performing two atomic allocations on
+the free_list and reading the pool's "next" value:
+
+  - atomically mark the free_list as being updated by writing a "magic" pointer
+  - read the first pool_item's "next" field
+  - atomically replace the free_list with this value
+
+This results in a fast operation that instantly retrieves a cluster at once.
+Then outside of the critical section entries are walked over and inserted into
+the local cache one at a time. In order to keep the code simple and efficient,
+objects allocated from the shared pool are all placed into the local cache, and
+only then the first one is allocated from the cache. This operation is
+performed by the dedicated function pool_refill_local_from_shared() which is
+called from pool_get_from_cache() when the cache is empty. It means there is an
+overhead of two list insert/delete operations for the first object and that
+could be avoided at the expense of more complex code in the fast path, but this
+is negligible since it only concerns objects that need to be visited anyway.
+
+Freeing a group of objects consists in performing the operation the other way
+around:
+
+  - atomically mark the free_list as being updated by writing a "magic" pointer
+  - write the free_list value to the to-be-released item's "next" entry
+  - atomically replace the free_list with the pool_item's pointer
+
+The cluster will simply have to be prepared before being sent to the shared
+pool. The operation of releasing a cluster at once is performed by function
+pool_put_to_shared_cache() which is called from pool_evict_last_items() which
+itself is responsible for building the clusters.
+
+Due to the way objects are stored, it is important to try to group objects as
+much as possible when releasing them because this is what will condition their
+retrieval as groups as well. This is the reason why pool_evict_last_items()
+uses the LRU to find a first entry but tries to pick several items at once from
+a single cache. Tests have shown that CONFIG_HAP_POOL_CLUSTER_SIZE set to 8
+achieves up to 6-6.5 objects on average per operation, which effectively
+divides by as much the average time spent per object by each thread and pushes
+the contention point further.
+
+Also, grouping items in clusters is a property of the process-wide shared pool
+and not of the thread-local caches. This means that there is no grouped
+operation when not using the shared pool (mode "2" in the diagram above).
+
+
+5. API
+------
+
+The following functions are public and available for user code:
+
+struct pool_head *create_pool(char *name, uint size, uint flags)
+        Create a new pool named <name> for objects of size <size> bytes. Pool
+        names are truncated to their first 11 characters. Pools of very similar
+        size will usually be merged if both have set the flag MEM_F_SHARED in
+        <flags>. When DEBUG_DONT_SHARE_POOLS was set at build time, or
+        "-dMno-merge" is passed on the executable's command line, the pools
+        also need to have the exact same name to be merged. In addition, unless
+        MEM_F_EXACT is set in <flags>, the object size will usually be rounded
+        up to the size of pointers (16 or 32 bytes). The name that will appear
+        in the pool upon merging is the name of the first created pool. The
+        returned pointer is the new (or reused) pool head, or NULL upon error.
+        Pools created this way must be destroyed using pool_destroy().
+
+void *pool_destroy(struct pool_head *pool)
+        Destroy pool <pool>, that is, all of its unused objects are freed and
+        the structure is freed as well if the pool didn't have any used objects
+        anymore. In this case NULL is returned. If some objects remain in use,
+        the pool is preserved and its pointer is returned. This ought to be
+        used essentially on exit or in rare situations where some internal
+        entities that hold pools have to be destroyed.
+
+void pool_destroy_all(void)
+        Destroy all pools, without checking which ones still have used entries.
+        This is only meant for use on exit.
+
+void *__pool_alloc(struct pool_head *pool, uint flags)
+        Allocate an entry from the pool <pool>. The allocator will first look
+        for an object in the thread-local cache if enabled, then in the shared
+        pool if enabled, then will fall back to the operating system's default
+        allocator. NULL is returned if the object couldn't be allocated (due to
+        configured limits or lack of memory). Object allocated this way have to
+        be released using pool_free(). Like with malloc(), by default the
+        contents of the returned object are undefined. If memory poisonning is
+        enabled, the object will be filled with the poisonning byte. If the
+        global "pool.fail-alloc" setting is non-zero and DEBUG_FAIL_ALLOC is
+        enabled, a random number generator will be called to randomly return a
+        NULL. The allocator's behavior may be adjusted using a few flags passed
+        in <flags>:
+           - POOL_F_NO_POISON : when set, disables memory poisonning (e.g. when
+             pointless and expensive, like for buffers)
+           - POOL_F_MUST_ZERO : when set, the memory area will be zeroed before
+             being returned, similar to what calloc() does
+           - POOL_F_NO_FAIL : when set, disables the random allocation failure,
+             e.g. for use during early init code or critical sections.
+
+void *pool_alloc(struct pool_head *pool)
+        This is an exact equivalent of __pool_alloc(pool, 0). It is the regular
+        way to allocate entries from a pool.
+
+void *pool_alloc_nocache(struct pool_head *pool)
+        Allocate an entry from the pool <pool>, bypassing the cache. If shared
+        pools are enabled, they will be consulted first. Otherwise the object
+        is allocated using the operating system's default allocator. This is
+        essentially used during early boot to pre-allocate a number of objects
+        for pools which require a minimum number of entries to exist.
+
+void *pool_zalloc(struct pool_head *pool)
+        This is an exact equivalent of __pool_alloc(pool, POOL_F_MUST_ZERO).
+
+void pool_free(struct pool_head *pool, void *ptr)
+        Free an entry allocate from one of the pool_alloc() functions above
+        from pool <pool>. The object will be placed into the thread-local cache
+        if enabled, or in the shared pool if enabled, or will be released using
+        the operating system's default allocator. When a local cache is
+        enabled, if the local cache size becomes larger than 75% of the maximum
+        size configured at build time, some objects will be evicted to the
+        shared pool. Such objects are taken first from the same pool, but if
+        the total size is really huge, other pools might be checked as well.
+        Some extra checks enabled at build time may enforce extra checks so
+        that the process will immediately crash if the object was not allocated
+        from this pool or experienced an overflow or some memory corruption.
+
+void pool_flush(struct pool_head *pool)
+        Free all unused objects from shared pool <pool>. Thread-local caches
+        are not affected. This is essentially used when running low on memory
+        or when stopping, in order to release a maximum amount of memory for
+        the new process.
+
+void pool_gc(struct pool_head *pool)
+        Free all unused objects from all pools, but respecting the minimum
+        number of spare objects required for each of them. Then, for operating
+        systems which support it, indicate the system that all unused memory
+        can be released. Thread-local caches are not affected. This operation
+        differs from pool_flush() in that it is run locklessly, under thread
+        isolation, and on all pools in a row. It is called by the SIGQUIT
+        signal handler and upon exit. Note that the obsolete argument <pool> is
+        not used and the convention is to pass NULL there.
+
+void dump_pools_to_trash(void)
+        Dump the current status of all pools into the trash buffer. This is
+        essentially used by the "show pools" CLI command or the SIGQUIT signal
+        handler to dump them on stderr. The total report size may not exceed
+        the size of the trash buffer. If it does, some entries will be missing.
+
+void dump_pools(void)
+        Dump the current status of all pools to stderr. This just calls
+        dump_pools_to_trash() and writes the trash to stderr.
+
+int pool_total_failures(void)
+        Report the total number of failed allocations. This is solely used to
+        report the "PoolFailed" metrics of the "show info" output. The total
+        is calculated on the fly by summing the number of failures in all pools
+        and is only meant to be used as an indicator rather than a precise
+        measure.
+
+ullong pool_total_allocated(void)
+        Report the total number of bytes allocated in all pools, for reporting
+        in the "PoolAlloc_MB" field of the "show info" output. The total is
+        calculated on the fly by summing the number of allocated bytes in all
+        pools and is only meant to be used as an indicator rather than a
+        precise measure.
+
+ullong pool_total_used(void)
+        Report the total number of bytes used in all pools, for reporting in
+        the "PoolUsed_MB" field of the "show info" output. The total is
+        calculated on the fly by summing the number of used bytes in all pools
+        and is only meant to be used as an indicator rather than a precise
+        measure. Note that objects present in caches are accounted as used.
+
+Some other functions exist and are only used by the pools code itself. While
+not strictly forbidden to use outside of this code, it is generally recommended
+to avoid touching them in order not to create undesired dependencies that will
+complicate maintenance.
+
+A few macros exist to ease the declaration of pools:
+
+DECLARE_POOL(ptr, name, size)
+        Placed at the top level of a file, this declares a global memory pool
+        as variable <ptr>, name <name> and size <size> bytes per element. This
+        is made via a call to REGISTER_POOL() and by assigning the resulting
+        pointer to variable <ptr>. <ptr> will be created of type "struct
+        pool_head *". If the pool needs to be visible outside of the function
+        (which is likely), it will also need to be declared somewhere as
+        "extern struct pool_head *<ptr>;". It is recommended to place such
+        declarations very early in the source file so that the variable is
+        already known to all subsequent functions which may use it.
+
+DECLARE_STATIC_POOL(ptr, name, size)
+        Placed at the top level of a file, this declares a static memory pool
+        as variable <ptr>, name <name> and size <size> bytes per element. This
+        is made via a call to REGISTER_POOL() and by assigning the resulting
+        pointer to local variable <ptr>. <ptr> will be created of type "static
+        struct pool_head *". It is recommended to place such declarations very
+        early in the source file so that the variable is already known to all
+        subsequent functions which may use it.
+
+
+6. Build options
+----------------
+
+A number of build-time defines allow to tune the pools behavior. All of them
+have to be enabled using "-Dxxx" or "-Dxxx=yyy" in the makefile's DEBUG
+variable.
+
+DEBUG_NO_POOLS
+        When this is set, pools are entirely disabled, and allocations are made
+        using malloc() instead. This is not recommended for production but may
+        be useful for tracing allocations. It corresponds to "-dMno-cache" at
+        boot time.
+
+DEBUG_MEMORY_POOLS
+        When this is set, an extra pointer is allocated at the end of each
+        object to reference the pool the object was allocated from and detect
+        buffer overflows. Then, pool_free() will provoke a crash in case it
+        detects an anomaly (pointer at the end not matching the pool). It
+        corresponds to "-dMtag" at boot time.
+
+DEBUG_FAIL_ALLOC
+        When enabled, a global setting "tune.fail-alloc" may be set to a non-
+        zero value representing a percentage of memory allocations that will be
+        made to fail in order to stress the calling code. It corresponds to
+        "-dMfail" at boot time.
+
+DEBUG_DONT_SHARE_POOLS
+        When enabled, pools of similar sizes are not merged unless the have the
+        exact same name. It corresponds to "-dMno-merge" at boot time.
+
+DEBUG_UAF
+        When enabled, pools are disabled and all allocations and releases pass
+        through mmap() and munmap(). The memory usage significantly inflates
+        and the performance degrades, but this allows to detect a lot of
+        use-after-free conditions by crashing the program at the first abnormal
+        access. This should not be used in production.
+
+DEBUG_POOL_INTEGRITY
+        When enabled, objects picked from the cache are checked for corruption
+        by comparing their contents against a pattern that was placed when they
+        were inserted into the cache. Objects are also allocated in the reverse
+        order, from the oldest one to the most recent, so as to maximize the
+        ability to detect such a corruption. The goal is to detect writes after
+        free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF
+        this cannot detect reads after free, but may possibly detect later
+        corruptions and will not consume extra memory. The CPU usage will
+        increase a bit due to the cost of filling/checking the area and for the
+        preference for cold cache instead of hot cache, though not as much as
+        with DEBUG_UAF. This option is meant to be usable in production. It
+        corresponds to boot-time options "-dMcold-first,integrity".
+
+DEBUG_POOL_TRACING
+        When enabled, the callers of pool_alloc() and pool_free() will be
+        recorded into an extra memory area placed after the end of the object.
+        This may only be required by developers who want to get a few more
+        hints about code paths involved in some crashes, but will serve no
+        purpose outside of this. It remains compatible (and completes well)
+        DEBUG_POOL_INTEGRITY above. Such information become meaningless once
+        the objects leave the thread-local cache. It corresponds to boot-time
+        option "-dMcaller".
+
+DEBUG_MEM_STATS
+        When enabled, all malloc/calloc/realloc/strdup/free calls are accounted
+        for per call place (file+line number), and may be displayed or reset on
+        the CLI using "debug dev memstats". This is essentially used to detect
+        potential leaks or abnormal usages. When pools are enabled (default),
+        such calls are rare and the output will mostly contain calls induced by
+        libraries. When pools are disabled, about all calls to pool_alloc() and
+        pool_free() will also appear since they will be remapped to standard
+        functions.
+
+CONFIG_HAP_GLOBAL_POOLS
+        When enabled, process-wide shared pools will be forcefully enabled even
+        if not considered useful on the platform. The default is to let haproxy
+        decide based on the OS and C library. It corresponds to boot-time
+        option "-dMglobal".
+
+CONFIG_HAP_NO_GLOBAL_POOLS
+        When enabled, process-wide shared pools will be forcefully disabled
+        even if considered useful on the platform. The default is to let
+        haproxy decide based on the OS and C library. It corresponds to
+        boot-time option "-dMno-global".
+
+CONFIG_HAP_POOL_CACHE_SIZE
+        This allows one to define the size of the per-thread cache, in bytes.
+        The default value is 512 kB (524288). Smaller values will use less
+        memory at the expense of a possibly higher CPU usage when using many
+        threads. Higher values will give diminishing returns on performance
+        while using much more memory. Usually there is no benefit in using
+        more than a per-core L2 cache size. It would be better not to set this
+        value lower than a few times the size of a buffer (bufsize, defaults to
+        16 kB).
+
+CONFIG_HAP_POOL_CLUSTER_SIZE
+        This allows one to define the maximum number of objects that will be
+        groupped together in an allocation from the shared pool. Values 4 to 8
+        have experimentally shown good results with 16 threads. On systems with
+        more cores or loosely coupled caches exhibiting slow atomic operations,
+        it could possibly make sense to slightly increase this value.
diff --git a/doc/internals/api/scheduler.txt b/doc/internals/api/scheduler.txt
new file mode 100644
index 0000000..3469543
--- /dev/null
+++ b/doc/internals/api/scheduler.txt
@@ -0,0 +1,226 @@
+2021-11-17 - Scheduler API
+
+
+1. Background
+-------------
+
+The scheduler relies on two major parts:
+  - the wait queue or timers queue, which contains an ordered tree of the next
+    timers to expire
+
+  - the run queue, which contains tasks that were already woken up and are
+    waiting for a CPU slot to execute.
+
+There are two types of schedulable objects in HAProxy:
+  - tasks: they contain one timer and can be in the run queue without leaving
+    their place in the timers queue.
+
+  - tasklets: they do not have the timers part and are either sleeping or
+    running.
+
+Both the timers queue and run queue in fact exist both shared between all
+threads and per-thread. A task or tasklet may only be queued in a single of
+each at a time. The thread-local queues are not thread-safe while the shared
+ones are. This means that it is only permitted to manipulate an object which
+is in the local queue or in a shared queue, but then after locking it. As such
+tasks and tasklets are usually pinned to threads and do not move, or only in
+very specific ways not detailed here.
+
+In case of doubt, keep in mind that it's not permitted to manipulate another
+thread's private task or tasklet, and that any task held by another thread
+might vanish while it's being looked at.
+
+Internally a large part of the task and tasklet struct is shared between
+the two types, which reduces code duplication and eases the preservation
+of fairness in the run queue by interleaving all of them. As such, some
+fields or flags may not always be relevant to tasklets and may be ignored.
+
+
+Tasklets do not use a thread mask but use a thread ID instead, to which they
+are bound. If the thread ID is negative, the tasklet is not bound but may only
+be run on the calling thread.
+
+
+2. API
+------
+
+There are few functions exposed by the scheduler. A few more ones are in fact
+accessible but if not documented there they'd rather be avoided or used only
+when absolutely certain they're suitable, as some have delicate corner cases.
+In doubt, checking the sched.pdf diagram may help.
+
+int total_run_queues()
+        Return the approximate number of tasks in run queues. This is racy
+        and a bit inaccurate as it iterates over all queues, but it is
+        sufficient for stats reporting.
+
+int task_in_rq(t)
+        Return non-zero if the designated task is in the run queue (i.e. it was
+        already woken up).
+
+int task_in_wq(t)
+        Return non-zero if the designated task is in the timers queue (i.e. it
+        has a valid timeout and will eventually expire).
+
+int thread_has_tasks()
+        Return non-zero if the current thread has some work to be done in the
+        run queue. This is used to decide whether or not to sleep in poll().
+
+void task_wakeup(t, f)
+        Will make sure task <t> will wake up, that is, will execute at least
+        once after the start of the function is called. The task flags <f> will
+        be ORed on the task's state, among TASK_WOKEN_* flags exclusively. In
+        multi-threaded environments it is safe to wake up another thread's task
+        and even if the thread is sleeping it will be woken up. Users have to
+        keep in mind that a task running on another thread might very well
+        finish and go back to sleep before the function returns. It is
+        permitted to wake the current task up, in which case it will be
+        scheduled to run another time after it returns to the scheduler.
+
+struct task *task_unlink_wq(t)
+        Remove the task from the timers queue if it was in it, and return it.
+        It may only be done for the local thread, or for a shared thread that
+        might be in the shared queue. It must not be done for another thread's
+        task.
+
+void task_queue(t)
+        Place or update task <t> into the timers queue, where it may already
+        be, scheduling it for an expiration at date t->expire. If t->expire is
+        infinite, nothing is done, so it's safe to call this function without
+        prior checking the expiration date. It is only valid to call this
+        function for local tasks or for shared tasks who have the calling
+        thread in their thread mask.
+
+void task_set_affinity(t, m)
+        Change task <t>'s thread_mask to new value <m>. This may only be
+        performed by the task itself while running. This is only used to let a
+        task voluntarily migrate to another thread.
+
+void tasklet_wakeup(tl)
+        Make sure that tasklet <tl> will wake up, that is, will execute at
+        least once. The tasklet will run on its assigned thread, or on any
+        thread if its TID is negative.
+
+void tasklet_wakeup_on(tl, thr)
+        Make sure that tasklet <tl> will wake up on thread <thr>, that is, will
+        execute at least once. The designated thread may only differ from the
+        calling one if the tasklet is already configured to run on another
+        thread, and it is not permitted to self-assign a tasklet if its tid is
+        negative, as it may already be scheduled to run somewhere else. Just in
+        case, only use tasklet_wakeup() which will pick the tasklet's assigned
+        thread ID.
+
+struct tasklet *tasklet_new()
+        Allocate a new tasklet and set it to run by default on the calling
+        thread. The caller may change its tid to another one before using it.
+        The new tasklet is returned.
+
+struct task *task_new_anywhere()
+        Allocate a new task to run on any thread, and return the task, or NULL
+        in case of allocation issue. Note that such tasks will be marked as
+        shared and will go through the locked queues, thus their activity will
+        be heavier than for other ones. See also task_new_here().
+
+struct task *task_new_here()
+        Allocate a new task to run on the calling thread, and return the task,
+        or NULL in case of allocation issue.
+
+struct task *task_new_on(t)
+        Allocate a new task to run on thread <t>, and return the task, or NULL
+        in case of allocation issue.
+
+void task_destroy(t)
+        Destroy this task. The task will be unlinked from any timers queue,
+        and either immediately freed, or asynchronously killed if currently
+        running. This may only be done by one of the threads this task is
+        allowed to run on. Developers must not forget that the task's memory
+        area is not always immediately freed, and that certain misuses could
+        only have effect later down the chain (e.g. use-after-free).
+
+void tasklet_free()
+        Free this tasklet, which must not be running, so that may only be
+        called by the thread responsible for the tasklet, typically the
+        tasklet's process() function itself.
+
+void task_schedule(t, d)
+        Schedule task <t> to run no later than date <d>. If the task is already
+        running, or scheduled for an earlier instant, nothing is done. If the
+        task was not in queued or was scheduled to run later, its timer entry
+        will be updated. This function assumes that it will never be called
+        with a timer in the past nor with TICK_ETERNITY. Only one of the
+        threads assigned to the task may call this function.
+
+The task's ->process() function receives the following arguments:
+
+  - struct task *t: a pointer to the task itself. It is always valid.
+
+  - void *ctx     : a copy of the task's ->context pointer at the moment
+                    the ->process() function was called by the scheduler. A
+                    function must use this and not task->context, because
+                    task->context might possibly be changed by another thread.
+                    For instance, the muxes' takeover() function do this.
+
+  - uint state    : a copy of the task's ->state field at the moment the
+                    ->process() function was executed. A function must use
+                    this and not task->state as the latter misses the wakeup
+                    reasons and may constantly change during execution along
+                    concurrent wakeups (threads or signals).
+
+The possible state flags to use during a call to task_wakeup() or seen by the
+task being called are the following; they're automatically cleaned from the
+state field before the call to ->process()
+
+  - TASK_WOKEN_INIT    each creation of a task causes a first wakeup with this
+                       flag set. Applications should not set it themselves.
+
+  - TASK_WOKEN_TIMER   this indicates the task's expire date was reached in the
+                       timers queue. Applications should not set it themselves.
+
+  - TASK_WOKEN_IO      indicates the wake-up happened due to I/O activity. Now
+                       that all low-level I/O processing happens on tasklets,
+                       this notion of I/O is now application-defined (for
+                       example stream-interfaces use it to notify the stream).
+
+  - TASK_WOKEN_SIGNAL  indicates that a signal the task was subscribed to was
+                       received. Applications should not set it themselves.
+
+  - TASK_WOKEN_MSG     any application-defined wake-up reason, usually for
+                       inter-task communication (e.g filters vs streams).
+
+  - TASK_WOKEN_RES     a resource the task was waiting for was finally made
+                       available, allowing the task to continue its work. This
+                       is essentially used by buffers and queues. Applications
+                       may carefully use it for their own purpose if they're
+                       certain not to rely on existing ones.
+
+  - TASK_WOKEN_OTHER   any other application-defined wake-up reason.
+
+
+In addition, a few persistent flags may be observed or manipulated by the
+application, both for tasks and tasklets:
+
+  - TASK_SELF_WAKING   when set, indicates that this task was found waking
+                       itself up, and its class will change to bulk processing.
+                       If this behavior is under control temporarily expected,
+                       and it is not expected to happen again, it may make
+                       sense to reset this flag from the ->process() function
+                       itself.
+
+  - TASK_HEAVY         when set, indicates that this task does so heavy
+                       processing that it will become mandatory to give back
+                       control to I/Os otherwise big latencies might occur. It
+                       may be set by an application that expects something
+                       heavy to happen (tens to hundreds of microseconds), and
+                       reset once finished. An example of user is the TLS stack
+                       which sets it when an imminent crypto operation is
+                       expected.
+
+  - TASK_F_USR1        This is the first application-defined persistent flag.
+                       It is always zero unless the application changes it. An
+                       example of use cases is the I/O handler for backend
+                       connections, to mention whether the connection is safe
+                       to use or might have recently been migrated.
+
+Finally, when built with -DDEBUG_TASK, an extra sub-structure "debug" is added
+to both tasks and tasklets to note the code locations of the last two calls to
+task_wakeup() and tasklet_wakeup().
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-28 09:35:11 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-28 09:35:11 +0000
commit	da76459dc21b5af2449af2d36eb95226cb186ce2 (patch)
tree	542ebb3c1e796fac2742495b8437331727bbbfa0 /doc/internals/api
parent	Initial commit. (diff)
download	haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.tar.xz haproxy-da76459dc21b5af2449af2d36eb95226cb186ce2.zip