From: Willy Tarreau Date: Tue, 8 Aug 2023 19:35:25 +0200 Subject: DOC: clarify the handling of URL fragments in requests Origin: https://git.haproxy.org/?p=haproxy-2.6.git;a=commit;h=c47814a58ec153a526e8e9e822cda6e66cef5cc2 We indicate in path/pathq/url that they may contain '#' if the frontend is configured with "option accept-invalid-http-request", and that option mentions the fragment as well. (cherry picked from commit 7ab4949ef107a7088777f954de800fe8cf727796) [ad: backported as a companion to BUG/MINOR: h1: do not accept '#' as part of the URI component] Signed-off-by: Amaury Denoyelle (cherry picked from commit 965fb74eb180ab4f275ef907e018128e7eee0e69) Signed-off-by: Amaury Denoyelle (cherry picked from commit e9903d6073ce9ff0ed8b304700e9d2b435ed8050) Signed-off-by: Amaury Denoyelle --- doc/configuration.txt | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/doc/configuration.txt b/doc/configuration.txt index 7219c489ef9a..c01abb2d0a66 100644 --- a/doc/configuration.txt +++ b/doc/configuration.txt @@ -8609,6 +8609,8 @@ no option accept-invalid-http-request option also relaxes the test on the HTTP version, it allows HTTP/0.9 requests to pass through (no version specified), as well as different protocol names (e.g. RTSP), and multiple digits for both the major and the minor version. + Finally, this option also allows incoming URLs to contain fragment references + ('#' after the path). This option should never be enabled by default as it hides application bugs and open security breaches. It should only be deployed after a problem has @@ -20991,7 +20993,11 @@ path : string information from databases and keep them in caches. Note that with outgoing caches, it would be wiser to use "url" instead. With ACLs, it's typically used to match exact file names (e.g. "/login.php"), or directory parts using - the derivative forms. See also the "url" and "base" fetch methods. + the derivative forms. See also the "url" and "base" fetch methods. Please + note that any fragment reference in the URI ('#' after the path) is strictly + forbidden by the HTTP standard and will be rejected. However, if the frontend + receiving the request has "option accept-invalid-http-request", then this + fragment part will be accepted and will also appear in the path. ACL derivatives : path : exact string match @@ -21009,7 +21015,11 @@ pathq : string relative URI, excluding the scheme and the authority part, if any. Indeed, while it is the common representation for an HTTP/1.1 request target, in HTTP/2, an absolute URI is often used. This sample fetch will return the same - result in both cases. + result in both cases. Please note that any fragment reference in the URI ('#' + after the path) is strictly forbidden by the HTTP standard and will be + rejected. However, if the frontend receiving the request has "option + accept-invalid-http-request", then this fragment part will be accepted and + will also appear in the path. query : string This extracts the request's query string, which starts after the first @@ -21242,7 +21252,11 @@ url : string "path" is preferred over using "url", because clients may send a full URL as is normally done with proxies. The only real use is to match "*" which does not match in "path", and for which there is already a predefined ACL. See - also "path" and "base". + also "path" and "base". Please note that any fragment reference in the URI + ('#' after the path) is strictly forbidden by the HTTP standard and will be + rejected. However, if the frontend receiving the request has "option + accept-invalid-http-request", then this fragment part will be accepted and + will also appear in the url. ACL derivatives : url : exact string match -- 2.43.0