From fe39ffb8b90ae4e002ed73fe98617cd590abb467 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 27 Apr 2024 08:33:50 +0200 Subject: Adding upstream version 2.4.56. Signed-off-by: Daniel Baumann --- docs/manual/rewrite/tech.html.en | 205 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 205 insertions(+) create mode 100644 docs/manual/rewrite/tech.html.en (limited to 'docs/manual/rewrite/tech.html.en') diff --git a/docs/manual/rewrite/tech.html.en b/docs/manual/rewrite/tech.html.en new file mode 100644 index 0000000..5207ec7 --- /dev/null +++ b/docs/manual/rewrite/tech.html.en @@ -0,0 +1,205 @@ + + + + + +Apache mod_rewrite Technical Details - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Apache mod_rewrite Technical Details

+
+

Available Languages:  en  | + fr 

+
+ +

This document discusses some of the technical details of mod_rewrite +and URL matching.

+
+ +
top
+
+

API Phases

+ +

The Apache HTTP Server handles requests in several phases. At + each of these phases, one or more modules may be called upon to + handle that portion of the request lifecycle. Phases include things + like URL-to-filename translation, authentication, authorization, + content, and logging. (This is not an exhaustive list.)

+ +

mod_rewrite acts in two of these phases (or "hooks", as they are + often called) to influence how URLs may be rewritten.

+ +

First, it uses the URL-to-filename translation hook, which occurs + after the HTTP request has been read, but before any authorization + starts. Secondly, it uses the Fixup hook, which is after the + authorization phases, and after per-directory configuration files + (.htaccess files) have been read, but before the + content handler is called.

+ +

So, after a request comes in and a corresponding server or + virtual host has been determined, the rewriting engine starts + processing any mod_rewrite directives appearing in the + per-server configuration. (i.e., in the main server configuration file + and <Virtualhost> + sections.) This happens in the URL-to-filename phase.

+ +

A few steps later, once the final data directories have been found, + the per-directory configuration directives (.htaccess + files and <Directory> blocks) are applied. This + happens in the Fixup phase.

+ +

In each of these cases, mod_rewrite rewrites the + REQUEST_URI either to a new URL, or to a filename.

+ +

In per-directory context (i.e., within .htaccess files + and Directory blocks), these rules are being applied + after a URL has already been translated to a filename. Because of + this, the URL-path that mod_rewrite initially compares RewriteRule directives against + is the full filesystem path to the translated filename with the current + directories path (including a trailing slash) removed from the front.

+ +

To illustrate: If rules are in /var/www/foo/.htaccess and a request + for /foo/bar/baz is being processed, an expression like ^bar/baz$ would + match.

+ +

If a substitution is made in per-directory context, a new internal + subrequest is issued with the new URL, which restarts processing of the + request phases. If the substitution is a relative path, the RewriteBase directive + determines the URL-path prefix prepended to the substitution. + In per-directory context, care must be taken to + create rules which will eventually (in some future "round" of per-directory + rewrite processing) not perform a substitution to avoid looping. + (See RewriteLooping + for further discussion of this problem.)

+ +

Because of this further manipulation of the URL in per-directory + context, you'll need to take care to craft your rewrite rules + differently in that context. In particular, remember that the + leading directory path will be stripped off of the URL that your + rewrite rules will see. Consider the examples below for further + clarification.

+ + + + + + + + + + + + + + + + + + + + + + + +
Location of ruleRule
VirtualHost sectionRewriteRule "^/images/(.+)\.jpg" "/images/$1.gif"
.htaccess file in document rootRewriteRule "^images/(.+)\.jpg" "images/$1.gif"
.htaccess file in images directoryRewriteRule "^(.+)\.jpg" "$1.gif"
+ +

For even more insight into how mod_rewrite manipulates URLs in + different contexts, you should consult the log entries made during + rewriting.

+ +
top
+
+

Ruleset Processing

+ +

Now when mod_rewrite is triggered in these two API phases, it + reads the configured rulesets from its configuration + structure (which itself was either created on startup for + per-server context or during the directory walk of the Apache + kernel for per-directory context). Then the URL rewriting + engine is started with the contained ruleset (one or more + rules together with their conditions). The operation of the + URL rewriting engine itself is exactly the same for both + configuration contexts. Only the final result processing is + different.

+ +

The order of rules in the ruleset is important because the + rewriting engine processes them in a special (and not very + obvious) order. The rule is this: The rewriting engine loops + through the ruleset rule by rule (RewriteRule directives) and + when a particular rule matches it optionally loops through + existing corresponding conditions (RewriteCond + directives). For historical reasons the conditions are given + first, and so the control flow is a little bit long-winded. See + Figure 1 for more details.

+

+ Flow of RewriteRule and RewriteCond matching
+ Figure 1:The control flow through the rewriting ruleset +

+

First the URL is matched against the + Pattern of each rule. If it fails, mod_rewrite + immediately stops processing this rule, and continues with the + next rule. If the Pattern matches, mod_rewrite looks + for corresponding rule conditions (RewriteCond directives, + appearing immediately above the RewriteRule in the configuration). + If none are present, it substitutes the URL with a new value, which is + constructed from the string Substitution, and goes on + with its rule-looping. But if conditions exist, it starts an + inner loop for processing them in the order that they are + listed. For conditions, the logic is different: we don't match + a pattern against the current URL. Instead we first create a + string TestString by expanding variables, + back-references, map lookups, etc. and then we try + to match CondPattern against it. If the pattern + doesn't match, the complete set of conditions and the + corresponding rule fails. If the pattern matches, then the + next condition is processed until no more conditions are + available. If all conditions match, processing is continued + with the substitution of the URL with + Substitution.

+ +
+
+

Available Languages:  en  | + fr 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file -- cgit v1.2.3