From 6beeb1b708550be0d4a53b272283e17e5e35fe17 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 17:01:30 +0200 Subject: Adding upstream version 2.4.57. Signed-off-by: Daniel Baumann --- docs/manual/developer/API.html | 5 + docs/manual/developer/API.html.en | 1245 +++++++++++++++ docs/manual/developer/debugging.html | 5 + docs/manual/developer/debugging.html.en | 60 + docs/manual/developer/documenting.html | 9 + docs/manual/developer/documenting.html.en | 112 ++ docs/manual/developer/documenting.html.zh-cn.utf8 | 109 ++ docs/manual/developer/filters.html | 5 + docs/manual/developer/filters.html.en | 234 +++ docs/manual/developer/hooks.html | 5 + docs/manual/developer/hooks.html.en | 261 ++++ docs/manual/developer/index.html | 9 + docs/manual/developer/index.html.en | 89 ++ docs/manual/developer/index.html.zh-cn.utf8 | 88 ++ docs/manual/developer/modguide.html | 5 + docs/manual/developer/modguide.html.en | 1739 +++++++++++++++++++++ docs/manual/developer/modules.html | 9 + docs/manual/developer/modules.html.en | 306 ++++ docs/manual/developer/modules.html.ja.utf8 | 301 ++++ docs/manual/developer/new_api_2_4.html | 5 + docs/manual/developer/new_api_2_4.html.en | 601 +++++++ docs/manual/developer/output-filters.html | 5 + docs/manual/developer/output-filters.html.en | 585 +++++++ docs/manual/developer/request.html | 5 + docs/manual/developer/request.html.en | 248 +++ docs/manual/developer/thread_safety.html | 5 + docs/manual/developer/thread_safety.html.en | 307 ++++ 27 files changed, 6357 insertions(+) create mode 100644 docs/manual/developer/API.html create mode 100644 docs/manual/developer/API.html.en create mode 100644 docs/manual/developer/debugging.html create mode 100644 docs/manual/developer/debugging.html.en create mode 100644 docs/manual/developer/documenting.html create mode 100644 docs/manual/developer/documenting.html.en create mode 100644 docs/manual/developer/documenting.html.zh-cn.utf8 create mode 100644 docs/manual/developer/filters.html create mode 100644 docs/manual/developer/filters.html.en create mode 100644 docs/manual/developer/hooks.html create mode 100644 docs/manual/developer/hooks.html.en create mode 100644 docs/manual/developer/index.html create mode 100644 docs/manual/developer/index.html.en create mode 100644 docs/manual/developer/index.html.zh-cn.utf8 create mode 100644 docs/manual/developer/modguide.html create mode 100644 docs/manual/developer/modguide.html.en create mode 100644 docs/manual/developer/modules.html create mode 100644 docs/manual/developer/modules.html.en create mode 100644 docs/manual/developer/modules.html.ja.utf8 create mode 100644 docs/manual/developer/new_api_2_4.html create mode 100644 docs/manual/developer/new_api_2_4.html.en create mode 100644 docs/manual/developer/output-filters.html create mode 100644 docs/manual/developer/output-filters.html.en create mode 100644 docs/manual/developer/request.html create mode 100644 docs/manual/developer/request.html.en create mode 100644 docs/manual/developer/thread_safety.html create mode 100644 docs/manual/developer/thread_safety.html.en (limited to 'docs/manual/developer') diff --git a/docs/manual/developer/API.html b/docs/manual/developer/API.html new file mode 100644 index 0000000..f178e90 --- /dev/null +++ b/docs/manual/developer/API.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: API.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/API.html.en b/docs/manual/developer/API.html.en new file mode 100644 index 0000000..60be1bc --- /dev/null +++ b/docs/manual/developer/API.html.en @@ -0,0 +1,1245 @@ + + + + + +Apache 1.3 API notes - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Apache 1.3 API notes

+
+

Available Languages:  en 

+
+ +

Warning

+

This document has not been updated to take into account changes made + in the 2.0 version of the Apache HTTP Server. Some of the information may + still be relevant, but please use it with care.

+
+ +

These are some notes on the Apache API and the data structures you have + to deal with, etc. They are not yet nearly complete, but hopefully, + they will help you get your bearings. Keep in mind that the API is still + subject to change as we gain experience with it. (See the TODO file for + what might be coming). However, it will be easy to adapt modules + to any changes that are made. (We have more modules to adapt than you + do).

+ +

A few notes on general pedagogical style here. In the interest of + conciseness, all structure declarations here are incomplete -- the real + ones have more slots that I'm not telling you about. For the most part, + these are reserved to one component of the server core or another, and + should be altered by modules with caution. However, in some cases, they + really are things I just haven't gotten around to yet. Welcome to the + bleeding edge.

+ +

Finally, here's an outline, to give you some bare idea of what's coming + up, and in what order:

+ + +
+ +
top
+
+

Basic concepts

+

We begin with an overview of the basic concepts behind the API, and how + they are manifested in the code.

+ +

Handlers, Modules, and Requests

+

Apache breaks down request handling into a series of steps, more or + less the same way the Netscape server API does (although this API has a + few more stages than NetSite does, as hooks for stuff I thought might be + useful in the future). These are:

+ +
    +
  • URI -> Filename translation
  • +
  • Auth ID checking [is the user who they say they are?]
  • +
  • Auth access checking [is the user authorized here?]
  • +
  • Access checking other than auth
  • +
  • Determining MIME type of the object requested
  • +
  • `Fixups' -- there aren't any of these yet, but the phase is intended + as a hook for possible extensions like SetEnv, which don't really fit well elsewhere.
  • +
  • Actually sending a response back to the client.
  • +
  • Logging the request
  • +
+ +

These phases are handled by looking at each of a succession of + modules, looking to see if each of them has a handler for the + phase, and attempting invoking it if so. The handler can typically do one + of three things:

+ +
    +
  • Handle the request, and indicate that it has done so by + returning the magic constant OK.
  • + +
  • Decline to handle the request, by returning the magic integer + constant DECLINED. In this case, the server behaves in all + respects as if the handler simply hadn't been there.
  • + +
  • Signal an error, by returning one of the HTTP error codes. This + terminates normal handling of the request, although an ErrorDocument may + be invoked to try to mop up, and it will be logged in any case.
  • +
+ +

Most phases are terminated by the first module that handles them; + however, for logging, `fixups', and non-access authentication checking, + all handlers always run (barring an error). Also, the response phase is + unique in that modules may declare multiple handlers for it, via a + dispatch table keyed on the MIME type of the requested object. Modules may + declare a response-phase handler which can handle any request, + by giving it the key */* (i.e., a wildcard MIME type + specification). However, wildcard handlers are only invoked if the server + has already tried and failed to find a more specific response handler for + the MIME type of the requested object (either none existed, or they all + declined).

+ +

The handlers themselves are functions of one argument (a + request_rec structure. vide infra), which returns an integer, + as above.

+ + +

A brief tour of a module

+

At this point, we need to explain the structure of a module. Our + candidate will be one of the messier ones, the CGI module -- this handles + both CGI scripts and the ScriptAlias config file command. It's actually a great deal + more complicated than most modules, but if we're going to have only one + example, it might as well be the one with its fingers in every place.

+ +

Let's begin with handlers. In order to handle the CGI scripts, the + module declares a response handler for them. Because of ScriptAlias, it also has handlers for the + name translation phase (to recognize ScriptAliased URIs), the type-checking phase (any + ScriptAliased request is typed + as a CGI script).

+ +

The module needs to maintain some per (virtual) server information, + namely, the ScriptAliases in + effect; the module structure therefore contains pointers to a functions + which builds these structures, and to another which combines two of them + (in case the main server and a virtual server both have ScriptAliases declared).

+ +

Finally, this module contains code to handle the ScriptAlias command itself. This particular + module only declares one command, but there could be more, so modules have + command tables which declare their commands, and describe where + they are permitted, and how they are to be invoked.

+ +

A final note on the declared types of the arguments of some of these + commands: a pool is a pointer to a resource pool + structure; these are used by the server to keep track of the memory which + has been allocated, files opened, etc., either to service a + particular request, or to handle the process of configuring itself. That + way, when the request is over (or, for the configuration pool, when the + server is restarting), the memory can be freed, and the files closed, + en masse, without anyone having to write explicit code to track + them all down and dispose of them. Also, a cmd_parms + structure contains various information about the config file being read, + and other status information, which is sometimes of use to the function + which processes a config-file command (such as ScriptAlias). With no further ado, the + module itself:

+ +

+ /* Declarations of handlers. */
+
+ int translate_scriptalias (request_rec *);
+ int type_scriptalias (request_rec *);
+ int cgi_handler (request_rec *);
+
+ /* Subsidiary dispatch table for response-phase
+  * handlers, by MIME type */
+
+ handler_rec cgi_handlers[] = {
+ + { "application/x-httpd-cgi", cgi_handler },
+ { NULL }
+
+ };
+
+ /* Declarations of routines to manipulate the
+  * module's configuration info. Note that these are
+  * returned, and passed in, as void *'s; the server
+  * core keeps track of them, but it doesn't, and can't,
+  * know their internal structure.
+  */
+
+ void *make_cgi_server_config (pool *);
+ void *merge_cgi_server_config (pool *, void *, void *);
+
+ /* Declarations of routines to handle config-file commands */
+
+ extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake, + char *real);
+
+ command_rec cgi_cmds[] = {
+ + { "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
+ "a fakename and a realname"},
+ { NULL }
+
+ };
+
+ module cgi_module = { +

  STANDARD_MODULE_STUFF,
+  NULL,                     /* initializer */
+  NULL,                     /* dir config creator */
+  NULL,                     /* dir merger */
+  make_cgi_server_config,   /* server config */
+  merge_cgi_server_config,  /* merge server config */
+  cgi_cmds,                 /* command table */
+  cgi_handlers,             /* handlers */
+  translate_scriptalias,    /* filename translation */
+  NULL,                     /* check_user_id */
+  NULL,                     /* check auth */
+  NULL,                     /* check access */
+  type_scriptalias,         /* type_checker */
+  NULL,                     /* fixups */
+  NULL,                     /* logger */
+  NULL                      /* header parser */
+};
+ +
top
+
+

How handlers work

+

The sole argument to handlers is a request_rec structure. + This structure describes a particular request which has been made to the + server, on behalf of a client. In most cases, each connection to the + client generates only one request_rec structure.

+ +

A brief tour of the request_rec

+

The request_rec contains pointers to a resource pool + which will be cleared when the server is finished handling the request; + to structures containing per-server and per-connection information, and + most importantly, information on the request itself.

+ +

The most important such information is a small set of character strings + describing attributes of the object being requested, including its URI, + filename, content-type and content-encoding (these being filled in by the + translation and type-check handlers which handle the request, + respectively).

+ +

Other commonly used data items are tables giving the MIME headers on + the client's original request, MIME headers to be sent back with the + response (which modules can add to at will), and environment variables for + any subprocesses which are spawned off in the course of servicing the + request. These tables are manipulated using the ap_table_get + and ap_table_set routines.

+ +
+

Note that the Content-type header value cannot + be set by module content-handlers using the ap_table_*() + routines. Rather, it is set by pointing the content_type + field in the request_rec structure to an appropriate + string. e.g.,

+

+ r->content_type = "text/html"; +

+
+ +

Finally, there are pointers to two data structures which, in turn, + point to per-module configuration structures. Specifically, these hold + pointers to the data structures which the module has built to describe + the way it has been configured to operate in a given directory (via + .htaccess files or <Directory> sections), for private data it has built in the + course of servicing the request (so modules' handlers for one phase can + pass `notes' to their handlers for other phases). There is another such + configuration vector in the server_rec data structure pointed + to by the request_rec, which contains per (virtual) server + configuration data.

+ +

Here is an abridged declaration, giving the fields most commonly + used:

+ +

+ struct request_rec {
+
+ pool *pool;
+ conn_rec *connection;
+ server_rec *server;
+
+ /* What object is being requested */
+
+ char *uri;
+ char *filename;
+ char *path_info; +

char *args;           /* QUERY_ARGS, if any */
+struct stat finfo;    /* Set by server core;
+                       * st_mode set to zero if no such file */

+ char *content_type;
+ char *content_encoding;
+
+ /* MIME header environments, in and out. Also,
+  * an array containing environment variables to
+  * be passed to subprocesses, so people can write
+  * modules to add to that environment.
+  *
+  * The difference between headers_out and
+  * err_headers_out is that the latter are printed
+  * even on error, and persist across internal
+  * redirects (so the headers printed for
+  * ErrorDocument handlers will have + them).
+  */
+
+ table *headers_in;
+ table *headers_out;
+ table *err_headers_out;
+ table *subprocess_env;
+
+ /* Info about the request itself... */
+
+

int header_only;     /* HEAD request, as opposed to GET */
+char *protocol;      /* Protocol, as given to us, or HTTP/0.9 */
+char *method;        /* GET, HEAD, POST, etc. */
+int method_number;   /* M_GET, M_POST, etc. */

+ /* Info for logging */
+
+ char *the_request;
+ int bytes_sent;
+
+ /* A flag which modules can set, to indicate that
+  * the data being returned is volatile, and clients
+  * should be told not to cache it.
+  */
+
+ int no_cache;
+
+ /* Various other config info which may change
+  * with .htaccess files
+  * These are config vectors, with one void*
+  * pointer for each module (the thing pointed
+  * to being the module's business).
+  */
+
+

void *per_dir_config;   /* Options set in config files, etc. */
+void *request_config;   /* Notes on *this* request */

+ }; +

+ + +

Where request_rec structures come from

+

Most request_rec structures are built by reading an HTTP + request from a client, and filling in the fields. However, there are a + few exceptions:

+ +
    +
  • If the request is to an imagemap, a type map (i.e., a + *.var file), or a CGI script which returned a local + `Location:', then the resource which the user requested is going to be + ultimately located by some URI other than what the client originally + supplied. In this case, the server does an internal redirect, + constructing a new request_rec for the new URI, and + processing it almost exactly as if the client had requested the new URI + directly.
  • + +
  • If some handler signaled an error, and an ErrorDocument + is in scope, the same internal redirect machinery comes into play.
  • + +
  • Finally, a handler occasionally needs to investigate `what would + happen if' some other request were run. For instance, the directory + indexing module needs to know what MIME type would be assigned to a + request for each directory entry, in order to figure out what icon to + use.

    + +

    Such handlers can construct a sub-request, using the + functions ap_sub_req_lookup_file, + ap_sub_req_lookup_uri, and ap_sub_req_method_uri; + these construct a new request_rec structure and processes it + as you would expect, up to but not including the point of actually sending + a response. (These functions skip over the access checks if the + sub-request is for a file in the same directory as the original + request).

    + +

    (Server-side includes work by building sub-requests and then actually + invoking the response handler for them, via the function + ap_run_sub_req).

    +
  • +
+ + +

Handling requests, declining, and returning + error codes

+

As discussed above, each handler, when invoked to handle a particular + request_rec, has to return an int to indicate + what happened. That can either be

+ +
    +
  • OK -- the request was handled successfully. This may or + may not terminate the phase.
  • + +
  • DECLINED -- no erroneous condition exists, but the module + declines to handle the phase; the server tries to find another.
  • + +
  • an HTTP error code, which aborts handling of the request.
  • +
+ +

Note that if the error code returned is REDIRECT, then + the module should put a Location in the request's + headers_out, to indicate where the client should be + redirected to.

+ + +

Special considerations for response + handlers

+

Handlers for most phases do their work by simply setting a few fields + in the request_rec structure (or, in the case of access + checkers, simply by returning the correct error code). However, response + handlers have to actually send a request back to the client.

+ +

They should begin by sending an HTTP response header, using the + function ap_send_http_header. (You don't have to do anything + special to skip sending the header for HTTP/0.9 requests; the function + figures out on its own that it shouldn't do anything). If the request is + marked header_only, that's all they should do; they should + return after that, without attempting any further output.

+ +

Otherwise, they should produce a request body which responds to the + client as appropriate. The primitives for this are ap_rputc + and ap_rprintf, for internally generated output, and + ap_send_fd, to copy the contents of some FILE * + straight to the client.

+ +

At this point, you should more or less understand the following piece + of code, which is the handler which handles GET requests + which have no more specific handler; it also shows how conditional + GETs can be handled, if it's desirable to do so in a + particular response handler -- ap_set_last_modified checks + against the If-modified-since value supplied by the client, + if any, and returns an appropriate code (which will, if nonzero, be + USE_LOCAL_COPY). No similar considerations apply for + ap_set_content_length, but it returns an error code for + symmetry.

+ +

+ int default_handler (request_rec *r)
+ {
+ + int errstatus;
+ FILE *f;
+
+ if (r->method_number != M_GET) return DECLINED;
+ if (r->finfo.st_mode == 0) return NOT_FOUND;
+
+ if ((errstatus = ap_set_content_length (r, r->finfo.st_size))
+     || + (errstatus = ap_set_last_modified (r, r->finfo.st_mtime)))
+ return errstatus;
+
+ f = fopen (r->filename, "r");
+
+ if (f == NULL) {
+ + log_reason("file permissions deny server access", r->filename, r);
+ return FORBIDDEN;
+
+ }
+
+ register_timeout ("send", r);
+ ap_send_http_header (r);
+
+ if (!r->header_only) send_fd (f, r);
+ ap_pfclose (r->pool, f);
+ return OK;
+
+ } +

+ +

Finally, if all of this is too much of a challenge, there are a few + ways out of it. First off, as shown above, a response handler which has + not yet produced any output can simply return an error code, in which + case the server will automatically produce an error response. Secondly, + it can punt to some other handler by invoking + ap_internal_redirect, which is how the internal redirection + machinery discussed above is invoked. A response handler which has + internally redirected should always return OK.

+ +

(Invoking ap_internal_redirect from handlers which are + not response handlers will lead to serious confusion).

+ + +

Special considerations for authentication + handlers

+

Stuff that should be discussed here in detail:

+ +
    +
  • Authentication-phase handlers not invoked unless auth is + configured for the directory.
  • + +
  • Common auth configuration stored in the core per-dir + configuration; it has accessors ap_auth_type, + ap_auth_name, and ap_requires.
  • + +
  • Common routines, to handle the protocol end of things, at + least for HTTP basic authentication + (ap_get_basic_auth_pw, which sets the + connection->user structure field + automatically, and ap_note_basic_auth_failure, + which arranges for the proper WWW-Authenticate: + header to be sent back).
  • +
+ + +

Special considerations for logging + handlers

+

When a request has internally redirected, there is the question of + what to log. Apache handles this by bundling the entire chain of redirects + into a list of request_rec structures which are threaded + through the r->prev and r->next pointers. + The request_rec which is passed to the logging handlers in + such cases is the one which was originally built for the initial request + from the client; note that the bytes_sent field will only be + correct in the last request in the chain (the one for which a response was + actually sent).

+ +
top
+
+

Resource allocation and resource pools

+

One of the problems of writing and designing a server-pool server is + that of preventing leakage, that is, allocating resources (memory, open + files, etc.), without subsequently releasing them. The resource + pool machinery is designed to make it easy to prevent this from happening, + by allowing resource to be allocated in such a way that they are + automatically released when the server is done with them.

+ +

The way this works is as follows: the memory which is allocated, file + opened, etc., to deal with a particular request are tied to a + resource pool which is allocated for the request. The pool is a + data structure which itself tracks the resources in question.

+ +

When the request has been processed, the pool is cleared. At + that point, all the memory associated with it is released for reuse, all + files associated with it are closed, and any other clean-up functions which + are associated with the pool are run. When this is over, we can be confident + that all the resource tied to the pool have been released, and that none of + them have leaked.

+ +

Server restarts, and allocation of memory and resources for per-server + configuration, are handled in a similar way. There is a configuration + pool, which keeps track of resources which were allocated while reading + the server configuration files, and handling the commands therein (for + instance, the memory that was allocated for per-server module configuration, + log files and other files that were opened, and so forth). When the server + restarts, and has to reread the configuration files, the configuration pool + is cleared, and so the memory and file descriptors which were taken up by + reading them the last time are made available for reuse.

+ +

It should be noted that use of the pool machinery isn't generally + obligatory, except for situations like logging handlers, where you really + need to register cleanups to make sure that the log file gets closed when + the server restarts (this is most easily done by using the function ap_pfopen, which also arranges for the + underlying file descriptor to be closed before any child processes, such as + for CGI scripts, are execed), or in case you are using the + timeout machinery (which isn't yet even documented here). However, there are + two benefits to using it: resources allocated to a pool never leak (even if + you allocate a scratch string, and just forget about it); also, for memory + allocation, ap_palloc is generally faster than + malloc.

+ +

We begin here by describing how memory is allocated to pools, and then + discuss how other resources are tracked by the resource pool machinery.

+ +

Allocation of memory in pools

+

Memory is allocated to pools by calling the function + ap_palloc, which takes two arguments, one being a pointer to + a resource pool structure, and the other being the amount of memory to + allocate (in chars). Within handlers for handling requests, + the most common way of getting a resource pool structure is by looking at + the pool slot of the relevant request_rec; hence + the repeated appearance of the following idiom in module code:

+ +

+ int my_handler(request_rec *r)
+ {
+ + struct my_structure *foo;
+ ...
+
+ foo = (foo *)ap_palloc (r->pool, sizeof(my_structure));
+
+ } +

+ +

Note that there is no ap_pfree -- + ap_palloced memory is freed only when the associated resource + pool is cleared. This means that ap_palloc does not have to + do as much accounting as malloc(); all it does in the typical + case is to round up the size, bump a pointer, and do a range check.

+ +

(It also raises the possibility that heavy use of + ap_palloc could cause a server process to grow excessively + large. There are two ways to deal with this, which are dealt with below; + briefly, you can use malloc, and try to be sure that all of + the memory gets explicitly freed, or you can allocate a + sub-pool of the main pool, allocate your memory in the sub-pool, and clear + it out periodically. The latter technique is discussed in the section + on sub-pools below, and is used in the directory-indexing code, in order + to avoid excessive storage allocation when listing directories with + thousands of files).

+ + +

Allocating initialized memory

+

There are functions which allocate initialized memory, and are + frequently useful. The function ap_pcalloc has the same + interface as ap_palloc, but clears out the memory it + allocates before it returns it. The function ap_pstrdup + takes a resource pool and a char * as arguments, and + allocates memory for a copy of the string the pointer points to, returning + a pointer to the copy. Finally ap_pstrcat is a varargs-style + function, which takes a pointer to a resource pool, and at least two + char * arguments, the last of which must be + NULL. It allocates enough memory to fit copies of each of + the strings, as a unit; for instance:

+ +

+ ap_pstrcat (r->pool, "foo", "/", "bar", NULL); +

+ +

returns a pointer to 8 bytes worth of memory, initialized to + "foo/bar".

+ + +

Commonly-used pools in the Apache Web + server

+

A pool is really defined by its lifetime more than anything else. + There are some static pools in http_main which are passed to various + non-http_main functions as arguments at opportune times. Here they + are:

+ +
+
permanent_pool
+
never passed to anything else, this is the ancestor of all pools
+ +
pconf
+
+
    +
  • subpool of permanent_pool
  • + +
  • created at the beginning of a config "cycle"; exists + until the server is terminated or restarts; passed to all + config-time routines, either via cmd->pool, or as the + "pool *p" argument on those which don't take pools
  • + +
  • passed to the module init() functions
  • +
+
+ +
ptemp
+
+
    +
  • sorry I lie, this pool isn't called this currently in + 1.3, I renamed it this in my pthreads development. I'm + referring to the use of ptrans in the parent... contrast + this with the later definition of ptrans in the + child.
  • + +
  • subpool of permanent_pool
  • + +
  • created at the beginning of a config "cycle"; exists + until the end of config parsing; passed to config-time + routines via cmd->temp_pool. Somewhat of a + "bastard child" because it isn't available everywhere. + Used for temporary scratch space which may be needed by + some config routines but which is deleted at the end of + config.
  • +
+
+ +
pchild
+
+
    +
  • subpool of permanent_pool
  • + +
  • created when a child is spawned (or a thread is + created); lives until that child (thread) is + destroyed
  • + +
  • passed to the module child_init functions
  • + +
  • destruction happens right after the child_exit + functions are called... (which may explain why I think + child_exit is redundant and unneeded)
  • +
+
+ +
ptrans
+
+
    +
  • should be a subpool of pchild, but currently is a + subpool of permanent_pool, see above
  • + +
  • cleared by the child before going into the accept() + loop to receive a connection
  • + +
  • used as connection->pool
  • +
+
+ +
r->pool
+
+
    +
  • for the main request this is a subpool of + connection->pool; for subrequests it is a subpool of + the parent request's pool.
  • + +
  • exists until the end of the request (i.e., + ap_destroy_sub_req, or in child_main after + process_request has finished)
  • + +
  • note that r itself is allocated from r->pool; + i.e., r->pool is first created and then r is + the first thing palloc()d from it
  • +
+
+
+ +

For almost everything folks do, r->pool is the pool to + use. But you can see how other lifetimes, such as pchild, are useful to + some modules... such as modules that need to open a database connection + once per child, and wish to clean it up when the child dies.

+ +

You can also see how some bugs have manifested themself, such as + setting connection->user to a value from + r->pool -- in this case connection exists for the + lifetime of ptrans, which is longer than + r->pool (especially if r->pool is a + subrequest!). So the correct thing to do is to allocate from + connection->pool.

+ +

And there was another interesting bug in mod_include + / mod_cgi. You'll see in those that they do this test + to decide if they should use r->pool or + r->main->pool. In this case the resource that they are + registering for cleanup is a child process. If it were registered in + r->pool, then the code would wait() for the + child when the subrequest finishes. With mod_include this + could be any old #include, and the delay can be up to 3 + seconds... and happened quite frequently. Instead the subprocess is + registered in r->main->pool which causes it to be + cleaned up when the entire request is done -- i.e., after the + output has been sent to the client and logging has happened.

+ + +

Tracking open files, etc.

+

As indicated above, resource pools are also used to track other sorts + of resources besides memory. The most common are open files. The routine + which is typically used for this is ap_pfopen, which takes a + resource pool and two strings as arguments; the strings are the same as + the typical arguments to fopen, e.g.,

+ +

+ ...
+ FILE *f = ap_pfopen (r->pool, r->filename, "r");
+
+ if (f == NULL) { ... } else { ... }
+

+ +

There is also a ap_popenf routine, which parallels the + lower-level open system call. Both of these routines arrange + for the file to be closed when the resource pool in question is + cleared.

+ +

Unlike the case for memory, there are functions to close files + allocated with ap_pfopen, and ap_popenf, namely + ap_pfclose and ap_pclosef. (This is because, on + many systems, the number of files which a single process can have open is + quite limited). It is important to use these functions to close files + allocated with ap_pfopen and ap_popenf, since to + do otherwise could cause fatal errors on systems such as Linux, which + react badly if the same FILE* is closed more than once.

+ +

(Using the close functions is not mandatory, since the + file will eventually be closed regardless, but you should consider it in + cases where your module is opening, or could open, a lot of files).

+ + +

Other sorts of resources -- cleanup functions

+

More text goes here. Describe the cleanup primitives in terms of + which the file stuff is implemented; also, spawn_process.

+ +

Pool cleanups live until clear_pool() is called: + clear_pool(a) recursively calls destroy_pool() + on all subpools of a; then calls all the cleanups for + a; then releases all the memory for a. + destroy_pool(a) calls clear_pool(a) and then + releases the pool structure itself. i.e., + clear_pool(a) doesn't delete a, it just frees + up all the resources and you can start using it again immediately.

+ + +

Fine control -- creating and dealing with sub-pools, with + a note on sub-requests

+

On rare occasions, too-free use of ap_palloc() and the + associated primitives may result in undesirably profligate resource + allocation. You can deal with such a case by creating a sub-pool, + allocating within the sub-pool rather than the main pool, and clearing or + destroying the sub-pool, which releases the resources which were + associated with it. (This really is a rare situation; the only + case in which it comes up in the standard module set is in case of listing + directories, and then only with very large directories. + Unnecessary use of the primitives discussed here can hair up your code + quite a bit, with very little gain).

+ +

The primitive for creating a sub-pool is ap_make_sub_pool, + which takes another pool (the parent pool) as an argument. When the main + pool is cleared, the sub-pool will be destroyed. The sub-pool may also be + cleared or destroyed at any time, by calling the functions + ap_clear_pool and ap_destroy_pool, respectively. + (The difference is that ap_clear_pool frees resources + associated with the pool, while ap_destroy_pool also + deallocates the pool itself. In the former case, you can allocate new + resources within the pool, and clear it again, and so forth; in the + latter case, it is simply gone).

+ +

One final note -- sub-requests have their own resource pools, which are + sub-pools of the resource pool for the main request. The polite way to + reclaim the resources associated with a sub request which you have + allocated (using the ap_sub_req_... functions) is + ap_destroy_sub_req, which frees the resource pool. Before + calling this function, be sure to copy anything that you care about which + might be allocated in the sub-request's resource pool into someplace a + little less volatile (for instance, the filename in its + request_rec structure).

+ +

(Again, under most circumstances, you shouldn't feel obliged to call + this function; only 2K of memory or so are allocated for a typical sub + request, and it will be freed anyway when the main request pool is + cleared. It is only when you are allocating many, many sub-requests for a + single main request that you should seriously consider the + ap_destroy_... functions).

+ +
top
+
+

Configuration, commands and the like

+

One of the design goals for this server was to maintain external + compatibility with the NCSA 1.3 server --- that is, to read the same + configuration files, to process all the directives therein correctly, and + in general to be a drop-in replacement for NCSA. On the other hand, another + design goal was to move as much of the server's functionality into modules + which have as little as possible to do with the monolithic server core. The + only way to reconcile these goals is to move the handling of most commands + from the central server into the modules.

+ +

However, just giving the modules command tables is not enough to divorce + them completely from the server core. The server has to remember the + commands in order to act on them later. That involves maintaining data which + is private to the modules, and which can be either per-server, or + per-directory. Most things are per-directory, including in particular access + control and authorization information, but also information on how to + determine file types from suffixes, which can be modified by + AddType and ForceType directives, and so forth. In general, + the governing philosophy is that anything which can be made + configurable by directory should be; per-server information is generally + used in the standard set of modules for information like + Aliases and Redirects which come into play before the + request is tied to a particular place in the underlying file system.

+ +

Another requirement for emulating the NCSA server is being able to handle + the per-directory configuration files, generally called + .htaccess files, though even in the NCSA server they can + contain directives which have nothing at all to do with access control. + Accordingly, after URI -> filename translation, but before performing any + other phase, the server walks down the directory hierarchy of the underlying + filesystem, following the translated pathname, to read any + .htaccess files which might be present. The information which + is read in then has to be merged with the applicable information + from the server's own config files (either from the <Directory> sections in + access.conf, or from defaults in srm.conf, which + actually behaves for most purposes almost exactly like <Directory + />).

+ +

Finally, after having served a request which involved reading + .htaccess files, we need to discard the storage allocated for + handling them. That is solved the same way it is solved wherever else + similar problems come up, by tying those structures to the per-transaction + resource pool.

+ +

Per-directory configuration structures

+

Let's look out how all of this plays out in mod_mime.c, + which defines the file typing handler which emulates the NCSA server's + behavior of determining file types from suffixes. What we'll be looking + at, here, is the code which implements the AddType and AddEncoding commands. These commands can appear in + .htaccess files, so they must be handled in the module's + private per-directory data, which in fact, consists of two separate + tables for MIME types and encoding information, and is declared as + follows:

+ +
typedef struct {
+    table *forced_types;      /* Additional AddTyped stuff */
+    table *encoding_types;    /* Added with AddEncoding... */
+} mime_dir_config;
+ +

When the server is reading a configuration file, or <Directory> section, which includes + one of the MIME module's commands, it needs to create a + mime_dir_config structure, so those commands have something + to act on. It does this by invoking the function it finds in the module's + `create per-dir config slot', with two arguments: the name of the + directory to which this configuration information applies (or + NULL for srm.conf), and a pointer to a + resource pool in which the allocation should happen.

+ +

(If we are reading a .htaccess file, that resource pool + is the per-request resource pool for the request; otherwise it is a + resource pool which is used for configuration data, and cleared on + restarts. Either way, it is important for the structure being created to + vanish when the pool is cleared, by registering a cleanup on the pool if + necessary).

+ +

For the MIME module, the per-dir config creation function just + ap_pallocs the structure above, and a creates a couple of + tables to fill it. That looks like this:

+ +

+ void *create_mime_dir_config (pool *p, char *dummy)
+ {
+ + mime_dir_config *new =
+ + (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config));
+
+
+ new->forced_types = ap_make_table (p, 4);
+ new->encoding_types = ap_make_table (p, 4);
+
+ return new;
+
+ } +

+ +

Now, suppose we've just read in a .htaccess file. We + already have the per-directory configuration structure for the next + directory up in the hierarchy. If the .htaccess file we just + read in didn't have any AddType + or AddEncoding commands, its + per-directory config structure for the MIME module is still valid, and we + can just use it. Otherwise, we need to merge the two structures + somehow.

+ +

To do that, the server invokes the module's per-directory config merge + function, if one is present. That function takes three arguments: the two + structures being merged, and a resource pool in which to allocate the + result. For the MIME module, all that needs to be done is overlay the + tables from the new per-directory config structure with those from the + parent:

+ +

+ void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
+ {
+ + mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
+ mime_dir_config *subdir = (mime_dir_config *)subdirv;
+ mime_dir_config *new =
+ + (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config));
+
+
+ new->forced_types = ap_overlay_tables (p, subdir->forced_types,
+ + parent_dir->forced_types);
+
+ new->encoding_types = ap_overlay_tables (p, subdir->encoding_types,
+ + parent_dir->encoding_types);
+
+
+ return new;
+
+ } +

+ +

As a note -- if there is no per-directory merge function present, the + server will just use the subdirectory's configuration info, and ignore + the parent's. For some modules, that works just fine (e.g., for + the includes module, whose per-directory configuration information + consists solely of the state of the XBITHACK), and for those + modules, you can just not declare one, and leave the corresponding + structure slot in the module itself NULL.

+ + +

Command handling

+

Now that we have these structures, we need to be able to figure out how + to fill them. That involves processing the actual AddType and AddEncoding commands. To find commands, the server looks in + the module's command table. That table contains information on how many + arguments the commands take, and in what formats, where it is permitted, + and so forth. That information is sufficient to allow the server to invoke + most command-handling functions with pre-parsed arguments. Without further + ado, let's look at the AddType + command handler, which looks like this (the AddEncoding command looks basically the same, and won't be + shown here):

+ +

+ char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
+ {
+ + if (*ext == '.') ++ext;
+ ap_table_set (m->forced_types, ext, ct);
+ return NULL;
+
+ } +

+ +

This command handler is unusually simple. As you can see, it takes + four arguments, two of which are pre-parsed arguments, the third being the + per-directory configuration structure for the module in question, and the + fourth being a pointer to a cmd_parms structure. That + structure contains a bunch of arguments which are frequently of use to + some, but not all, commands, including a resource pool (from which memory + can be allocated, and to which cleanups should be tied), and the (virtual) + server being configured, from which the module's per-server configuration + data can be obtained if required.

+ +

Another way in which this particular command handler is unusually + simple is that there are no error conditions which it can encounter. If + there were, it could return an error message instead of NULL; + this causes an error to be printed out on the server's + stderr, followed by a quick exit, if it is in the main config + files; for a .htaccess file, the syntax error is logged in + the server error log (along with an indication of where it came from), and + the request is bounced with a server error response (HTTP error status, + code 500).

+ +

The MIME module's command table has entries for these commands, which + look like this:

+ +

+ command_rec mime_cmds[] = {
+ + { "AddType", add_type, NULL, OR_FILEINFO, TAKE2,
+ "a mime type followed by a file extension" },
+ { "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
+ + "an encoding (e.g., gzip), followed by a file extension" },
+
+ { NULL }
+
+ }; +

+ +

The entries in these tables are:

+
    +
  • The name of the command
  • +
  • The function which handles it
  • +
  • a (void *) pointer, which is passed in the + cmd_parms structure to the command handler --- + this is useful in case many similar commands are handled by + the same function.
  • + +
  • A bit mask indicating where the command may appear. There + are mask bits corresponding to each + AllowOverride option, and an additional mask + bit, RSRC_CONF, indicating that the command may + appear in the server's own config files, but not in + any .htaccess file.
  • + +
  • A flag indicating how many arguments the command handler + wants pre-parsed, and how they should be passed in. + TAKE2 indicates two pre-parsed arguments. Other + options are TAKE1, which indicates one + pre-parsed argument, FLAG, which indicates that + the argument should be On or Off, + and is passed in as a boolean flag, RAW_ARGS, + which causes the server to give the command the raw, unparsed + arguments (everything but the command name itself). There is + also ITERATE, which means that the handler looks + the same as TAKE1, but that if multiple + arguments are present, it should be called multiple times, + and finally ITERATE2, which indicates that the + command handler looks like a TAKE2, but if more + arguments are present, then it should be called multiple + times, holding the first argument constant.
  • + +
  • Finally, we have a string which describes the arguments + that should be present. If the arguments in the actual config + file are not as required, this string will be used to help + give a more specific error message. (You can safely leave + this NULL).
  • +
+ +

Finally, having set this all up, we have to use it. This is ultimately + done in the module's handlers, specifically for its file-typing handler, + which looks more or less like this; note that the per-directory + configuration structure is extracted from the request_rec's + per-directory configuration vector by using the + ap_get_module_config function.

+ +

+ int find_ct(request_rec *r)
+ {
+ + int i;
+ char *fn = ap_pstrdup (r->pool, r->filename);
+ mime_dir_config *conf = (mime_dir_config *)
+ + ap_get_module_config(r->per_dir_config, &mime_module);
+
+ char *type;
+
+ if (S_ISDIR(r->finfo.st_mode)) {
+ + r->content_type = DIR_MAGIC_TYPE;
+ return OK;
+
+ }
+
+ if((i=ap_rind(fn,'.')) < 0) return DECLINED;
+ ++i;
+
+ if ((type = ap_table_get (conf->encoding_types, &fn[i])))
+ {
+ + r->content_encoding = type;
+
+ /* go back to previous extension to try to use it as a type */
+ fn[i-1] = '\0';
+ if((i=ap_rind(fn,'.')) < 0) return OK;
+ ++i;
+
+ }
+
+ if ((type = ap_table_get (conf->forced_types, &fn[i])))
+ {
+ + r->content_type = type;
+
+ }
+
+ return OK; +
+ } +

+ + +

Side notes -- per-server configuration, + virtual servers, etc.

+

The basic ideas behind per-server module configuration are basically + the same as those for per-directory configuration; there is a creation + function and a merge function, the latter being invoked where a virtual + server has partially overridden the base server configuration, and a + combined structure must be computed. (As with per-directory configuration, + the default if no merge function is specified, and a module is configured + in some virtual server, is that the base configuration is simply + ignored).

+ +

The only substantial difference is that when a command needs to + configure the per-server private module data, it needs to go to the + cmd_parms data to get at it. Here's an example, from the + alias module, which also indicates how a syntax error can be returned + (note that the per-directory configuration argument to the command + handler is declared as a dummy, since the module doesn't actually have + per-directory config data):

+ +

+ char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
+ {
+ + server_rec *s = cmd->server;
+ alias_server_conf *conf = (alias_server_conf *)
+ + ap_get_module_config(s->module_config,&alias_module);
+
+ alias_entry *new = ap_push_array (conf->redirects);
+
+ if (!ap_is_url (url)) return "Redirect to non-URL";
+
+ new->fake = f; new->real = url;
+ return NULL;
+
+ } +

+ +
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/debugging.html b/docs/manual/developer/debugging.html new file mode 100644 index 0000000..83dcee2 --- /dev/null +++ b/docs/manual/developer/debugging.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: debugging.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/debugging.html.en b/docs/manual/developer/debugging.html.en new file mode 100644 index 0000000..00ce08c --- /dev/null +++ b/docs/manual/developer/debugging.html.en @@ -0,0 +1,60 @@ + + + + + +Debugging Memory Allocation in APR - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Debugging Memory Allocation in APR

+
+

Available Languages:  en 

+
+ +

+ This document has been removed. +

+
+
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/documenting.html b/docs/manual/developer/documenting.html new file mode 100644 index 0000000..fef7894 --- /dev/null +++ b/docs/manual/developer/documenting.html @@ -0,0 +1,9 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: documenting.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 + +URI: documenting.html.zh-cn.utf8 +Content-Language: zh-cn +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/documenting.html.en b/docs/manual/developer/documenting.html.en new file mode 100644 index 0000000..4902eb7 --- /dev/null +++ b/docs/manual/developer/documenting.html.en @@ -0,0 +1,112 @@ + + + + + +Documenting code in Apache 2.4 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Documenting code in Apache 2.4

+
+

Available Languages:  en  | + zh-cn 

+
+ +

Apache 2.4 uses Doxygen to + document the APIs and global variables in the code. This will explain + the basics of how to document using Doxygen.

+
+
top
+
+

Brief Description

+

To start a documentation block, use /**
+ To end a documentation block, use */

+ +

In the middle of the block, there are multiple tags we can + use:

+ +

+ Description of this functions purpose
+ @param parameter_name description
+ @return description
+ @deffunc signature of the function
+

+ +

The deffunc is not always necessary. DoxyGen does not + have a full parser in it, so any prototype that use a macro in the + return type declaration is too complex for scandoc. Those functions + require a deffunc. An example (using &gt; rather + than >):

+ +

+ /**
+  * return the final element of the pathname
+  * @param pathname The path to get the final element of
+  * @return the final element of the path
+  * @tip Examples:
+  * <pre>
+  * "/foo/bar/gum" -&gt; "gum"
+  * "/foo/bar/gum/" -&gt; ""
+  * "gum" -&gt; "gum"
+  * "wi\\n32\\stuff" -&gt; "stuff"
+  * </pre>
+  * @deffunc const char * ap_filename_of_pathname(const char *pathname)
+  */ +

+ +

At the top of the header file, always include:

+

+ /**
+  * @package Name of library header
+  */ +

+ +

Doxygen uses a new HTML file for each package. The HTML files are named + {Name_of_library_header}.html, so try to be concise with your names.

+ +

For a further discussion of the possibilities please refer to + the Doxygen site.

+
+
+

Available Languages:  en  | + zh-cn 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/documenting.html.zh-cn.utf8 b/docs/manual/developer/documenting.html.zh-cn.utf8 new file mode 100644 index 0000000..dab18a1 --- /dev/null +++ b/docs/manual/developer/documenting.html.zh-cn.utf8 @@ -0,0 +1,109 @@ + + + + + +Apache 2.0 文档 - Apache HTTP 服务器 版本 2.4 + + + + + + + +
<-
+

Apache 2.0 文档

+
+

可用语言:  en  | + zh-cn 

+
+
此翻译可能过期。要了解最近的更改,请阅读英文版。
+ +

Apache 2.0 使用 Doxygen 从代码中 + 生成 API 和全局变量的文档。下面是对使用 Doxygen 生成文档的简介。

+
+
top
+
+

简要说明

+

使用 /** 开始文档块
+ 使用 */ 结束文档块

+ +

在文档块中,我们可以使用多个标签:

+ +

+ Description of this functions purpose
+ @param parameter_name description
+ @return description
+ @deffunc signature of the function
+

+ +

一般不需要 deffunc 。DoxyGen 没有完整的解析器,所以任何 + 在返回类型声明中使用宏的原型,都是太复杂了。这些函数就需要使用 deffunc。 + 例如 (使用 &gt; 而不是 >):

+ +

+ /**
+  * return the final element of the pathname
+  * @param pathname The path to get the final element of
+  * @return the final element of the path
+  * @tip Examples:
+  * <pre>
+  * "/foo/bar/gum" -&gt; "gum"
+  * "/foo/bar/gum/" -&gt; ""
+  * "gum" -&gt; "gum"
+  * "wi\\n32\\stuff" -&gt; "stuff"
+  * </pre>
+  * @deffunc const char * ap_filename_of_pathname(const char *pathname)
+  */ +

+ +

总是在头文件开始包含:

+

+ /**
+  * @package Name of library header
+  */ +

+ +

Doxygen 为每个包生成一个新的 HTML 文件,名字是 + {Name_of_library_header}.html,所以请简化名称。

+ +

更深入的讨论,请参见 + Doxygen 站点

+
+
+

可用语言:  en  | + zh-cn 

+
top

评论

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/filters.html b/docs/manual/developer/filters.html new file mode 100644 index 0000000..48559da --- /dev/null +++ b/docs/manual/developer/filters.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: filters.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/filters.html.en b/docs/manual/developer/filters.html.en new file mode 100644 index 0000000..61971b5 --- /dev/null +++ b/docs/manual/developer/filters.html.en @@ -0,0 +1,234 @@ + + + + + +How filters work in Apache 2.0 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

How filters work in Apache 2.0

+
+

Available Languages:  en 

+
+ +

Warning

+

This is a cut 'n paste job from an email + (<022501c1c529$f63a9550$7f00000a@KOJ>) and only reformatted for + better readability. It's not up to date but may be a good start for + further research.

+
+
+ +
top
+
+

Filter Types

+

There are three basic filter types (each of these is actually broken + down into two categories, but that comes later).

+ +
+
CONNECTION
+
Filters of this type are valid for the lifetime of this connection. + (AP_FTYPE_CONNECTION, AP_FTYPE_NETWORK)
+ +
PROTOCOL
+
Filters of this type are valid for the lifetime of this request from + the point of view of the client, this means that the request is valid + from the time that the request is sent until the time that the response + is received. (AP_FTYPE_PROTOCOL, + AP_FTYPE_TRANSCODE)
+ +
RESOURCE
+
Filters of this type are valid for the time that this content is used + to satisfy a request. For simple requests, this is identical to + PROTOCOL, but internal redirects and sub-requests can change + the content without ending the request. (AP_FTYPE_RESOURCE, + AP_FTYPE_CONTENT_SET)
+
+ +

It is important to make the distinction between a protocol and a + resource filter. A resource filter is tied to a specific resource, it + may also be tied to header information, but the main binding is to a + resource. If you are writing a filter and you want to know if it is + resource or protocol, the correct question to ask is: "Can this filter + be removed if the request is redirected to a different resource?" If + the answer is yes, then it is a resource filter. If it is no, then it + is most likely a protocol or connection filter. I won't go into + connection filters, because they seem to be well understood. With this + definition, a few examples might help:

+ +
+
Byterange
+
We have coded it to be inserted for all requests, and it is removed + if not used. Because this filter is active at the beginning of all + requests, it can not be removed if it is redirected, so this is a + protocol filter.
+ +
http_header
+
This filter actually writes the headers to the network. This is + obviously a required filter (except in the asis case which is special + and will be dealt with below) and so it is a protocol filter.
+ +
Deflate
+
The administrator configures this filter based on which file has been + requested. If we do an internal redirect from an autoindex page to an + index.html page, the deflate filter may be added or removed based on + config, so this is a resource filter.
+
+ +

The further breakdown of each category into two more filter types is + strictly for ordering. We could remove it, and only allow for one + filter type, but the order would tend to be wrong, and we would need to + hack things to make it work. Currently, the RESOURCE filters + only have one filter type, but that should change.

+
top
+
+

How are filters inserted?

+

This is actually rather simple in theory, but the code is + complex. First of all, it is important that everybody realize that + there are three filter lists for each request, but they are all + concatenated together:

+
    +
  • r->output_filters (corresponds to RESOURCE)
  • +
  • r->proto_output_filters (corresponds to PROTOCOL)
  • +
  • r->connection->output_filters (corresponds to CONNECTION)
  • +
+ +

The problem previously, was that we used a singly linked list to create the filter stack, and we + started from the "correct" location. This means that if I had a + RESOURCE filter on the stack, and I added a + CONNECTION filter, the CONNECTION filter would + be ignored. This should make sense, because we would insert the connection + filter at the top of the c->output_filters list, but the end + of r->output_filters pointed to the filter that used to be + at the front of c->output_filters. This is obviously wrong. + The new insertion code uses a doubly linked list. This has the advantage + that we never lose a filter that has been inserted. Unfortunately, it comes + with a separate set of headaches.

+ +

The problem is that we have two different cases were we use subrequests. + The first is to insert more data into a response. The second is to + replace the existing response with an internal redirect. These are two + different cases and need to be treated as such.

+ +

In the first case, we are creating the subrequest from within a handler + or filter. This means that the next filter should be passed to + make_sub_request function, and the last resource filter in the + sub-request will point to the next filter in the main request. This + makes sense, because the sub-request's data needs to flow through the + same set of filters as the main request. A graphical representation + might help:

+ +
Default_handler --> includes_filter --> byterange --> ...
+ +

If the includes filter creates a sub request, then we don't want the + data from that sub-request to go through the includes filter, because it + might not be SSI data. So, the subrequest adds the following:

+ +
Default_handler --> includes_filter -/-> byterange --> ...
+                                    /
+Default_handler --> sub_request_core
+ +

What happens if the subrequest is SSI data? Well, that's easy, the + includes_filter is a resource filter, so it will be added to + the sub request in between the Default_handler and the + sub_request_core filter.

+ +

The second case for sub-requests is when one sub-request is going to + become the real request. This happens whenever a sub-request is created + outside of a handler or filter, and NULL is passed as the next filter to + the make_sub_request function.

+ +

In this case, the resource filters no longer make sense for the new + request, because the resource has changed. So, instead of starting from + scratch, we simply point the front of the resource filters for the + sub-request to the front of the protocol filters for the old request. + This means that we won't lose any of the protocol filters, neither will + we try to send this data through a filter that shouldn't see it.

+ +

The problem is that we are using a doubly-linked list for our filter + stacks now. But, you should notice that it is possible for two lists to + intersect in this model. So, you do you handle the previous pointer? + This is a very difficult question to answer, because there is no "right" + answer, either method is equally valid. I looked at why we use the + previous pointer. The only reason for it is to allow for easier + addition of new servers. With that being said, the solution I chose was + to make the previous pointer always stay on the original request.

+ +

This causes some more complex logic, but it works for all cases. My + concern in having it move to the sub-request, is that for the more + common case (where a sub-request is used to add data to a response), the + main filter chain would be wrong. That didn't seem like a good idea to + me.

+
top
+
+

Asis

+

The final topic. :-) Mod_Asis is a bit of a hack, but the + handler needs to remove all filters except for connection filters, and + send the data. If you are using mod_asis, all other + bets are off.

+
top
+
+

Explanations

+

The absolutely last point is that the reason this code was so hard to + get right, was because we had hacked so much to force it to work. I + wrote most of the hacks originally, so I am very much to blame. + However, now that the code is right, I have started to remove some + hacks. Most people should have seen that the reset_filters + and add_required_filters functions are gone. Those inserted + protocol level filters for error conditions, in fact, both functions did + the same thing, one after the other, it was really strange. Because we + don't lose protocol filters for error cases any more, those hacks went away. + The HTTP_HEADER, Content-length, and + Byterange filters are all added in the + insert_filters phase, because if they were added earlier, we + had some interesting interactions. Now, those could all be moved to be + inserted with the HTTP_IN, CORE, and + CORE_IN filters. That would make the code easier to + follow.

+
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/hooks.html b/docs/manual/developer/hooks.html new file mode 100644 index 0000000..75c3cad --- /dev/null +++ b/docs/manual/developer/hooks.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: hooks.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/hooks.html.en b/docs/manual/developer/hooks.html.en new file mode 100644 index 0000000..30aa6f9 --- /dev/null +++ b/docs/manual/developer/hooks.html.en @@ -0,0 +1,261 @@ + + + + + +Hook Functions in the Apache HTTP Server 2.x - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Hook Functions in the Apache HTTP Server 2.x

+
+

Available Languages:  en 

+
+ +

Warning

+

This document is still in development and may be partially out of + date.

+
+ +

In general, a hook function is one that the Apache HTTP Server + will call at some point during the processing of a request. + Modules can provide functions that are called, and specify when + they get called in comparison to other modules.

+
+ +
top
+
+

Core Hooks

+

The httpd's core modules offer a predefinined list of hooks + used during the standard request processing + phase. Creating a new hook will expose a function that + implements it (see sections below) but it is essential to understand that you will not + extend the httpd's core hooks. Their presence and order in the request processing is in fact + a consequence of how they are called in server/request.c + (check this section + for an overview). The core hooks are listed in the + doxygen documentation.

+ +

Reading guide for developing modules and + request processing before proceeding is + highly recommended. +

+
top
+
+

Creating a hook function

+

In order to create a new hook, four things need to be + done:

+ +

Declare the hook function

+

Use the AP_DECLARE_HOOK macro, which needs to be given + the return type of the hook function, the name of the hook, and the + arguments. For example, if the hook returns an int and + takes a request_rec * and an int and is + called do_something, then declare it like this:

+
AP_DECLARE_HOOK(int, do_something, (request_rec *r, int n))
+ + +

This should go in a header which modules will include if + they want to use the hook.

+ + +

Create the hook structure

+

Each source file that exports a hook has a private structure + which is used to record the module functions that use the hook. + This is declared as follows:

+ +
APR_HOOK_STRUCT(
+  APR_HOOK_LINK(do_something)
+  ...
+)
+ + + +

Implement the hook caller

+

The source file that exports the hook has to implement a + function that will call the hook. There are currently three + possible ways to do this. In all cases, the calling function is + called ap_run_hookname().

+ +

Void hooks

+

If the return value of a hook is void, then all the + hooks are called, and the caller is implemented like this:

+ +
AP_IMPLEMENT_HOOK_VOID(do_something, (request_rec *r, int n), (r, n))
+ + +

The second and third arguments are the dummy argument + declaration and the dummy arguments as they will be used when + calling the hook. In other words, this macro expands to + something like this:

+ +
void ap_run_do_something(request_rec *r, int n)
+{
+    ...
+    do_something(r, n);
+}
+ + + +

Hooks that return a value

+

If the hook returns a value, then it can either be run until + the first hook that does something interesting, like so:

+ +
AP_IMPLEMENT_HOOK_RUN_FIRST(int, do_something, (request_rec *r, int n), (r, n), DECLINED)
+ + +

The first hook that does not return DECLINED + stops the loop and its return value is returned from the hook + caller. Note that DECLINED is the traditional + hook return value meaning "I didn't do anything", but it can be + whatever suits you.

+ +

Alternatively, all hooks can be run until an error occurs. + This boils down to permitting two return values, one of + which means "I did something, and it was OK" and the other + meaning "I did nothing". The first function that returns a + value other than one of those two stops the loop, and its + return is the return value. Declare these like so:

+ +
AP_IMPLEMENT_HOOK_RUN_ALL(int, do_something, (request_rec *r, int n), (r, n), OK, DECLINED)
+ + +

Again, OK and DECLINED are the traditional + values. You can use what you want.

+ + + +

Call the hook callers

+

At appropriate moments in the code, call the hook caller, + like so:

+ +
int n, ret;
+request_rec *r;
+
+ret=ap_run_do_something(r, n);
+ + +
top
+
+

Hooking the hook

+

A module that wants a hook to be called needs to do two + things.

+ +

Implement the hook function

+

Include the appropriate header, and define a static function + of the correct type:

+ +
static int my_something_doer(request_rec *r, int n)
+{
+    ...
+    return OK;
+}
+ + + +

Add a hook registering function

+

During initialisation, the server will call each modules hook + registering function, which is included in the module + structure:

+ +
static void my_register_hooks()
+{
+    ap_hook_do_something(my_something_doer, NULL, NULL, APR_HOOK_MIDDLE);
+}
+
+mode MODULE_VAR_EXPORT my_module =
+{
+    ...
+    my_register_hooks       /* register hooks */
+};
+ + + +

Controlling hook calling order

+

In the example above, we didn't use the three arguments in + the hook registration function that control calling order of + all the functions registered within the hook. + There are two mechanisms for doing this. The first, rather + crude, method, allows us to specify roughly where the hook is + run relative to other modules. The final argument control this. + There are three possible values: APR_HOOK_FIRST, + APR_HOOK_MIDDLE and APR_HOOK_LAST.

+ +

All modules using any particular value may be run in any + order relative to each other, but, of course, all modules using + APR_HOOK_FIRST will be run before APR_HOOK_MIDDLE + which are before APR_HOOK_LAST. Modules that don't care + when they are run should use APR_HOOK_MIDDLE. These + values are spaced out, so that positions like APR_HOOK_FIRST-2 + are possible to hook slightly earlier than other functions.

+ +

Note that there are two more values, + APR_HOOK_REALLY_FIRST and APR_HOOK_REALLY_LAST. These + should only be used by the hook exporter.

+ +

The other method allows finer control. When a module knows + that it must be run before (or after) some other modules, it + can specify them by name. The second (third) argument is a + NULL-terminated array of strings consisting of the names of + modules that must be run before (after) the current module. For + example, suppose we want "mod_xyz.c" and "mod_abc.c" to run + before we do, then we'd hook as follows:

+ +
static void register_hooks()
+{
+    static const char * const aszPre[] = { "mod_xyz.c", "mod_abc.c", NULL };
+
+    ap_hook_do_something(my_something_doer, aszPre, NULL, APR_HOOK_MIDDLE);
+}
+ + +

Note that the sort used to achieve this is stable, so + ordering set by APR_HOOK_ORDER is preserved, as far + as is possible.

+ + +
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/index.html b/docs/manual/developer/index.html new file mode 100644 index 0000000..d79f31b --- /dev/null +++ b/docs/manual/developer/index.html @@ -0,0 +1,9 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: index.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 + +URI: index.html.zh-cn.utf8 +Content-Language: zh-cn +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/index.html.en b/docs/manual/developer/index.html.en new file mode 100644 index 0000000..48b834d --- /dev/null +++ b/docs/manual/developer/index.html.en @@ -0,0 +1,89 @@ + + + + + +Developer Documentation for the Apache HTTP Server 2.4 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Developer Documentation for the Apache HTTP Server 2.4

+
+

Available Languages:  en  | + zh-cn 

+
+ +

Warning

+

Many of the documents listed here are in need of update. + They are in different stages of progress. + Please be patient and follow this link + to propose a fix or point out any error/discrepancy.

+
+
+ +
top
+
top
+
top
+
+
+

Available Languages:  en  | + zh-cn 

+
+ \ No newline at end of file diff --git a/docs/manual/developer/index.html.zh-cn.utf8 b/docs/manual/developer/index.html.zh-cn.utf8 new file mode 100644 index 0000000..b4e21ae --- /dev/null +++ b/docs/manual/developer/index.html.zh-cn.utf8 @@ -0,0 +1,88 @@ + + + + + +Apache 2.0 开发者文档 - Apache HTTP 服务器 版本 2.4 + + + + + + + +
<-
+

Apache 2.0 开发者文档

+
+

可用语言:  en  | + zh-cn 

+
+
此翻译可能过期。要了解最近的更改,请阅读英文版。
+ +

开发者页面的许多文档都来自于 Apache 1.3。当更新到 Apache 2 + 时,它们可能位于不同的阶段。请耐心等待,或者直接向 + dev@httpd.apache.org 邮件列表报告开发者页面的差异或错误。

+
+ +
top
+
top
+
+
+

可用语言:  en  | + zh-cn 

+
+ \ No newline at end of file diff --git a/docs/manual/developer/modguide.html b/docs/manual/developer/modguide.html new file mode 100644 index 0000000..3e5c834 --- /dev/null +++ b/docs/manual/developer/modguide.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: modguide.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/modguide.html.en b/docs/manual/developer/modguide.html.en new file mode 100644 index 0000000..3ac127e --- /dev/null +++ b/docs/manual/developer/modguide.html.en @@ -0,0 +1,1739 @@ + + + + + +Developing modules for the Apache HTTP Server 2.4 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Developing modules for the Apache HTTP Server 2.4

+
+

Available Languages:  en 

+
+ +

This document explains how you can develop modules for the Apache HTTP +Server 2.4

+
+ +
top
+
+

Introduction

+

What we will be discussing in this document

+

+This document will discuss how you can create modules for the Apache +HTTP Server 2.4, by exploring an example module called +mod_example. In the first part of this document, the purpose +of this module will be to calculate and print out various digest values for +existing files on your web server, whenever we access the URL +http://hostname/filename.sum. For instance, if we want to know the +MD5 digest value of the file located at +http://www.example.com/index.html, we would visit +http://www.example.com/index.html.sum. +

+ +

+In the second part of this document, which deals with configuration +directive and context awareness, we will be looking at a module that simply +writes out its own configuration to the client. +

+ + +

Prerequisites

+

+First and foremost, you are expected to have a basic knowledge of how the C +programming language works. In most cases, we will try to be as pedagogical +as possible and link to documents describing the functions used in the +examples, but there are also many cases where it is necessary to either +just assume that "it works" or do some digging yourself into what the hows +and whys of various function calls. +

+

+Lastly, you will need to have a basic understanding of how modules are +loaded and configured in the Apache HTTP Server, as well as how to get the headers for +Apache if you do not have them already, as these are needed for compiling +new modules. +

+ +

Compiling your module

+

+To compile the source code we are building in this document, we will be +using APXS. Assuming your source file +is called mod_example.c, compiling, installing and activating the module is +as simple as: +

+
apxs -i -a -c mod_example.c
+ + +
top
+
+

Defining a module

+

+Module name tags
+Every module starts with the same declaration, or name tag if you will, +that defines a module as a separate entity within Apache:

+ + + +
module AP_MODULE_DECLARE_DATA   example_module =
+{ 
+    STANDARD20_MODULE_STUFF,
+    create_dir_conf, /* Per-directory configuration handler */
+    merge_dir_conf,  /* Merge handler for per-directory configurations */
+    create_svr_conf, /* Per-server configuration handler */
+    merge_svr_conf,  /* Merge handler for per-server configurations */
+    directives,      /* Any directives we may have for httpd */
+    register_hooks   /* Our hook registering function */
+};
+ + + +

+This bit of code lets the server know that we have now registered a new module +in the system, and that its name is example_module. The name +of the module is used primarily for two things:
+

+
    +
  • Letting the server know how to load the module using the LoadModule
  • +
  • Setting up a namespace for the module to use in configurations
  • +
+

+For now, we're only concerned with the first purpose of the module name, +which comes into play when we need to load the module: +

+
LoadModule example_module modules/mod_example.so
+ +

+In essence, this tells the server to open up mod_example.so and look for a module +called example_module. +

+

+Within this name tag of ours is also a bunch of references to how we would +like to handle things: Which directives do we respond to in a configuration +file or .htaccess, how do we operate within specific contexts, and what +handlers are we interested in registering with the Apache HTTP service. We'll +return to all these elements later in this document. +

+
top
+
+

Getting started: Hooking into the server

+

An introduction to hooks

+

+When handling requests in Apache HTTP Server 2.4, the first thing you will need to do is +create a hook into the request handling process. A hook is essentially a +message telling the server that you are willing to either serve or at least +take a glance at certain requests given by clients. All handlers, whether +it's mod_rewrite, mod_authn_*, mod_proxy and so on, are hooked into +specific parts of the request process. As you are probably aware, modules +serve different purposes; Some are authentication/authorization handlers, +others are file or script handlers while some third modules rewrite URIs or +proxies content. Furthermore, in the end, it is up to the user of the server +how and when each module will come into place. Thus, the server itself does not +presume to know which module is responsible for handling a specific +request, and will ask each module whether they have an interest in a given +request or not. It is then up to each module to either gently decline +serving a request, accept serving it or flat out deny the request from +being served, as authentication/authorization modules do:
+Hook handling in httpd
+To make it a bit easier for handlers such as our mod_example to know +whether the client is requesting content we should handle or not, the server +has directives for hinting to modules whether their assistance is needed or +not. Two of these are AddHandler +and SetHandler. Let's take a look at +an example using AddHandler. In +our example case, we want every request ending with .sum to be served by +mod_example, so we'll add a configuration directive that tells +the server to do just that: +

+
AddHandler example-handler .sum
+ +

+What this tells the server is the following: Whenever we receive a request +for a URI ending in .sum, we are to let all modules know that we are +looking for whoever goes by the name of "example-handler" . +Thus, when a request is being served that ends in .sum, the server will let all +modules know, that this request should be served by "example-handler +". As you will see later, when we start building mod_example, we will +check for this handler tag relayed by AddHandler and reply to +the server based on the value of this tag. +

+ +

Hooking into httpd

+

+To begin with, we only want to create a simple handler that replies to the +client browser when a specific URL is requested, so we won't bother setting +up configuration handlers and directives just yet. Our initial module +definition will look like this:

+ + + +
module AP_MODULE_DECLARE_DATA   example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    NULL,
+    NULL,
+    NULL,
+    NULL,
+    NULL,
+    register_hooks   /* Our hook registering function */
+};
+ + + + +

This lets the server know that we are not interested in anything fancy, we +just want to hook onto the requests and possibly handle some of them.

+ +

The reference in our example declaration, register_hooks +is the name of a function we will create to manage how we hook onto the +request process. In this example module, the function has just one purpose; +To create a simple hook that gets called after all the rewrites, access +control etc has been handled. Thus, we will let the server know that we want +to hook into its process as one of the last modules: +

+ + +
static void register_hooks(apr_pool_t *pool)
+{
+    /* Create a hook in the request handler, so we get called when a request arrives */
+    ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
+}
+ + + +

+The example_handler reference is the function that will handle +the request. We will discuss how to create a handler in the next chapter. +

+ +

Other useful hooks

+

+Hooking into the request handling phase is but one of many hooks that you +can create. Some other ways of hooking are: +

+
    +
  • ap_hook_child_init: Place a hook that executes when a child process is spawned (commonly used for initializing modules after the server has forked)
  • +
  • ap_hook_pre_config: Place a hook that executes before any configuration data has been read (very early hook)
  • +
  • ap_hook_post_config: Place a hook that executes after configuration has been parsed, but before the server has forked
  • +
  • ap_hook_pre_translate_name: Place a hook that executes when a URI needs to be translated into a filename on the server, before decoding
  • +
  • ap_hook_translate_name: Place a hook that executes when a URI needs to be translated into a filename on the server (think mod_rewrite)
  • +
  • ap_hook_quick_handler: Similar to ap_hook_handler, except it is run before any other request hooks (translation, auth, fixups etc)
  • +
  • ap_hook_log_transaction: Place a hook that executes when the server is about to add a log entry of the current request
  • +
+ + +
top
+
+

Building a handler

+

+A handler is essentially a function that receives a callback when a request +to the server is made. It is passed a record of the current request (how it was +made, which headers and requests were passed along, who's giving the +request and so on), and is put in charge of either telling the server that it's +not interested in the request or handle the request with the tools provided. +

+

A simple "Hello, world!" +handler

+

Let's start off by making a very simple request handler +that does the following: +

+
    +
  1. Check that this is a request that should be served by "example-handler"
  2. +
  3. Set the content type of our output to text/html
  4. +
  5. Write "Hello, world!" back to the client browser
  6. +
  7. Let the server know that we took care of this request and everything went fine
  8. +
+

+In C code, our example handler will now look like this: +

+ + +
static int example_handler(request_rec *r)
+{
+    /* First off, we need to check if this is a call for the "example-handler" handler.
+     * If it is, we accept it and do our things, if not, we simply return DECLINED,
+     * and the server will try somewhere else.
+     */
+    if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED);
+    
+    /* Now that we are handling this request, we'll write out "Hello, world!" to the client.
+     * To do so, we must first set the appropriate content type, followed by our output.
+     */
+    ap_set_content_type(r, "text/html");
+    ap_rprintf(r, "Hello, world!");
+    
+    /* Lastly, we must tell the server that we took care of this request and everything went fine.
+     * We do so by simply returning the value OK to the server.
+     */
+    return OK;
+}
+ + + +

+Now, we put all we have learned together and end up with a program that +looks like +mod_example_1.c +. The functions used in this example will be explained later in the section +"Some useful functions you should know". +

+ +

The request_rec structure

+

The most essential part of any request is the request record +. In a call to a handler function, this is represented by the +request_rec* structure passed along with every call that is made. +This struct, typically just referred to as r in modules, +contains all the information you need for your module to fully process any +HTTP request and respond accordingly.

Some key elements of the +request_rec structure are: +

+
    +
  • r->handler (char*): Contains the name of the handler the server is currently asking to do the handling of this request
  • +
  • r->method (char*): Contains the HTTP method being used, f.x. GET or POST
  • +
  • r->filename (char*): Contains the translated filename the client is requesting
  • +
  • r->args (char*): Contains the query string of the request, if any
  • +
  • r->headers_in (apr_table_t*): Contains all the headers sent by the client
  • +
  • r->connection (conn_rec*): A record containing information about the current connection
  • +
  • r->user (char*): If the URI requires authentication, this is set to the username provided
  • +
  • r->useragent_ip (char*): The IP address of the client connecting to us
  • +
  • r->pool (apr_pool_t*): The memory pool of this request. We'll discuss this in the +"Memory management" chapter.
  • +
+

+A complete list of all the values contained within the request_rec structure can be found in +the httpd.h header +file or at http://ci.apache.org/projects/httpd/trunk/doxygen/structrequest__rec.html. +

+ + +

+Let's try out some of these variables in another example handler:
+

+ + +
static int example_handler(request_rec *r)
+{
+    /* Set the appropriate content type */
+    ap_set_content_type(r, "text/html");
+
+    /* Print out the IP address of the client connecting to us: */
+    ap_rprintf(r, "<h2>Hello, %s!</h2>", r->useragent_ip);
+    
+    /* If we were reached through a GET or a POST request, be happy, else sad. */
+    if ( !strcmp(r->method, "POST") || !strcmp(r->method, "GET") ) {
+        ap_rputs("You used a GET or a POST method, that makes us happy!<br/>", r);
+    }
+    else {
+        ap_rputs("You did not use POST or GET, that makes us sad :(<br/>", r);
+    }
+
+    /* Lastly, if there was a query string, let's print that too! */
+    if (r->args) {
+        ap_rprintf(r, "Your query string was: %s", r->args);
+    }
+    return OK;
+}
+ + + + + +

Return values

+

+Apache relies on return values from handlers to signify whether a request +was handled or not, and if so, whether the request went well or not. If a +module is not interested in handling a specific request, it should always +return the value DECLINED. If it is handling a request, it +should either return the generic value OK, or a specific HTTP +status code, for example: +

+ + +
static int example_handler(request_rec *r)
+{
+    /* Return 404: Not found */
+    return HTTP_NOT_FOUND;
+}
+ + + +

+Returning OK or a HTTP status code does not necessarily mean +that the request will end. The server may still have other handlers that are +interested in this request, for instance the logging modules which, upon a +successful request, will write down a summary of what was requested and how +it went. To do a full stop and prevent any further processing after your +module is done, you can return the value DONE to let the server +know that it should cease all activity on this request and carry on with +the next, without informing other handlers. +
+General response codes: +

+
    +
  • DECLINED: We are not handling this request
  • +
  • OK: We handled this request and it went well
  • +
  • DONE: We handled this request and the server should just close this thread without further processing
  • +
+

+HTTP specific return codes (excerpt): +

+
    +
  • HTTP_OK (200): Request was okay
  • +
  • HTTP_MOVED_PERMANENTLY (301): The resource has moved to a new URL
  • +
  • HTTP_UNAUTHORIZED (401): Client is not authorized to visit this page
  • +
  • HTTP_FORBIDDEN (403): Permission denied
  • +
  • HTTP_NOT_FOUND (404): File not found
  • +
  • HTTP_INTERNAL_SERVER_ERROR (500): Internal server error (self explanatory)
  • +
+ + +

Some useful functions you should know

+ +
    +
  • + ap_rputs(const char *string, request_rec *r):
    + Sends a string of text to the client. This is a shorthand version of + ap_rwrite. + + + +
    ap_rputs("Hello, world!", r);
    + + + + +
  • +
  • + + ap_rprintf:
    + This function works just like printf, except it sends the result to the client. + + + +
    ap_rprintf(r, "Hello, %s!", r->useragent_ip);
    + + + +
  • +
  • + + ap_set_content_type(request_rec *r, const char *type):
    + Sets the content type of the output you are sending. + + + +
    ap_set_content_type(r, "text/plain"); /* force a raw text output */
    + + + +
  • + + +
+ + +

Memory management

+

+Managing your resources in Apache HTTP Server 2.4 is quite easy, thanks to the memory pool +system. In essence, each server, connection and request have their own +memory pool that gets cleaned up when its scope ends, e.g. when a request +is done or when a server process shuts down. All your module needs to do is +latch onto this memory pool, and you won't have to worry about having to +clean up after yourself - pretty neat, huh? +

+ +

+In our module, we will primarily be allocating memory for each request, so +it's appropriate to use the r->pool +reference when creating new objects. A few of the functions for allocating +memory within a pool are: +

+
    +
  • void* apr_palloc( +apr_pool_t *p, apr_size_t size): Allocates size number of bytes in the pool for you
  • +
  • void* apr_pcalloc( +apr_pool_t *p, apr_size_t size): Allocates size number of bytes in the pool for you and sets all bytes to 0
  • +
  • char* apr_pstrdup( +apr_pool_t *p, const char *s): Creates a duplicate of the string s. This is useful for copying constant values so you can edit them
  • +
  • char* apr_psprintf( +apr_pool_t *p, const char *fmt, ...): Similar to sprintf, except the server supplies you with an appropriately allocated target variable
  • +
+ +

Let's put these functions into an example handler:

+ + + +
static int example_handler(request_rec *r)
+{
+    const char *original = "You can't edit this!";
+    char *copy;
+    int *integers;
+    
+    /* Allocate space for 10 integer values and set them all to zero. */
+    integers = apr_pcalloc(r->pool, sizeof(int)*10); 
+    
+    /* Create a copy of the 'original' variable that we can edit. */
+    copy = apr_pstrdup(r->pool, original);
+    return OK;
+}
+ + + +

+This is all well and good for our module, which won't need any +pre-initialized variables or structures. However, if we wanted to +initialize something early on, before the requests come rolling in, we +could simply add a call to a function in our register_hooks +function to sort it out: +

+ + +
static void register_hooks(apr_pool_t *pool)
+{
+    /* Call a function that initializes some stuff */
+    example_init_function(pool);
+    /* Create a hook in the request handler, so we get called when a request arrives */
+    ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
+}
+ + + +

+In this pre-request initialization function we would not be using the +same pool as we did when allocating resources for request-based functions. +Instead, we would use the pool given to us by the server for allocating memory +on a per-process based level. +

+ + +

Parsing request data

+

+In our example module, we would like to add a feature, that checks which +type of digest, MD5 or SHA1 the client would like to see. This could be +solved by adding a query string to the request. A query string is typically +comprised of several keys and values put together in a string, for instance +valueA=yes&valueB=no&valueC=maybe. It is up to the +module itself to parse these and get the data it requires. In our example, +we'll be looking for a key called digest, and if set to +md5, we'll produce an MD5 digest, otherwise we'll produce a SHA1 +digest. +

+

+Since the introduction of Apache HTTP Server 2.4, parsing request data from GET and +POST requests have never been easier. All we require to parse both GET and +POST data is four simple lines: +

+ + + +
+apr_table_t *GET; 
+apr_array_header_t*POST; 
+
+
+
+ap_args_to_table(r, &GET); 
+
+ap_parse_form_data(r, NULL, &POST, -1, 8192);
+ + + +

+In our specific example module, we're looking for the digest +value from the query string, which now resides inside a table called +GET. To extract this value, we need only perform a simple operation: +

+ + + +
/* Get the "digest" key from the query string, if any. */
+const char *digestType = apr_table_get(GET, "digest");
+
+/* If no key was returned, we will set a default value instead. */
+if (!digestType) digestType = "sha1";
+ + + +

+The structures used for the POST and GET data are not exactly the same, so +if we were to fetch a value from POST data instead of the query string, we +would have to resort to a few more lines, as outlined in this example in the last chapter of this document. +

+ + +

Making an advanced handler

+

+Now that we have learned how to parse form data and manage our resources, +we can move on to creating an advanced version of our module, that spits +out the MD5 or SHA1 digest of files: +

+ + + +
static int example_handler(request_rec *r)
+{
+    int rc, exists;
+    apr_finfo_t finfo;
+    apr_file_t *file;
+    char *filename;
+    char buffer[256];
+    apr_size_t readBytes;
+    int n;
+    apr_table_t *GET;
+    apr_array_header_t *POST;
+    const char *digestType;
+    
+    
+    /* Check that the "example-handler" handler is being called. */
+    if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED);
+    
+    /* Figure out which file is being requested by removing the .sum from it */
+    filename = apr_pstrdup(r->pool, r->filename);
+    filename[strlen(filename)-4] = 0; /* Cut off the last 4 characters. */
+    
+    /* Figure out if the file we request a sum on exists and isn't a directory */
+    rc = apr_stat(&finfo, filename, APR_FINFO_MIN, r->pool);
+    if (rc == APR_SUCCESS) {
+        exists =
+        (
+            (finfo.filetype != APR_NOFILE)
+        &&  !(finfo.filetype & APR_DIR)
+        );
+        if (!exists) return HTTP_NOT_FOUND; /* Return a 404 if not found. */
+    }
+    /* If apr_stat failed, we're probably not allowed to check this file. */
+    else return HTTP_FORBIDDEN;
+    
+    /* Parse the GET and, optionally, the POST data sent to us */
+    
+    ap_args_to_table(r, &GET);
+    ap_parse_form_data(r, NULL, &POST, -1, 8192);
+    
+    /* Set the appropriate content type */
+    ap_set_content_type(r, "text/html");
+    
+    /* Print a title and some general information */
+    ap_rprintf(r, "<h2>Information on %s:</h2>", filename);
+    ap_rprintf(r, "<b>Size:</b> %u bytes<br/>", finfo.size);
+    
+    /* Get the digest type the client wants to see */
+    digestType = apr_table_get(GET, "digest");
+    if (!digestType) digestType = "MD5";
+    
+    
+    rc = apr_file_open(&file, filename, APR_READ, APR_OS_DEFAULT, r->pool);
+    if (rc == APR_SUCCESS) {
+        
+        /* Are we trying to calculate the MD5 or the SHA1 digest? */
+        if (!strcasecmp(digestType, "md5")) {
+            /* Calculate the MD5 sum of the file */
+            union {
+                char      chr[16];
+                uint32_t  num[4];
+            } digest;
+            apr_md5_ctx_t md5;
+            apr_md5_init(&md5);
+            readBytes = 256;
+            while ( apr_file_read(file, buffer, &readBytes) == APR_SUCCESS ) {
+                apr_md5_update(&md5, buffer, readBytes);
+            }
+            apr_md5_final(digest.chr, &md5);
+            
+            /* Print out the MD5 digest */
+            ap_rputs("<b>MD5: </b><code>", r);
+            for (n = 0; n < APR_MD5_DIGESTSIZE/4; n++) {
+                ap_rprintf(r, "%08x", digest.num[n]);
+            }
+            ap_rputs("</code>", r);
+            /* Print a link to the SHA1 version */
+            ap_rputs("<br/><a href='?digest=sha1'>View the SHA1 hash instead</a>", r);
+        }
+        else {
+            /* Calculate the SHA1 sum of the file */
+            union {
+                char      chr[20];
+                uint32_t  num[5];
+            } digest;
+            apr_sha1_ctx_t sha1;
+            apr_sha1_init(&sha1);
+            readBytes = 256;
+            while ( apr_file_read(file, buffer, &readBytes) == APR_SUCCESS ) {
+                apr_sha1_update(&sha1, buffer, readBytes);
+            }
+            apr_sha1_final(digest.chr, &sha1);
+            
+            /* Print out the SHA1 digest */
+            ap_rputs("<b>SHA1: </b><code>", r);
+            for (n = 0; n < APR_SHA1_DIGESTSIZE/4; n++) {
+                ap_rprintf(r, "%08x", digest.num[n]);
+            }
+            ap_rputs("</code>", r);
+            
+            /* Print a link to the MD5 version */
+            ap_rputs("<br/><a href='?digest=md5'>View the MD5 hash instead</a>", r);
+        }
+        apr_file_close(file);
+        
+    }    
+    /* Let the server know that we responded to this request. */
+    return OK;
+}
+ + + +

+This version in its entirety can be found here: +mod_example_2.c. +

+ + +
top
+
+

Adding configuration options

+

+In this next segment of this document, we will turn our eyes away from the +digest module and create a new example module, whose only function is to +write out its own configuration. The purpose of this is to examine how +the server works with configuration, and what happens when you start writing +advanced configurations +for your modules. +

+

An introduction to configuration +directives

+

+If you are reading this, then you probably already know +what a configuration directive is. Simply put, a directive is a way of +telling an individual module (or a set of modules) how to behave, such as +these directives control how mod_rewrite works: +

+
RewriteEngine On
+RewriteCond "%{REQUEST_URI}" "^/foo/bar"
+RewriteRule "^/foo/bar/(.*)$" "/foobar?page=$1"
+ +

+Each of these configuration directives are handled by a separate function, +that parses the parameters given and sets up a configuration accordingly. +

+ +

Making an example configuration

+

To begin with, we'll create a basic configuration in C-space:

+ + + +
typedef struct {
+    int         enabled;      /* Enable or disable our module */
+    const char *path;         /* Some path to...something */
+    int         typeOfAction; /* 1 means action A, 2 means action B and so on */
+} example_config;
+ + + +

+Now, let's put this into perspective by creating a very small module that +just prints out a hard-coded configuration. You'll notice that we use the +register_hooks function for initializing the configuration +values to their defaults: +

+ + +
typedef struct {
+    int         enabled;      /* Enable or disable our module */
+    const char *path;         /* Some path to...something */
+    int         typeOfAction; /* 1 means action A, 2 means action B and so on */
+} example_config;
+
+static example_config config;
+
+static int example_handler(request_rec *r)
+{
+    if (!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED);
+    ap_set_content_type(r, "text/plain");
+    ap_rprintf(r, "Enabled: %u\n", config.enabled);
+    ap_rprintf(r, "Path: %s\n", config.path);
+    ap_rprintf(r, "TypeOfAction: %x\n", config.typeOfAction);
+    return OK;
+}
+
+static void register_hooks(apr_pool_t *pool) 
+{
+    config.enabled = 1;
+    config.path = "/foo/bar";
+    config.typeOfAction = 0x00;
+    ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
+}
+
+/* Define our module as an entity and assign a function for registering hooks  */
+
+module AP_MODULE_DECLARE_DATA   example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    NULL,            /* Per-directory configuration handler */
+    NULL,            /* Merge handler for per-directory configurations */
+    NULL,            /* Per-server configuration handler */
+    NULL,            /* Merge handler for per-server configurations */
+    NULL,            /* Any directives we may have for httpd */
+    register_hooks   /* Our hook registering function */
+};
+ + + +

+So far so good. To access our new handler, we could add the following to +our configuration: +

+
<Location "/example">
+    SetHandler example-handler
+</Location>
+ +

+When we visit, we'll see our current configuration being spit out by our +module. +

+ + +

Registering directives with the server

+

+What if we want to change our configuration, not by hard-coding new values +into the module, but by using either the httpd.conf file or possibly a +.htaccess file? It's time to let the server know that we want this to be +possible. To do so, we must first change our name tag to include a +reference to the configuration directives we want to register with the server: +

+ + +
module AP_MODULE_DECLARE_DATA   example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    NULL,               /* Per-directory configuration handler */
+    NULL,               /* Merge handler for per-directory configurations */
+    NULL,               /* Per-server configuration handler */
+    NULL,               /* Merge handler for per-server configurations */
+    example_directives, /* Any directives we may have for httpd */
+    register_hooks      /* Our hook registering function */
+};
+ + + +

+This will tell the server that we are now accepting directives from the +configuration files, and that the structure called example_directives + holds information on what our directives are and how they work. +Since we have three different variables in our module configuration, we +will add a structure with three directives and a NULL at the end: +

+ + +
static const command_rec        example_directives[] =
+{
+    AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disable mod_example"),
+    AP_INIT_TAKE1("examplePath", example_set_path, NULL, RSRC_CONF, "The path to whatever"),
+    AP_INIT_TAKE2("exampleAction", example_set_action, NULL, RSRC_CONF, "Special action value!"),
+    { NULL }
+};
+ + + +

+Directives structure
+As you can see, each directive needs at least 5 parameters set: +

+
    +
  1. AP_INIT_TAKE1: This is a macro that tells the server that this directive takes one and only one argument. +If we required two arguments, we could use the macro AP_INIT_TAKE2 and so on (refer to httpd_conf.h +for more macros).
  2. +
  3. exampleEnabled: This is the name of our directive. More precisely, it is what the user must put in his/her +configuration in order to invoke a configuration change in our module.
  4. +
  5. example_set_enabled: This is a reference to a C function that parses the directive and sets the configuration +accordingly. We will discuss how to make this in the following paragraph.
  6. +
  7. RSRC_CONF: This tells the server where the directive is permitted. We'll go into details on this value in the +later chapters, but for now, RSRC_CONF means that the server will only accept these directives in a server context.
  8. +
  9. "Enable or disable....": This is simply a brief description of what the directive does.
  10. +
+

+(The "missing" parameter in our definition, which is usually set to +NULL, is an optional function that can be run after the +initial function to parse the arguments have been run. This is usually +omitted, as the function for verifying arguments might as well be used to +set them.) +

+ +

The directive handler function

+

+Now that we have told the server to expect some directives for our module, it's +time to make a few functions for handling these. What the server reads in the +configuration file(s) is text, and so naturally, what it passes along to +our directive handler is one or more strings, that we ourselves need to +recognize and act upon. You'll notice, that since we set our +exampleAction directive to accept two arguments, its C function also +has an additional parameter defined:

+ + +
/* Handler for the "exampleEnabled" directive */
+const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    if(!strcasecmp(arg, "on")) config.enabled = 1;
+    else config.enabled = 0;
+    return NULL;
+}
+
+/* Handler for the "examplePath" directive */
+const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    config.path = arg;
+    return NULL;
+}
+
+/* Handler for the "exampleAction" directive */
+/* Let's pretend this one takes one argument (file or db), and a second (deny or allow), */
+/* and we store it in a bit-wise manner. */
+const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg2)
+{
+    if(!strcasecmp(arg1, "file")) config.typeOfAction = 0x01;
+    else config.typeOfAction = 0x02;
+    
+    if(!strcasecmp(arg2, "deny")) config.typeOfAction += 0x10;
+    else config.typeOfAction += 0x20;
+    return NULL;
+}
+ + + + + +

Putting it all together

+

+Now that we have our directives set up, and handlers configured for them, +we can assemble our module into one big file: +

+ + +
/* mod_example_config_simple.c: */
+#include <stdio.h>
+#include "apr_hash.h"
+#include "ap_config.h"
+#include "ap_provider.h"
+#include "httpd.h"
+#include "http_core.h"
+#include "http_config.h"
+#include "http_log.h"
+#include "http_protocol.h"
+#include "http_request.h"
+
+/*
+ ==============================================================================
+ Our configuration prototype and declaration:
+ ==============================================================================
+ */
+typedef struct {
+    int         enabled;      /* Enable or disable our module */
+    const char *path;         /* Some path to...something */
+    int         typeOfAction; /* 1 means action A, 2 means action B and so on */
+} example_config;
+
+static example_config config;
+
+/*
+ ==============================================================================
+ Our directive handlers:
+ ==============================================================================
+ */
+/* Handler for the "exampleEnabled" directive */
+const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    if(!strcasecmp(arg, "on")) config.enabled = 1;
+    else config.enabled = 0;
+    return NULL;
+}
+
+/* Handler for the "examplePath" directive */
+const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    config.path = arg;
+    return NULL;
+}
+
+/* Handler for the "exampleAction" directive */
+/* Let's pretend this one takes one argument (file or db), and a second (deny or allow), */
+/* and we store it in a bit-wise manner. */
+const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg2)
+{
+    if(!strcasecmp(arg1, "file")) config.typeOfAction = 0x01;
+    else config.typeOfAction = 0x02;
+    
+    if(!strcasecmp(arg2, "deny")) config.typeOfAction += 0x10;
+    else config.typeOfAction += 0x20;
+    return NULL;
+}
+
+/*
+ ==============================================================================
+ The directive structure for our name tag:
+ ==============================================================================
+ */
+static const command_rec        example_directives[] =
+{
+    AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disable mod_example"),
+    AP_INIT_TAKE1("examplePath", example_set_path, NULL, RSRC_CONF, "The path to whatever"),
+    AP_INIT_TAKE2("exampleAction", example_set_action, NULL, RSRC_CONF, "Special action value!"),
+    { NULL }
+};
+/*
+ ==============================================================================
+ Our module handler:
+ ==============================================================================
+ */
+static int example_handler(request_rec *r)
+{
+    if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED);
+    ap_set_content_type(r, "text/plain");
+    ap_rprintf(r, "Enabled: %u\n", config.enabled);
+    ap_rprintf(r, "Path: %s\n", config.path);
+    ap_rprintf(r, "TypeOfAction: %x\n", config.typeOfAction);
+    return OK;
+}
+
+/*
+ ==============================================================================
+ The hook registration function (also initializes the default config values):
+ ==============================================================================
+ */
+static void register_hooks(apr_pool_t *pool) 
+{
+    config.enabled = 1;
+    config.path = "/foo/bar";
+    config.typeOfAction = 3;
+    ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
+}
+/*
+ ==============================================================================
+ Our module name tag:
+ ==============================================================================
+ */
+module AP_MODULE_DECLARE_DATA   example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    NULL,               /* Per-directory configuration handler */
+    NULL,               /* Merge handler for per-directory configurations */
+    NULL,               /* Per-server configuration handler */
+    NULL,               /* Merge handler for per-server configurations */
+    example_directives, /* Any directives we may have for httpd */
+    register_hooks      /* Our hook registering function */
+};
+ + + + +

+In our httpd.conf file, we can now change the hard-coded configuration by +adding a few lines: +

+
ExampleEnabled On
+ExamplePath "/usr/bin/foo"
+ExampleAction file allow
+ +

+And thus we apply the configuration, visit /example on our +web site, and we see the configuration has adapted to what we wrote in our +configuration file. +

+ + + +
top
+
+

Context aware configurations

+

Introduction to context aware configurations

+

+In Apache HTTP Server 2.4, different URLs, virtual hosts, directories etc can have very +different meanings to the user of the server, and thus different contexts +within which modules must operate. For example, let's assume you have this +configuration set up for mod_rewrite: +

+
<Directory "/var/www">
+    RewriteCond "%{HTTP_HOST}" "^example.com$"
+    RewriteRule "(.*)" "http://www.example.com/$1"
+</Directory>
+<Directory "/var/www/sub">
+    RewriteRule "^foobar$" "index.php?foobar=true"
+</Directory>
+ +

+In this example, you will have set up two different contexts for +mod_rewrite:

+
    +
  1. Inside /var/www, all requests for http://example.com must go to http://www.example.com
  2. +
  3. Inside /var/www/sub, all requests for foobar must go to index.php?foobar=true
  4. +
+

+If mod_rewrite (or the entire server for that matter) wasn't context aware, then +these rewrite rules would just apply to every and any request made, +regardless of where and how they were made, but since the module can pull +the context specific configuration straight from the server, it does not need +to know itself, which of the directives are valid in this context, since +the server takes care of this.

+ +

+So how does a module get the specific configuration for the server, +directory or location in question? It does so by making one simple call: +

+ + +
example_config *config = (example_config*) ap_get_module_config(r->per_dir_config, &example_module);
+ + + +

+That's it! Of course, a whole lot goes on behind the scenes, which we will +discuss in this chapter, starting with how the server came to know what our +configuration looks like, and how it came to be set up as it is in the +specific context. +

+ + +

Our basic configuration setup

+

In this chapter, we will be working with a slightly modified version of +our previous context structure. We will set a context +variable that we can use to track which context configuration is being +used by the server in various places: +

+ +
typedef struct {
+    char        context[256];
+    char        path[256];
+    int         typeOfAction;
+    int         enabled;
+} example_config;
+ + + +

Our handler for requests will also be modified, yet still very simple:

+ + + +
static int example_handler(request_rec *r)
+{
+    if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED);
+    example_config *config = (example_config*) ap_get_module_config(r->per_dir_config, &example_module);
+    ap_set_content_type(r, "text/plain");
+    ap_rprintf("Enabled: %u\n", config->enabled);
+    ap_rprintf("Path: %s\n", config->path);
+    ap_rprintf("TypeOfAction: %x\n", config->typeOfAction);
+    ap_rprintf("Context: %s\n", config->context);
+    return OK;
+}
+ + + + + +

Choosing a context

+

+Before we can start making our module context aware, we must first define, +which contexts we will accept. As we saw in the previous chapter, defining +a directive required five elements be set:

+ + + +
AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disable mod_example"),
+ + + + +

The RSRC_CONF definition told the server that we would only allow +this directive in a global server context, but since we are now trying out +a context aware version of our module, we should set this to something +more lenient, namely the value ACCESS_CONF, which lets us use +the directive inside <Directory> and <Location> blocks. For more +control over the placement of your directives, you can combine the following +restrictions together to form a specific rule: +

+
    +
  • RSRC_CONF: Allow in .conf files (not .htaccess) outside <Directory> or <Location>
  • +
  • ACCESS_CONF: Allow in .conf files (not .htaccess) inside <Directory> or <Location>
  • +
  • OR_OPTIONS: Allow in .conf files and .htaccess when AllowOverride Options is set
  • +
  • OR_FILEINFO: Allow in .conf files and .htaccess when AllowOverride FileInfo is set
  • +
  • OR_AUTHCFG: Allow in .conf files and .htaccess when AllowOverride AuthConfig is set
  • +
  • OR_INDEXES: Allow in .conf files and .htaccess when AllowOverride Indexes is set
  • +
  • OR_ALL: Allow anywhere in .conf files and .htaccess
  • +
+ + +

Using the server to allocate configuration slots

+

A much smarter way to manage your configurations is by letting the server +help you create them. To do so, we must first start off by changing our +name tag to let the server know, that it should assist us in creating +and managing our configurations. Since we have chosen the per-directory +(or per-location) context for our module configurations, we'll add a +per-directory creator and merger function reference in our tag:

+ + +
module AP_MODULE_DECLARE_DATA   example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    create_dir_conf, /* Per-directory configuration handler */
+    merge_dir_conf,  /* Merge handler for per-directory configurations */
+    NULL,            /* Per-server configuration handler */
+    NULL,            /* Merge handler for per-server configurations */
+    directives,      /* Any directives we may have for httpd */
+    register_hooks   /* Our hook registering function */
+};
+ + + + + + + +

Creating new context configurations

+

+Now that we have told the server to help us create and manage configurations, +our first step is to make a function for creating new, blank +configurations. We do so by creating the function we just referenced in +our name tag as the Per-directory configuration handler:

+ +
void *create_dir_conf(apr_pool_t *pool, char *context) {
+    context = context ? context : "(undefined context)";
+    example_config *cfg = apr_pcalloc(pool, sizeof(example_config));
+    if(cfg) {
+        /* Set some default values */
+        strcpy(cfg->context, context);
+        cfg->enabled = 0;
+        cfg->path = "/foo/bar";
+        cfg->typeOfAction = 0x11;
+    }
+    return cfg;
+}
+ + + + + + +

Merging configurations

+

+Our next step in creating a context aware configuration is merging +configurations. This part of the process particularly applies to scenarios +where you have a parent configuration and a child, such as the following: +

+
<Directory "/var/www">
+    ExampleEnabled On
+    ExamplePath "/foo/bar"
+    ExampleAction file allow
+</Directory>
+<Directory "/var/www/subdir">
+    ExampleAction file deny
+</Directory>
+ +

+In this example, it is natural to assume that the directory +/var/www/subdir should inherit the values set for the /var/www + directory, as we did not specify an ExampleEnabled nor +an ExamplePath for this directory. The server does not presume to +know if this is true, but cleverly does the following: +

+
    +
  1. Creates a new configuration for /var/www
  2. +
  3. Sets the configuration values according to the directives given for /var/www
  4. +
  5. Creates a new configuration for /var/www/subdir
  6. +
  7. Sets the configuration values according to the directives given for /var/www/subdir
  8. +
  9. Proposes a merge of the two configurations into a new configuration for /var/www/subdir
  10. +
+

+This proposal is handled by the merge_dir_conf function we +referenced in our name tag. The purpose of this function is to assess the +two configurations and decide how they are to be merged:

+ + + +
void *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD) {
+    example_config *base = (example_config *) BASE ; /* This is what was set in the parent context */
+    example_config *add = (example_config *) ADD ;   /* This is what is set in the new context */
+    example_config *conf = (example_config *) create_dir_conf(pool, "Merged configuration"); /* This will be the merged configuration */
+    
+    /* Merge configurations */
+    conf->enabled = ( add->enabled == 0 ) ? base->enabled : add->enabled ;
+    conf->typeOfAction = add->typeOfAction ? add->typeOfAction : base->typeOfAction;
+    strcpy(conf->path, strlen(add->path) ? add->path : base->path);
+    
+    return conf ;
+}
+ + + + + + +

Trying out our new context aware configurations

+

+Now, let's try putting it all together to create a new module that is +context aware. First off, we'll create a configuration that lets us test +how the module works: +

+
<Location "/a">
+    SetHandler example-handler
+    ExampleEnabled on
+    ExamplePath "/foo/bar"
+    ExampleAction file allow
+</Location>
+
+<Location "/a/b">
+    ExampleAction file deny
+    ExampleEnabled off
+</Location>
+
+<Location "/a/b/c">
+    ExampleAction db deny
+    ExamplePath "/foo/bar/baz"
+    ExampleEnabled on
+</Location>
+ +

+Then we'll assemble our module code. Note, that since we are now using our +name tag as reference when fetching configurations in our handler, I have +added some prototypes to keep the compiler happy: +

+ + +
/*$6
+ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ * mod_example_config.c
+ +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+ */
+
+
+#include <stdio.h>
+#include "apr_hash.h"
+#include "ap_config.h"
+#include "ap_provider.h"
+#include "httpd.h"
+#include "http_core.h"
+#include "http_config.h"
+#include "http_log.h"
+#include "http_protocol.h"
+#include "http_request.h"
+
+/*$1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    Configuration structure
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+typedef struct
+{
+    char    context[256];
+    char    path[256];
+    int     typeOfAction;
+    int     enabled;
+} example_config;
+
+/*$1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    Prototypes
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+static int    example_handler(request_rec *r);
+const char    *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg);
+const char    *example_set_path(cmd_parms *cmd, void *cfg, const char *arg);
+const char    *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg2);
+void          *create_dir_conf(apr_pool_t *pool, char *context);
+void          *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD);
+static void   register_hooks(apr_pool_t *pool);
+
+/*$1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    Configuration directives
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+static const command_rec    directives[] =
+{
+    AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, ACCESS_CONF, "Enable or disable mod_example"),
+    AP_INIT_TAKE1("examplePath", example_set_path, NULL, ACCESS_CONF, "The path to whatever"),
+    AP_INIT_TAKE2("exampleAction", example_set_action, NULL, ACCESS_CONF, "Special action value!"),
+    { NULL }
+};
+
+/*$1
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+    Our name tag
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+module AP_MODULE_DECLARE_DATA    example_module =
+{
+    STANDARD20_MODULE_STUFF,
+    create_dir_conf,    /* Per-directory configuration handler */
+    merge_dir_conf,     /* Merge handler for per-directory configurations */
+    NULL,               /* Per-server configuration handler */
+    NULL,               /* Merge handler for per-server configurations */
+    directives,         /* Any directives we may have for httpd */
+    register_hooks      /* Our hook registering function */
+};
+
+/*
+ =======================================================================================================================
+    Hook registration function
+ =======================================================================================================================
+ */
+static void register_hooks(apr_pool_t *pool)
+{
+    ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
+}
+
+/*
+ =======================================================================================================================
+    Our example web service handler
+ =======================================================================================================================
+ */
+static int example_handler(request_rec *r)
+{
+    if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED);
+
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *config = (example_config *) ap_get_module_config(r->per_dir_config, &example_module);
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    ap_set_content_type(r, "text/plain");
+    ap_rprintf(r, "Enabled: %u\n", config->enabled);
+    ap_rprintf(r, "Path: %s\n", config->path);
+    ap_rprintf(r, "TypeOfAction: %x\n", config->typeOfAction);
+    ap_rprintf(r, "Context: %s\n", config->context);
+    return OK;
+}
+
+/*
+ =======================================================================================================================
+    Handler for the "exampleEnabled" directive
+ =======================================================================================================================
+ */
+const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *conf = (example_config *) cfg;
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    if(conf)
+    {
+        if(!strcasecmp(arg, "on"))
+            conf->enabled = 1;
+        else
+            conf->enabled = 0;
+    }
+
+    return NULL;
+}
+
+/*
+ =======================================================================================================================
+    Handler for the "examplePath" directive
+ =======================================================================================================================
+ */
+const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *conf = (example_config *) cfg;
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    if(conf)
+    {
+        strcpy(conf->path, arg);
+    }
+
+    return NULL;
+}
+
+/*
+ =======================================================================================================================
+    Handler for the "exampleAction" directive ;
+    Let's pretend this one takes one argument (file or db), and a second (deny or allow), ;
+    and we store it in a bit-wise manner.
+ =======================================================================================================================
+ */
+const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg2)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *conf = (example_config *) cfg;
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    if(conf)
+    {
+        {
+            if(!strcasecmp(arg1, "file"))
+                conf->typeOfAction = 0x01;
+            else
+                conf->typeOfAction = 0x02;
+            if(!strcasecmp(arg2, "deny"))
+                conf->typeOfAction += 0x10;
+            else
+                conf->typeOfAction += 0x20;
+        }
+    }
+
+    return NULL;
+}
+
+/*
+ =======================================================================================================================
+    Function for creating new configurations for per-directory contexts
+ =======================================================================================================================
+ */
+void *create_dir_conf(apr_pool_t *pool, char *context)
+{
+    context = context ? context : "Newly created configuration";
+
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *cfg = apr_pcalloc(pool, sizeof(example_config));
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    if(cfg)
+    {
+        {
+            /* Set some default values */
+            strcpy(cfg->context, context);
+            cfg->enabled = 0;
+            memset(cfg->path, 0, 256);
+            cfg->typeOfAction = 0x00;
+        }
+    }
+
+    return cfg;
+}
+
+/*
+ =======================================================================================================================
+    Merging function for configurations
+ =======================================================================================================================
+ */
+void *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    example_config    *base = (example_config *) BASE;
+    example_config    *add = (example_config *) ADD;
+    example_config    *conf = (example_config *) create_dir_conf(pool, "Merged configuration");
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    conf->enabled = (add->enabled == 0) ? base->enabled : add->enabled;
+    conf->typeOfAction = add->typeOfAction ? add->typeOfAction : base->typeOfAction;
+    strcpy(conf->path, strlen(add->path) ? add->path : base->path);
+    return conf;
+}
+ + + + + + + +
top
+
+

Summing up

+

+We have now looked at how to create simple modules for Apache HTTP Server 2.4 and +configuring them. What you do next is entirely up to you, but it is my +hope that something valuable has come out of reading this documentation. +If you have questions on how to further develop modules, you are welcome +to join our mailing lists +or check out the rest of our documentation for further tips. +

+
top
+
+

Some useful snippets of code

+ +

Retrieve variables from POST form data

+ + + +
typedef struct {
+    const char *key;
+    const char *value;
+} keyValuePair;
+
+keyValuePair *readPost(request_rec *r) {
+    apr_array_header_t *pairs = NULL;
+    apr_off_t len;
+    apr_size_t size;
+    int res;
+    int i = 0;
+    char *buffer;
+    keyValuePair *kvp;
+
+    res = ap_parse_form_data(r, NULL, &pairs, -1, HUGE_STRING_LEN);
+    if (res != OK || !pairs) return NULL; /* Return NULL if we failed or if there are is no POST data */
+    kvp = apr_pcalloc(r->pool, sizeof(keyValuePair) * (pairs->nelts + 1));
+    while (pairs && !apr_is_empty_array(pairs)) {
+        ap_form_pair_t *pair = (ap_form_pair_t *) apr_array_pop(pairs);
+        apr_brigade_length(pair->value, 1, &len);
+        size = (apr_size_t) len;
+        buffer = apr_palloc(r->pool, size + 1);
+        apr_brigade_flatten(pair->value, buffer, &size);
+        buffer[len] = 0;
+        kvp[i].key = apr_pstrdup(r->pool, pair->name);
+        kvp[i].value = buffer;
+        i++;
+    }
+    return kvp;
+}
+
+static int example_handler(request_rec *r)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~*/
+    keyValuePair *formData;
+    /*~~~~~~~~~~~~~~~~~~~~~~*/
+
+    formData = readPost(r);
+    if (formData) {
+        int i;
+        for (i = 0; &formData[i]; i++) {
+            if (formData[i].key && formData[i].value) {
+                ap_rprintf(r, "%s = %s\n", formData[i].key, formData[i].value);
+            } else if (formData[i].key) {
+                ap_rprintf(r, "%s\n", formData[i].key);
+            } else if (formData[i].value) {
+                ap_rprintf(r, "= %s\n", formData[i].value);
+            } else {
+                break;
+            }
+        }
+    }
+    return OK;
+}
+ + + + + + +

Printing out every HTTP header received

+ + + +
static int example_handler(request_rec *r)
+{
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+    const apr_array_header_t    *fields;
+    int                         i;
+    apr_table_entry_t           *e = 0;
+    /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+    fields = apr_table_elts(r->headers_in);
+    e = (apr_table_entry_t *) fields->elts;
+    for(i = 0; i < fields->nelts; i++) {
+        ap_rprintf(r, "%s: %s\n", e[i].key, e[i].val);
+    }
+    return OK;
+}
+ + + + + + +

Reading the request body into memory

+ + + +
static int util_read(request_rec *r, const char **rbuf, apr_off_t *size)
+{
+    /*~~~~~~~~*/
+    int rc = OK;
+    /*~~~~~~~~*/
+
+    if((rc = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR))) {
+        return(rc);
+    }
+
+    if(ap_should_client_block(r)) {
+
+        /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+        char         argsbuffer[HUGE_STRING_LEN];
+        apr_off_t    rsize, len_read, rpos = 0;
+        apr_off_t length = r->remaining;
+        /*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/
+
+        *rbuf = (const char *) apr_pcalloc(r->pool, (apr_size_t) (length + 1));
+        *size = length;
+        while((len_read = ap_get_client_block(r, argsbuffer, sizeof(argsbuffer))) > 0) {
+            if((rpos + len_read) > length) {
+                rsize = length - rpos;
+            }
+            else {
+                rsize = len_read;
+            }
+
+            memcpy((char *) *rbuf + rpos, argsbuffer, (size_t) rsize);
+            rpos += rsize;
+        }
+    }
+    return(rc);
+}
+
+static int example_handler(request_rec *r) 
+{
+    /*~~~~~~~~~~~~~~~~*/
+    apr_off_t   size;
+    const char  *buffer;
+    /*~~~~~~~~~~~~~~~~*/
+
+    if(util_read(r, &buffer, &size) == OK) {
+        ap_rprintf(r, "We read a request body that was %" APR_OFF_T_FMT " bytes long", size);
+    }
+    return OK;
+}
+ + + + + + + +
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/modules.html b/docs/manual/developer/modules.html new file mode 100644 index 0000000..ebc705b --- /dev/null +++ b/docs/manual/developer/modules.html @@ -0,0 +1,9 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: modules.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 + +URI: modules.html.ja.utf8 +Content-Language: ja +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/modules.html.en b/docs/manual/developer/modules.html.en new file mode 100644 index 0000000..fb7ccef --- /dev/null +++ b/docs/manual/developer/modules.html.en @@ -0,0 +1,306 @@ + + + + + +Converting Modules from Apache 1.3 to Apache 2.0 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Converting Modules from Apache 1.3 to Apache 2.0

+
+

Available Languages:  en  | + ja 

+
+ +

This is a first attempt at writing the lessons I learned + when trying to convert the mod_mmap_static module to Apache + 2.0. It's by no means definitive and probably won't even be + correct in some ways, but it's a start.

+
+ +
top
+
+

The easier changes ...

+ +

Cleanup Routines

+

These now need to be of type apr_status_t and return a + value of that type. Normally the return value will be + APR_SUCCESS unless there is some need to signal an error in + the cleanup. Be aware that even though you signal an error not all code + yet checks and acts upon the error.

+ + +

Initialisation Routines

+

These should now be renamed to better signify where they sit + in the overall process. So the name gets a small change from + mmap_init to mmap_post_config. The arguments + passed have undergone a radical change and now look like

+ +
    +
  • apr_pool_t *p
  • +
  • apr_pool_t *plog
  • +
  • apr_pool_t *ptemp
  • +
  • server_rec *s
  • +
+ + +

Data Types

+

A lot of the data types have been moved into the APR. This means that some have had + a name change, such as the one shown above. The following is a brief + list of some of the changes that you are likely to have to make.

+ +
    +
  • pool becomes apr_pool_t
  • +
  • table becomes apr_table_t
  • +
+ +
top
+
+

The messier changes...

+ +

Register Hooks

+

The new architecture uses a series of hooks to provide for + calling your functions. These you'll need to add to your module + by way of a new function, static void register_hooks(void). + The function is really reasonably straightforward once you + understand what needs to be done. Each function that needs + calling at some stage in the processing of a request needs to + be registered, handlers do not. There are a number of phases + where functions can be added, and for each you can specify with + a high degree of control the relative order that the function + will be called in.

+ +

This is the code that was added to mod_mmap_static:

+
static void register_hooks(void)
+{
+    static const char * const aszPre[]={ "http_core.c",NULL };
+    ap_hook_post_config(mmap_post_config,NULL,NULL,HOOK_MIDDLE);
+    ap_hook_translate_name(mmap_static_xlat,aszPre,NULL,HOOK_LAST);
+};
+ +

This registers 2 functions that need to be called, one in + the post_config stage (virtually every module will need this + one) and one for the translate_name phase. note that while + there are different function names the format of each is + identical. So what is the format?

+ +

+ ap_hook_phase_name(function_name, + predecessors, successors, position); +

+ +

There are 3 hook positions defined...

+ +
    +
  • HOOK_FIRST
  • +
  • HOOK_MIDDLE
  • +
  • HOOK_LAST
  • +
+ +

To define the position you use the position and then modify + it with the predecessors and successors. Each of the modifiers + can be a list of functions that should be called, either before + the function is run (predecessors) or after the function has + run (successors).

+ +

In the mod_mmap_static case I didn't care about the + post_config stage, but the mmap_static_xlat + must be called after the core module had done its name + translation, hence the use of the aszPre to define a modifier to the + position HOOK_LAST.

+ + +

Module Definition

+

There are now a lot fewer stages to worry about when + creating your module definition. The old definition looked + like

+ +
module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD_MODULE_STUFF,
+    /* initializer */
+    /* dir config creater */
+    /* dir merger --- default is to override */
+    /* server config */
+    /* merge server config */
+    /* command handlers */
+    /* handlers */
+    /* filename translation */
+    /* check_user_id */
+    /* check auth */
+    /* check access */
+    /* type_checker */
+    /* fixups */
+    /* logger */
+    /* header parser */
+    /* child_init */
+    /* child_exit */
+    /* post read-request */
+};
+ +

The new structure is a great deal simpler...

+
module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD20_MODULE_STUFF,
+    /* create per-directory config structures */
+    /* merge per-directory config structures  */
+    /* create per-server config structures    */
+    /* merge per-server config structures     */
+    /* command handlers */
+    /* handlers */
+    /* register hooks */
+};
+ +

Some of these read directly across, some don't. I'll try to + summarise what should be done below.

+ +

The stages that read directly across :

+ +
+
/* dir config creater */
+
/* create per-directory config structures */
+ +
/* server config */
+
/* create per-server config structures */
+ +
/* dir merger */
+
/* merge per-directory config structures */
+ +
/* merge server config */
+
/* merge per-server config structures */
+ +
/* command table */
+
/* command apr_table_t */
+ +
/* handlers */
+
/* handlers */
+
+ +

The remainder of the old functions should be registered as + hooks. There are the following hook stages defined so + far...

+ +
+
ap_hook_pre_config
+
do any setup required prior to processing configuration + directives
+ +
ap_hook_check_config
+
review configuration directive interdependencies
+ +
ap_hook_test_config
+
executes only with -t option
+ +
ap_hook_open_logs
+
open any specified logs
+ +
ap_hook_post_config
+
this is where the old _init routines get + registered
+ +
ap_hook_http_method
+
retrieve the http method from a request. (legacy)
+ +
ap_hook_auth_checker
+
check if the resource requires authorization
+ +
ap_hook_access_checker
+
check for module-specific restrictions
+ +
ap_hook_check_user_id
+
check the user-id and password
+ +
ap_hook_default_port
+
retrieve the default port for the server
+ +
ap_hook_pre_connection
+
do any setup required just before processing, but after + accepting
+ +
ap_hook_process_connection
+
run the correct protocol
+ +
ap_hook_child_init
+
call as soon as the child is started
+ +
ap_hook_create_request
+
??
+ +
ap_hook_fixups
+
last chance to modify things before generating content
+ +
ap_hook_handler
+
generate the content
+ +
ap_hook_header_parser
+
lets modules look at the headers, not used by most modules, because + they use post_read_request for this
+ +
ap_hook_insert_filter
+
to insert filters into the filter chain
+ +
ap_hook_log_transaction
+
log information about the request
+ +
ap_hook_optional_fn_retrieve
+
retrieve any functions registered as optional
+ +
ap_hook_post_read_request
+
called after reading the request, before any other phase
+ +
ap_hook_quick_handler
+
called before any request processing, used by cache modules.
+ +
ap_hook_translate_name
+
translate the URI into a filename
+ +
ap_hook_type_checker
+
determine and/or set the doc type
+
+ +
+
+

Available Languages:  en  | + ja 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/modules.html.ja.utf8 b/docs/manual/developer/modules.html.ja.utf8 new file mode 100644 index 0000000..097e6bc --- /dev/null +++ b/docs/manual/developer/modules.html.ja.utf8 @@ -0,0 +1,301 @@ + + + + + +モジュールの Apache 1.3 から Apache 2.0 への移植 - Apache HTTP サーバ バージョン 2.4 + + + + + + + +
<-
+

モジュールの Apache 1.3 から Apache 2.0 への移植

+
+

翻訳済み言語:  en  | + ja 

+
+
この日本語訳はすでに古くなっている + 可能性があります。 + 最近更新された内容を見るには英語版をご覧下さい。 +
+ +

この文書は mod_mmap_static モジュールを Apache 2.0 用に移植した時に + 学んだ経験をもとに書いた、最初の手引き書です。まだまだ完全じゃないし、 + ひょっとすると間違っている部分もあるかもしれませんが、 + 取っ掛りにはなるでしょう。

+
+ +
top
+
+

簡単な変更点

+ +

クリーンナップ ルーチン

+

クリーンナップルーチンは apr_status_t 型である必要があります。 + そして、apr_status_t 型の値を返さなくてはなりません。 + クリーンナップ中のエラーを通知する必要がなければ、返り値は普通、 + ARP_SUCCESS です。たとえエラーを通知したとしても、 + すべてのコードがその通知をチェックしたり、 + エラーに応じた動作をするわけではないことに気をつけてください。

+ + + +

初期化ルーチン

+ +

初期化ルーチンは処理全体から見てしっくりくるような意味を表すように、 + 名前が変更されました。ですから、mmap_init から mmap_post_config + のようにちょっと変更されました。 + 渡される引数は大幅に変更され、次のようになりました。

+ +
    +
  • apr_pool_t *p
  • +
  • apr_pool_t *plog
  • +
  • apr_pool_t *ptemp
  • +
  • server_rec *s
  • +
+ + +

データ型

+

データ型のほとんどは APR に移されました。つまり、 + いくつかの名前が前述のように変更されています。 + 施すべき変更点の簡単な一覧を以下に示します。

+ +
    +
  • pool becomes apr_pool_t
  • +
  • table becomes apr_table_t
  • +
+ +
top
+
+

もっと厄介な変更点…

+ +

フックの登録

+

新しいアーキテクチャでは作成した関数を呼び出すのに + 一連のフックを使用します。このフックは、新しい関数 + static void register_hooks(void) を使って登録するよう、 + モジュールに書き足さなくてはなりません。 + この関数は、なにをすべきか一旦理解してしまえば、 + 十分にわかりやすいものです。 + リクエストの処理のあるステージで呼び出さなくてはならない + 関数は登録する必要があります。ハンドラは登録する必要はありません。 + 関数を登録できるフェーズはたくさんあります。 + それぞれのフェーズで、関数を呼び出す相対的な順番は、 + かなりの程度制御できます。

+ +

以下は、mod_mmap_static に追加したコードです:

+ +
static void register_hooks(void)
+{
+    static const char * const aszPre[]={ "http_core.c",NULL };
+    ap_hook_post_config(mmap_post_config,NULL,NULL,HOOK_MIDDLE);
+    ap_hook_translate_name(mmap_static_xlat,aszPre,NULL,HOOK_LAST);
+};
+ +

ここでは呼びだすべき二つの関数を登録しています。一つは + post_config ステージ用 (ほとんどすべてのモジュール + はこれが必要です) で、もう一つは translate_name フェーズ用です。 + それぞれの関数は名前は違うけれども形式は同じであることに注意してください。 + それでは、形式はどのようになっているでしょうか?

+ +

+ ap_hook_phase_name(function_name, + predecessors, successors, position); +

+ +

三つの位置が定義されています…

+ +
    +
  • HOOK_FIRST
  • +
  • HOOK_MIDDLE
  • +
  • HOOK_LAST
  • +
+ +

位置を定義するには、上記の「位置」を指定し、 + 修飾子である「先行」と「後行」で手を加えます。 + 「先行」「後行」は、呼ばれるべき関数のリストです。 + 「先行」は関数の実行前に呼ばれるもので、 + 「後行」は実行後に呼ばれるものです。

+ +

mod_mmap_static の場合、post_config + ステージでは必要ありませんが、 + mmap_static_xlat が core モジュールが名前の変換を実行した後に + 呼ばれなければなりません。 + そこで aszPre を使って HOOK_LAST の修飾子を定義しています。

+ + +

モジュールの定義

+

モジュールの定義を作成する際に注意しなければならない + ステージの数は激減しています。古い定義は次のようになっていました。

+ +
module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD_MODULE_STUFF,
+    /* initializer */
+    /* dir config creater */
+    /* dir merger --- default is to override */
+    /* server config */
+    /* merge server config */
+    /* command handlers */
+    /* handlers */
+    /* filename translation */
+    /* check_user_id */
+    /* check auth */
+    /* check access */
+    /* type_checker */
+    /* fixups */
+    /* logger */
+    /* header parser */
+    /* child_init */
+    /* child_exit */
+    /* post read-request */
+};
+ +

新しい構造体はとってもシンプルです…

+
module MODULE_VAR_EXPORT module_name_module =
+{
+    STANDARD20_MODULE_STUFF,
+    /* create per-directory config structures */
+    /* merge per-directory config structures  */
+    /* create per-server config structures    */
+    /* merge per-server config structures     */
+    /* command handlers */
+    /* handlers */
+    /* register hooks */
+};
+ +

このうちのいくつかは古いものから新しいものに直接読み替えられるもので、 + いくつかはそうではありません。どうすればいいのかを要約してみます。

+ +

直接読み替えられるステージ:

+ +
+
/* ディレクトリ設定作成関数 */
+
/* ディレクトリ毎設定構造体作成 */
+ +
/* サーバ設定作成関数 */
+
/* サーバ毎設定構造体作成 */
+ +
/* ディレクトリ設定マージ関数 */
+
/* ディレクトリ毎設定構造体マージ */
+ +
/* サーバ設定マージ関数 */
+
/* サーバ毎設定構造体作成マージ */
+ +
/* コマンド・テーブル */
+
/* コマンド apr_table_t */
+ +
/* ハンドラ */
+
/* ハンドラ */
+
+ +

古い関数の残りのものはフックとして登録されるべきです。 + 現時点で次のようなフック・ステージが定義されています…

+ +
+
ap_hook_post_config
+
(以前の _init ルーチンが登録されるべき場所です)
+ +
ap_hook_http_method
+
(リクエストから HTTP メソッドを取得します (互換用))
+ +
ap_hook_open_logs
+
(特定のログのオープン)
+ +
ap_hook_auth_checker
+
(リソースが権限を必要とするかどうかの確認)
+ +
ap_hook_access_checker
+
(モジュール固有の制約の確認)
+ +
ap_hook_check_user_id
+
(ユーザ ID とパスワードの確認)
+ +
ap_hook_default_port
+
(サーバのデフォルト・ポートの取得)
+ +
ap_hook_pre_connection
+
(処理の直前に必要なことを実行。ただし accept 直後に呼ばれる)
+ +
ap_hook_process_connection
+
(プロトコルの処理)
+ +
ap_hook_child_init
+
(子プロセル起動直後)
+ +
ap_hook_create_request
+
(??)
+ +
ap_hook_fixups
+
(応答内容の生成を変更するラスト・チャンス)
+ +
ap_hook_handler
+
(応答内容の生成)
+ +
ap_hook_header_parser
+
(モジュールにヘッダの照会をさせる。ほとんどのモジュールでは使われません。post_read_request を使います)
+ +
ap_hook_insert_filter
+
(フィルタ・チェインにフィルタを挿入)
+ +
ap_hook_log_transaction
+
(リクエストについての情報を記録する)
+ +
ap_hook_optional_fn_retrieve
+
(オプションとして登録された関数の取得)
+ +
ap_hook_post_read_request
+
(リクエストを読みこんだ後、他のフェーズの前に呼ばれる)
+ +
ap_hook_quick_handler
+
リクエストの処理が始まる前に呼ばれる。キャッシュモジュールが + 使用している
+ +
ap_hook_translate_name
+
(URI をファイル名に変換する)
+ +
ap_hook_type_checker
+
(文書型の決定と設定。あるいはその片方)
+
+ +
+
+

翻訳済み言語:  en  | + ja 

+
top

コメント

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/new_api_2_4.html b/docs/manual/developer/new_api_2_4.html new file mode 100644 index 0000000..e79fd3c --- /dev/null +++ b/docs/manual/developer/new_api_2_4.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: new_api_2_4.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/new_api_2_4.html.en b/docs/manual/developer/new_api_2_4.html.en new file mode 100644 index 0000000..6354e85 --- /dev/null +++ b/docs/manual/developer/new_api_2_4.html.en @@ -0,0 +1,601 @@ + + + + + +API Changes in Apache HTTP Server 2.4 since 2.2 - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

API Changes in Apache HTTP Server 2.4 since 2.2

+
+

Available Languages:  en 

+
+ +

This document describes changes to the Apache HTTPD API from + version 2.2 to 2.4, that may be of interest to module/application + developers and core hacks. As of the first GA release of the + 2.4 branch API compatibility is preserved for the life of the + 2.4 branch. (The + VERSIONING + description for the 2.4 release provides more information about API + compatibility.)

+ +

API changes fall into two categories: APIs that are altogether new, + and existing APIs that are expanded or changed. The latter are + further divided into those where all changes are backwards-compatible + (so existing modules can ignore them), and those that might + require attention by maintainers. As with the transition from + HTTPD 2.0 to 2.2, existing modules and applications will require + recompiling and may call for some attention, but most should not + require any substantial updating (although some may be able to + take advantage of API changes to offer significant improvements).

+

For the purpose of this document, the API is split according + to the public header files. These headers are themselves the + reference documentation, and can be used to generate a browsable + HTML reference with make docs.

+
+ +
top
+
+

Changed APIs

+ + +

ap_expr (NEW!)

+ +

Introduces a new API to parse and evaluate boolean and algebraic + expressions, including provision for a standard syntax and + customised variants.

+ + +

ap_listen (changed; backwards-compatible)

+ +

Introduces a new API to enable httpd child processes to serve + different purposes.

+ + +

ap_mpm (changed)

+ +

ap_mpm_run is replaced by a new mpm hook. + Also ap_graceful_stop_signalled is lost, and + ap_mpm_register_timed_callback is new.

+ + +

ap_regex (changed)

+ +

In addition to the existing regexp wrapper, a new higher-level API + ap_rxplus is now provided. This provides the capability to + compile Perl-style expressions like s/regexp/replacement/flags + and to execute them against arbitrary strings. Support for regexp + backreferences is also added.

+ + +

ap_slotmem (NEW!)

+ +

Introduces an API for modules to allocate and manage memory slots, + most commonly for shared memory.

+ + +

ap_socache (NEW!)

+ +

API to manage a shared object cache.

+ + +

heartbeat (NEW!)

+ +

common structures for heartbeat modules

+ + +

ap_parse_htaccess (changed)

+ +

The function signature for ap_parse_htaccess has been + changed. A apr_table_t of individual directives allowed + for override must now be passed (override remains).

+ + +

http_config (changed)

+ +
    +
  • Introduces per-module, per-directory loglevels, including macro wrappers.
  • +
  • New AP_DECLARE_MODULE macro to declare all modules.
  • +
  • New APLOG_USE_MODULE macro necessary for per-module loglevels in + multi-file modules.
  • +
  • New API to retain data across module unload/load
  • +
  • New check_config hook
  • +
  • New ap_process_fnmatch_configs() function to process wildcards
  • +
  • Change ap_configfile_t, ap_cfg_getline(), + ap_cfg_getc() to return error codes, and add + ap_pcfg_strerror() for retrieving an error description.
  • +
  • Any config directive permitted in ACCESS_CONF context must now + correctly handle being called from an .htaccess file via the new + AllowOverrideList directive. + ap_check_cmd_context() accepts a new flag NOT_IN_HTACCESS to detect + this case.
  • +
+ + +

http_core (changed)

+ +
    +
  • REMOVED ap_default_type, ap_requires, all + 2.2 authnz API
  • +
  • Introduces Optional Functions for logio and authnz
  • +
  • New function ap_get_server_name_for_url to support IPv6 + literals.
  • +
  • New function ap_register_errorlog_handler to register error log + format string handlers.
  • +
  • Arguments of error_log hook have changed. Declaration has moved to + http_core.h.
  • +
  • New function ap_state_query to determine if the server is in the + initial configuration preflight phase or not. This is both easier to + use and more correct than the old method of creating a pool userdata + entry in the process pool.
  • +
  • New function ap_get_conn_socket to get the socket descriptor for a + connection. This should be used instead of accessing the core + connection config directly.
  • +
+ + +

httpd (changed)

+ +
    +
  • Introduce per-directory, per-module loglevel
  • +
  • New loglevels APLOG_TRACEn
  • +
  • Introduce errorlog ids for requests and connections
  • +
  • Support for mod_request kept_body
  • +
  • Support buffering filter data for async requests
  • +
  • New CONN_STATE values
  • +
  • Function changes: ap_escape_html updated; + ap_unescape_all, ap_escape_path_segment_buffer
  • +
  • Modules that load other modules later than the EXEC_ON_READ config + reading stage need to call ap_reserve_module_slots() or + ap_reserve_module_slots_directive() in their + pre_config hook.
  • +
  • The useragent IP address per request can now be tracked + independently of the client IP address of the connection, for + support of deployments with load balancers.
  • +
+ + +

http_log (changed)

+ +
    +
  • Introduce per-directory, per-module loglevel
  • +
  • New loglevels APLOG_TRACEn
  • +
  • ap_log_*error become macro wrappers (backwards-compatible if + APLOG_MARK macro is used, except that is no longer possible to + use #ifdef inside the argument list)
  • +
  • piped logging revamped
  • +
  • module_index added to error_log hook
  • +
  • new function: ap_log_command_line
  • +
+ + +

http_request (changed)

+ +
    +
  • New auth_internal API and auth_provider API
  • +
  • New EOR bucket type
  • +
  • New function ap_process_async_request
  • +
  • New flags AP_AUTH_INTERNAL_PER_CONF and + AP_AUTH_INTERNAL_PER_URI
  • +
  • New access_checker_ex hook to apply additional access control + and/or bypass authentication.
  • +
  • New functions ap_hook_check_access_ex, + ap_hook_check_access, ap_hook_check_authn, + ap_hook_check_authz which accept + AP_AUTH_INTERNAL_PER_* flags
  • +
  • DEPRECATED direct use of ap_hook_access_checker, + access_checker_ex, ap_hook_check_user_id, + ap_hook_auth_checker
  • +
+

When possible, registering all access control hooks (including + authentication and authorization hooks) using AP_AUTH_INTERNAL_PER_CONF + is recommended. If all modules' access control hooks are registered + with this flag, then whenever the server handles an internal + sub-request that matches the same set of access control configuration + directives as the initial request (which is the common case), it can + avoid invoking the access control hooks another time.

+

If your module requires the old behavior and must perform access + control checks on every sub-request with a different URI from the + initial request, even if that URI matches the same set of access + control configuration directives, then use + AP_AUTH_INTERNAL_PER_URI.

+ + +

mod_auth (NEW!)

+ +

Introduces the new provider framework for authn and authz

+ + +

mod_cache (changed)

+ +

Introduces a commit_entity() function to the cache provider + interface, allowing atomic writes to cache. Add a cache_status() + hook to report the cache decision. All private structures and functions were + removed.

+ + +

mod_core (NEW!)

+ +

This introduces low-level APIs to send arbitrary headers, + and exposes functions to handle HTTP OPTIONS and TRACE.

+ + +

mod_cache_disk (changed)

+ +

Changes the disk format of the disk cache to support atomic cache + updates without locking. The device/inode pair of the body file is + embedded in the header file, allowing confirmation that the header + and body belong to one another.

+ + +

mod_disk_cache (renamed)

+ +

The mod_disk_cache module has been renamed to mod_cache_disk in + order to be consistent with the naming of other modules within the + server.

+ + +

mod_request (NEW!)

+ +

The API for mod_request, to make input data + available to multiple application/handler modules where required, + and to parse HTML form data.

+ + +

mpm_common (changed)

+ +
    +
  • REMOVES: accept, lockfile, lock_mech, + set_scoreboard (locking uses the new ap_mutex API)
  • +
  • NEW API to drop privileges (delegates this platform-dependent + function to modules)
  • +
  • NEW Hooks: mpm_query, timed_callback, and + get_name
  • +
  • CHANGED interfaces: monitor hook, + ap_reclaim_child_processes, + ap_relieve_child_processes
  • +
+ + +

scoreboard (changed)

+ +

ap_get_scoreboard_worker is made non-backwards-compatible + as an alternative version is introduced. Additional proxy_balancer + support. Child status stuff revamped.

+ + +

util_cookies (NEW!)

+ +

Introduces a new API for managing HTTP Cookies.

+ + +

util_ldap (changed)

+ +

no description available

+ + +

util_mutex (NEW!)

+ +

A wrapper for APR proc and global mutexes in httpd, providing + common configuration for the underlying mechanism and location + of lock files.

+ + +

util_script (changed)

+ +

NEW: ap_args_to_table

+ + +

util_time (changed)

+ +

NEW: ap_recent_ctime_ex

+ + +
top
+
+

Specific information on upgrading modules from 2.2

+ + +

Logging

+ +

In order to take advantage of per-module loglevel configuration, any + source file that calls the ap_log_* functions should declare + which module it belongs to. If the module's module_struct is called + foo_module, the following code can be used to remain + backward compatible with HTTPD 2.0 and 2.2:

+

+ #include <http_log.h>
+
+ #ifdef APLOG_USE_MODULE
+ APLOG_USE_MODULE(foo);
+ #endif +

+

Note: This is absolutely required for C++-language modules. It + can be skipped for C-language modules, though that breaks + module-specific log level support for files without it.

+

The number of parameters of the ap_log_* functions and the + definition of APLOG_MARK has changed. Normally, the change + is completely transparent. However, changes are required if a + module uses APLOG_MARK as a parameter to its own functions + or if a module calls ap_log_* without passing + APLOG_MARK. A module which uses wrappers + around ap_log_* typically uses both of these constructs.

+ +

The easiest way to change code which passes APLOG_MARK to + its own functions is to define and use a different macro that expands to + the parameters required by those functions, as APLOG_MARK + should only be used when calling ap_log_* + directly. In this way, the code will remain compatible with HTTPD 2.0 + and 2.2.

+ +

Code which calls ap_log_* without passing + APLOG_MARK will necessarily differ between 2.4 and earlier + releases, as 2.4 requires a new third argument, + APLOG_MODULE_INDEX.

+ +

+ /* code for httpd 2.0/2.2 */
+ ap_log_perror(file, line, APLOG_ERR, 0, p, "Failed to allocate dynamic lock structure");
+
+ /* code for httpd 2.4 */
+ ap_log_perror(file, line, APLOG_MODULE_INDEX, APLOG_ERR, 0, p, "Failed to allocate dynamic lock structure");
+
+

+ +

ap_log_*error are now implemented as macros. This means + that it is no longer possible to use #ifdef inside the + argument list of ap_log_*error, as this would cause + undefined behavior according to C99.

+ +

A server_rec pointer must be passed to + ap_log_error() when called after startup. This + was always appropriate, but there are even more limitations with + a NULL server_rec in 2.4 than in + previous releases. Beginning with 2.3.12, the global variable + ap_server_conf can always be used as + the server_rec parameter, as it will be + NULL only when it is valid to pass NULL + to ap_log_error(). ap_server_conf + should be used only when a more appropriate server_rec + is not available.

+ +

Consider the following changes to take advantage of the new + APLOG_TRACE1..8 log levels:

+
    +
  • Check current use of APLOG_DEBUG and + consider if one of the APLOG_TRACEn levels is + more appropriate.
  • +
  • If your module currently has a mechanism for configuring + the amount of debug logging which is performed, consider + eliminating that mechanism and relying on the use of + different APLOG_TRACEn levels. If expensive + trace processing needs to be bypassed depending on the + configured log level, use the APLOGtracen + and APLOGrtracen macros to first check + if tracing is enabled.
  • +
+ +

Modules sometimes add process id and/or thread id to their log + messages. These ids are now logged by default, so it may not + be necessary for the module to log them explicitly. (Users may + remove them from the error log format, but they can be + instructed to add it back if necessary for problem diagnosis.)

+ + +

If your module uses these existing APIs...

+ + +
+
ap_default_type()
+
This is no longer available; Content-Type must be configured + explicitly or added by the application.
+ +
ap_get_server_name()
+
If the returned server name is used in a URL, + use ap_get_server_name_for_url() instead. This new + function handles the odd case where the server name is an IPv6 + literal address.
+ +
ap_get_server_version()
+
For logging purposes, where detailed information is + appropriate, use ap_get_server_description(). + When generating output, where the amount of information + should be configurable by ServerTokens, use + ap_get_server_banner().
+ +
ap_graceful_stop_signalled()
+
Replace with a call + to ap_mpm_query(AP_MPMQ_MPM_STATE) and checking for + state AP_MPMQ_STOPPING.
+ +
ap_max_daemons_limit, ap_my_generation, + and ap_threads_per_child
+
Use ap_mpm_query() query codes + AP_MPMQ_MAX_DAEMON_USED, AP_MPMQ_GENERATION, + and AP_MPMQ_MAX_THREADS, respectively.
+ +
ap_mpm_query()
+
Ensure that it is not used until after the register-hooks + hook has completed. Otherwise, an MPM built as a DSO + would not have had a chance to enable support for this + function.
+ +
ap_requires()
+
The core server now provides better infrastructure for handling + Require configuration. + Register an auth provider function for each supported entity using + ap_register_auth_provider(). The function will be + called as necessary during Require + processing. (Consult bundled modules for detailed examples.)
+ +
ap_server_conf->process->pool + userdata
+
+ Optional: +
    +
  • If your module uses this to determine which pass of the + startup hooks is being run, + use ap_state_query(AP_SQ_MAIN_STATE).
  • +
  • If your module uses this to maintain data across the + unloading and reloading of your module, use + ap_retained_data_create() and + ap_retained_data_get().
  • +
+
+ +
apr_global_mutex_create(), + apr_proc_mutex_create()
+
Optional: See ap_mutex_register(), + ap_global_mutex_create(), and + ap_proc_mutex_create(); these allow your + mutexes to be configurable with + the Mutex directive; + you can also remove any configuration mechanisms in your + module for such mutexes +
+ +
CORE_PRIVATE
+
This is now unnecessary and ignored.
+ +
dav_new_error() + and dav_new_error_tag()
+
Previously, these assumed that errno contained + information describing the failure. Now, + an apr_status_t parameter must be provided. Pass + 0/APR_SUCCESS if there is no such error information, or a valid + apr_status_t value otherwise.
+ +
mpm_default.h, DEFAULT_LOCKFILE, + DEFAULT_THREAD_LIMIT, DEFAULT_PIDLOG, + etc.
+
The header file and most of the default configuration + values set in it are no longer visible to modules. (Most can + still be overridden at build time.) DEFAULT_PIDLOG + and DEFAULT_REL_RUNTIMEDIR are now universally + available via ap_config.h.
+ +
unixd_config
+
This has been renamed to ap_unixd_config.
+ +
unixd_setup_child()
+
This has been renamed to ap_unixd_setup_child(), but most callers + should call the added ap_run_drop_privileges() hook.
+ +
conn_rec->remote_ip and + conn_rec->remote_addr
+
These fields have been renamed in order to distinguish between + the client IP address of the connection and the useragent IP address + of the request (potentially overridden by a load balancer or proxy). + References to either of these fields must be updated with one of the + following options, as appropriate for the module: +
    +
  • When you require the IP address of the user agent, which + might be connected directly to the server, or might optionally be + separated from the server by a transparent load balancer or + proxy, use request_rec->useragent_ip and + request_rec->useragent_addr.
  • +
  • When you require the IP address of the client that is + connected directly to the server, which might be the useragent or + might be the load balancer or proxy itself, use + conn_rec->client_ip and + conn_rec->client_addr.
  • +
+
+
+ + +

If your module interfaces with this feature...

+ +
+
suEXEC
+
Optional: If your module logs an error + when ap_unixd_config.suexec_enabled is 0, + also log the value of the new + field suexec_disabled_reason, which contains an + explanation of why it is not available.
+ +
Extended status data in the scoreboard
+
In previous releases, ExtendedStatus had to be + set to On, which in turn required that + mod_status was loaded. In 2.4, just + set ap_extended_status to 1 in a + pre-config hook and the extended status data will be + available.
+
+ + +

Does your module...

+ +
+
Parse query args
+
Consider if ap_args_to_table() would be + helpful.
+ +
Parse form data...
+
Use ap_parse_form_data().
+ +
Check for request header fields Content-Length + and Transfer-Encoding to see if a body was + specified
+
Use ap_request_has_body().
+ +
Implement cleanups which clear pointer variables
+
Use ap_pool_cleanup_set_null().
+ +
Create run-time files such as shared memory files, pid files, + etc.
+
Use ap_runtime_dir_relative() so that the global + configuration for the location of such files, either by the + DEFAULT_REL_RUNTIMEDIR compile setting or the + DefaultRuntimeDir directive, + will be respected. Apache httpd 2.4.2 and above.
+ +
+ + +
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/output-filters.html b/docs/manual/developer/output-filters.html new file mode 100644 index 0000000..ee632a6 --- /dev/null +++ b/docs/manual/developer/output-filters.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: output-filters.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/output-filters.html.en b/docs/manual/developer/output-filters.html.en new file mode 100644 index 0000000..cd5cf8c --- /dev/null +++ b/docs/manual/developer/output-filters.html.en @@ -0,0 +1,585 @@ + + + + + +Guide to writing output filters - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Guide to writing output filters

+
+

Available Languages:  en 

+
+ +

There are a number of common pitfalls encountered when writing + output filters; this page aims to document best practice for + authors of new or existing filters.

+ +

This document is applicable to both version 2.0 and version 2.2 + of the Apache HTTP Server; it specifically targets + RESOURCE-level or CONTENT_SET-level + filters though some advice is generic to all types of filter.

+
+ +
top
+
+

Filters and bucket brigades

+ + +

Each time a filter is invoked, it is passed a bucket + brigade, containing a sequence of buckets which + represent both data content and metadata. Every bucket has a + bucket type; a number of bucket types are defined and + used by the httpd core modules (and the + apr-util library which provides the bucket brigade + interface), but modules are free to define their own types.

+ +
Output filters must be prepared to process + buckets of non-standard types; with a few exceptions, a filter + need not care about the types of buckets being filtered.
+ +

A filter can tell whether a bucket represents either data or + metadata using the APR_BUCKET_IS_METADATA macro. + Generally, all metadata buckets should be passed down the filter + chain by an output filter. Filters may transform, delete, and + insert data buckets as appropriate.

+ +

There are two metadata bucket types which all filters must pay + attention to: the EOS bucket type, and the + FLUSH bucket type. An EOS bucket + indicates that the end of the response has been reached and no + further buckets need be processed. A FLUSH bucket + indicates that the filter should flush any buffered buckets (if + applicable) down the filter chain immediately.

+ +
FLUSH buckets are sent when the + content generator (or an upstream filter) knows that there may be + a delay before more content can be sent. By passing + FLUSH buckets down the filter chain immediately, + filters ensure that the client is not kept waiting for pending + data longer than necessary.
+ +

Filters can create FLUSH buckets and pass these + down the filter chain if desired. Generating FLUSH + buckets unnecessarily, or too frequently, can harm network + utilisation since it may force large numbers of small packets to + be sent, rather than a small number of larger packets. The + section on Non-blocking bucket reads + covers a case where filters are encouraged to generate + FLUSH buckets.

+ +

Example bucket brigade

+ HEAP FLUSH FILE EOS

+ +

This shows a bucket brigade which may be passed to a filter; it + contains two metadata buckets (FLUSH and + EOS), and two data buckets (HEAP and + FILE).

+ +
top
+
+

Filter invocation

+ + +

For any given request, an output filter might be invoked only + once and be given a single brigade representing the entire response. + It is also possible that the number of times a filter is invoked + for a single response is proportional to the size of the content + being filtered, with the filter being passed a brigade containing + a single bucket each time. Filters must operate correctly in + either case.

+ +
An output filter which allocates long-lived + memory every time it is invoked may consume memory proportional to + response size. Output filters which need to allocate memory + should do so once per response; see Maintaining + state below.
+ +

An output filter can distinguish the final invocation for a + given response by the presence of an EOS bucket in + the brigade. Any buckets in the brigade after an EOS should be + ignored.

+ +

An output filter should never pass an empty brigade down the + filter chain. To be defensive, filters should be prepared to + accept an empty brigade, and should return success without passing + this brigade on down the filter chain. The handling of an empty + brigade should have no side effects (such as changing any state + private to the filter).

+ +

How to handle an empty brigade

apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
+{
+    if (APR_BRIGADE_EMPTY(bb)) {
+        return APR_SUCCESS;
+    }
+    ...
+
+ +
top
+
+

Brigade structure

+ + +

A bucket brigade is a doubly-linked list of buckets. The list + is terminated (at both ends) by a sentinel which can be + distinguished from a normal bucket by comparing it with the + pointer returned by APR_BRIGADE_SENTINEL. The list + sentinel is in fact not a valid bucket structure; any attempt to + call normal bucket functions (such as + apr_bucket_read) on the sentinel will have undefined + behaviour (i.e. will crash the process).

+ +

There are a variety of functions and macros for traversing and + manipulating bucket brigades; see the apr_buckets.h + header for complete coverage. Commonly used macros include:

+ +
+
APR_BRIGADE_FIRST(bb)
+
returns the first bucket in brigade bb
+ +
APR_BRIGADE_LAST(bb)
+
returns the last bucket in brigade bb
+ +
APR_BUCKET_NEXT(e)
+
gives the next bucket after bucket e
+ +
APR_BUCKET_PREV(e)
+
gives the bucket before bucket e
+ +
+ +

The apr_bucket_brigade structure itself is + allocated out of a pool, so if a filter creates a new brigade, it + must ensure that memory use is correctly bounded. A filter which + allocates a new brigade out of the request pool + (r->pool) on every invocation, for example, will fall + foul of the warning above concerning + memory use. Such a filter should instead create a brigade on the + first invocation per request, and store that brigade in its state structure.

+ +

It is generally never advisable to use + apr_brigade_destroy to "destroy" a brigade unless + you know for certain that the brigade will never be used + again, even then, it should be used rarely. The + memory used by the brigade structure will not be released by + calling this function (since it comes from a pool), but the + associated pool cleanup is unregistered. Using + apr_brigade_destroy can in fact cause memory leaks; + if a "destroyed" brigade contains buckets when its + containing pool is destroyed, those buckets will not be + immediately destroyed.

+ +

In general, filters should use apr_brigade_cleanup + in preference to apr_brigade_destroy.

+ +
top
+
+

Processing buckets

+ + + +

When dealing with non-metadata buckets, it is important to + understand that the "apr_bucket *" object is an + abstract representation of data:

+ +
    +
  1. The amount of data represented by the bucket may or may not + have a determinate length; for a bucket which represents data of + indeterminate length, the ->length field is set to + the value (apr_size_t)-1. For example, buckets of + the PIPE bucket type have an indeterminate length; + they represent the output from a pipe.
  2. + +
  3. The data represented by a bucket may or may not be mapped + into memory. The FILE bucket type, for example, + represents data stored in a file on disk.
  4. +
+ +

Filters read the data from a bucket using the + apr_bucket_read function. When this function is + invoked, the bucket may morph into a different bucket + type, and may also insert a new bucket into the bucket brigade. + This must happen for buckets which represent data not mapped into + memory.

+ +

To give an example; consider a bucket brigade containing a + single FILE bucket representing an entire file, 24 + kilobytes in size:

+ +

FILE(0K-24K)

+ +

When this bucket is read, it will read a block of data from the + file, morph into a HEAP bucket to represent that + data, and return the data to the caller. It also inserts a new + FILE bucket representing the remainder of the file; + after the apr_bucket_read call, the brigade looks + like:

+ +

HEAP(8K) FILE(8K-24K)

+ +
top
+
+

Filtering brigades

+ + +

The basic function of any output filter will be to iterate + through the passed-in brigade and transform (or simply examine) + the content in some manner. The implementation of the iteration + loop is critical to producing a well-behaved output filter.

+ +

Taking an example which loops through the entire brigade as + follows:

+ +

Bad output filter -- do not imitate!

apr_bucket *e = APR_BRIGADE_FIRST(bb);
+const char *data;
+apr_size_t length;
+
+while (e != APR_BRIGADE_SENTINEL(bb)) {
+    apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
+    e = APR_BUCKET_NEXT(e);
+}
+
+return ap_pass_brigade(bb);
+
+ +

The above implementation would consume memory proportional to + content size. If passed a FILE bucket, for example, + the entire file contents would be read into memory as each + apr_bucket_read call morphed a FILE + bucket into a HEAP bucket.

+ +

In contrast, the implementation below will consume a fixed + amount of memory to filter any brigade; a temporary brigade is + needed and must be allocated only once per response, see the Maintaining state section.

+ +

Better output filter

apr_bucket *e;
+const char *data;
+apr_size_t length;
+
+while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
+    rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
+    if (rv) ...;
+    /* Remove bucket e from bb. */
+    APR_BUCKET_REMOVE(e);
+    /* Insert it into  temporary brigade. */
+    APR_BRIGADE_INSERT_HEAD(tmpbb, e);
+    /* Pass brigade downstream. */
+    rv = ap_pass_brigade(f->next, tmpbb);
+    if (rv) ...;
+    apr_brigade_cleanup(tmpbb);
+}
+
+ +
top
+
+

Maintaining state

+ + + +

A filter which needs to maintain state over multiple + invocations per response can use the ->ctx field of + its ap_filter_t structure. It is typical to store a + temporary brigade in such a structure, to avoid having to allocate + a new brigade per invocation as described in the Brigade structure section.

+ +

Example code to maintain filter state

struct dummy_state {
+    apr_bucket_brigade *tmpbb;
+    int filter_state;
+    ...
+};
+
+apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
+{
+    struct dummy_state *state;
+
+    state = f->ctx;
+    if (state == NULL) {
+
+        /* First invocation for this response: initialise state structure.
+         */
+        f->ctx = state = apr_palloc(f->r->pool, sizeof *state);
+
+        state->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
+        state->filter_state = ...;
+    }
+    ...
+
+ +
top
+
+

Buffering buckets

+ + +

If a filter decides to store buckets beyond the duration of a + single filter function invocation (for example storing them in its + ->ctx state structure), those buckets must be set + aside. This is necessary because some bucket types provide + buckets which represent temporary resources (such as stack memory) + which will fall out of scope as soon as the filter chain completes + processing the brigade.

+ +

To setaside a bucket, the apr_bucket_setaside + function can be called. Not all bucket types can be setaside, but + if successful, the bucket will have morphed to ensure it has a + lifetime at least as long as the pool given as an argument to the + apr_bucket_setaside function.

+ +

Alternatively, the ap_save_brigade function can be + used, which will move all the buckets into a separate brigade + containing buckets with a lifetime as long as the given pool + argument. This function must be used with care, taking into + account the following points:

+ +
    +
  1. On return, ap_save_brigade guarantees that all + the buckets in the returned brigade will represent data mapped + into memory. If given an input brigade containing, for example, + a PIPE bucket, ap_save_brigade will + consume an arbitrary amount of memory to store the entire output + of the pipe.
  2. + +
  3. When ap_save_brigade reads from buckets which + cannot be setaside, it will always perform blocking reads, + removing the opportunity to use Non-blocking + bucket reads.
  4. + +
  5. If ap_save_brigade is used without passing a + non-NULL "saveto" (destination) brigade parameter, + the function will create a new brigade, which may cause memory + use to be proportional to content size as described in the Brigade structure section.
  6. +
+ +
Filters must ensure that any buffered data is + processed and passed down the filter chain during the last + invocation for a given response (a brigade containing an EOS + bucket). Otherwise such data will be lost.
+ +
top
+
+

Non-blocking bucket reads

+ + +

The apr_bucket_read function takes an + apr_read_type_e argument which determines whether a + blocking or non-blocking read will be performed + from the data source. A good filter will first attempt to read + from every data bucket using a non-blocking read; if that fails + with APR_EAGAIN, then send a FLUSH + bucket down the filter chain, and retry using a blocking read.

+ +

This mode of operation ensures that any filters further down the + filter chain will flush any buffered buckets if a slow content + source is being used.

+ +

A CGI script is an example of a slow content source which is + implemented as a bucket type. mod_cgi will send + PIPE buckets which represent the output from a CGI + script; reading from such a bucket will block when waiting for the + CGI script to produce more output.

+ +

Example code using non-blocking bucket reads

apr_bucket *e;
+apr_read_type_e mode = APR_NONBLOCK_READ;
+
+while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
+    apr_status_t rv;
+
+    rv = apr_bucket_read(e, &data, &length, mode);
+    if (rv == APR_EAGAIN && mode == APR_NONBLOCK_READ) {
+
+        /* Pass down a brigade containing a flush bucket: */
+        APR_BRIGADE_INSERT_TAIL(tmpbb, apr_bucket_flush_create(...));
+        rv = ap_pass_brigade(f->next, tmpbb);
+        apr_brigade_cleanup(tmpbb);
+        if (rv != APR_SUCCESS) return rv;
+
+        /* Retry, using a blocking read. */
+        mode = APR_BLOCK_READ;
+        continue;
+    }
+    else if (rv != APR_SUCCESS) {
+        /* handle errors */
+    }
+
+    /* Next time, try a non-blocking read first. */
+    mode = APR_NONBLOCK_READ;
+    ...
+}
+
+ +
top
+
+

Ten rules for output filters

+ + +

In summary, here is a set of rules for all output filters to + follow:

+ +
    +
  1. Output filters should not pass empty brigades down the filter + chain, but should be tolerant of being passed empty + brigades.
  2. + +
  3. Output filters must pass all metadata buckets down the filter + chain; FLUSH buckets should be respected by passing + any pending or buffered buckets down the filter chain.
  4. + +
  5. Output filters should ignore any buckets following an + EOS bucket.
  6. + +
  7. Output filters must process a fixed amount of data at a + time, to ensure that memory consumption is not proportional to + the size of the content being filtered.
  8. + +
  9. Output filters should be agnostic with respect to bucket + types, and must be able to process buckets of unfamiliar + type.
  10. + +
  11. After calling ap_pass_brigade to pass a brigade + down the filter chain, output filters should call + apr_brigade_cleanup to ensure the brigade is empty + before reusing that brigade structure; output filters should + never use apr_brigade_destroy to "destroy" + brigades.
  12. + +
  13. Output filters must setaside any buckets which are + preserved beyond the duration of the filter function.
  14. + +
  15. Output filters must not ignore the return value of + ap_pass_brigade, and must return appropriate errors + back up the filter chain.
  16. + +
  17. Output filters must only create a fixed number of bucket + brigades for each response, rather than one per invocation.
  18. + +
  19. Output filters should first attempt non-blocking reads from + each data bucket, and send a FLUSH bucket down the + filter chain if the read blocks, before retrying with a blocking + read.
  20. + +
+ +
top
+
+

Use case: buffering in mod_ratelimit

+ +

The r1833875 change is a good + example to show what buffering and keeping state means in the context of an + output filter. In this use case, a user asked on the users' mailing list a + interesting question about why mod_ratelimit seemed not to + honor its setting with proxied content (either rate limiting at a different + speed or simply not doing it at all). Before diving deep into the solution, + it is better to explain on a high level how mod_ratelimit works. + The trick is really simple: take the rate limit settings and calculate a + chunk size of data to flush every 200ms to the client. For example, let's imagine + that to set rate-limit 60 in our config, these are the high level + steps to find the chunk size:

+
/* milliseconds to wait between each flush of data */
+RATE_INTERVAL_MS = 200;
+/* rate limit speed in b/s */
+speed = 60 * 1024;
+/* final chunk size is 12228 bytes */
+chunk_size = (speed / (1000 / RATE_INTERVAL_MS));
+ +

If we apply this calculation to a bucket brigade carrying 38400 bytes, it means + that the filter will try to do the following:

+
    +
  1. Split the 38400 bytes in chunks of maximum 12228 bytes each.
  2. +
  3. Flush the first 12228 chunk of bytes and sleep 200ms.
  4. +
  5. Flush the second 12228 chunk of bytes and sleep 200ms.
  6. +
  7. Flush the third 12228 chunk of bytes and sleep 200ms.
  8. +
  9. Flush the remaining 1716 bytes.
  10. +
+

The above pseudo code works fine if the output filter handles only one brigade + for each response, but it might happen that it needs to be called multiple times + with different brigade sizes as well. The former use case is for example when + httpd directly serves some content, like a static file: the bucket brigade + abstraction takes care of handling the whole content, and rate limiting + works nicely. But if the same static content is served via mod_proxy_http (for + example a backend is serving it rather than httpd) then the content generator + (in this case mod_proxy_http) may use a maximum buffer size and then send data + as bucket brigades to the output filters chain regularly, triggering of course + multiple calls to mod_ratelimit. If the reader tries to execute the pseudo code + assuming multiple calls to the output filter, each one requiring to process + a bucket brigade of 38400 bytes, then it is easy to spot some + anomalies:

+
    +
  1. Between the last flush of a brigade and the first one of the next, + there is no sleep.
  2. +
  3. Even if the sleep was forced after the last flush, then that chunk size + would not be the ideal size (1716 bytes instead of 12228) and the final client's speed + would quickly become different than what set in the httpd's config.
  4. +
+

In this case, two things might help:

+
    +
  1. Use the ctx internal data structure, initialized by mod_ratelimit + for each response handling cycle, to "remember" when the last sleep was + performed across multiple invocations, and act accordingly.
  2. +
  3. If a bucket brigade is not splittable into a finite number of chunk_size + blocks, store the remaining bytes (located in the tail of the bucket brigade) + in a temporary holding area (namely another bucket brigade) and then use + ap_save_brigade to set them aside. + These bytes will be prepended to the next bucket brigade that will be handled + in the subsequent invocation.
  4. +
  5. Avoid the previous logic if the bucket brigade that is currently being + processed contains the end of stream bucket (EOS). There is no need to sleep + or buffering data if the end of stream is reached.
  6. +
+

The commit linked in the beginning of the section contains also a bit of code + refactoring so it is not trivial to read during the first pass, but the overall + idea is basically what written up to now. The goal of this section is not to + cause a headache to the reader trying to read C code, but to put him/her into + the right mindset needed to use efficiently the tools offered by the httpd's + filter chain toolset.

+
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/request.html b/docs/manual/developer/request.html new file mode 100644 index 0000000..92c1bee --- /dev/null +++ b/docs/manual/developer/request.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: request.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/request.html.en b/docs/manual/developer/request.html.en new file mode 100644 index 0000000..2ea780d --- /dev/null +++ b/docs/manual/developer/request.html.en @@ -0,0 +1,248 @@ + + + + + +Request Processing in the Apache HTTP Server 2.x - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Request Processing in the Apache HTTP Server 2.x

+
+

Available Languages:  en 

+
+ +

Warning

+

Warning - this is a first (fast) draft that needs further + revision!

+
+ +

Several changes in 2.0 and above affect the internal request + processing mechanics. Module authors need to be aware of these + changes so they may take advantage of the optimizations and + security enhancements.

+ +

The first major change is to the subrequest and redirect + mechanisms. There were a number of different code paths in + the Apache HTTP Server 1.3 to attempt to optimize subrequest + or redirect behavior. As patches were introduced to 2.0, these + optimizations (and the server behavior) were quickly broken due + to this duplication of code. All duplicate code has been folded + back into ap_process_request_internal() to prevent + the code from falling out of sync again.

+ +

This means that much of the existing code was 'unoptimized'. + It is the Apache HTTP Project's first goal to create a robust + and correct implementation of the HTTP server RFC. Additional + goals include security, scalability and optimization. New + methods were sought to optimize the server (beyond the + performance of 1.3) without introducing fragile or + insecure code.

+
+ +
top
+
+

The Request Processing Cycle

+

All requests pass through ap_process_request_internal() + in server/request.c, including subrequests and redirects. If a module + doesn't pass generated requests through this code, the author is cautioned + that the module may be broken by future changes to request + processing.

+ +

To streamline requests, the module author can take advantage + of the hooks offered to drop + out of the request cycle early, or to bypass core hooks which are + irrelevant (and costly in terms of CPU.)

+
top
+
+

The Request Parsing Phase

+

Unescapes the URL

+

The request's parsed_uri path is unescaped, once and only + once, at the beginning of internal request processing.

+ +

This step is bypassed if the proxyreq flag is set, or the + parsed_uri.path element is unset. The module has no further + control of this one-time unescape operation, either failing to + unescape or multiply unescaping the URL leads to security + repercussions.

+ + +

Strips Parent and This Elements from the + URI

+

All /../ and /./ elements are + removed by ap_getparents(), as well as any trailing + /. or /.. element. This helps to ensure + the path is (nearly) absolute before the request processing + continues. (See RFC 1808 section 4 for further discussion.)

+ +

This step cannot be bypassed.

+ + +

Initial URI Location Walk

+

Every request is subject to an + ap_location_walk() call. This ensures that + <Location> sections + are consistently enforced for all requests. If the request is an internal + redirect or a sub-request, it may borrow some or all of the processing + from the previous or parent request's ap_location_walk, so this step + is generally very efficient after processing the main request.

+ + +

translate_name

+

Modules can determine the file name, or alter the given URI + in this step. For example, mod_vhost_alias will + translate the URI's path into the configured virtual host, + mod_alias will translate the path to an alias path, + and if the request falls back on the core, the DocumentRoot is prepended to the request resource.

+ +

If all modules DECLINE this phase, an error 500 is + returned to the browser, and a "couldn't translate name" error is logged + automatically.

+ + +

Hook: map_to_storage

+

After the file or correct URI was determined, the + appropriate per-dir configurations are merged together. For + example, mod_proxy compares and merges the appropriate + <Proxy> sections. + If the URI is nothing more than a local (non-proxy) TRACE + request, the core handles the request and returns DONE. + If no module answers this hook with OK or DONE, + the core will run the request filename against the <Directory> and <Files> sections. If the request + 'filename' isn't an absolute, legal filename, a note is set for + later termination.

+ + +

URI Location Walk

+

Every request is hardened by a second + ap_location_walk() call. This reassures that a + translated request is still subjected to the configured + <Location> sections. + The request again borrows some or all of the processing from its previous + location_walk above, so this step is almost always very + efficient unless the translated URI mapped to a substantially different + path or Virtual Host.

+ + +

Hook: header_parser

+

The main request then parses the client's headers. This + prepares the remaining request processing steps to better serve + the client's request.

+ +
top
+
+

The Security Phase

+

Needs Documentation. Code is:

+ +
if ((access_status = ap_run_access_checker(r)) != 0) {
+    return decl_die(access_status, "check access", r);
+}
+
+if ((access_status = ap_run_check_user_id(r)) != 0) {
+    return decl_die(access_status, "check user", r);
+}
+
+if ((access_status = ap_run_auth_checker(r)) != 0) {
+    return decl_die(access_status, "check authorization", r);
+}
+ +
top
+
+

The Preparation Phase

+

Hook: type_checker

+

The modules have an opportunity to test the URI or filename + against the target resource, and set mime information for the + request. Both mod_mime and + mod_mime_magic use this phase to compare the file + name or contents against the administrator's configuration and set the + content type, language, character set and request handler. Some modules + may set up their filters or other request handling parameters at this + time.

+ +

If all modules DECLINE this phase, an error 500 is + returned to the browser, and a "couldn't find types" error is logged + automatically.

+ + +

Hook: fixups

+

Many modules are 'trounced' by some phase above. The fixups + phase is used by modules to 'reassert' their ownership or force + the request's fields to their appropriate values. It isn't + always the cleanest mechanism, but occasionally it's the only + option.

+ +
top
+
+

The Handler Phase

+

This phase is not part of the processing in + ap_process_request_internal(). Many + modules prepare one or more subrequests prior to creating any + content at all. After the core, or a module calls + ap_process_request_internal() it then calls + ap_invoke_handler() to generate the request.

+ +

Hook: insert_filter

+

Modules that transform the content in some way can insert + their values and override existing filters, such that if the + user configured a more advanced filter out-of-order, then the + module can move its order as need be. There is no result code, + so actions in this hook better be trusted to always succeed.

+ + +

Hook: handler

+

The module finally has a chance to serve the request in its + handler hook. Note that not every prepared request is sent to + the handler hook. Many modules, such as mod_autoindex, + will create subrequests for a given URI, and then never serve the + subrequest, but simply lists it for the user. Remember not to + put required teardown from the hooks above into this module, + but register pool cleanups against the request pool to free + resources as required.

+ +
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file diff --git a/docs/manual/developer/thread_safety.html b/docs/manual/developer/thread_safety.html new file mode 100644 index 0000000..8196302 --- /dev/null +++ b/docs/manual/developer/thread_safety.html @@ -0,0 +1,5 @@ +# GENERATED FROM XML -- DO NOT EDIT + +URI: thread_safety.html.en +Content-Language: en +Content-type: text/html; charset=UTF-8 diff --git a/docs/manual/developer/thread_safety.html.en b/docs/manual/developer/thread_safety.html.en new file mode 100644 index 0000000..7e842d8 --- /dev/null +++ b/docs/manual/developer/thread_safety.html.en @@ -0,0 +1,307 @@ + + + + + +Apache HTTP Server 2.x Thread Safety Issues - Apache HTTP Server Version 2.4 + + + + + + + +
<-
+

Apache HTTP Server 2.x Thread Safety Issues

+
+

Available Languages:  en 

+
+ +

When using any of the threaded mpms in the Apache HTTP Server 2.x it is important + that every function called from Apache be thread safe. When linking in 3rd + party extensions it can be difficult to determine whether the resulting + server will be thread safe. Casual testing generally won't tell you this + either as thread safety problems can lead to subtle race conditions that + may only show up in certain conditions under heavy load.

+
+ +
top
+
+

Global and static variables

+

When writing your module or when trying to determine if a module or + 3rd party library is thread safe there are some common things to keep in + mind.

+ +

First, you need to recognize that in a threaded model each individual + thread has its own program counter, stack and registers. Local variables + live on the stack, so those are fine. You need to watch out for any + static or global variables. This doesn't mean that you are absolutely not + allowed to use static or global variables. There are times when you + actually want something to affect all threads, but generally you need to + avoid using them if you want your code to be thread safe.

+ +

In the case where you have a global variable that needs to be global and + accessed by all threads, be very careful when you update it. If, for + example, it is an incrementing counter, you need to atomically increment + it to avoid race conditions with other threads. You do this using a mutex + (mutual exclusion). Lock the mutex, read the current value, increment it + and write it back and then unlock the mutex. Any other thread that wants + to modify the value has to first check the mutex and block until it is + cleared.

+ +

If you are using APR, have a look + at the apr_atomic_* functions and the + apr_thread_mutex_* functions.

+ +
top
+
+

errno

+

This is a common global variable that holds the error number of the + last error that occurred. If one thread calls a low-level function that + sets errno and then another thread checks it, we are bleeding error + numbers from one thread into another. To solve this, make sure your + module or library defines _REENTRANT or is compiled with + -D_REENTRANT. This will make errno a per-thread variable + and should hopefully be transparent to the code. It does this by doing + something like this:

+ +

+ #define errno (*(__errno_location())) +

+ +

which means that accessing errno will call + __errno_location() which is provided by the libc. Setting + _REENTRANT also forces redefinition of some other functions + to their *_r equivalents and sometimes changes + the common getc/putc macros into safer function + calls. Check your libc documentation for specifics. Instead of, or in + addition to _REENTRANT the symbols that may affect this are + _POSIX_C_SOURCE, _THREAD_SAFE, + _SVID_SOURCE, and _BSD_SOURCE.

+
top
+
+

Common standard troublesome functions

+

Not only do things have to be thread safe, but they also have to be + reentrant. strtok() is an obvious one. You call it the first + time with your delimiter which it then remembers and on each subsequent + call it returns the next token. Obviously if multiple threads are + calling it you will have a problem. Most systems have a reentrant version + of the function called strtok_r() where you pass in an + extra argument which contains an allocated char * which the + function will use instead of its own static storage for maintaining + the tokenizing state. If you are using APR you can use apr_strtok().

+ +

crypt() is another function that tends to not be reentrant, + so if you run across calls to that function in a library, watch out. On + some systems it is reentrant though, so it is not always a problem. If + your system has crypt_r() chances are you should be using + that, or if possible simply avoid the whole mess by using md5 instead.

+ +
top
+
+

Common 3rd Party Libraries

+

The following is a list of common libraries that are used by 3rd party + Apache modules. You can check to see if your module is using a potentially + unsafe library by using tools such as ldd(1) and + nm(1). For PHP, for example, + try this:

+ +

+ % ldd libphp4.so
+ libsablot.so.0 => /usr/local/lib/libsablot.so.0 (0x401f6000)
+ libexpat.so.0 => /usr/lib/libexpat.so.0 (0x402da000)
+ libsnmp.so.0 => /usr/lib/libsnmp.so.0 (0x402f9000)
+ libpdf.so.1 => /usr/local/lib/libpdf.so.1 (0x40353000)
+ libz.so.1 => /usr/lib/libz.so.1 (0x403e2000)
+ libpng.so.2 => /usr/lib/libpng.so.2 (0x403f0000)
+ libmysqlclient.so.11 => /usr/lib/libmysqlclient.so.11 (0x40411000)
+ libming.so => /usr/lib/libming.so (0x40449000)
+ libm.so.6 => /lib/libm.so.6 (0x40487000)
+ libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x404a8000)
+ libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x404e7000)
+ libcrypt.so.1 => /lib/libcrypt.so.1 (0x40505000)
+ libssl.so.2 => /lib/libssl.so.2 (0x40532000)
+ libcrypto.so.2 => /lib/libcrypto.so.2 (0x40560000)
+ libresolv.so.2 => /lib/libresolv.so.2 (0x40624000)
+ libdl.so.2 => /lib/libdl.so.2 (0x40634000)
+ libnsl.so.1 => /lib/libnsl.so.1 (0x40637000)
+ libc.so.6 => /lib/libc.so.6 (0x4064b000)
+ /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000) +

+ +

In addition to these libraries you will need to have a look at any + libraries linked statically into the module. You can use nm(1) + to look for individual symbols in the module.

+
top
+
+

Library List

+

Please drop a note to dev@httpd.apache.org + if you have additions or corrections to this list.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
LibraryVersionThread Safe?Notes
ASpell/PSpell ?
Berkeley DB3.x, 4.xYesBe careful about sharing a connection across threads.
bzip2 YesBoth low-level and high-level APIs are thread-safe. However, + high-level API requires thread-safe access to errno.
cdb ?
C-Client Perhapsc-client uses strtok() and + gethostbyname() which are not thread-safe on most C + library implementations. c-client's static data is meant to be shared + across threads. If strtok() and + gethostbyname() are thread-safe on your OS, c-client + may be thread-safe.
libcrypt ?
Expat YesNeed a separate parser instance per thread
FreeTDS ?
FreeType ?
GD 1.8.x ?
GD 2.0.x ?
gdbm NoErrors returned via a static gdbm_error + variable
ImageMagick5.2.2YesImageMagick docs claim it is thread safe since version 5.2.2 (see Change log). +
Imlib2 ?
libjpegv6b?
libmysqlclient YesUse mysqlclient_r library variant to ensure thread-safety. For + more information, please read http://dev.mysql.com/doc/mysql/en/Threaded_clients.html.
Ming0.2a?
Net-SNMP5.0.x?
OpenLDAP2.1.xYesUse ldap_r library variant to ensure + thread-safety.
OpenSSL0.9.6gYesRequires proper usage of CRYPTO_num_locks, + CRYPTO_set_locking_callback, + CRYPTO_set_id_callback
liboci8 (Oracle 8+)8.x,9.x?
pdflib5.0.xYesPDFLib docs claim it is thread safe; changes.txt indicates it + has been partially thread-safe since V1.91: http://www.pdflib.com/products/pdflib-family/pdflib/.
libpng1.0.x?
libpng1.2.x?
libpq (PostgreSQL)8.xYesDon't share connections across threads and watch out for + crypt() calls
Sablotron0.95?
zlib1.1.4YesRelies upon thread-safe zalloc and zfree functions Default is to + use libc's calloc/free which are thread-safe.
+
+
+

Available Languages:  en 

+
top

Comments

Notice:
This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our mailing lists.
+
+ \ No newline at end of file -- cgit v1.2.3