summaryrefslogtreecommitdiffstats
path: root/ROADMAP
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--ROADMAP229
1 files changed, 229 insertions, 0 deletions
diff --git a/ROADMAP b/ROADMAP
new file mode 100644
index 0000000..4b3f662
--- /dev/null
+++ b/ROADMAP
@@ -0,0 +1,229 @@
+APACHE 2.x ROADMAP
+==================
+Last modified at [$Date: 2020-02-20 19:33:40 -0500 (Thu, 20 Feb 2020) $]
+
+
+WORKS IN PROGRESS
+-----------------
+
+ * Source code should follow style guidelines.
+ OK, we all agree pretty code is good. Probably best to clean this
+ up by hand immediately upon branching a 2.1 tree.
+ Status: Justin volunteers to hand-edit the entire source tree ;)
+
+ Justin says:
+ Recall when the release plan for 2.0 was written:
+ Absolute Enforcement of an "Apache Style" for code.
+ Watch this slip into 3.0.
+
+ David says:
+ The style guide needs to be reviewed before this can be done.
+ http://httpd.apache.org/dev/styleguide.html
+ The current file is dated April 20th 1998!
+
+ OtherBill offers:
+ It's survived since '98 because it's welldone :-) Suggest we
+ simply follow whatever is documented in styleguide.html as we
+ branch the next tree. Really sort of straightforward, if you
+ dislike a bit within that doc, bring it up on the dev@httpd
+ list prior to the next branch.
+
+ So Bill sums up ... let's get the code cleaned up in CVS head.
+ Remember, it just takes cvs diff -b (that is, --ignore-space-change)
+ to see the code changes and ignore that cruft. Get editing Justin :)
+
+ * Replace stat [deferred open] with open/fstat in directory_walk.
+ Justin, Ian, OtherBill all interested in this. Implies setting up
+ the apr_file_t member in request_rec, and having all modules use
+ that file, and allow the cleanup to close it [if it isn't a shared,
+ cached file handle.]
+
+ * The Async Apache Server implemented in terms of APR.
+ [Bill Stoddard's pet project.]
+ Message-ID: <008301c17d42$9b446970$01000100@sashimi> (dev@apr)
+
+ OtherBill notes that this can proceed in two parts...
+
+ Async accept, setup, and tear-down of the request
+ e.g. dealing with the incoming request headers, prior to
+ dispatching the request to a thread for processing.
+ This doesn't need to wait for a 2.x/3.0 bump.
+
+ Async delegation of the entire request processing chain
+ Too many handlers use stack storage and presume it is
+ available for the life of the request, so a complete
+ async implementation would need to happen 3.0 release.
+
+ Brian notes that async writes will provide a bigger
+ scalability win than async reads for most servers.
+ We may want to try a hybrid sync-read/async-write MPM
+ as a next step. This should be relatively easy to
+ build: start with the current worker or leader/followers
+ model, but hand off each response brigade to a "completion
+ thread" that multiplexes writes on many connections, so
+ that the worker thread doesn't have to wait around for
+ the sendfile to complete.
+
+
+MAKING APACHE REPOSITORY-AGNOSTIC
+(or: remove knowledge of the filesystem)
+
+[ 2002/10/01: discussion in progress on items below; this isn't
+ planned yet ]
+
+ * dav_resource concept for an HTTP resource ("ap_resource")
+
+ * r->filename, r->canonical_filename, r->finfo need to
+ disappear. All users need to use new APIs on the ap_resource
+ object.
+
+ (backwards compat: today, when this occurs with mod_dav and a
+ custom backend, the above items refer to the topmost directory
+ mapped by a location; e.g. docroot)
+
+ Need to preserve a 'filename'-like string for mime-by-name
+ sorts of operations. But this only needs to be the name itself
+ and not a full path.
+
+ Justin: Can we leverage the path info, or do we not trust the
+ user?
+
+ gstein: well, it isn't the "path info", but the actual URI of
+ the resource. And of course we trust the user... that is
+ the resource they requested.
+
+ dav_resource->uri is the field you want. path_info might
+ still exist, but that portion might be related to the
+ CGI concept of "path translated" or some other further
+ resolution.
+
+ To continue, I would suggest that "path translated" and
+ having *any* path info is Badness. It means that you did
+ not fully resolve a resource for the given URI. The
+ "abs_path" in a URI identifies a resource, and that
+ should get fully resolved. None of this "resolve to
+ <here> and then we have a magical second resolution
+ (inside the CGI script)" or somesuch.
+
+ Justin: Well, let's consider mod_mbox for a second. It is sort of
+ a virtual filesystem in its own right - as it introduces
+ it's own notion of a URI space, but it is intrinsically
+ tied to the filesystem to do the lookups. But, for the
+ portion that isn't resolved on the file system, it has
+ its own addressing scheme. Do we need the ability to
+ layer resolution?
+
+ * The translate_name hook goes away
+
+ Wrowe altogether disagrees. translate_name today even operates
+ on URIs ... this mechanism needs to be preserved.
+
+ * The doc for map_to_storage is totally opaque to me. It has
+ something to do with filesystems, but it also talks about
+ security and per_dir_config and other stuff. I presume something
+ needs to happen there -- at least better doc.
+
+ Wrowe agrees and will write it up.
+
+ * The directory_walk concept disappears. All configuration is
+ tagged to Locations. The "mod_filesystem" module might have some
+ internal concept of the same config appearing in multiple
+ places, but that is handled internally rather than by Apache
+ core.
+
+ Wrowe suggests this is wrong, instead it's private to filesystem
+ requests, and is already invoked from map_to_storage, not the core
+ handler. <Directory > and <Files > blocks are preserved as-is,
+ but <Directory > sections become specific to the filesystem handler
+ alone. Because alternate filesystem schemes could be loaded, this
+ should be exposed, from the core, for other file-based stores to
+ share. Consider an archive store where the layers become
+ <Directory path> -> <Archive store> -> <File name>
+
+ Justin: How do we map Directory entries to Locations?
+
+ * The "Location tree" is an in-memory representation of the URL
+ namespace. Nodes of the tree have configuration specific to that
+ location in the namespace.
+
+ Something like:
+
+ typedef struct {
+ const char *name; /* name of this node relative to parent */
+
+ struct ap_conf_vector_t *locn_config;
+
+ apr_hash_t *children; /* NULL if no child configs */
+ } ap_locn_node;
+
+ The following config:
+
+ <Location /server-status>
+ SetHandler server-status
+ Order deny,allow
+ Deny from all
+ Allow from 127.0.0.1
+ </Location>
+
+ Creates a node with name=="server_status", and the node is a
+ child of the "/" node. (hmm. node->name is redundant with the
+ hash key; maybe drop node->name)
+
+ In the config vector, mod_access has stored its Order, Deny, and
+ Allow configs. mod_core has stored the SetHandler.
+
+ During the Location walk, we merge the config vectors normally.
+
+ Note that an Alias simply associates a filesystem path (in
+ mod_filesystem) with that Location in the tree. Merging
+ continues with child locations, but a merge is never done
+ through filesystem locations. Config on a specific subdir needs
+ to be mapped back into the corresponding point in the Location
+ tree for proper merging.
+
+ * Config is parsed into a tree, as we did for the 2.0 timeframe,
+ but that tree is just a representation of the config (for
+ multiple runs and for in-memory manipulation and usage). It is
+ unrelated to the "Location tree".
+
+ * Calls to apr_file_io functions generally need to be replaced
+ with operations against the ap_resource. For example, rather
+ than calling apr_dir_open/read/close(), a caller uses
+ resource->repos->get_children() or somesuch.
+
+ Note that things like mod_dir, mod_autoindex, and mod_negotiation
+ need to be converted to use these mechanisms so that their
+ functions will work on logical repositories rather than just
+ filesystems.
+
+ * How do we handle CGI scripts? Especially when the resource may
+ not be backed by a file? Ideally, we should be able to come up
+ with some mechanism to allow CGIs to work in a
+ repository-independent manner.
+
+ - Writing the virtual data as a file and then executing it?
+ - Can a shell be executed in a streamy manner? (Portably?)
+ - Have an 'execute_resource' hook/func that allows the
+ repository to choose its manner - be it exec() or whatever.
+ - Won't this approach lead to duplication of code? Helper fns?
+
+ gstein: PHP, Perl, and Python scripts are nominally executed by
+ a filter inserted by mod_php/perl/python. I'd suggest
+ that shell/batch scripts are similar.
+
+ But to ask further: what if it is an executable
+ *program* rather than just a script? Do we yank that out
+ of the repository, drop it onto the filesystem, and run
+ it? eeewwwww...
+
+ I'll vote -0.9 for CGIs as a filter. Keep 'em handlers.
+
+ Justin: So, do we give up executing CGIs from virtual repositories?
+ That seems like a sad tradeoff to make. I'd like to have
+ my CGI scripts under DAV (SVN) control.
+
+ * How do we handle overlaying of Location and Directory entries?
+ Right now, we have a problem when /cgi-bin/ is ScriptAlias'd and
+ mod_dav has control over /. Some people believe that /cgi-bin/
+ shouldn't be under DAV control, while others do believe it
+ should be. What's the right strategy?