diff options
Diffstat (limited to 'src/lib/hooks/hooks_maintenance.dox')
-rw-r--r-- | src/lib/hooks/hooks_maintenance.dox | 381 |
1 files changed, 381 insertions, 0 deletions
diff --git a/src/lib/hooks/hooks_maintenance.dox b/src/lib/hooks/hooks_maintenance.dox new file mode 100644 index 0000000..75e1227 --- /dev/null +++ b/src/lib/hooks/hooks_maintenance.dox @@ -0,0 +1,381 @@ +// Copyright (C) 2013-2020 Internet Systems Consortium, Inc. ("ISC") +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +// Note: the prefix "hooksmg" to all labels is an abbreviation for "Hooks +// Maintenance Guide" and is used to prevent a clash with symbols in any +// other Doxygen file. + +/** + @page hooksmgMaintenanceGuide Hooks Maintenance Guide + + @section hooksmgIntroduction Introduction + + This document is aimed at Kea maintainers responsible for the hooks + system. It provides an overview of the classes that make up the hooks + framework and notes important aspects of processing. More detailed + information can be found in the source code. + + It is assumed that the reader is familiar with the contents of the @ref + hooksdgDevelopersGuide and the @ref hooksComponentDeveloperGuide. + + @section hooksmgObjects Hooks Framework Objects + + The relationships between the various objects in the hooks framework + is shown below: + + @image html HooksUml.png "High-Level Class Diagram of the Hooks Framework" + + (To avoid clutter, the @ref hooksmgServerHooks object, used to pass + information about registered hooks to the components, is not shown on + the diagram.) + + The hooks framework objects can be split into user-side objects and + server-side objects. The former are those objects used or referenced + by user-written hooks libraries. The latter are those objects used in + the hooks framework. + + @subsection hooksmgUserObjects User-Side Objects + + The user-side code is able to access two objects in the framework, + the @ref hooksmgCalloutHandle and the @ref hooksmgLibraryHandle. + The @ref hooksmgCalloutHandle is used to pass data between the Kea + component and the loaded library; the @ref hooksmgLibraryHandle is used + for registering callouts. + + @subsubsection hooksmgCalloutHandle Callout Handle + + The @ref isc::hooks::CalloutHandle has two functions: passing arguments + between the Kea component and the user-written library, and storing + per-request context between library calls. In both cases the data is + stored in a @c std::map structure, keyed by argument (or context item) name. + The actual data is stored in a @c boost::any object, which allows any + data type to be stored, although a penalty for this flexibility is + the restriction (mentioned in the @ref hooksdgDevelopersGuide) that + the type of data retrieved must be identical (and not just compatible) + with that stored. + + The storage of context data is slightly complex because there is + separate context for each user library. For this reason, the @ref + hooksmgCalloutHandle has multiple maps, one for each library loaded. + The maps are stored in another map, the appropriate map being identified + by the "current library index" (this index is explained further below). + The reason for the second map (rather than a structure such as a vector) + is to avoid creating individual context maps unless needed; given the + key to the map (in this case the current library index) accessing an + element in a map using the operator[] method returns the element in + question if it exists, or creates a new one (and stores it in the map) + if its doesn't. + + @subsubsection hooksmgLibraryHandle Library Handle + + Little more than a restricted interface to the @ref + hooksmgCalloutManager, the @ref isc::hooks::LibraryHandle allows a + callout to register and deregister callouts. However, there are some + quirks to callout registration which, although the processing involved + is in the @ref hooksmgCalloutManager, are best described here. + + Firstly, a callout can be deregistered by a function within a user + library only if it was registered by a function within that library. That + is to say, if library A registers the callout A_func() on hook "alpha" + and library B registers B_func(), functions within library A are only + able to remove A_func() (and functions in library B remove B_func()). + The restriction - here to prevent one library interfering with the + callouts of another - is enforced by means of the current library index. + As described below, each entry in the vector of callouts associated with + a hook is a pair object, comprising a pointer to the callout and + the index of the library with which it is associated. A callout + can only modify entries in that vector where the current library index + matches the index element of the pair. + + A second quirk is that when dynamically modifying the list of callouts, + the change only takes effect when the current call out from the server + completes. To clarify this, suppose that functions A_func(), B_func() + and C_func() are registered on a hook, and the server executes a callout + on the hook. Suppose also during this call, A_func() removes the callout + C_func() and that B_func() adds D_func(). As changes only take effect + when the current call out completes, the user callouts executed will be + A_func(), B_func() then C_func(). When the server calls the hook callouts + again, the functions executed will be A_func(), B_func() and D_func(). + + This restriction is down to implementation. When a set of callouts on a hook + is being called, the @ref hooksmgCalloutManager iterates through a + vector (the "callout vector") of (index, callout pointer) pairs. Since + registration or deregistration of a callout on that hook would change the + vector (and so potentially invalidate the iterators used to access the it), + a copy of the vector is taken before the iteration starts. The @ref + hooksmgCalloutManager iterates over this copy while any changes made + by the callout registration functions affect the relevant callout vector. + Such approach was chosen because of performance considerations. + + @subsection hooksmgServerObjects Server-Side Objects + + Those objects are not accessible by user libraries. Please do not + attempt to use them if you are developing user callouts. + + @subsubsection hooksmgServerHooks Server Hooks + + The singleton @ref isc::hooks::ServerHooks object is used to register + hooks. It is little more than a wrapper around a map of (hook index, + hook name), generating a unique number (the hook index) for each + hook registered. It also handles the registration of the pre-defined + context_create and context_destroy hooks. + + In operation, the @ref hooksmgHooksManager provides a thin wrapper + around it, so that the Kea component developer does not have to + worry about another object. + + @subsubsection hooksmgLibraryManager Library Manager + + An @ref isc::hooks::LibraryManager is created by the @ref + hooksmgHooksManager object for each shared library loaded. It + controls the loading and unloading of the library and in essence + represents the library in the hooks framework. It also handles the + registration of the standard callouts (functions in the library with + the same name as the hook name). + + Of particular importance is the "library's index", a number associated + with the library. This is passed to the LibraryManager at creation + time and is used to tag the callout pointers. It is discussed + further below. + + As the @c LibraryManager provides all the methods needed to manage the + shared library, it is the natural home for the static @c validateLibrary() + method. The function called the parsing of the Kea configuration, when + the "hooks-libraries" element is processed. It checks that shared library + exists, that it can be opened, that it contains the @c version() function + and that that function returns a valid value. It then closes the shared + library and returns an appropriate indication as to the library status. + + @subsubsection hooksmgLibraryManagerCollection Library Manager Collection + + The hooks framework can handle multiple libraries and as + a result will create a @ref hooksmgLibraryManager for each + of them. The collection of LibraryManagers is managed by the + @ref isc::hooks::LibraryManagerCollection object which, in most + cases has a method corresponding to a @ref hooksmgLibraryManager + method, e.g. it has a @c loadLibraries() that corresponds to the @ref + hooksmgLibraryManager's loadLibrary() call. As would be expected, methods + on the @c LibraryManagerCollection iterate through all specified libraries, + calling the corresponding LibraryManager method for each library. + + One point of note is that @c LibraryManagerCollection operates on an "all + or none" principle. When @c loadLibraries() is called, on exit either all + libraries have been successfully opened or none of them have. There + is no use-case in Kea where, after a user has specified the shared + libraries they want to load, the system will operate with only some of + them loaded. + + The @c LibraryManagerCollection is the place where each library's index is set. + Each library is assigned a number ranging from 1 through to the number + of libraries being loaded. As mentioned in the previous section, this + index is used to tag callout pointers, something that is discussed + in the next section. + + (Whilst on the subject of library index numbers, two additional + numbers - 0 and @c INT_MAX - are also valid as "current library index". + For flexibility, the Kea component is able to register its own + functions as hook callouts. It does this by obtaining a suitable @ref + hooksmgLibraryHandle from the @ref hooksmgHooksManager. A choice + of two is available: one @ref hooksmgLibraryHandle (with an index + of 0) can be used to register a callout on a hook to execute before + any user-supplied callouts. The second (with an index of @c INT_MAX) + is used to register a callout to run after user-specified callouts. + Apart from the index number, the hooks framework does not treat these + callouts any differently from user-supplied ones.) + + @subsubsection hooksmgCalloutManager Callout Manager + + The @ref isc::hooks::CalloutManager is the core of the framework insofar + as the registration and calling of callouts is concerned. + + It maintains a "hook vector" - a vector with one element for + each registered hook. Each element in this vector is itself a + vector (the callout vector), each element of which is a pair of + (library index, callback pointer). When a callout is registered, the + CalloutManager's current library index is used to supply the "library + index" part of the pair. The library index is set explicitly by the + @ref hooksmgLibraryManager prior to calling the user library's load() + function (and prior to registering the standard callbacks). + + The situation is slightly more complex when a callout is executing. In + order to execute a callout, the CalloutManager's @c callCallouts() + method must be called. This iterates through the callout vector for + a hook and for each element in the vector, uses the "library index" + part of the pair to set the "current library index" before calling the + callout function recorded in the second part of the pair. In most cases, + the setting of the library index has no effect on the callout. However, + if the callout wishes to dynamically register or deregister a callout, + the @ref hooksmgLibraryHandle (see above) calls a method on the + @ref hooksmgCalloutManager which in turn uses that information. + + @subsubsection hooksmgHooksManager Hooks Manager + + The @ref isc::hooks::HooksManager is the main object insofar as the + server is concerned. It controls the creation of the library-related + objects and provides the framework in which they interact. It also + provides a shell around objects such as @ref hooksmgServerHooks so that all + interaction with the hooks framework by the server is through the + HooksManager object. Apart from this, it supplies no functionality to + the hooks framework. + + @section hooksmgOtherIssues Other Issues + + @subsection hooksmgMemoryAllocation Memory Allocation + + Unloading a shared library works by unmapping the part of the process's + virtual address space in which the library lies. This may lead to + problems if there are still references to that address space elsewhere + in the process. + + In many operating systems, heap storage allowed by a shared library + will lie in the virtual address allocated to the library. This has + implications in the hooks framework because: + + - Argument information stored in a @ref hooksmgCalloutHandle by a + callout in a library may lie in the library's address space. + + - Data modified in objects passed as arguments may lie in the address + space. For example, it is common for a DHCP callout to add "options" + to a packet: the memory allocated for those options will most likely + lie in library address space. + + The problem really arises because of the extensive use by Kea of + boost smart pointers. When the pointer is destroyed, the pointed-to + memory is deallocated. If the pointer points to address space that is + unmapped because a library has been unloaded, the deletion causes a + segmentation fault. + + The hooks framework addresses the issue for the @ref hooksmgCalloutHandle + by keeping in that object a shared pointer to the object controlling + library unloading (the @ref hooksmgLibraryManagerCollection). Although + the libraries can be unloaded at any time, it is only when every + @ref hooksmgCalloutHandle that could possibly reference address space in the + library have been deleted that the library will actually be unloaded + and the address space unmapped. + + The hooks framework cannot solve the second issue as the objects in + question are under control of the Kea server incorporating the + hooks. It is up to the server developer to ensure that all such objects + have been destroyed before libraries are reloaded. In extreme cases + this may mean the server suspending all processing of incoming requests + until all currently executing requests have completed and data object + destroyed, reloading the libraries, then resuming processing. + + Since Kea 1.7.10 the unload() entry point is called as the first phase + of unloading. This gives more chance to hooks writer to perform + necessary cleanup actions so the second phase, memory unmapping + can safely happen. The @c isc::hooks::unloadLibraries() function + was updated too to return false when at least one active callout + handle remained. + + @subsection hooksmgStaticLinking Hooks and Statically-Linked Kea + + Kea has the configuration option to allow static linking. What this + means is that it links against the static Kea libraries and not + the sharable ones - although it links against the sharable system + libraries like "libc" and "libstdc++" and well as the sharable libraries + for third-party packages such as log4cplus and MySql. + + Static linking poses a problem for dynamically-loaded hooks libraries + as some of the code in them - in particular the hooks framework and + the logging code - depend on global objects created within the Kea + libraries. In the normal course of events (Kea linked against + shared libraries), when Kea is run and the operating system loads + a Kea shared library containing a global object, address space + is assigned for it. When the hooks framework loads a user-library + linked against the same Kea shared library, the operating system + recognizes that the library is already loaded (and initialized) and + uses its definition of the global object. Thus both the code in the + Kea image and the code in the user-written shared library + reference the same object. + + If Kea is statically linked, the linker allocates address space + in the Kea image for the global object and does not include any + reference to the shared library containing it. When Kea now loads + the user-written shared library - and so loads the Kea library code + containing the global object - the operating system does not know that + the object already exists. Instead, it allocates new address space. + The version of Kea in memory therefore has two copies of the object: + one referenced by code in the Kea image, and one referenced by code + in the user-written hooks library. This causes problems - information + put in one copy is not available to the other. + + Particular problems were encountered with global objects the hooks library + and in the logging library, so some code to alleviate the problem has been + included. + + The issue in the hooks library is the singleton @ref + isc::hooks::ServerHooks object, used by the user-written hooks library + if it attempts to register or deregister callouts. The contents of the + singleton - the names of the hook points and their index - are set by + the relevant Kea server; this information is not available in the + singleton created in the user's hooks library. + + Within the code users by the user's hooks library, the @c ServerHooks + object is used by @ref isc::hooks::CalloutHandle and @ref + isc::hooks::CalloutManager objects. Both these objects are passed to the + hooks library code when a callout is called: the former directly through + the callout argument list, the latter indirectly as a pointer to it is + stored in the CalloutHandle. This allows a solution to the problem: + instead of accessing the singleton via @c ServerHooks::getServerHooks(), + the constructors of these objects store a reference to the singleton + @c ServerHooks when they are created and use that reference to access + @c ServerHooks data. Since both @c CalloutHandle and @c CalloutManager are + created in the statically-linked Kea server, use of the reference + means that it is the singleton within the server - and not the one + within the user's hooks library - that is referenced. + + The solution of the logging problem is not so straightforward. Within + Kea, there are two logging components, the Kea logging framework + and the log4cplus libraries. Owing to static linking, there are two + instances of the former; but as static linking uses shared libraries of + third-party products, there is one instance of the latter. What further + complicates matters is that initialization of the logging framework is + in two parts: static initialization and run-time initialization. + + The logging initialization comprises the following: + + -# Static initialization of the log4cplus global variables. + -# Static initialization of messages in the various Kea libraries. + -# Static initialization of logging framework. + -# Run-time initialization of the logging framework. + -# Run-time initialization of log4cplus + + As both the Kea server and the user-written hooks libraries use the + log4cplus shared library, item 1 - the static initialization of the log4cplus + global variables is performed once. + + The next two tasks - static initialization of the messages in the Kea + libraries and the static initialization of the logging framework - + are performed twice, once in the context of the Kea server and + once in the context of the hooks library. For this reason, run-time + initialization of the logging framework needs to be performed twice, + once in the context of the Kea server and once in the context of the + user-written hooks library. However, the standard logging framework + initialization code also performs the last task, initialization of + log4cplus, something that causes problems if executed more than once. + + To get round this, the function @ref isc::hooks::hooksStaticLinkInit() + has been written. It executes the only part of the logging framework + run-time initialization that actually pertains to the logging framework + and not log4cplus, namely loading the message dictionary with the + statically-initialized messages in the Kea libraries. + This should be executed by any hooks library linking against a statically + initialized Kea. (In fact, running it against a dynamically-linked + Kea should have no effect, as the load operation discards any duplicate + message entries.) The hooks library tests do this, the code being + conditionally compiled within a test of the @c USE_STATIC_LINK macro, set + by the configure script. + + @note Not everything is completely rosy with logging and static linking. + In particular, there appears to be an issue with the scenario where a + user-written hooks library is run by a statically-linked Kea and then + unloaded. As far as can be determined, on unload the system attempts to + delete the same logger twice. This is alleviated by explicitly clearing + the loggerptr_ variable in the @ref isc::log::Logger destructor, but there + is a suspicion that some memory might be lost in these circumstances. + This is still under investigation. +*/ |