diff options
Diffstat (limited to 'src/hooks/dhcp/high_availability/ha.dox')
-rw-r--r-- | src/hooks/dhcp/high_availability/ha.dox | 414 |
1 files changed, 414 insertions, 0 deletions
diff --git a/src/hooks/dhcp/high_availability/ha.dox b/src/hooks/dhcp/high_availability/ha.dox new file mode 100644 index 0000000..257f551 --- /dev/null +++ b/src/hooks/dhcp/high_availability/ha.dox @@ -0,0 +1,414 @@ +// Copyright (C) 2017-2021 Internet Systems Consortium, Inc. ("ISC") +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. +/** + +@page libdhcp_ha Kea High Availability Hooks Library + +Welcome to Kea High Availability Hooks Library. This documentation is +addressed at developers who are interested in internal operation of the +library. This file provides information needed to understand and perhaps +extend this library. + +@section haOverview Overview + +The High Availability (HA) hooks library is intended for DHCP deployments +in which there is a need to sustain the DHCP service in the event if one +of the servers becomes unavailable as a result of a crash, power outage or +other unexpected situation. The other server belonging to this setup should +be able to handle the entire DHCP traffic directed to the system, including +the traffic that would be normally handled by the server which became +unavailable. + +Many of the concepts behind the HA hooks library are derived from the +DHCP Failover protocol, however this solution has different architecture, +uses different state machine and different message formats for communication +between the participating servers. This solution is not a DHCP Failover +implementation and, therefore, this documentation purposely avoids using +the word "Failover" in the context of this library. + +The HA feature design can be found at +<a href="https://gitlab.isc.org/isc-projects/kea/wikis/designs/High-Availability-Design">Kea HA Design page</a>. + +@section haWhyHookLibrary Why Hook Library? + +High Availability is a very important requirement for various DHCP +deployments. It is a valid question why such a generic feature is +placed in a hook library rather implemented as an integral part of the +Kea DHCP servers. If the HA is implemented in the loadable library, +users who don't use HA or who don't want to use this particular +solution for HA will simply not load this library. The server code +without the HA implementation is lighter, easier to understand and +debug. High Availability is a pretty complex feature and will certainly +keep growing both in size and complexity. Keeping it in a separate +code base makes it easier to maintain and use. Also, the HA hooks +library requires Kea lease_cmds hook library to be loaded on the +participating servers. It would clearly be a bad design to introduce +the feature relying on the presence the loadable (lease_cmds) +module in the main Kea code. + +@section haNotableDifferences Notable Differences to ISC DHCP + +It is worth to briefly explain what are the major differences between Kea HA +implementation and the failover implemented in ISC DHCP. + +There are two protocols that IETF attempted to standardize: +<a href="https://datatracker.ietf.org/doc/html/draft-ietf-dhc-failover"> +DHCPv4 Failover draft</a>, which was an Internet Draft status that had +expired Sept. 2003. The other one is <a href="https://tools.ietf.org/html/rfc8156"> +RFC8156: DHCPv6 Failover</a>, which was published as Proposed Standard. +ISC DHCP implemented the former, but not the latter. As such, ISC DHCP +is able to provide failover for DHCPv4 only, not DHCPv6. + +The second major difference is that both IETF failover protocols are based on +MCLT (or Maximum Client Lead Time), sometimes referenced to as lazy +updates. This mechanism lets a server respond immediately, which improves +latency, but it does so at the cost of greatly increased complexity. The lease +is assigned with a very short lifetime, then an update is sent to the other +server with a lifetime greater than the client requested. Once the other server +confirms the lease, the client's renewal is being updated with a longer +lifetime. This approach generates more traffic and causes lease lifetimes to +fluctuate greatly, despite an administrator setting it to a specific value. Kea +HA does not implement this complexity. It is much simpler and easier to use and +understand its operation, although the price to pay for this relative simplicity +is a longer response time and somewhat decreased performance. + +Third difference is that in ISC DHCP the failover relationship is strictly +a pair (i.e. two) of servers. On the other hand Kea HA is able to define additional +backup servers. While they're not technically participating in the HA +relationship, their databases are kept up to date and can be used are replacements +that are almost ready to take over the traffic. However, replacing primary +or secondary server with a backup requires manual administrator's intervention. + +The fourth difference is that Kea HA does not support pool rebalancing yet. +When running in load balancing mode, Kea uses hashing mechanism to segregate +clients into one of two pools. It is unlikely, but possible that a network +would be visited by clients that are predominantly assigned to one server. +As a result, this server could ran out of addresses, while its underutilized +partner could still have many addresses available. This unfortunate, but +unlikely limitation will be removed in the future Kea releases. + +@section haAyncCommunication Asynchronous Communication with Boost Asio + +One of the major technical problems with High Availability is that the +participating servers must constantly communicate with each other. +When one of the servers allocates a lease it must notify its peer about +this allocation and provide it with a full information about the +allocated lease. The server which has allocated the lease must not +respond to the client until its partner confirms that it has saved +the lease in its database. This guarantees that, at any given time, +both servers hold the most current lease information and any of the +servers can take responsibility for managing existing leases if the +partner server becomes unavailable. This is similar to the requirement +on a single DHCP server which must store the lease information on +the persistent storage before responding to the client. Failing to do +so may cause the lease information to get lost if the server crashes +before writing it to the lease file. + +The requirement for the partner to store the lease in its lease database +and confirming this fact to the server allocating the lease results in +increased latency of the DHCP responses to the clients. In order to +minimize the latency the idea of "parking" DHCP packets has been introduced. +This is a solution for pseudo parallel processing of multiple DHCP packets +and to prevent blocking wait during the communication with the other server. +When the HA hooks library needs to send a lease update to the partner, +the client's packet associated with this lease is "parked", waiting for +the communication with the partner to complete. Meanwhile, other incoming +DHCP packets are processed (and also parked if necessary). The client +which sent the DHCP packet still has to wait for the communication with +the partner to complete, but it doesn't have to wait for the server to +receive its packet (and start processing it) while previous DHCP +transaction is still in progress. + +This solution requires that the communication between the servers is +asynchronous and the most obvious framework for this was Boost ASIO, +as it is already used in many different areas of the code. + +The DHCP servers are processing incoming packets synchronously (in a +loop), but each loop pass contains a call to: + +@code +getIOService()->poll(); +@endcode + +which executes callbacks for completed asynchronous operations, such as +timers, asynchronous sends and receives. The instance of the IOService +is owned by the DHCP servers, but hooks libraries must have access to it +and must use this instance to schedule asynchronous tasks. This is why +the new hook points "dhcp4_srv_configured" and "dhcp6_srv_configured" +have been introduced. These hook points are used by the DHCPv4 and the +DHCPv6 servers respectively, to pass the instance of the IOService +(via "io_context" argument) to the hooks libraries which require to +schedule asynchronous tasks. + +It is also worth to note that the blocking reception of the DHCP packets +may cause up to 1 second delays in the asynchronous operations. This is +due to the structure of the main server loop: + +@code +bool +Dhcpv4Srv::run() { + while (!shutdown_) { + try { + run_one(); + getIOService()->poll(); + } catch (const std::exception& e) { + // General catch-all exception that are not caught by more specific + // catches. This one is for exceptions derived from std::exception. + LOG_ERROR(packet4_logger, DHCP4_PACKET_PROCESS_STD_EXCEPTION) + .arg(e.what()); + } catch (...) { + // General catch-all exception that are not caught by more specific + // catches. This one is for other exceptions, not derived from + // std::exception. + LOG_ERROR(packet4_logger, DHCP4_PACKET_PROCESS_EXCEPTION); + } + } + + return (true); +} +@endcode + +The @c run_one() call includes a @c select() invocation with a timeout of +1 second. The @c poll() is not invoked for at most 1 second while the server +is performing this blocking @c select(). Future Kea releases should mitigate +this problem by introducing some mechanisms for concurrent reception and +processing of the DHCP packets. + + +@section haClientClassification Client Classification in Load Balancing + +One of the top requirements for the HA was to support load balancing between +two participating servers. Even though, current implementation supports +only 50/50 split of packets between two servers, the implementation can +easily be extended to support different splits. + +Another supported mode of operation is the "hot-standby" mode in which +one of the servers handles the entire traffic and the other server is +simply receiving lease updates from it. In case of the failure of the +first server, the standby server can automatically switch to handle the +DHCP traffic directed to the system. + +The "load-balancing" mode is more complex in that it requires isolation +of address/prefix pools from which the respective servers are allocating +leases for the clients. If the two servers were sharing address pools +they would frequently run into the conflict whereby both of them would +allocate the same address to different clients. This is not a problem in +the "hot-standby" mode because there is only one server allocating leases +at the given time. + +The most challenging part in case of load balancing is the configuration +of the address pools on respective servers. At the time when the HA design +was created, there was no requirement on the HA hooks library to be able +to rebalance the pools, e.g. in case one of the pools is nearly exhausted +and the other pool include many available addresses or prefixes. This +requirement may come in the future, in which case the current approach +to the configuration may be enhanced. + +The current approach uses existing client classification mechanism to +statically split allocations accross multiple pools. Client classification +was designed to serve as a generic framework to support various scenarios +in which clients need to be segregated and associated with selected +pools, subnets and shared networks. The load balancing in HA hooks +library is nothing else but another use case for client classification. +Should new requirements be created for the HA hooks library in the +future (e.g. rebalancing), the client classification will need to be +extended to adopt those requirements. + +In fact, client classification was already extended for the Kea 1.4.0 +release to allow for selecting a specific pool based on combinations +of classes, rather than a single class associated with the server +by the HA load balancing algorithm. The examples of the pools split +between different device types (e.g. laptops and telephones) and +between load balancing servers (e.g. "server1" and "server2") can +be found in the Kea Administrator's Manual. + +@section haCodeStructure HA Hooks Library Code Structure + +@subsection haService HA Service Class + +The @c isc::ha::HAService class is a heart of the HA system. It implements the +HA state machine. It is derived from the @c isc::util::StateModel +class. The states are documented both in the Kea Administrator's +Manual and the HA design. The declarations of the states can be +found in the @c ha_service_states.h header file because they are +used by multiple C++ classes. + +Besides running the state machine transitions, the @c HAService +class serves the following purposes: + +- Assigns class to the received DHCP packet appropriate for the server + selected to process the DHCP packet as a result of load balancing. +- Measures the clock skew between the active servers. If the clock skew + is too high, it can either log an error or stop the HA function. +- Sends lease updates to the partner and receives responses. +- Sends heartbeat command to the partner to verify partner's state + and its notion of time (for clock skew). +- Controls whether the DHCP server should respond to the queries + from clients or not. +- Synchronizes local lease database by fetching the leases from the + partner server. +- Controls which packets the server responds to (HA scopes). + +As of Kea 1.4.0 release, there is only one instance of the @c HAService +class created by the HA hooks library. In the future, multiple +@c HAService instances may co-exist, each handling an independent HA +relationship with another server. For example: a server could be +configured to respond to devices in two subnets and establish a +connection with two different servers for respective subnets. Lease +updates pertaining to the first subnet would be sent via first +connection and those pertaining to the second subnet would be sent +via the second connection. As of Kea 1.4.0 release, there is exactly +one relationship that the Kea server instance can participate in. + +@subsection haImplementation HA Implementation Class + +The @c isc::ha::HAImpl class implements callouts and command handlers supported +by the HA hooks library. Its methods expect @c isc::hooks::CalloutHandle +as arguments and are usually directly called by the callout functions +such as @c pkt4_receive etc. This makes it more natural to unit test +those implementations because the tests can invoke methods of the @c HAImpl +class, rather than the "extern" functions. + +Internally, the @c HAImpl class methods call methods of the @c HAService +class to perform certain actions, such as triggering lease updates, +sending heartbeat to another server etc. However, the @c HAImpl still +includes a fair amount of logic to retrieve and validate the arguments +provided within the @c isc::hooks::CalloutHandle. + +The @c isc::ha::HAImpl::buffer4Receive and @c isc::ha::HAImpl::buffer6Receive +functions deserve some detailed explanation, because not only do they retrieve +the arguments provided to the callouts but also perform parsing of the received +DHCP queries. + +The DHCP query parsing is normally performed by the server. In most +cases a hooks library would not have to parse the DHCP packets on +its own. If the hooks library needs to access some information, e.g. +DHCP options or BOOTP message fields, it is sufficient to +implement the @c pkt4_receive or @c pkt6_receive callout, which is +invoked after the server has parsed the packet. However, this +approach would not work in case of the HA hooks library. This +library assigns classes as a result of the load balancing to the +incoming packets. This assignment must take place before the server +evaluates classes specified in the configuration file, i.e. +before the @c pkt4_receive and @c pkt6_receive hook point. This +implies that the HA specific classification must be performed within +the @c buffer4_receive or @c buffer6_receive callouts. These callouts +must parse (unpack) the received buffers to have an access into the +data used by the load balancing algorithm, such as: MAC address, client +identifier or DUID. + +@subsection haQueryFilter Query Filter Class + +The @c isc::ha::QueryFilter class is used to control which DHCP queries are +to be processed by respective servers. It implements the load +balancing algorithm which is triggered by cooperating servers against +each incoming packet and results in assigning the packet to one of the +served "scopes". Scopes are associated with the servers and are named +after the servers. In the load balancing case there are two scopes, +e.g. "server1" and "server2". The Load balancing algorithm selects +one of the scopes for the packet. During the normal operation, +each server handles its own scope. In the "partner-down" state, the +surviving server would handle both scopes. The selection of the +scopes to be served by the server instance is usually made +automatically as a result of transitioning to some new state within +the @c HAService class. However, the scopes assignment can also be +made via control channel as a result of an administrative action. + +@subsection haCommunicationState Communication State Class + +The @c CommunicationState class is used by the @c HAService to +control all aspects of the communication between the active servers, +i.e.: + +- Scheduling periodic heartbeat commands using Boost ASIO timers. +- Holding the state of the partner returned in response to the + heartbeat command. +- Recording when the last successful heartbeat has been sent, i.e. + how long the partner server has been unresponsive. +- Analyzing DHCP queries to detect whether the partner server is + not responsive by checking whether the values in the 'secs' field + or Elapsed Time option are too high. +- Monitoring the clocks skew between the active servers, which is + calculated by substracting the current time (on the local + server) from the time returned by the partner in response to the + heartbeat command. + +The large part of this class is common for the DHCPv4 and DHCPv6 servers. +However, there are differences in how the DHCPv4 and the DHCPv6 messages +are analyzed to detect whether the partner server has stopped responding: + +- The DHCPv4 server uses 'secs' field, while the DHCPv6 server looks + into the DHCPv6 specific Elapsed Time option. +- When the DHCPv4 server records a client information in case if the + DHCPv4 server fails to respond the client's query, it records both the + client identifier and the MAC address. The DHCPv6 server uses the + DUID to record the client. + +Those differences led to creation of DHCPv4 and DHCPv6 specific +derivations of the @c CommunicationState class, which differently +deal with analysis of the queries. + +The clock skew is checked by the @c QueryFilter class every time +it is updated as a result of receiving a response to the heartbeat. +If the clock skew is in the range of 30 to 60 seconds, the +@c clockSkewShouldWarn returns true to indicate to the @c HAService +that a warning should be logged. In order to prevent too frequent +warnings (especially when heartbeats are sent frequently), this +method implements a simple gating algorithm, which would not return +true (trigger the warning) more often than every 60 seconds. + +The @c isc::ha::CommunicationState::clockSkewShouldTerminate informs whether +the clock skew has exceeded 60 seconds, in which case the +@c HAService class would transition to the "terminated" state. + +@subsection haCommandCreator Command Creator Class + +The @c CommandCreator is a collection of static methods which +create commands issued between the HA-enabled DHCP servers. These +JSON commands are sent over the @c isc::http::HttpClient from the +@c HAService class. + +@section haShortcomings Future HA Hooks Library Improvement Ideas + +The HA hooks library was first released with Kea 1.4.0. There are +numerous enhancements to this library considered for the future releases. +Some of them are briefly described in this section. + +@subsection haStateMachineControl Controlling State Machine + +As of Kea 1.4.0, there are no control commands allowing for setting or +influencing the transitions between states. In particular, there is no +way to pause the HA state machine on the selected state to perform +some administrative actions before transitioning to the normal +operation state. + +@subsection haNameUpdates DNS Updates are not Coordinated + +When one of the servers allocates the lease this server is responsible +or sending a DNS update if configured to send such updates. The partner +server receives the lease update (including the inserted hostname) so +it knows that the hostname was stored in the DNS. When this lease +subsequently expires, the hostname must be removed from the DNS. The +HA hooks library, however, has no means to record which server has +allocated this lease in the lease database. If recording such information +had been possible, the same server which allocated the lease would have +sent the removal name change request (NCR) to the D2. Because this +information is unavailable, both servers will send the removal NCRs. +One of those NCRs will succeed, another one will fail. + +Addressing this issue requires two enhancements: + +- Implementing "user context" for leases, which could be used for storing + custom type of information, e.g. server identifier, along with the leases. +- Implementing callouts for the "lease4_expire" and "lease6_expire" hook + points via which the server removing the lease from the database could + notify the partner about such removal. + +@section haMTCompatibility Multi-Threading Compatibility + +The High Availability hooks library is compatible with multi-threading. + +*/ |