diff options
Diffstat (limited to '')
-rw-r--r-- | doc/devel/fuzz.dox | 303 |
1 files changed, 303 insertions, 0 deletions
diff --git a/doc/devel/fuzz.dox b/doc/devel/fuzz.dox new file mode 100644 index 0000000..f4ec931 --- /dev/null +++ b/doc/devel/fuzz.dox @@ -0,0 +1,303 @@ +// Copyright (C) 2017-2021 Internet Systems Consortium, Inc. ("ISC") +// +// This Source Code Form is subject to the terms of the Mozilla Public +// License, v. 2.0. If a copy of the MPL was not distributed with this +// file, You can obtain one at http://mozilla.org/MPL/2.0/. + +/** +@page fuzzer Fuzzing Kea + +@section fuzzIntro Introduction + +Fuzzing is a software-testing technique whereby a program is presented with a +variety of generated data as input and is monitored for abnormal conditions +such as crashes or hangs. + +Fuzz testing of Kea uses the AFL (American Fuzzy Lop) program. In this, Kea is +built using an AFL-supplied program that not only compiles the software but +also instruments it. When run, AFL generates test cases and monitors the +execution of Kea as it processes them. AFL will adjust the input based on +these measurements, seeking to discover and test new execution paths. + +@section fuzzTypes Types of Kea Fuzzing + +@subsection fuzzTypeNetwork Fuzzing with Network Packets + +In this mode, AFL will start an instance of Kea and send it a packet of data. +Kea reads this packet and processes it in the normal way. AFL monitors code +paths taken by Kea and, based on this, will vary the data sent in subsequent +packets. + +@subsection fuzzTypeConfig Fuzzing with Configuration Files + +Kea has a configuration file check mode whereby it will read a configuration +file, report whether the file is valid, then immediately exit. Operation of +the configuration parsing code can be tested with AFL by fuzzing the +configuration file: AFL generates example configuration files based on a +dictionary of valid keywords and runs Kea in configuration file check mode on +them. As with network packet fuzzing, the behaviour of Kea is monitored and +the content of subsequent files adjusted accordingly. + +@section fuzzBuild Building Kea for Fuzzing + +Whatever tests are done, Kea needs to be built with fuzzing in mind. The steps +for this are: + +-# Install AFL on the system on which you plan to build Kea and do the fuzzing. + AFL may be downloaded from http://lcamtuf.coredump.cx/afl. At the time of + writing (August 2019), the latest version is 2.52b. AFL should be built as + per the instructions in the README file in the distribution. The LLVM-based + instrumentation programs should also be built, as per the instructions in + the file llvm_mode/README.llvm (also in the distribution). Note that this + requires that LLVM be installed on the machine used for the fuzzing. + +-# Build Kea. Kea should be compiled and built as usual, although the + following additional steps should be observed: + - Set the environment variable CXX to point to the afl-clang-fast++ + compiler. + - Specify a value of "--prefix" on the command line to set the directory + into which Kea is installed. + - Add the "--enable-fuzz" switch to the "configure" command line. + . + For example: + @code + CXX=/opt/afl/afl-clang-fast++ ./configure --enable-fuzz --prefix=$HOME/installed + make + @endcode + +-# Install Kea to the directory specified by "--prefix": + @code + make install + @endcode + This step is not strictly necessary, but makes running AFL easier. + "libtool", used by the Kea build procedure to build executable images, puts + the executable in a hidden ".libs" subdirectory of the target directory and + creates a shell script in the target directory for running it. The wrapper + script handles the fact that the Kea libraries on which the executable depends + are not installed by fixing up the LD_LIBRARY_PATH environment variable to + point to them. It is possible to set the variable appropriately and use AFL + to run the image from the ".libs" directory; in practice, it is a lot + simpler to install the programs in the directories set by "--prefix" and run + them from there. + +@section fuzzRun Running the Fuzzer + +@subsection fuzzRunNetwork Fuzzing with Network Packets + +-# In this type of fuzzing, Kea is processing packets from the fuzzer over a + network interface. This interface could be a physical interface or it could + be the loopback interface. Either way, it needs to be configured with a + suitable IPv4 or IPv6 address depending on whether kea-dhcp4 or kea-dhcp6 is + being fuzzed. + +-# Once the interface has been decided, these need to be set in the + configuration file used for the test. For example, to fuzz Kea-dhcp4 + using the loopback interface "lo" and IPv4 address 10.53.0.1, the + configuration file would contain the following snippet: + @code + "Dhcp4": { + : + "interfaces-config": { + "interfaces": ["lo/10.53.0.1"] + }, + "subnet4": [ + { + : + "interface": "lo", + : + } + ], + : + } + @endcode + +-# The specification of the interface and address in the configuration file + is used by the main Kea code. Owing to the way that the fuzzing interface + between Kea and AFL is implemented, the address and interface also need to + be specified by the environment variables KEA_AFL_INTERFACE and + KEA_AFL_ADDRESS. With a configuration file containing statements listed + above, the relevant commands are: + @code + export KEA_AFL_INTERFACE="lo" + export KEA_AFL_ADDRESS="10.53.0.1" + @endcode + (If kea-dhcp6 is being fuzzed, then KEA_AFL_ADDRESS should specify an IPv6 + address.) + +-# The fuzzer can now be run: a suitable command line is: + @code + afl-fuzz -m 4096 -i seeds -o fuzz-out -- ./kea-dhcp6 -c kea.conf -p 9001 -P 9002 + @endcode + In the above: + - It is assumed that the directory holding the "afl-fuzz" program is in + the path, otherwise include the path name when invoking it. + - "-m 4096" allows Kea to take up to 4096 MB of memory. (Use "ulimit" to + check and optionally modify the amount of virtual memory that can be used.) + - The "-i" switch specifies a directory (in this example, one named "seeds") + holding "seed" files. These are binary files that AFL will use as its + source for generating new packets. They can generated from a real packet + stream with wireshark: right click on a packet, then export as binary + data. Ensure that only the payload of the UDP packet is exported. + - The "-o" switch specifies a directory (in this example called "fuzz-out") + that AFL will use to hold packets it has generated and packets that it has + found causes crashes or hangs. + - "--" Separates the AFL command line from that of Kea. + - "./kea-dhcp6" is the program being fuzzed. As mentioned above, this + should be an executable image, and it will be simpler to fuzz one + that has been installed. + - The "-c" switch sets the configuration file Kea should use while being + fuzzed. + - "-p 9001 -P 9002". The port on which Kea should listen and the port to + which it should send replies. If omitted, Kea will try to use the default + DHCP ports, which are in the privileged range. Unless run with "sudo", + Kea will fail to open the port and Kea will exit early on: no useful + information will be obtained from the fuzzer. + +-# Check that the fuzzer is working. If run from a terminal (with a black + background - AFL is particular about this), AFL will bring up a curses-style + interface showing the progress of the fuzzing. A good indication that + everything is working is to look at the "total paths" figure. Initially, + this should increase reasonably rapidly. If not, it is likely that Kea is + failing to start or initialize properly and the logging output (assuming + this has been configured) should be examined. + +@subsection fuzzRunConfig Fuzzing with Configuration Files + +AFL can be used to check the parsing of the configuration files. In this type +of fuzzing, AFL generates configuration files which is passes to Kea to check. +Steps for this fuzzing are: + +-# Build Kea as described above. + +-# Create a dictionary of keywords. Although AFL will mutate the files by + byte swaps, bit flips and the like, better results are obtained if it can + create new files based on keywords that could appear in the file. The + dictionary is described in the AFL documentation, but in brief, the file + contains successive lines of the form 'variable=keyword"', e.g. + @code + PD_POOLS="pd-pools" + PEERADDR="peeraddr" + PERSIST="persist" + PKT="pkt" + PKT4="pkt4" + @endcode + "variable" can be anything, as its name is ignored by AFL. However, all the + variable names in the file must be different. "keyword" is a valid keyword + that could appear in the configuration file. The convention adopted in the + example above seems to work well - variables have the same name as keywords, + but are in uppercase and have hyphens replaced by underscores. + +-# Run Kea with a command line of the form: + @code + afl-fuzz -m 4096 -i seeds -o fuzz-out -x dict.dat -- ./kea-dhcp4 -t @@ + @endcode + In the above command line: + - Everything up to and including the "--" is the AFL command. The switches + are as described in the previous section apart from the "-x" switch: this + specifies the dictionary file ("dict.dat" in this example) described + above. + - The Kea command line uses the "-t" switch to specify the configuration + file to check. This is specified by two consecutive "@" signs: AFL + will replace these with the name of a file it has created when starting + Kea. + +@section Fuzzing Internals + +@subsection fuzzInternalNetwork Fuzzing with Network Packets + +The AFL fuzzer delivers packets to Kea's stdin. Although the part of Kea +concerning the reception of packets could have been modified to accept input +from stdin and have Kea pick them up in the normal way, a less-intrusive method +was adopted. + +The packet loop in the main server code for kea-dhcp4 and kea-dhcp6 is +essentially: +@code{.unparsed} +while (not shutting down) { + Read and process one packet +} +@endcode +When --enable-fuzz is specified, this is conceptually modified to: +@code{.unparsed} +while (not shutting down) { + Read stdin and copy data to address/port on which Kea is listening + Read and process one packet +} +@endcode + +Implementation is via an object of class "Fuzz". When created, it identifies +an interface, address and port on which Kea is listening and creates the +appropriate address structures for these. The port is passed as an argument to +the constructor because at the point at which the object is constructed, that +information is readily available. The interface and address are picked up from +the environment variables mentioned above. Consideration was given to +extracting the interface and address information from the configuration file, +but it was decided not to do this: + +-# The configuration file can contain the definition of multiple interfaces; + if this is the case, the one being used for fuzzing is unclear. +-# The code is much simpler if the data is extracted from environment + variables. + +Every time through the loop, the object reads the data from stdin and writes it +to the identified address/port. Control then returns to the main Kea code, +which finds data available on the address/port on which it is listening and +handles the data in the normal way. + +In practice, the "while" line is actually: +@code{.unparsed} +while (__AFL_LOOP(count)) { +@endcode +__AFL_LOOP is a token recognized and expanded by the AFL compiler (so no need +to "#include" a file defining it) that implements the logic for the fuzzing. +Each time through the loop (apart from the first), it raises a SIGSTOP signal +telling AFL that the packet has been processed and instructing it to provide +more data. The "count" value is the number of times through the loop before +the loop terminates and the process is allowed to exit normally. When this +happens, AFL will start the process anew. The purpose of periodically shutting +down the process is to avoid issues raised by the fuzzing being confused with +any issues associated with the process running for a long time (e.g. memory +leaks). + +@subsection fuzzInternalConfig Fuzzing with Configuration Files + +No changes were required to Kea source code to fuzz configuration files. In +fact, other than compiling with afl-clang++ and installing the resultant +executable, no other steps are required. In particular, there is no need to +use the "--enable-fuzz" switch in the configuration command line (although +doing so will not cause any problems). + +@subsection fuzzThreads Changes Required for Multi-Threaded Kea + +The early versions of the fuzzing code used a separate thread to receive the +packets from AFL and to write them to the socket on which Kea is listening. +The lack of synchronization proved a problem, with Kea hanging in some +instances. Although some experiments with thread synchronization were +successful, in the end the far simpler single-threaded implementation described +above was adopted for the single-threaded Kea 1.6. Should Kea be modified to +become multi-threaded, the fuzzing code will need to be changed back to reading +the AFL input in the background. + +@section fuzzNotes Notes + +@subsection fuzzNotesUnitTests Unit Test Failures + +If unit tests are built when --enable-fuzzing is specified, note that tests +which check or use the DHCP servers (i.e. the unit tests in src/bin/dhcp4, +src/bin/dhcp6 and src/bin/kea-admin) will fail. With no AFL-related +environment variables defined, a C++ exception will be thrown with the +description "no fuzzing interface has been set". However, if the +KEA_AFL_INTERFACE and KEA_AFL_ADDRESS variables are set to valid values, the +tests will hang. + +Both these results are expected and should cause no concern. The exception is +thrown by the fuzzing object constructor when it attempts to create the address +structures for routing packets between AFL and Kea but discovers it does not +have the necessary information. The hang is due to the fact that the AFL +processing loop does a synchronous read from stdin, something not expected by +the test. (Should random input be supplied on stdin, e.g. from the keyboard, +the test will most likely fail as the input is unlikely to be that expected by +the test.) + + +*/ |