Adding upstream version 110.0.1.upstream/110.0.1 upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-07 09:22:09 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-07 09:22:09 +0000
commit: 43a97878ce14b72f0981164f87f2e35e14151312 (patch)
tree: 620249daf56c0258faa40cbdcf9cfba06de2a846 /tools/fuzzing/docs
parent: Initial commit. (diff)
download: firefox-upstream.tar.xz
firefox-upstream.zip
2 files changed, 934 insertions, 0 deletions
diff --git a/tools/fuzzing/docs/fuzzing_interface.rst b/tools/fuzzing/docs/fuzzing_interface.rst
new file mode 100644
index 0000000000..561727d540
--- /dev/null
+++ b/tools/fuzzing/docs/fuzzing_interface.rst
@@ -0,0 +1,496 @@
+Fuzzing Interface
+=================
+
+The fuzzing interface is glue code living in mozilla-central in order to
+make it easier for developers and security researchers to test C/C++
+code with either `libFuzzer <https://llvm.org/docs/LibFuzzer.html>`__ or
+`afl-fuzz <http://lcamtuf.coredump.cx/afl/>`__.
+
+These fuzzing tools, are based on *compile-time instrumentation* to measure
+things like branch coverage and more advanced heuristics per fuzzing test.
+Doing so allows these tools to progress through code with little to no custom
+logic/knowledge implemented in the fuzzer itself. Usually, the only thing
+these tools need is a code "shim" that provides the entry point for the fuzzer
+to the code to be tested. We call this additional code a *fuzzing target* and
+the rest of this manual describes how to implement and work with these targets.
+
+As for the tools used with these targets, we currently recommend the use of
+libFuzzer over afl-fuzz, as the latter is no longer maintained while libFuzzer
+is being actively developed. Furthermore, libFuzzer has some advanced
+instrumentation features (e.g. value profiling to deal with complicated
+comparisons in code), making it overall more effective.
+
+What can be tested?
+~~~~~~~~~~~~~~~~~~~
+
+The interface can be used to test all C/C++ code that either ends up in
+``libxul`` (more precisely, the gtest version of ``libxul``) **or** is
+part of the JS engine.
+
+Note that this is not the right testing approach for testing the full
+browser as a whole. It is rather meant for component-based testing
+(especially as some components cannot be easily separated out of the
+full build).
+
+.. note::
+
+   **Note:** If you are working on the JS engine (trying to reproduce a
+   bug or seeking to develop a new fuzzing target), then please also read
+   the :ref:`JS Engine Specifics Section <JS Engine Specifics>` at the end
+   of this documentation, as the JS engine offers additional options for
+   implementing and running fuzzing targets.
+
+
+Reproducing bugs for existing fuzzing targets
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If you are working on a bug that involves an existing fuzzing interface target,
+you have two options for reproducing the issue:
+
+
+Using existing builds
+^^^^^^^^^^^^^^^^^^^^^
+
+We have several fuzzing builds in CI that you can simply download. We recommend
+using ``fuzzfetch`` for this purpose, as it makes downloading and unpacking
+these builds much easier.
+
+You can install ``fuzzfetch`` from
+`Github <https://github.com/MozillaSecurity/fuzzfetch>`__ or
+`via pip <https://pypi.org/project/fuzzfetch/>`__.
+
+Afterwards, you can run
+
+::
+
+   $ python -m fuzzfetch -a --fuzzing --gtest -n firefox-fuzzing
+
+to fetch the latest optimized build. Alternatively, we offer non-ASan debug builds
+which you can download using
+
+::
+
+   $ python -m fuzzfetch -d --fuzzing --gtest -n firefox-fuzzing
+
+In both commands, ``firefox-fuzzing`` indicates the name of the directory that
+will be created for the download.
+
+Afterwards, you can reproduce the bug using
+
+::
+
+   $ FUZZER=TargetName firefox-fuzzing/firefox test.bin
+
+assuming that ``TargetName`` is the name of the fuzzing target specified in the
+bug you are working on and ``test.bin`` is the attached testcase.
+
+.. note::
+
+   **Note:** You should not export the ``FUZZER`` variable permanently
+   in your shell, especially if you plan to do local builds. If the ``FUZZER``
+   variable is exported, it will affect the build process.
+
+If the CI builds don't meet your requirements and you need a local build instead,
+you can follow the steps below to create one:
+
+.. _Local build requirements and flags:
+
+Local build requirements and flags
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You will need a Linux environment with a recent Clang. Using the Clang downloaded
+by ``./mach bootstrap`` or a newer version is recommended.
+
+The only build flag required to enable the fuzzing targets is ``--enable-fuzzing``,
+so adding
+
+::
+
+  ac_add_options --enable-fuzzing
+
+to your ``.mozconfig`` is already sufficient for producing a fuzzing build.
+However, for improved crash handling capabilities and to detect additional errors,
+it is strongly recommended to combine libFuzzer with :ref:`AddressSanitizer <Address Sanitizer>`
+at least for optimized builds and bugs requiring ASan to reproduce at all
+(e.g. you are working on a bug where ASan reports a memory safety violation
+of some sort).
+
+Once your build is complete, if you want to run gtests, you **must** additionally run
+
+::
+
+  $ ./mach gtest dontruntests
+
+to force the gtest libxul to be built.
+
+.. note::
+
+   **Note:** If you modify any code, please ensure that you run **both** build
+   commands to ensure that the gtest libxul is also rebuilt. It is a common mistake
+   to only run ``./mach build`` and miss the second command.
+
+Once these steps are complete, you can reproduce the bug locally using the same
+steps as described above for the downloaded builds.
+
+
+Developing new fuzzing targets
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Developing a new fuzzing target using the fuzzing interface only requires a few steps.
+
+
+Determine if the fuzzing interface is the right tool
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The fuzzing interface is not suitable for every kind of testing. In particular
+if your testing requires the full browser to be running, then you might want to
+look into other testing methods.
+
+The interface uses the ``ScopedXPCOM`` implementation to provide an environment
+in which XPCOM is available and initialized. You can initialize further subsystems
+that you might require, but you are responsible yourself for any kind of
+initialization steps.
+
+There is (in theory) no limit as to how far you can take browser initialization.
+However, the more subsystems are involved, the more problems might occur due to
+non-determinism and loss of performance.
+
+If you are unsure if the fuzzing interface is the right approach for you or you
+require help in evaluating what could be done for your particular task, please
+don't hestitate to :ref:`contact us <Fuzzing#contact-us>`.
+
+
+Develop the fuzzing code
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Where to put your fuzzing code
+''''''''''''''''''''''''''''''
+
+The code using the fuzzing interface usually lives in a separate directory
+called ``fuzztest`` that is on the same level as gtests. If your component
+has no gtests, then a subdirectory either in tests or in your main directory
+will work. If such a directory does not exist yet in your component, then you
+need to create one with a suitable ``moz.build``. See  `the transport target
+for an example <https://searchfox.org/mozilla-central/source/dom/media/webrtc/transport/fuzztest/moz.build>`__
+
+In order to include the new subdirectory into the build process, you will
+also have to modify the toplevel ``moz.build`` file accordingly. For this
+purpose, you should add your directory to ``TEST_DIRS`` only if ``FUZZING_INTERFACES``
+is set. See again `the transport target for an example
+<https://searchfox.org/mozilla-central/rev/de7676288a78b70d2b9927c79493adbf294faad5/media/mtransport/moz.build#18-24>`__.
+
+How your code should look like
+''''''''''''''''''''''''''''''
+
+In order to define your fuzzing target ``MyTarget``, you only need to implement 2 functions:
+
+1. A one-time initialization function.
+
+   At startup, the fuzzing interface calls this function **once**, so this can
+   be used to perform one-time operations like initializing subsystems or parsing
+   extra fuzzing options.
+
+   This function is the equivalent of the `LLVMFuzzerInitialize <https://llvm.org/docs/LibFuzzer.html#startup-initialization>`__
+   function and has the same signature. However, with our fuzzing interface,
+   it won't be resolved by its name, so it can be defined ``static`` and called
+   whatever you prefer. Note that the function should always ``return 0`` and
+   can (except for the return), remain empty.
+
+   For the sake of this documentation, we assume that you have ``static int FuzzingInitMyTarget(int* argc, char*** argv);``
+
+2. The fuzzing iteration function.
+
+   This is where the actual fuzzing happens, and this function is the equivalent
+   of `LLVMFuzzerTestOneInput <https://llvm.org/docs/LibFuzzer.html#fuzz-target>`__.
+   Again, the difference to the fuzzing interface is that the function won't be
+   resolved by its name. In addition, we offer two different possible signatures
+   for this function, either
+
+   ``static int FuzzingRunMyTarget(const uint8_t* data, size_t size);``
+
+   or
+
+   ``static int FuzzingRunMyTarget(nsCOMPtr<nsIInputStream> inputStream);``
+
+   The latter is just a wrapper around the first one for implementations that
+   usually work with streams. No matter which of the two signatures you choose
+   to work with, the only thing you need to implement inside the function
+   is the use of the provided data with your target implementation. This can
+   mean to simply feed the data to your target, using the data to drive operations
+   on the target API, or a mix of both.
+
+   While doing so, you should avoid altering global state in a permanent way,
+   using additional sources of data/randomness or having code run beyond the
+   lifetime of the iteration function (e.g. on another thread), for one simple
+   reason: Coverage-guided fuzzing tools depend on the **deterministic** nature
+   of the iteration function. If the same input to this function does not lead
+   to the same execution when run twice (e.g. because the resulting state depends
+   on multiple successive calls or because of additional external influences),
+   then the tool will not be able to reproduce its fuzzing progress and perform
+   badly. Dealing with this restriction can be challenging e.g. when dealing
+   with asynchronous targets that run multi-threaded, but can usually be managed
+   by synchronizing execution on all threads at the end of the iteration function.
+   For implementations accumulating global state, it might be necessary to
+   (re)initialize this global state in each iteration, rather than doing it once
+   in the initialization function, even if this costs additional performance.
+
+   Note that unlike the vanilla libFuzzer approach, you are allowed to ``return 1``
+   in this function to indicate that an input is "bad". Doing so will cause
+   libFuzzer to discard the input, no matter if it generated new coverage or not.
+   This is particularly useful if you have means to internally detect and catch
+   bad testcase behavior such as timeouts/excessive resource usage etc. to avoid
+   these tests to end up in your corpus.
+
+
+Once you have implemented the two functions, the only thing remaining is to
+register them with the fuzzing interface. For this purpose, we offer two
+macros, depending on which iteration function signature you used. If you
+sticked to the classic signature using buffer and size, you can simply use
+
+::
+
+  #include "FuzzingInterface.h"
+
+  // Your includes and code
+
+  MOZ_FUZZING_INTERFACE_RAW(FuzzingInitMyTarget, FuzzingRunMyTarget, MyTarget);
+
+where ``MyTarget`` is the name of the target and will be used later to decide
+at runtime which target should be used.
+
+If instead you went for the streaming interface, you need a different include,
+but the macro invocation is quite similar:
+
+::
+
+  #include "FuzzingInterfaceStream.h"
+
+  // Your includes and code
+
+  MOZ_FUZZING_INTERFACE_STREAM(FuzzingInitMyTarget, FuzzingRunMyTarget, MyTarget);
+
+For a live example, see also the `implementation of the STUN fuzzing target
+<https://searchfox.org/mozilla-central/source/dom/media/webrtc/transport/fuzztest/stun_parser_libfuzz.cpp>`__.
+
+Add instrumentation to the code being tested
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+libFuzzer requires that the code you are trying to test is instrumented
+with special compiler flags. Fortunately, adding these on a per-directory basis
+can be done just by including the following directive in each ``moz.build``
+file that builds code under test:
+
+::
+
+  # Add libFuzzer configuration directives
+  include('/tools/fuzzing/libfuzzer-config.mozbuild')
+
+
+The include already does the appropriate configuration checks to be only
+active in fuzzing builds, so you don't have to guard this in any way.
+
+.. note::
+
+   **Note:** This include modifies `CFLAGS` and `CXXFLAGS` accordingly
+   but this only works for source files defined in this particular
+   directory. The flags are **not** propagated to subdirectories automatically
+   and you have to ensure that each directory that builds source files
+   for your target has the include added to its ``moz.build`` file.
+
+By keeping the instrumentation limited to the parts that are actually being
+tested using this tool, you not only increase the performance but also potentially
+reduce the amount of noise that libFuzzer sees.
+
+
+Build your code
+^^^^^^^^^^^^^^^
+
+See the :ref:`Build instructions above <Local build requirements and flags>` for instructions
+how to modify your ``.mozconfig`` to create the appropriate build.
+
+
+Running your code and building a corpus
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You need to set the following environment variable to enable running the
+fuzzing code inside Firefox instead of the regular browser.
+
+-  ``FUZZER=name``
+
+Where ``name`` is the name of your fuzzing module that you specified
+when calling the ``MOZ_FUZZING_INTERFACE_RAW`` macro. For the example
+above, this would be ``MyTarget`` or ``StunParser`` for the live example.
+
+Now when you invoke the firefox binary in your build directory with the
+``-help=1`` parameter, you should see the regular libFuzzer help. On
+Linux for example:
+
+::
+
+   $ FUZZER=StunParser obj-asan/dist/bin/firefox -help=1
+
+You should see an output similar to this:
+
+::
+
+   Running Fuzzer tests...
+   Usage:
+
+   To run fuzzing pass 0 or more directories.
+   obj-asan/dist/bin/firefox [-flag1=val1 [-flag2=val2 ...] ] [dir1 [dir2 ...] ]
+
+   To run individual tests without fuzzing pass 1 or more files:
+   obj-asan/dist/bin/firefox [-flag1=val1 [-flag2=val2 ...] ] file1 [file2 ...]
+
+   Flags: (strictly in form -flag=value)
+    verbosity                      1       Verbosity level.
+    seed                           0       Random seed. If 0, seed is generated.
+    runs                           -1      Number of individual test runs (-1 for infinite runs).
+    max_len                        0       Maximum length of the test input. If 0, libFuzzer tries to guess a good value based on the corpus and reports it.
+   ...
+
+
+Reproducing a Crash
+'''''''''''''''''''
+
+In order to reproduce a crash from a given test file, simply put the
+file as the only argument on the command line, e.g.
+
+::
+
+   $ FUZZER=StunParser obj-asan/dist/bin/firefox test.bin
+
+This should reproduce the given problem.
+
+
+FuzzManager and libFuzzer
+'''''''''''''''''''''''''
+
+Our FuzzManager project comes with a harness for running libFuzzer with
+an optional connection to a FuzzManager server instance. Note that this
+connection is not mandatory, even without a server you can make use of
+the local harness.
+
+You can find the harness
+`here <https://github.com/MozillaSecurity/FuzzManager/tree/master/misc/afl-libfuzzer>`__.
+
+An example invocation for the harness to use with StunParser could look
+like this:
+
+::
+
+   FUZZER=StunParser python /path/to/afl-libfuzzer-daemon.py --fuzzmanager \
+       --stats libfuzzer-stunparser.stats --libfuzzer-auto-reduce-min 500 --libfuzzer-auto-reduce 30 \
+       --tool libfuzzer-stunparser --libfuzzer --libfuzzer-instances 6 obj-asan/dist/bin/firefox \
+       -max_len=256 -use_value_profile=1 -rss_limit_mb=3000 corpus-stunparser
+
+What this does is
+
+-  run libFuzzer on the ``StunParser`` target with 6 parallel instances
+   using the corpus in the ``corpus-stunparser`` directory (with the
+   specified libFuzzer options such as ``-max_len`` and
+   ``-use_value_profile``)
+-  automatically reduce the corpus and restart if it grew by 30% (and
+   has at least 500 files)
+-  use FuzzManager (need a local ``.fuzzmanagerconf`` and a
+   ``firefox.fuzzmanagerconf`` binary configuration as described in the
+   FuzzManager manual) and submit crashes as ``libfuzzer-stunparser``
+   tool
+-  write statistics to the ``libfuzzer-stunparser.stats`` file
+
+.. _JS Engine Specifics:
+
+JS Engine Specifics
+~~~~~~~~~~~~~~~~~~~
+
+The fuzzing interface can also be used for testing the JS engine, in fact there
+are two separate options to implement and run fuzzing targets:
+
+Implementing in C++
+^^^^^^^^^^^^^^^^^^^
+
+Similar to the fuzzing interface in Firefox, you can implement your target in
+entirely C++ with very similar interfaces compared to what was described before.
+
+There are a few minor differences though:
+
+1. All of the fuzzing targets live in `js/src/fuzz-tests`.
+
+2. All of the code is linked into a separate binary called `fuzz-tests`,
+   similar to how all JSAPI tests end up in `jsapi-tests`. In order for this
+   binary to be built, you must build a JS shell with ``--enable-fuzzing``
+   **and** ``--enable-tests``. Again, this can and should be combined with
+   AddressSanitizer for maximum effectiveness. This also means that there is no
+   need to (re)build gtests when dealing with a JS fuzzing target and using
+   a shell as part of a full browser build.
+
+3. The harness around the JS implementation already provides you with an
+   initialized ``JSContext`` and global object. You can access these in
+   your target by declaring
+
+   ``extern JS::PersistentRootedObject gGlobal;``
+
+   and
+
+   ``extern JSContext* gCx;``
+
+   but there is no obligation for you to use these.
+
+For a live example, see also the `implementation of the StructuredCloneReader target
+<https://searchfox.org/mozilla-central/source/js/src/fuzz-tests/testStructuredCloneReader.cpp>`__.
+
+
+Implementing in JS
+^^^^^^^^^^^^^^^^^^
+
+In addition to the C++ targets, you can also implement targets in JavaScript
+using the JavaScript Runtime (JSRT) fuzzing approach. Using this approach is
+not only much simpler (since you don't need to know anything about the
+JSAPI or engine internals), but it also gives you full access to everything
+defined in the JS shell, including handy functions such as ``timeout()``.
+
+Of course, this approach also comes with disadvantages: Calling into JS and
+performing the fuzzing operations there costs performance. Also, there is more
+chance for causing global side-effects or non-determinism compared to a
+fairly isolated C++ target.
+
+As a rule of thumb, you should implement the target in JS if
+
+* you don't know C++ and/or how to use the JSAPI (after all, a JS fuzzing target is better than none),
+* your target is expected to have lots of hangs/timeouts (you can catch these internally),
+* or your target is not isolated enough for a C++ target and/or you need specific JS shell functions.
+
+
+There is an `example target <https://searchfox.org/mozilla-central/source/js/src/shell/jsrtfuzzing/jsrtfuzzing-example.js>`__
+in-tree that shows roughly how to implement such a fuzzing target.
+
+To run such a target, you must run the ``js`` (shell) binary instead of the
+``fuzz-tests`` binary and point the ``FUZZER`` variable to the file containing
+your fuzzing target, e.g.
+
+::
+
+   $ FUZZER=/path/to/jsrtfuzzing-example.js obj-asan/dist/bin/js --fuzzing-safe --no-threads -- <libFuzzer options here>
+
+More elaborate targets can be found in `js/src/fuzz-tests/ <https://searchfox.org/mozilla-central/source/js/src/fuzz-tests/>`__.
+
+Troubleshooting
+~~~~~~~~~~~~~~~
+
+
+Fuzzing Interface: Error: No testing callback found
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This error means that the fuzzing callback with the name you specified
+using the ``FUZZER`` environment variable could not be found. Reasons
+for are typically either a misspelled name or that your code wasn't
+built (check your ``moz.build`` file and build log).
+
+
+``mach build`` doesn't seem to update my fuzzing code
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Keep in mind you always need to run both the ``mach build`` and
+``mach gtest dontruntests`` commands in order to update your fuzzing
+code. The latter rebuilds the gtest version of ``libxul``, containing
+your code.
diff --git a/tools/fuzzing/docs/index.rst b/tools/fuzzing/docs/index.rst
new file mode 100644
index 0000000000..9a4e2d01c4
--- /dev/null
+++ b/tools/fuzzing/docs/index.rst
@@ -0,0 +1,438 @@
+Fuzzing
+=======
+
+.. toctree::
+  :maxdepth: 1
+  :hidden:
+  :glob:
+  :reversed:
+
+  *
+
+This section focuses on explaining the software testing technique called
+“Fuzzing” or “Fuzz Testing” and its application to the Mozilla codebase.
+The overall goal is to educate developers about the capabilities and
+usefulness of fuzzing and also allow them to write their own fuzzing
+targets. Note that not all fuzzing tools used at Mozilla are open
+source. Some tools are for internal use only because they can easily
+find critical security vulnerabilities.
+
+What is Fuzzing?
+----------------
+
+Fuzzing (or Fuzz Testing) is a technique to randomly use a program or
+parts of it with the goal to uncover bugs. Random usage can have a wide
+variety of forms, a few common ones are
+
+-  random input data (e.g. file formats, network data, source code, etc.)
+
+-  random API usage
+
+-  random UI interaction
+
+with the first two being the most practical methods used in the field.
+Of course, these methods are not entirely separate, combinations are
+possible. Fuzzing is a great way to find quality issues, some of them
+being also security issues.
+
+Random input data
+~~~~~~~~~~~~~~~~~
+
+This is probably the most obvious fuzzing method: You have code that
+processes data and you provide it with random or mutated data, hoping
+that it will uncover bugs in your implementation. Examples are media
+formats like JPEG or H.264, but basically anything that involves
+processing a “blob” of data can be a valuable target. Countless security
+vulnerabilities in a variety of libraries and programs have been found
+using this method (the AFLFuzz
+`bug-o-rama <http://lcamtuf.coredump.cx/afl/#bugs>`__ gives a good
+impression).
+
+Common tools for this task are e.g.
+`libFuzzer <https://llvm.org/docs/LibFuzzer.html>`__ and
+`AFLFuzz <http://lcamtuf.coredump.cx/afl/>`__, but also specialized
+tools with custom logic like
+`LangFuzz <https://www.usenix.org/system/files/conference/usenixsecurity12/sec12-final73.pdf>`__
+and `Avalanche <https://github.com/MozillaSecurity/avalanche>`__.
+
+Random API Usage
+~~~~~~~~~~~~~~~~
+
+Randomly testing APIs is especially helpful with parts of software that
+expose a well-defined interface (see also :ref:`Well-defined
+behavior and Safety <Well defined behaviour and safety>`). If this interface is additionally exposed to
+untrusted parties/content, then this is a strong sign that random API
+testing would be worthwhile here, also for security reasons. APIs can be
+anything from C++ layer code to APIs offered in the browser.
+
+A good example for a fuzzing target here is the DOM (Document Object
+Model) and various other browser APIs. The browser exposes a variety of
+different APIs for working with documents, media, communication,
+storage, etc. with a growing complexity. Each of these APIs has
+potential bugs that can be uncovered with fuzzing. At Mozilla, we
+currently use domino (internal tool) for this purpose.
+
+Random UI Interaction
+~~~~~~~~~~~~~~~~~~~~~
+
+A third way to test programs and in particular user interfaces is by
+directly interacting with the UI in a random way, typically in
+combination with other actions the program has to perform. Imagine for
+example an automated browser that surfs through the web and randomly
+performs actions such as scrolling, zooming and clicking links. The nice
+thing about this approach is that you likely find many issues that the
+end-user also experiences. However, this approach typically suffers from
+bad reproducibility (see also :ref:`Reproducibility <Reproducibility>`) and is therefore
+often of limited use.
+
+An example for a fuzzing tool using this technique is `Android
+Monkey <https://developer.android.com/studio/test/monkey>`__. At
+Mozilla however, we currently don’t make much use of this approach.
+
+Why Fuzzing Helps You
+---------------------
+
+Understanding the value of fuzzing for you as a developer and software
+quality in general is important to justify the support this testing
+method might need from you. When your component is fuzzed for the first
+time there are two common things you will be confronted with:
+
+**Bug reports that don’t seem real bugs or not important:** Fuzzers
+find all sorts of bugs in various corners of your component, even
+obscure ones. This automatically leads to a larger number of bugs that
+either don’t seem to be bugs (see also the :ref:`Well-defined behavior and
+safety <Well defined behaviour and safety>` section below) or that don’t seem to be important bugs.
+
+Fixing these bugs is still important for the fuzzers because ignoring them
+in fuzzing costs resources (performance, human resources) and might even
+prevent the fuzzer from hitting other bugs. For example certain fuzzing tools
+like libFuzzer run in-process and have to restart on every crash, involving a
+costly re-read of the fuzzing samples.
+
+Also, as some of our code evolves quickly, a corner case might become a
+hot code path in a few months.
+
+**New steps to reproduce:** Fuzzing tools are very likely to exercise
+your component using different methods than an average end-user. A
+common technique is modify existing parts of a program or write entirely
+new code to yield a fuzzing "target". This target is specifically
+designed to work with the fuzzing tools in use. Reproducing the reported
+bugs might require you to learn these new steps to reproduce, including
+building/acquiring that target and having the right environment.
+
+Both of these issues might seem like a waste of time in some cases,
+however, realizing that both steps are a one-time investment for a
+constant stream of valuable bug reports is paramount here. Helping your
+security engineers to overcome these issues will ensure that future
+regressions in your code can be detected at an earlier stage and in a
+form that is more easily actionable. Especially if you are dealing with
+regressions in your code already, fuzzing has the potential to make your
+job as a developer easier.
+
+One of the best examples at Mozilla is the JavaScript engine. The JS
+team has put great quite some effort into getting fuzzing started and
+supporting our work. Here’s what Jan de Mooij, a senior platform
+engineer for the JavaScript engine, has to say about it:
+
+*“Bugs in the engine can cause mysterious browser crashes and bugs that
+are incredibly hard to track down. Fortunately, we don't have to deal
+with these time consuming browser issues very often: usually the fuzzers
+find a reliable shell test long before the bug makes it into a release.
+Fuzzing is invaluable to us and I cannot imagine working on this project
+without it.”*
+
+Levels of Fuzzing in Firefox/Gecko
+----------------------------------
+
+Applying fuzzing to e.g. Firefox happens at different "levels", similar
+to the different types of automated tests we have:
+
+Full Browser Fuzzing
+~~~~~~~~~~~~~~~~~~~~
+
+The most obvious method of testing would be to test the full browser and
+doing so is required for certain features like the DOM and other APIs.
+The advantage here is that we have all the features of the browser
+available and testing happens closely to what we actually ship. The
+downside here though is that browser testing is by far the slowest of
+all testing methods. In addition, it has the most amount of
+non-determinism involved (resulting e.g. in intermittent testcases).
+Browser fuzzing at Mozilla is largely done with the `Grizzly
+framework <https://blog.mozilla.org/security/2019/07/10/grizzly/>`__
+(`meta bug <https://bugzilla.mozilla.org/show_bug.cgi?id=grizzly>`__)
+and one of the most successful fuzzers is the Domino tool (`meta
+bug <https://bugzilla.mozilla.org/show_bug.cgi?id=domino>`__).
+
+Summarizing, full browser fuzzing is the right technique to investigate
+if your feature really requires it. Consider using other methods (see
+below) if your code can be exercised in this way.
+
+The Fuzzing Interface
+~~~~~~~~~~~~~~~~~~~~~
+
+**Fuzzing Interface**
+
+The fuzzing interface is glue code living in mozilla-central in order to make it
+easier for developers and security researchers to test C/C++ code with either libFuzzer or afl-fuzz.
+
+This interface offers a gtest (C++ unit test) level component based
+fuzzing approach and is suitable for anything that could also be
+tested/exercised using a gtest. This method is by far the fastest, but
+usually limited to testing isolated components that can be instantiated
+on this level. Utilizing this method requires you to write a fuzzing
+target similar to writing a gtest. This target will automatically be
+usable with libFuzzer and AFLFuzz. We offer a :ref:`comprehensive manual <Fuzzing Interface>`
+that describes how to write and utilize your own target.
+
+A simple example here is the `SDP parser
+target <https://searchfox.org/mozilla-central/rev/efdf9bb55789ea782ae3a431bda6be74a87b041e/media/webrtc/signaling/fuzztest/sdp_parser_libfuzz.cpp#30>`__,
+which tests the SipccSdpParser in our codebase.
+
+Shell-based Fuzzing
+~~~~~~~~~~~~~~~~~~~
+
+Some of our fuzzing, e.g. JS Engine testing, happens in a separate shell
+program. For JS, this is the JS shell also used for most of the JS tests
+and development. In theory, xpcshell could also be used for testing but
+so far, there has not been a use case for this (most things that can be
+reached through xpcshell can also be tested on the gtest level).
+
+Identifying the right level of fuzzing is the first step towards
+continuous fuzz testing of your code.
+
+Code/Process Requirements for Fuzzing
+-------------------------------------
+
+In this section, we are going to discuss how code should be written in
+order to yield optimal results with fuzzing.
+
+Defect Oracles
+~~~~~~~~~~~~~~
+
+Fuzzing is only effective if you are able to know when a problem has
+been found. Crashes are typically problems if the unit being tested is
+safe for fuzzing (see Well-defined behavior and Safety). But there are
+many more problems that you would want to find, correctness issues,
+corruptions that don’t necessarily crash etc. For this, you need an
+*oracle* that tells you something is wrong.
+
+The simplest defect oracle is the assertion (ex: ``MOZ_ASSERT``).
+Assertions are a very powerful instrument because they can be used to
+determine if your program is performing correctly, even if the bug would
+not lead to any sort of crash. They can encode arbitrarily complex
+information about what is considered correct, information that might
+otherwise only exist in the developers’ minds.
+
+External tools like the sanitizers (AddressSanitizer aka ASan,
+ThreadSanitizer aka TSan, MemorySanitizer aka MSan and
+UndefinedBehaviorSanitizer - UBSan) can also serve as oracles for
+sometimes severe issues that would not necessarily crash. Making sure
+that these tools can be used on your code is highly useful.
+
+Examples for bugs found with sanitizers are `bug
+1419608 <https://bugzilla.mozilla.org/show_bug.cgi?id=1419608>`__,
+`bug 1580288 <https://bugzilla.mozilla.org/show_bug.cgi?id=1580288>`__
+and `bug 922603 <https://bugzilla.mozilla.org/show_bug.cgi?id=922603>`__,
+but since we started using sanitizers, we have found over 1000 bugs with
+these tools.
+
+Another defect oracle can be a reference implementation. Comparing
+program behavior (typically output) between two programs or two modes of
+the same program that should produce the same outputs can find complex
+correctness issues. This method is often called differential testing.
+
+One example where this is regularly used to find issues is the Mozilla
+JavaScript engine: Running random programs with and without JIT
+compilation enabled finds lots of problems with the JIT implementation.
+One example for such a bug is `Bug
+1404636 <https://bugzilla.mozilla.org/show_bug.cgi?id=1404636>`__.
+
+Component Decoupling
+~~~~~~~~~~~~~~~~~~~~
+
+Being able to test components in isolation can be an advantage for
+fuzzing (both for performance and reproducibility). Clear boundaries
+between different components and documentation that explains the
+contracts usually help with this goal. Sometimes it might be useful to
+mock a certain component that the target component is interacting with
+and that is much harder if the components are tightly coupled and their
+contracts unclear. Of course, this does not mean that one should only
+test components in isolation. Sometimes, testing the interaction between
+them is even desirable and does not hurt performance at all.
+
+Avoiding external I/O
+~~~~~~~~~~~~~~~~~~~~~
+
+External I/O like network or file interactions are bad for performance
+and can introduce additional non-determinism. Providing interfaces to
+process data directly from memory instead is usually much more helpful.
+
+.. _Well defined behaviour and safety:
+
+Well-defined Behavior and Safety
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This requirement mostly ties in where defect oracles ended and is one of
+the most important problems seen in the wild nowadays with fuzzing. If a
+part of your program’s behavior is unspecified, then this potentially
+leads to bad times if the behavior is considered a defect by fuzzing.
+For example, if your code has crashes that are not considered bugs, then
+your code might be unsuitable for fuzzing. Your component should be
+fuzzing safe, meaning that any defect oracle (e.g. assertion or crash)
+triggered by the fuzzer is considered a bug. This important aspect is
+often neglected. Be aware that any false positives cause both
+performance degradation and additional manual work for your fuzzing
+team. The Mozilla JS developers for example have implemented this
+concept in a “--fuzzing-safe” switch which disables harmful functions.
+Sometimes, crashes cannot be avoided for handling certain error
+conditions. In such situations, it is important to mark these crashes in
+a way the fuzzer can recognize and distinguish them from undesired
+crashes. However, keep in mind that crashes in general can be disruptive
+to the fuzzing process. Performance is an important aspect of fuzzing
+and frequent crashes can severely degrade performance.
+
+.. _Reproducibility:
+
+Reproducibility
+~~~~~~~~~~~~~~~
+
+Being able to reproduce issues found with fuzzing is necessary for
+several reasons: First, you as the developer probably want a test that
+reproduces the issue so you can debug it better. Our feedback from most
+developers is that traces without a reproducible test can help to find a
+problem, but it makes the whole process very complicated. Some of these
+non-reproducible bugs never get fixed. Second, having a reproducible
+test also helps the triage process by allowing an automated bisection to
+find the responsible developer. Last but not least, the test can be
+added to a test suite, used for automated verification of fixes and even
+serve as a basis for more fuzzing.
+
+Adding functionality to the program that improve reproducibility is
+therefore a good idea in case non-reproducible issues are found. Some
+examples are shown in the next section.
+
+While many problems with reproducibility are specific for the project
+you are working on, there is one source of these problems that many
+programs have in common: Threading. While some bugs only occur in the
+first place due to concurrency, some other bugs would be perfectly
+reproducible without threads, but are intermittent and hard to with
+threading enabled. If the bug is indeed caused by a data race, then
+tools like ThreadSanitizer will help and we are currently working on
+making ThreadSanitizer usable on Firefox. For bugs that are not caused
+by threading, it sometimes makes sense to be able to disable threading
+or limit the amount of worker threads involved.
+
+Supporting Code
+~~~~~~~~~~~~~~~
+
+Some possibilities of what support implementations for fuzzing can do
+have already been named in the previous sections: Additional defect
+oracles and functionality to improve reproducibility and safety. In
+fact, many features added specifically for fuzzing fit into one of these
+categories. However, there’s room for more: Often, there are ways to
+make it easier for fuzzers to exercise complex and hard to reach parts
+of your code. For example, if a certain optimization feature is only
+turned on under very specific conditions (that are not a requirement for
+the optimization), then it makes sense to add a functionality to force
+it on. Then, a fuzzer can hit the optimization code much more
+frequently, increasing the chance to find issues. Some examples from
+Firefox and SpiderMonkey:
+
+- The `FuzzingFunctions <https://searchfox.org/mozilla-central/rev/efdf9bb55789ea782ae3a431bda6be74a87b041e/dom/webidl/FuzzingFunctions.webidl#15>`__
+  interface in the browser allows fuzzing tools to perform GC/CC, tune various
+  settings related to garbage collection or enable features like accessibility
+  mode. Being able to force a garbage collection at a specific time helped
+  identifying lots of problems in the past.
+
+- The --ion-eager and --baseline-eager flags for the JS shell force JIT
+  compilation at various stages, rather than using the builtin
+  heuristic to enable it only for hot functions.
+
+- The --no-threads flag disables all threading (if possible) in the JS shell.
+  This makes some bugs reproduce deterministically that would otherwise be
+  intermittent and harder to find. However, some bugs that only occur with
+  threading can’t be found with this option enabled.
+
+Another important feature that must be turned off for fuzzing is
+checksums. Many file formats use checksums to validate a file before
+processing it. If a checksum feature is still enabled, fuzzers are
+likely never going to produce valid files. The same often holds for
+cryptographic signatures. Being able to turn off the validation of these
+features as part of a fuzzing switch is extremely helpful.
+
+An example for such a checksum can be found in the
+`FlacDemuxer <https://searchfox.org/mozilla-central/rev/efdf9bb55789ea782ae3a431bda6be74a87b041e/dom/media/flac/FlacDemuxer.cpp#494>`__.
+
+Test Samples
+~~~~~~~~~~~~
+
+Some fuzzing strategies make use of existing data that is mutated to
+produce the new random data. In fact, mutation-based strategies are
+typically superior to others if the original samples are of good quality
+because the originals carry a lot of semantics that the fuzzer does not
+have to know about or implement. However, success here really stands and
+falls with the quality of the samples. If the originals don’t cover
+certain parts of the implementation, then the fuzzer will also have to
+do more work to get there.
+
+
+Fuzz Blockers
+~~~~~~~~~~~~~
+
+Fuzz blockers are issues that prevent fuzzers from being as
+effective as possible. Depending on the fuzzer and its scope a fuzz blocker
+in one area (or component) can impede performance in other areas and in
+some cases block the fuzzer all together. Some examples are:
+
+- Frequent crashes - These can block code paths and waste compute
+  resources due to the need to relaunch the fuzzing target and handle
+  the results (regardless of whether it is ignored or reported). This can also
+  include assertions that are mostly benign in many cases are but easily
+  triggered by fuzzers.
+
+- Frequent hangs / timeouts - This includes any issue that slows down
+  or blocks execution of the fuzzer or the target.
+
+- Hard to bucket - This includes crashes such as stack overflows or any issue
+  that crashes in an inconsistent location. This also includes issues that
+  corrupt logs/debugger output or provide a broken/invalid crash report.
+
+- Broken builds - This is fairly straightforward, without up-to-date builds
+  fuzzers are unable to run or verify fixes.
+
+- Missing instrumentation - In some cases tools such as ASan are used as
+  defect oracles and are required by the fuzzing tools to allow for proper
+  automation. In other cases incomplete instrumentation can give a false sense
+  of stability or make investigating issues much more time consuming. Although
+  this is not necessarily blocking the fuzzers it should be prioritized
+  appropriately.
+
+Since these types of crashes harm the overall fuzzing progress, it is important
+for them to be addressed in a timely manner. Even if the bug itself might seem
+trivial and low priority for the product, it can still have devastating effects
+on fuzzing and hence prevent finding other critical issues.
+
+Issues in Bugzilla are marked as fuzz blockers by adding “[fuzzblocker]”
+to the “Whiteboard” field. A list of open issues marked as fuzz blockers
+can be found on `Bugzilla <https://bugzilla.mozilla.org/buglist.cgi?cmdtype=dorem&remaction=run&namedcmd=fuzzblockers&sharer_id=486634>`__.
+
+
+Documentation
+~~~~~~~~~~~~~
+
+It is important for the fuzzing team to know how your software, tests
+and designs work. Even obvious tasks, like how a test program is
+supposed to be invoked, which options are safe, etc. might be hard to
+figure out for the person doing the testing, just as you are reading
+this manual right now to find out what is important in fuzzing.
+
+Contact Us
+~~~~~~~~~~
+
+The fuzzing team can be reached at
+`fuzzing@mozilla.com <mailto:fuzzing@mozilla.com>`__ or
+`on Matrix <https://chat.mozilla.org/#/room/#fuzzing:mozilla.org>`__
+and will be happy to help you with any questions about fuzzing
+you might have. We can help you find the right method of fuzzing for
+your feature, collaborate on the implementation and provide the
+infrastructure to run it and process the results accordingly.
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-07 09:22:09 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-07 09:22:09 +0000
commit	43a97878ce14b72f0981164f87f2e35e14151312 (patch)
tree	620249daf56c0258faa40cbdcf9cfba06de2a846 /tools/fuzzing/docs
parent	Initial commit. (diff)
download	firefox-upstream.tar.xz firefox-upstream.zip