summaryrefslogtreecommitdiffstats
path: root/Documentation/dev-tools
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-27 10:05:51 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-27 10:05:51 +0000
commit5d1646d90e1f2cceb9f0828f4b28318cd0ec7744 (patch)
treea94efe259b9009378be6d90eb30d2b019d95c194 /Documentation/dev-tools
parentInitial commit. (diff)
downloadlinux-5d1646d90e1f2cceb9f0828f4b28318cd0ec7744.tar.xz
linux-5d1646d90e1f2cceb9f0828f4b28318cd0ec7744.zip
Adding upstream version 5.10.209.upstream/5.10.209upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'Documentation/dev-tools')
-rw-r--r--Documentation/dev-tools/coccinelle.rst512
-rw-r--r--Documentation/dev-tools/gcov.rst270
-rw-r--r--Documentation/dev-tools/gdb-kernel-debugging.rst179
-rw-r--r--Documentation/dev-tools/index.rst36
-rw-r--r--Documentation/dev-tools/kasan.rst355
-rw-r--r--Documentation/dev-tools/kcov.rst334
-rw-r--r--Documentation/dev-tools/kcsan.rst316
-rw-r--r--Documentation/dev-tools/kgdb.rst940
-rw-r--r--Documentation/dev-tools/kmemleak.rst259
-rw-r--r--Documentation/dev-tools/kselftest.rst350
-rw-r--r--Documentation/dev-tools/kunit/api/index.rst16
-rw-r--r--Documentation/dev-tools/kunit/api/test.rst11
-rw-r--r--Documentation/dev-tools/kunit/faq.rst103
-rw-r--r--Documentation/dev-tools/kunit/index.rst94
-rw-r--r--Documentation/dev-tools/kunit/kunit-tool.rst57
-rw-r--r--Documentation/dev-tools/kunit/start.rst237
-rw-r--r--Documentation/dev-tools/kunit/style.rst205
-rw-r--r--Documentation/dev-tools/kunit/usage.rst617
-rw-r--r--Documentation/dev-tools/sparse.rst102
-rw-r--r--Documentation/dev-tools/ubsan.rst88
20 files changed, 5081 insertions, 0 deletions
diff --git a/Documentation/dev-tools/coccinelle.rst b/Documentation/dev-tools/coccinelle.rst
new file mode 100644
index 000000000..74c5e6aee
--- /dev/null
+++ b/Documentation/dev-tools/coccinelle.rst
@@ -0,0 +1,512 @@
+.. Copyright 2010 Nicolas Palix <npalix@diku.dk>
+.. Copyright 2010 Julia Lawall <julia@diku.dk>
+.. Copyright 2010 Gilles Muller <Gilles.Muller@lip6.fr>
+
+.. highlight:: none
+
+.. _devtools_coccinelle:
+
+Coccinelle
+==========
+
+Coccinelle is a tool for pattern matching and text transformation that has
+many uses in kernel development, including the application of complex,
+tree-wide patches and detection of problematic programming patterns.
+
+Getting Coccinelle
+------------------
+
+The semantic patches included in the kernel use features and options
+which are provided by Coccinelle version 1.0.0-rc11 and above.
+Using earlier versions will fail as the option names used by
+the Coccinelle files and coccicheck have been updated.
+
+Coccinelle is available through the package manager
+of many distributions, e.g. :
+
+ - Debian
+ - Fedora
+ - Ubuntu
+ - OpenSUSE
+ - Arch Linux
+ - NetBSD
+ - FreeBSD
+
+Some distribution packages are obsolete and it is recommended
+to use the latest version released from the Coccinelle homepage at
+http://coccinelle.lip6.fr/
+
+Or from Github at:
+
+https://github.com/coccinelle/coccinelle
+
+Once you have it, run the following commands::
+
+ ./autogen
+ ./configure
+ make
+
+as a regular user, and install it with::
+
+ sudo make install
+
+More detailed installation instructions to build from source can be
+found at:
+
+https://github.com/coccinelle/coccinelle/blob/master/install.txt
+
+Supplemental documentation
+--------------------------
+
+For supplemental documentation refer to the wiki:
+
+https://bottest.wiki.kernel.org/coccicheck
+
+The wiki documentation always refers to the linux-next version of the script.
+
+For Semantic Patch Language(SmPL) grammar documentation refer to:
+
+http://coccinelle.lip6.fr/documentation.php
+
+Using Coccinelle on the Linux kernel
+------------------------------------
+
+A Coccinelle-specific target is defined in the top level
+Makefile. This target is named ``coccicheck`` and calls the ``coccicheck``
+front-end in the ``scripts`` directory.
+
+Four basic modes are defined: ``patch``, ``report``, ``context``, and
+``org``. The mode to use is specified by setting the MODE variable with
+``MODE=<mode>``.
+
+- ``patch`` proposes a fix, when possible.
+
+- ``report`` generates a list in the following format:
+ file:line:column-column: message
+
+- ``context`` highlights lines of interest and their context in a
+ diff-like style. Lines of interest are indicated with ``-``.
+
+- ``org`` generates a report in the Org mode format of Emacs.
+
+Note that not all semantic patches implement all modes. For easy use
+of Coccinelle, the default mode is "report".
+
+Two other modes provide some common combinations of these modes.
+
+- ``chain`` tries the previous modes in the order above until one succeeds.
+
+- ``rep+ctxt`` runs successively the report mode and the context mode.
+ It should be used with the C option (described later)
+ which checks the code on a file basis.
+
+Examples
+~~~~~~~~
+
+To make a report for every semantic patch, run the following command::
+
+ make coccicheck MODE=report
+
+To produce patches, run::
+
+ make coccicheck MODE=patch
+
+
+The coccicheck target applies every semantic patch available in the
+sub-directories of ``scripts/coccinelle`` to the entire Linux kernel.
+
+For each semantic patch, a commit message is proposed. It gives a
+description of the problem being checked by the semantic patch, and
+includes a reference to Coccinelle.
+
+As with any static code analyzer, Coccinelle produces false
+positives. Thus, reports must be carefully checked, and patches
+reviewed.
+
+To enable verbose messages set the V= variable, for example::
+
+ make coccicheck MODE=report V=1
+
+Coccinelle parallelization
+--------------------------
+
+By default, coccicheck tries to run as parallel as possible. To change
+the parallelism, set the J= variable. For example, to run across 4 CPUs::
+
+ make coccicheck MODE=report J=4
+
+As of Coccinelle 1.0.2 Coccinelle uses Ocaml parmap for parallelization;
+if support for this is detected you will benefit from parmap parallelization.
+
+When parmap is enabled coccicheck will enable dynamic load balancing by using
+``--chunksize 1`` argument. This ensures we keep feeding threads with work
+one by one, so that we avoid the situation where most work gets done by only
+a few threads. With dynamic load balancing, if a thread finishes early we keep
+feeding it more work.
+
+When parmap is enabled, if an error occurs in Coccinelle, this error
+value is propagated back, and the return value of the ``make coccicheck``
+command captures this return value.
+
+Using Coccinelle with a single semantic patch
+---------------------------------------------
+
+The optional make variable COCCI can be used to check a single
+semantic patch. In that case, the variable must be initialized with
+the name of the semantic patch to apply.
+
+For instance::
+
+ make coccicheck COCCI=<my_SP.cocci> MODE=patch
+
+or::
+
+ make coccicheck COCCI=<my_SP.cocci> MODE=report
+
+
+Controlling Which Files are Processed by Coccinelle
+---------------------------------------------------
+
+By default the entire kernel source tree is checked.
+
+To apply Coccinelle to a specific directory, ``M=`` can be used.
+For example, to check drivers/net/wireless/ one may write::
+
+ make coccicheck M=drivers/net/wireless/
+
+To apply Coccinelle on a file basis, instead of a directory basis, the
+C variable is used by the makefile to select which files to work with.
+This variable can be used to run scripts for the entire kernel, a
+specific directory, or for a single file.
+
+For example, to check drivers/bluetooth/bfusb.c, the value 1 is
+passed to the C variable to check files that make considers
+need to be compiled.::
+
+ make C=1 CHECK=scripts/coccicheck drivers/bluetooth/bfusb.o
+
+The value 2 is passed to the C variable to check files regardless of
+whether they need to be compiled or not.::
+
+ make C=2 CHECK=scripts/coccicheck drivers/bluetooth/bfusb.o
+
+In these modes, which work on a file basis, there is no information
+about semantic patches displayed, and no commit message proposed.
+
+This runs every semantic patch in scripts/coccinelle by default. The
+COCCI variable may additionally be used to only apply a single
+semantic patch as shown in the previous section.
+
+The "report" mode is the default. You can select another one with the
+MODE variable explained above.
+
+Debugging Coccinelle SmPL patches
+---------------------------------
+
+Using coccicheck is best as it provides in the spatch command line
+include options matching the options used when we compile the kernel.
+You can learn what these options are by using V=1; you could then
+manually run Coccinelle with debug options added.
+
+Alternatively you can debug running Coccinelle against SmPL patches
+by asking for stderr to be redirected to stderr. By default stderr
+is redirected to /dev/null; if you'd like to capture stderr you
+can specify the ``DEBUG_FILE="file.txt"`` option to coccicheck. For
+instance::
+
+ rm -f cocci.err
+ make coccicheck COCCI=scripts/coccinelle/free/kfree.cocci MODE=report DEBUG_FILE=cocci.err
+ cat cocci.err
+
+You can use SPFLAGS to add debugging flags; for instance you may want to
+add both --profile --show-trying to SPFLAGS when debugging. For example
+you may want to use::
+
+ rm -f err.log
+ export COCCI=scripts/coccinelle/misc/irqf_oneshot.cocci
+ make coccicheck DEBUG_FILE="err.log" MODE=report SPFLAGS="--profile --show-trying" M=./drivers/mfd/arizona-irq.c
+
+err.log will now have the profiling information, while stdout will
+provide some progress information as Coccinelle moves forward with
+work.
+
+DEBUG_FILE support is only supported when using coccinelle >= 1.0.2.
+
+.cocciconfig support
+--------------------
+
+Coccinelle supports reading .cocciconfig for default Coccinelle options that
+should be used every time spatch is spawned. The order of precedence for
+variables for .cocciconfig is as follows:
+
+- Your current user's home directory is processed first
+- Your directory from which spatch is called is processed next
+- The directory provided with the --dir option is processed last, if used
+
+Since coccicheck runs through make, it naturally runs from the kernel
+proper dir; as such the second rule above would be implied for picking up a
+.cocciconfig when using ``make coccicheck``.
+
+``make coccicheck`` also supports using M= targets. If you do not supply
+any M= target, it is assumed you want to target the entire kernel.
+The kernel coccicheck script has::
+
+ if [ "$KBUILD_EXTMOD" = "" ] ; then
+ OPTIONS="--dir $srctree $COCCIINCLUDE"
+ else
+ OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE"
+ fi
+
+KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases
+the spatch --dir argument is used, as such third rule applies when whether M=
+is used or not, and when M= is used the target directory can have its own
+.cocciconfig file. When M= is not passed as an argument to coccicheck the
+target directory is the same as the directory from where spatch was called.
+
+If not using the kernel's coccicheck target, keep the above precedence
+order logic of .cocciconfig reading. If using the kernel's coccicheck target,
+override any of the kernel's .coccicheck's settings using SPFLAGS.
+
+We help Coccinelle when used against Linux with a set of sensible default
+options for Linux with our own Linux .cocciconfig. This hints to coccinelle
+that git can be used for ``git grep`` queries over coccigrep. A timeout of 200
+seconds should suffice for now.
+
+The options picked up by coccinelle when reading a .cocciconfig do not appear
+as arguments to spatch processes running on your system. To confirm what
+options will be used by Coccinelle run::
+
+ spatch --print-options-only
+
+You can override with your own preferred index option by using SPFLAGS. Take
+note that when there are conflicting options Coccinelle takes precedence for
+the last options passed. Using .cocciconfig is possible to use idutils, however
+given the order of precedence followed by Coccinelle, since the kernel now
+carries its own .cocciconfig, you will need to use SPFLAGS to use idutils if
+desired. See below section "Additional flags" for more details on how to use
+idutils.
+
+Additional flags
+----------------
+
+Additional flags can be passed to spatch through the SPFLAGS
+variable. This works as Coccinelle respects the last flags
+given to it when options are in conflict. ::
+
+ make SPFLAGS=--use-glimpse coccicheck
+
+Coccinelle supports idutils as well but requires coccinelle >= 1.0.6.
+When no ID file is specified coccinelle assumes your ID database file
+is in the file .id-utils.index on the top level of the kernel. Coccinelle
+carries a script scripts/idutils_index.sh which creates the database with::
+
+ mkid -i C --output .id-utils.index
+
+If you have another database filename you can also just symlink with this
+name. ::
+
+ make SPFLAGS=--use-idutils coccicheck
+
+Alternatively you can specify the database filename explicitly, for
+instance::
+
+ make SPFLAGS="--use-idutils /full-path/to/ID" coccicheck
+
+See ``spatch --help`` to learn more about spatch options.
+
+Note that the ``--use-glimpse`` and ``--use-idutils`` options
+require external tools for indexing the code. None of them is
+thus active by default. However, by indexing the code with
+one of these tools, and according to the cocci file used,
+spatch could proceed the entire code base more quickly.
+
+SmPL patch specific options
+---------------------------
+
+SmPL patches can have their own requirements for options passed
+to Coccinelle. SmPL patch-specific options can be provided by
+providing them at the top of the SmPL patch, for instance::
+
+ // Options: --no-includes --include-headers
+
+SmPL patch Coccinelle requirements
+----------------------------------
+
+As Coccinelle features get added some more advanced SmPL patches
+may require newer versions of Coccinelle. If an SmPL patch requires
+a minimum version of Coccinelle, this can be specified as follows,
+as an example if requiring at least Coccinelle >= 1.0.5::
+
+ // Requires: 1.0.5
+
+Proposing new semantic patches
+------------------------------
+
+New semantic patches can be proposed and submitted by kernel
+developers. For sake of clarity, they should be organized in the
+sub-directories of ``scripts/coccinelle/``.
+
+
+Detailed description of the ``report`` mode
+-------------------------------------------
+
+``report`` generates a list in the following format::
+
+ file:line:column-column: message
+
+Example
+~~~~~~~
+
+Running::
+
+ make coccicheck MODE=report COCCI=scripts/coccinelle/api/err_cast.cocci
+
+will execute the following part of the SmPL script::
+
+ <smpl>
+ @r depends on !context && !patch && (org || report)@
+ expression x;
+ position p;
+ @@
+
+ ERR_PTR@p(PTR_ERR(x))
+
+ @script:python depends on report@
+ p << r.p;
+ x << r.x;
+ @@
+
+ msg="ERR_CAST can be used with %s" % (x)
+ coccilib.report.print_report(p[0], msg)
+ </smpl>
+
+This SmPL excerpt generates entries on the standard output, as
+illustrated below::
+
+ /home/user/linux/crypto/ctr.c:188:9-16: ERR_CAST can be used with alg
+ /home/user/linux/crypto/authenc.c:619:9-16: ERR_CAST can be used with auth
+ /home/user/linux/crypto/xts.c:227:9-16: ERR_CAST can be used with alg
+
+
+Detailed description of the ``patch`` mode
+------------------------------------------
+
+When the ``patch`` mode is available, it proposes a fix for each problem
+identified.
+
+Example
+~~~~~~~
+
+Running::
+
+ make coccicheck MODE=patch COCCI=scripts/coccinelle/api/err_cast.cocci
+
+will execute the following part of the SmPL script::
+
+ <smpl>
+ @ depends on !context && patch && !org && !report @
+ expression x;
+ @@
+
+ - ERR_PTR(PTR_ERR(x))
+ + ERR_CAST(x)
+ </smpl>
+
+This SmPL excerpt generates patch hunks on the standard output, as
+illustrated below::
+
+ diff -u -p a/crypto/ctr.c b/crypto/ctr.c
+ --- a/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200
+ +++ b/crypto/ctr.c 2010-06-03 23:44:49.000000000 +0200
+ @@ -185,7 +185,7 @@ static struct crypto_instance *crypto_ct
+ alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER,
+ CRYPTO_ALG_TYPE_MASK);
+ if (IS_ERR(alg))
+ - return ERR_PTR(PTR_ERR(alg));
+ + return ERR_CAST(alg);
+
+ /* Block size must be >= 4 bytes. */
+ err = -EINVAL;
+
+Detailed description of the ``context`` mode
+--------------------------------------------
+
+``context`` highlights lines of interest and their context
+in a diff-like style.
+
+ **NOTE**: The diff-like output generated is NOT an applicable patch. The
+ intent of the ``context`` mode is to highlight the important lines
+ (annotated with minus, ``-``) and gives some surrounding context
+ lines around. This output can be used with the diff mode of
+ Emacs to review the code.
+
+Example
+~~~~~~~
+
+Running::
+
+ make coccicheck MODE=context COCCI=scripts/coccinelle/api/err_cast.cocci
+
+will execute the following part of the SmPL script::
+
+ <smpl>
+ @ depends on context && !patch && !org && !report@
+ expression x;
+ @@
+
+ * ERR_PTR(PTR_ERR(x))
+ </smpl>
+
+This SmPL excerpt generates diff hunks on the standard output, as
+illustrated below::
+
+ diff -u -p /home/user/linux/crypto/ctr.c /tmp/nothing
+ --- /home/user/linux/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200
+ +++ /tmp/nothing
+ @@ -185,7 +185,6 @@ static struct crypto_instance *crypto_ct
+ alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER,
+ CRYPTO_ALG_TYPE_MASK);
+ if (IS_ERR(alg))
+ - return ERR_PTR(PTR_ERR(alg));
+
+ /* Block size must be >= 4 bytes. */
+ err = -EINVAL;
+
+Detailed description of the ``org`` mode
+----------------------------------------
+
+``org`` generates a report in the Org mode format of Emacs.
+
+Example
+~~~~~~~
+
+Running::
+
+ make coccicheck MODE=org COCCI=scripts/coccinelle/api/err_cast.cocci
+
+will execute the following part of the SmPL script::
+
+ <smpl>
+ @r depends on !context && !patch && (org || report)@
+ expression x;
+ position p;
+ @@
+
+ ERR_PTR@p(PTR_ERR(x))
+
+ @script:python depends on org@
+ p << r.p;
+ x << r.x;
+ @@
+
+ msg="ERR_CAST can be used with %s" % (x)
+ msg_safe=msg.replace("[","@(").replace("]",")")
+ coccilib.org.print_todo(p[0], msg_safe)
+ </smpl>
+
+This SmPL excerpt generates Org entries on the standard output, as
+illustrated below::
+
+ * TODO [[view:/home/user/linux/crypto/ctr.c::face=ovl-face1::linb=188::colb=9::cole=16][ERR_CAST can be used with alg]]
+ * TODO [[view:/home/user/linux/crypto/authenc.c::face=ovl-face1::linb=619::colb=9::cole=16][ERR_CAST can be used with auth]]
+ * TODO [[view:/home/user/linux/crypto/xts.c::face=ovl-face1::linb=227::colb=9::cole=16][ERR_CAST can be used with alg]]
diff --git a/Documentation/dev-tools/gcov.rst b/Documentation/dev-tools/gcov.rst
new file mode 100644
index 000000000..9e989baae
--- /dev/null
+++ b/Documentation/dev-tools/gcov.rst
@@ -0,0 +1,270 @@
+Using gcov with the Linux kernel
+================================
+
+gcov profiling kernel support enables the use of GCC's coverage testing
+tool gcov_ with the Linux kernel. Coverage data of a running kernel
+is exported in gcov-compatible format via the "gcov" debugfs directory.
+To get coverage data for a specific file, change to the kernel build
+directory and use gcov with the ``-o`` option as follows (requires root)::
+
+ # cd /tmp/linux-out
+ # gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
+
+This will create source code files annotated with execution counts
+in the current directory. In addition, graphical gcov front-ends such
+as lcov_ can be used to automate the process of collecting data
+for the entire kernel and provide coverage overviews in HTML format.
+
+Possible uses:
+
+* debugging (has this line been reached at all?)
+* test improvement (how do I change my test to cover these lines?)
+* minimizing kernel configurations (do I need this option if the
+ associated code is never run?)
+
+.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
+.. _lcov: http://ltp.sourceforge.net/coverage/lcov.php
+
+
+Preparation
+-----------
+
+Configure the kernel with::
+
+ CONFIG_DEBUG_FS=y
+ CONFIG_GCOV_KERNEL=y
+
+and to get coverage data for the entire kernel::
+
+ CONFIG_GCOV_PROFILE_ALL=y
+
+Note that kernels compiled with profiling flags will be significantly
+larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported
+on all architectures.
+
+Profiling data will only become accessible once debugfs has been
+mounted::
+
+ mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+-------------
+
+To enable profiling for specific files or directories, add a line
+similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)::
+
+ GCOV_PROFILE_main.o := y
+
+- For all files in one directory::
+
+ GCOV_PROFILE := y
+
+To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL
+is specified, use::
+
+ GCOV_PROFILE_main.o := n
+
+and::
+
+ GCOV_PROFILE := n
+
+Only files which are linked to the main kernel image or are compiled as
+kernel modules are supported by this mechanism.
+
+
+Files
+-----
+
+The gcov kernel support creates the following files in debugfs:
+
+``/sys/kernel/debug/gcov``
+ Parent directory for all gcov-related files.
+
+``/sys/kernel/debug/gcov/reset``
+ Global reset file: resets all coverage data to zero when
+ written to.
+
+``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda``
+ The actual gcov data file as understood by the gcov
+ tool. Resets file coverage data to zero when written to.
+
+``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno``
+ Symbolic link to a static data file required by the gcov
+ tool. This file is generated by gcc when compiling with
+ option ``-ftest-coverage``.
+
+
+Modules
+-------
+
+Kernel modules may contain cleanup code which is only run during
+module unload time. The gcov mechanism provides a means to collect
+coverage data for such code by keeping a copy of the data associated
+with the unloaded module. This data remains available through debugfs.
+Once the module is loaded again, the associated coverage counters are
+initialized with the data from its previous instantiation.
+
+This behavior can be deactivated by specifying the gcov_persist kernel
+parameter::
+
+ gcov_persist=0
+
+At run-time, a user can also choose to discard data for an unloaded
+module by writing to its data file or the global reset file.
+
+
+Separated build and test machines
+---------------------------------
+
+The gcov kernel profiling infrastructure is designed to work out-of-the
+box for setups where kernels are built and run on the same machine. In
+cases where the kernel runs on a separate machine, special preparations
+must be made, depending on where the gcov tool is used:
+
+a) gcov is run on the TEST machine
+
+ The gcov tool version on the test machine must be compatible with the
+ gcc version used for kernel build. Also the following files need to be
+ copied from build to test machine:
+
+ from the source tree:
+ - all C source files + headers
+
+ from the build tree:
+ - all C source files + headers
+ - all .gcda and .gcno files
+ - all links to directories
+
+ It is important to note that these files need to be placed into the
+ exact same file system location on the test machine as on the build
+ machine. If any of the path components is symbolic link, the actual
+ directory needs to be used instead (due to make's CURDIR handling).
+
+b) gcov is run on the BUILD machine
+
+ The following files need to be copied after each test case from test
+ to build machine:
+
+ from the gcov directory in sysfs:
+ - all .gcda files
+ - all links to .gcno files
+
+ These files can be copied to any location on the build machine. gcov
+ must then be called with the -o option pointing to that directory.
+
+ Example directory setup on the build machine::
+
+ /tmp/linux: kernel source tree
+ /tmp/out: kernel build directory as specified by make O=
+ /tmp/coverage: location of the files copied from the test machine
+
+ [user@build] cd /tmp/out
+ [user@build] gcov -o /tmp/coverage/tmp/out/init main.c
+
+
+Note on compilers
+-----------------
+
+GCC and LLVM gcov tools are not necessarily compatible. Use gcov_ to work with
+GCC-generated .gcno and .gcda files, and use llvm-cov_ for Clang.
+
+.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html
+.. _llvm-cov: https://llvm.org/docs/CommandGuide/llvm-cov.html
+
+Build differences between GCC and Clang gcov are handled by Kconfig. It
+automatically selects the appropriate gcov format depending on the detected
+toolchain.
+
+
+Troubleshooting
+---------------
+
+Problem
+ Compilation aborts during linker step.
+
+Cause
+ Profiling flags are specified for source files which are not
+ linked to the main kernel or which are linked by a custom
+ linker procedure.
+
+Solution
+ Exclude affected source files from profiling by specifying
+ ``GCOV_PROFILE := n`` or ``GCOV_PROFILE_basename.o := n`` in the
+ corresponding Makefile.
+
+Problem
+ Files copied from sysfs appear empty or incomplete.
+
+Cause
+ Due to the way seq_file works, some tools such as cp or tar
+ may not correctly copy files from sysfs.
+
+Solution
+ Use ``cat`` to read ``.gcda`` files and ``cp -d`` to copy links.
+ Alternatively use the mechanism shown in Appendix B.
+
+
+Appendix A: gather_on_build.sh
+------------------------------
+
+Sample script to gather coverage meta files on the build machine
+(see 6a):
+
+.. code-block:: sh
+
+ #!/bin/bash
+
+ KSRC=$1
+ KOBJ=$2
+ DEST=$3
+
+ if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then
+ echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2
+ exit 1
+ fi
+
+ KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
+ KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
+
+ find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \
+ -perm /u+r,g+r | tar cfz $DEST -P -T -
+
+ if [ $? -eq 0 ] ; then
+ echo "$DEST successfully created, copy to test system and unpack with:"
+ echo " tar xfz $DEST -P"
+ else
+ echo "Could not create file $DEST"
+ fi
+
+
+Appendix B: gather_on_test.sh
+-----------------------------
+
+Sample script to gather coverage data files on the test machine
+(see 6b):
+
+.. code-block:: sh
+
+ #!/bin/bash -e
+
+ DEST=$1
+ GCDA=/sys/kernel/debug/gcov
+
+ if [ -z "$DEST" ] ; then
+ echo "Usage: $0 <output.tar.gz>" >&2
+ exit 1
+ fi
+
+ TEMPDIR=$(mktemp -d)
+ echo Collecting data..
+ find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \;
+ find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \;
+ find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \;
+ tar czf $DEST -C $TEMPDIR sys
+ rm -rf $TEMPDIR
+
+ echo "$DEST successfully created, copy to build system and unpack with:"
+ echo " tar xfz $DEST"
diff --git a/Documentation/dev-tools/gdb-kernel-debugging.rst b/Documentation/dev-tools/gdb-kernel-debugging.rst
new file mode 100644
index 000000000..10cdd990b
--- /dev/null
+++ b/Documentation/dev-tools/gdb-kernel-debugging.rst
@@ -0,0 +1,179 @@
+.. highlight:: none
+
+Debugging kernel and modules via gdb
+====================================
+
+The kernel debugger kgdb, hypervisors like QEMU or JTAG-based hardware
+interfaces allow to debug the Linux kernel and its modules during runtime
+using gdb. Gdb comes with a powerful scripting interface for python. The
+kernel provides a collection of helper scripts that can simplify typical
+kernel debugging steps. This is a short tutorial about how to enable and use
+them. It focuses on QEMU/KVM virtual machines as target, but the examples can
+be transferred to the other gdb stubs as well.
+
+
+Requirements
+------------
+
+- gdb 7.2+ (recommended: 7.4+) with python support enabled (typically true
+ for distributions)
+
+
+Setup
+-----
+
+- Create a virtual Linux machine for QEMU/KVM (see www.linux-kvm.org and
+ www.qemu.org for more details). For cross-development,
+ https://landley.net/aboriginal/bin keeps a pool of machine images and
+ toolchains that can be helpful to start from.
+
+- Build the kernel with CONFIG_GDB_SCRIPTS enabled, but leave
+ CONFIG_DEBUG_INFO_REDUCED off. If your architecture supports
+ CONFIG_FRAME_POINTER, keep it enabled.
+
+- Install that kernel on the guest, turn off KASLR if necessary by adding
+ "nokaslr" to the kernel command line.
+ Alternatively, QEMU allows to boot the kernel directly using -kernel,
+ -append, -initrd command line switches. This is generally only useful if
+ you do not depend on modules. See QEMU documentation for more details on
+ this mode. In this case, you should build the kernel with
+ CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR.
+
+- Build the gdb scripts (required on kernels v5.1 and above)::
+
+ make scripts_gdb
+
+- Enable the gdb stub of QEMU/KVM, either
+
+ - at VM startup time by appending "-s" to the QEMU command line
+
+ or
+
+ - during runtime by issuing "gdbserver" from the QEMU monitor
+ console
+
+- cd /path/to/linux-build
+
+- Start gdb: gdb vmlinux
+
+ Note: Some distros may restrict auto-loading of gdb scripts to known safe
+ directories. In case gdb reports to refuse loading vmlinux-gdb.py, add::
+
+ add-auto-load-safe-path /path/to/linux-build
+
+ to ~/.gdbinit. See gdb help for more details.
+
+- Attach to the booted guest::
+
+ (gdb) target remote :1234
+
+
+Examples of using the Linux-provided gdb helpers
+------------------------------------------------
+
+- Load module (and main kernel) symbols::
+
+ (gdb) lx-symbols
+ loading vmlinux
+ scanning for modules in /home/user/linux/build
+ loading @0xffffffffa0020000: /home/user/linux/build/net/netfilter/xt_tcpudp.ko
+ loading @0xffffffffa0016000: /home/user/linux/build/net/netfilter/xt_pkttype.ko
+ loading @0xffffffffa0002000: /home/user/linux/build/net/netfilter/xt_limit.ko
+ loading @0xffffffffa00ca000: /home/user/linux/build/net/packet/af_packet.ko
+ loading @0xffffffffa003c000: /home/user/linux/build/fs/fuse/fuse.ko
+ ...
+ loading @0xffffffffa0000000: /home/user/linux/build/drivers/ata/ata_generic.ko
+
+- Set a breakpoint on some not yet loaded module function, e.g.::
+
+ (gdb) b btrfs_init_sysfs
+ Function "btrfs_init_sysfs" not defined.
+ Make breakpoint pending on future shared library load? (y or [n]) y
+ Breakpoint 1 (btrfs_init_sysfs) pending.
+
+- Continue the target::
+
+ (gdb) c
+
+- Load the module on the target and watch the symbols being loaded as well as
+ the breakpoint hit::
+
+ loading @0xffffffffa0034000: /home/user/linux/build/lib/libcrc32c.ko
+ loading @0xffffffffa0050000: /home/user/linux/build/lib/lzo/lzo_compress.ko
+ loading @0xffffffffa006e000: /home/user/linux/build/lib/zlib_deflate/zlib_deflate.ko
+ loading @0xffffffffa01b1000: /home/user/linux/build/fs/btrfs/btrfs.ko
+
+ Breakpoint 1, btrfs_init_sysfs () at /home/user/linux/fs/btrfs/sysfs.c:36
+ 36 btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj);
+
+- Dump the log buffer of the target kernel::
+
+ (gdb) lx-dmesg
+ [ 0.000000] Initializing cgroup subsys cpuset
+ [ 0.000000] Initializing cgroup subsys cpu
+ [ 0.000000] Linux version 3.8.0-rc4-dbg+ (...
+ [ 0.000000] Command line: root=/dev/sda2 resume=/dev/sda1 vga=0x314
+ [ 0.000000] e820: BIOS-provided physical RAM map:
+ [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
+ [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
+ ....
+
+- Examine fields of the current task struct::
+
+ (gdb) p $lx_current().pid
+ $1 = 4998
+ (gdb) p $lx_current().comm
+ $2 = "modprobe\000\000\000\000\000\000\000"
+
+- Make use of the per-cpu function for the current or a specified CPU::
+
+ (gdb) p $lx_per_cpu("runqueues").nr_running
+ $3 = 1
+ (gdb) p $lx_per_cpu("runqueues", 2).nr_running
+ $4 = 0
+
+- Dig into hrtimers using the container_of helper::
+
+ (gdb) set $next = $lx_per_cpu("hrtimer_bases").clock_base[0].active.next
+ (gdb) p *$container_of($next, "struct hrtimer", "node")
+ $5 = {
+ node = {
+ node = {
+ __rb_parent_color = 18446612133355256072,
+ rb_right = 0x0 <irq_stack_union>,
+ rb_left = 0x0 <irq_stack_union>
+ },
+ expires = {
+ tv64 = 1835268000000
+ }
+ },
+ _softexpires = {
+ tv64 = 1835268000000
+ },
+ function = 0xffffffff81078232 <tick_sched_timer>,
+ base = 0xffff88003fd0d6f0,
+ state = 1,
+ start_pid = 0,
+ start_site = 0xffffffff81055c1f <hrtimer_start_range_ns+20>,
+ start_comm = "swapper/2\000\000\000\000\000\000"
+ }
+
+
+List of commands and functions
+------------------------------
+
+The number of commands and convenience functions may evolve over the time,
+this is just a snapshot of the initial version::
+
+ (gdb) apropos lx
+ function lx_current -- Return current task
+ function lx_module -- Find module by name and return the module variable
+ function lx_per_cpu -- Return per-cpu variable
+ function lx_task_by_pid -- Find Linux task by PID and return the task_struct variable
+ function lx_thread_info -- Calculate Linux thread_info from task variable
+ lx-dmesg -- Print Linux kernel log buffer
+ lx-lsmod -- List currently loaded modules
+ lx-symbols -- (Re-)load symbols of Linux kernel and currently loaded modules
+
+Detailed help can be obtained via "help <command-name>" for commands and "help
+function <function-name>" for convenience functions.
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
new file mode 100644
index 000000000..f7809c7b1
--- /dev/null
+++ b/Documentation/dev-tools/index.rst
@@ -0,0 +1,36 @@
+================================
+Development tools for the kernel
+================================
+
+This document is a collection of documents about development tools that can
+be used to work on the kernel. For now, the documents have been pulled
+together without any significant effort to integrate them into a coherent
+whole; patches welcome!
+
+.. class:: toc-title
+
+ Table of contents
+
+.. toctree::
+ :maxdepth: 2
+
+ coccinelle
+ sparse
+ kcov
+ gcov
+ kasan
+ ubsan
+ kmemleak
+ kcsan
+ gdb-kernel-debugging
+ kgdb
+ kselftest
+ kunit/index
+
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
new file mode 100644
index 000000000..2b68addaa
--- /dev/null
+++ b/Documentation/dev-tools/kasan.rst
@@ -0,0 +1,355 @@
+The Kernel Address Sanitizer (KASAN)
+====================================
+
+Overview
+--------
+
+KernelAddressSANitizer (KASAN) is a dynamic memory error detector designed to
+find out-of-bound and use-after-free bugs. KASAN has two modes: generic KASAN
+(similar to userspace ASan) and software tag-based KASAN (similar to userspace
+HWASan).
+
+KASAN uses compile-time instrumentation to insert validity checks before every
+memory access, and therefore requires a compiler version that supports that.
+
+Generic KASAN is supported in both GCC and Clang. With GCC it requires version
+8.3.0 or later. Any supported Clang version is compatible, but detection of
+out-of-bounds accesses for global variables is only supported since Clang 11.
+
+Tag-based KASAN is only supported in Clang.
+
+Currently generic KASAN is supported for the x86_64, arm64, xtensa, s390 and
+riscv architectures, and tag-based KASAN is supported only for arm64.
+
+Usage
+-----
+
+To enable KASAN configure kernel with::
+
+ CONFIG_KASAN = y
+
+and choose between CONFIG_KASAN_GENERIC (to enable generic KASAN) and
+CONFIG_KASAN_SW_TAGS (to enable software tag-based KASAN).
+
+You also need to choose between CONFIG_KASAN_OUTLINE and CONFIG_KASAN_INLINE.
+Outline and inline are compiler instrumentation types. The former produces
+smaller binary while the latter is 1.1 - 2 times faster.
+
+Both KASAN modes work with both SLUB and SLAB memory allocators.
+For better bug detection and nicer reporting, enable CONFIG_STACKTRACE.
+
+To augment reports with last allocation and freeing stack of the physical page,
+it is recommended to enable also CONFIG_PAGE_OWNER and boot with page_owner=on.
+
+To disable instrumentation for specific files or directories, add a line
+similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)::
+
+ KASAN_SANITIZE_main.o := n
+
+- For all files in one directory::
+
+ KASAN_SANITIZE := n
+
+Error reports
+~~~~~~~~~~~~~
+
+A typical out-of-bounds access generic KASAN report looks like this::
+
+ ==================================================================
+ BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa8/0xbc [test_kasan]
+ Write of size 1 at addr ffff8801f44ec37b by task insmod/2760
+
+ CPU: 1 PID: 2760 Comm: insmod Not tainted 4.19.0-rc3+ #698
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
+ Call Trace:
+ dump_stack+0x94/0xd8
+ print_address_description+0x73/0x280
+ kasan_report+0x144/0x187
+ __asan_report_store1_noabort+0x17/0x20
+ kmalloc_oob_right+0xa8/0xbc [test_kasan]
+ kmalloc_tests_init+0x16/0x700 [test_kasan]
+ do_one_initcall+0xa5/0x3ae
+ do_init_module+0x1b6/0x547
+ load_module+0x75df/0x8070
+ __do_sys_init_module+0x1c6/0x200
+ __x64_sys_init_module+0x6e/0xb0
+ do_syscall_64+0x9f/0x2c0
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+ RIP: 0033:0x7f96443109da
+ RSP: 002b:00007ffcf0b51b08 EFLAGS: 00000202 ORIG_RAX: 00000000000000af
+ RAX: ffffffffffffffda RBX: 000055dc3ee521a0 RCX: 00007f96443109da
+ RDX: 00007f96445cff88 RSI: 0000000000057a50 RDI: 00007f9644992000
+ RBP: 000055dc3ee510b0 R08: 0000000000000003 R09: 0000000000000000
+ R10: 00007f964430cd0a R11: 0000000000000202 R12: 00007f96445cff88
+ R13: 000055dc3ee51090 R14: 0000000000000000 R15: 0000000000000000
+
+ Allocated by task 2760:
+ save_stack+0x43/0xd0
+ kasan_kmalloc+0xa7/0xd0
+ kmem_cache_alloc_trace+0xe1/0x1b0
+ kmalloc_oob_right+0x56/0xbc [test_kasan]
+ kmalloc_tests_init+0x16/0x700 [test_kasan]
+ do_one_initcall+0xa5/0x3ae
+ do_init_module+0x1b6/0x547
+ load_module+0x75df/0x8070
+ __do_sys_init_module+0x1c6/0x200
+ __x64_sys_init_module+0x6e/0xb0
+ do_syscall_64+0x9f/0x2c0
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ Freed by task 815:
+ save_stack+0x43/0xd0
+ __kasan_slab_free+0x135/0x190
+ kasan_slab_free+0xe/0x10
+ kfree+0x93/0x1a0
+ umh_complete+0x6a/0xa0
+ call_usermodehelper_exec_async+0x4c3/0x640
+ ret_from_fork+0x35/0x40
+
+ The buggy address belongs to the object at ffff8801f44ec300
+ which belongs to the cache kmalloc-128 of size 128
+ The buggy address is located 123 bytes inside of
+ 128-byte region [ffff8801f44ec300, ffff8801f44ec380)
+ The buggy address belongs to the page:
+ page:ffffea0007d13b00 count:1 mapcount:0 mapping:ffff8801f7001640 index:0x0
+ flags: 0x200000000000100(slab)
+ raw: 0200000000000100 ffffea0007d11dc0 0000001a0000001a ffff8801f7001640
+ raw: 0000000000000000 0000000080150015 00000001ffffffff 0000000000000000
+ page dumped because: kasan: bad access detected
+
+ Memory state around the buggy address:
+ ffff8801f44ec200: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
+ ffff8801f44ec280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
+ >ffff8801f44ec300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03
+ ^
+ ffff8801f44ec380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
+ ffff8801f44ec400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
+ ==================================================================
+
+The header of the report provides a short summary of what kind of bug happened
+and what kind of access caused it. It's followed by a stack trace of the bad
+access, a stack trace of where the accessed memory was allocated (in case bad
+access happens on a slab object), and a stack trace of where the object was
+freed (in case of a use-after-free bug report). Next comes a description of
+the accessed slab object and information about the accessed memory page.
+
+In the last section the report shows memory state around the accessed address.
+Reading this part requires some understanding of how KASAN works.
+
+The state of each 8 aligned bytes of memory is encoded in one shadow byte.
+Those 8 bytes can be accessible, partially accessible, freed or be a redzone.
+We use the following encoding for each shadow byte: 0 means that all 8 bytes
+of the corresponding memory region are accessible; number N (1 <= N <= 7) means
+that the first N bytes are accessible, and other (8 - N) bytes are not;
+any negative value indicates that the entire 8-byte word is inaccessible.
+We use different negative values to distinguish between different kinds of
+inaccessible memory like redzones or freed memory (see mm/kasan/kasan.h).
+
+In the report above the arrows point to the shadow byte 03, which means that
+the accessed address is partially accessible.
+
+For tag-based KASAN this last report section shows the memory tags around the
+accessed address (see Implementation details section).
+
+
+Implementation details
+----------------------
+
+Generic KASAN
+~~~~~~~~~~~~~
+
+From a high level, our approach to memory error detection is similar to that
+of kmemcheck: use shadow memory to record whether each byte of memory is safe
+to access, and use compile-time instrumentation to insert checks of shadow
+memory on each memory access.
+
+Generic KASAN dedicates 1/8th of kernel memory to its shadow memory (e.g. 16TB
+to cover 128TB on x86_64) and uses direct mapping with a scale and offset to
+translate a memory address to its corresponding shadow address.
+
+Here is the function which translates an address to its corresponding shadow
+address::
+
+ static inline void *kasan_mem_to_shadow(const void *addr)
+ {
+ return ((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT)
+ + KASAN_SHADOW_OFFSET;
+ }
+
+where ``KASAN_SHADOW_SCALE_SHIFT = 3``.
+
+Compile-time instrumentation is used to insert memory access checks. Compiler
+inserts function calls (__asan_load*(addr), __asan_store*(addr)) before each
+memory access of size 1, 2, 4, 8 or 16. These functions check whether memory
+access is valid or not by checking corresponding shadow memory.
+
+GCC 5.0 has possibility to perform inline instrumentation. Instead of making
+function calls GCC directly inserts the code to check the shadow memory.
+This option significantly enlarges kernel but it gives x1.1-x2 performance
+boost over outline instrumented kernel.
+
+Generic KASAN prints up to 2 call_rcu() call stacks in reports, the last one
+and the second to last.
+
+Software tag-based KASAN
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Tag-based KASAN uses the Top Byte Ignore (TBI) feature of modern arm64 CPUs to
+store a pointer tag in the top byte of kernel pointers. Like generic KASAN it
+uses shadow memory to store memory tags associated with each 16-byte memory
+cell (therefore it dedicates 1/16th of the kernel memory for shadow memory).
+
+On each memory allocation tag-based KASAN generates a random tag, tags the
+allocated memory with this tag, and embeds this tag into the returned pointer.
+Software tag-based KASAN uses compile-time instrumentation to insert checks
+before each memory access. These checks make sure that tag of the memory that
+is being accessed is equal to tag of the pointer that is used to access this
+memory. In case of a tag mismatch tag-based KASAN prints a bug report.
+
+Software tag-based KASAN also has two instrumentation modes (outline, that
+emits callbacks to check memory accesses; and inline, that performs the shadow
+memory checks inline). With outline instrumentation mode, a bug report is
+simply printed from the function that performs the access check. With inline
+instrumentation a brk instruction is emitted by the compiler, and a dedicated
+brk handler is used to print bug reports.
+
+A potential expansion of this mode is a hardware tag-based mode, which would
+use hardware memory tagging support instead of compiler instrumentation and
+manual shadow memory manipulation.
+
+What memory accesses are sanitised by KASAN?
+--------------------------------------------
+
+The kernel maps memory in a number of different parts of the address
+space. This poses something of a problem for KASAN, which requires
+that all addresses accessed by instrumented code have a valid shadow
+region.
+
+The range of kernel virtual addresses is large: there is not enough
+real memory to support a real shadow region for every address that
+could be accessed by the kernel.
+
+By default
+~~~~~~~~~~
+
+By default, architectures only map real memory over the shadow region
+for the linear mapping (and potentially other small areas). For all
+other areas - such as vmalloc and vmemmap space - a single read-only
+page is mapped over the shadow area. This read-only shadow page
+declares all memory accesses as permitted.
+
+This presents a problem for modules: they do not live in the linear
+mapping, but in a dedicated module space. By hooking in to the module
+allocator, KASAN can temporarily map real shadow memory to cover
+them. This allows detection of invalid accesses to module globals, for
+example.
+
+This also creates an incompatibility with ``VMAP_STACK``: if the stack
+lives in vmalloc space, it will be shadowed by the read-only page, and
+the kernel will fault when trying to set up the shadow data for stack
+variables.
+
+CONFIG_KASAN_VMALLOC
+~~~~~~~~~~~~~~~~~~~~
+
+With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
+cost of greater memory usage. Currently this is only supported on x86.
+
+This works by hooking into vmalloc and vmap, and dynamically
+allocating real shadow memory to back the mappings.
+
+Most mappings in vmalloc space are small, requiring less than a full
+page of shadow space. Allocating a full shadow page per mapping would
+therefore be wasteful. Furthermore, to ensure that different mappings
+use different shadow pages, mappings would have to be aligned to
+``KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE``.
+
+Instead, we share backing space across multiple mappings. We allocate
+a backing page when a mapping in vmalloc space uses a particular page
+of the shadow region. This page can be shared by other vmalloc
+mappings later on.
+
+We hook in to the vmap infrastructure to lazily clean up unused shadow
+memory.
+
+To avoid the difficulties around swapping mappings around, we expect
+that the part of the shadow region that covers the vmalloc space will
+not be covered by the early shadow page, but will be left
+unmapped. This will require changes in arch-specific code.
+
+This allows ``VMAP_STACK`` support on x86, and can simplify support of
+architectures that do not have a fixed module region.
+
+CONFIG_KASAN_KUNIT_TEST & CONFIG_TEST_KASAN_MODULE
+--------------------------------------------------
+
+``CONFIG_KASAN_KUNIT_TEST`` utilizes the KUnit Test Framework for testing.
+This means each test focuses on a small unit of functionality and
+there are a few ways these tests can be run.
+
+Each test will print the KASAN report if an error is detected and then
+print the number of the test and the status of the test:
+
+pass::
+
+ ok 28 - kmalloc_double_kzfree
+
+or, if kmalloc failed::
+
+ # kmalloc_large_oob_right: ASSERTION FAILED at lib/test_kasan.c:163
+ Expected ptr is not null, but is
+ not ok 4 - kmalloc_large_oob_right
+
+or, if a KASAN report was expected, but not found::
+
+ # kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:629
+ Expected kasan_data->report_expected == kasan_data->report_found, but
+ kasan_data->report_expected == 1
+ kasan_data->report_found == 0
+ not ok 28 - kmalloc_double_kzfree
+
+All test statuses are tracked as they run and an overall status will
+be printed at the end::
+
+ ok 1 - kasan
+
+or::
+
+ not ok 1 - kasan
+
+(1) Loadable Module
+~~~~~~~~~~~~~~~~~~~~
+
+With ``CONFIG_KUNIT`` enabled, ``CONFIG_KASAN_KUNIT_TEST`` can be built as
+a loadable module and run on any architecture that supports KASAN
+using something like insmod or modprobe. The module is called ``test_kasan``.
+
+(2) Built-In
+~~~~~~~~~~~~~
+
+With ``CONFIG_KUNIT`` built-in, ``CONFIG_KASAN_KUNIT_TEST`` can be built-in
+on any architecure that supports KASAN. These and any other KUnit
+tests enabled will run and print the results at boot as a late-init
+call.
+
+(3) Using kunit_tool
+~~~~~~~~~~~~~~~~~~~~~
+
+With ``CONFIG_KUNIT`` and ``CONFIG_KASAN_KUNIT_TEST`` built-in, we can also
+use kunit_tool to see the results of these along with other KUnit
+tests in a more readable way. This will not print the KASAN reports
+of tests that passed. Use `KUnit documentation <https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html>`_ for more up-to-date
+information on kunit_tool.
+
+.. _KUnit: https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html
+
+``CONFIG_TEST_KASAN_MODULE`` is a set of KASAN tests that could not be
+converted to KUnit. These tests can be run only as a module with
+``CONFIG_TEST_KASAN_MODULE`` built as a loadable module and
+``CONFIG_KASAN`` built-in. The type of error expected and the
+function being run is printed before the expression expected to give
+an error. Then the error is printed, if found, and that test
+should be interpretted to pass only if the error was the one expected
+by the test.
diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
new file mode 100644
index 000000000..8548b0b04
--- /dev/null
+++ b/Documentation/dev-tools/kcov.rst
@@ -0,0 +1,334 @@
+kcov: code coverage for fuzzing
+===============================
+
+kcov exposes kernel code coverage information in a form suitable for coverage-
+guided fuzzing (randomized testing). Coverage data of a running kernel is
+exported via the "kcov" debugfs file. Coverage collection is enabled on a task
+basis, and thus it can capture precise coverage of a single system call.
+
+Note that kcov does not aim to collect as much coverage as possible. It aims
+to collect more or less stable coverage that is function of syscall inputs.
+To achieve this goal it does not collect coverage in soft/hard interrupts
+and instrumentation of some inherently non-deterministic parts of kernel is
+disabled (e.g. scheduler, locking).
+
+kcov is also able to collect comparison operands from the instrumented code
+(this feature currently requires that the kernel is compiled with clang).
+
+Prerequisites
+-------------
+
+Configure the kernel with::
+
+ CONFIG_KCOV=y
+
+CONFIG_KCOV requires gcc 6.1.0 or later.
+
+If the comparison operands need to be collected, set::
+
+ CONFIG_KCOV_ENABLE_COMPARISONS=y
+
+Profiling data will only become accessible once debugfs has been mounted::
+
+ mount -t debugfs none /sys/kernel/debug
+
+Coverage collection
+-------------------
+
+The following program demonstrates coverage collection from within a test
+program using kcov:
+
+.. code-block:: c
+
+ #include <stdio.h>
+ #include <stddef.h>
+ #include <stdint.h>
+ #include <stdlib.h>
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <sys/ioctl.h>
+ #include <sys/mman.h>
+ #include <unistd.h>
+ #include <fcntl.h>
+
+ #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
+ #define KCOV_ENABLE _IO('c', 100)
+ #define KCOV_DISABLE _IO('c', 101)
+ #define COVER_SIZE (64<<10)
+
+ #define KCOV_TRACE_PC 0
+ #define KCOV_TRACE_CMP 1
+
+ int main(int argc, char **argv)
+ {
+ int fd;
+ unsigned long *cover, n, i;
+
+ /* A single fd descriptor allows coverage collection on a single
+ * thread.
+ */
+ fd = open("/sys/kernel/debug/kcov", O_RDWR);
+ if (fd == -1)
+ perror("open"), exit(1);
+ /* Setup trace mode and trace size. */
+ if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
+ perror("ioctl"), exit(1);
+ /* Mmap buffer shared between kernel- and user-space. */
+ cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if ((void*)cover == MAP_FAILED)
+ perror("mmap"), exit(1);
+ /* Enable coverage collection on the current thread. */
+ if (ioctl(fd, KCOV_ENABLE, KCOV_TRACE_PC))
+ perror("ioctl"), exit(1);
+ /* Reset coverage from the tail of the ioctl() call. */
+ __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
+ /* That's the target syscal call. */
+ read(-1, NULL, 0);
+ /* Read number of PCs collected. */
+ n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
+ for (i = 0; i < n; i++)
+ printf("0x%lx\n", cover[i + 1]);
+ /* Disable coverage collection for the current thread. After this call
+ * coverage can be enabled for a different thread.
+ */
+ if (ioctl(fd, KCOV_DISABLE, 0))
+ perror("ioctl"), exit(1);
+ /* Free resources. */
+ if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
+ perror("munmap"), exit(1);
+ if (close(fd))
+ perror("close"), exit(1);
+ return 0;
+ }
+
+After piping through addr2line output of the program looks as follows::
+
+ SyS_read
+ fs/read_write.c:562
+ __fdget_pos
+ fs/file.c:774
+ __fget_light
+ fs/file.c:746
+ __fget_light
+ fs/file.c:750
+ __fget_light
+ fs/file.c:760
+ __fdget_pos
+ fs/file.c:784
+ SyS_read
+ fs/read_write.c:562
+
+If a program needs to collect coverage from several threads (independently),
+it needs to open /sys/kernel/debug/kcov in each thread separately.
+
+The interface is fine-grained to allow efficient forking of test processes.
+That is, a parent process opens /sys/kernel/debug/kcov, enables trace mode,
+mmaps coverage buffer and then forks child processes in a loop. Child processes
+only need to enable coverage (disable happens automatically on thread end).
+
+Comparison operands collection
+------------------------------
+
+Comparison operands collection is similar to coverage collection:
+
+.. code-block:: c
+
+ /* Same includes and defines as above. */
+
+ /* Number of 64-bit words per record. */
+ #define KCOV_WORDS_PER_CMP 4
+
+ /*
+ * The format for the types of collected comparisons.
+ *
+ * Bit 0 shows whether one of the arguments is a compile-time constant.
+ * Bits 1 & 2 contain log2 of the argument size, up to 8 bytes.
+ */
+
+ #define KCOV_CMP_CONST (1 << 0)
+ #define KCOV_CMP_SIZE(n) ((n) << 1)
+ #define KCOV_CMP_MASK KCOV_CMP_SIZE(3)
+
+ int main(int argc, char **argv)
+ {
+ int fd;
+ uint64_t *cover, type, arg1, arg2, is_const, size;
+ unsigned long n, i;
+
+ fd = open("/sys/kernel/debug/kcov", O_RDWR);
+ if (fd == -1)
+ perror("open"), exit(1);
+ if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
+ perror("ioctl"), exit(1);
+ /*
+ * Note that the buffer pointer is of type uint64_t*, because all
+ * the comparison operands are promoted to uint64_t.
+ */
+ cover = (uint64_t *)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if ((void*)cover == MAP_FAILED)
+ perror("mmap"), exit(1);
+ /* Note KCOV_TRACE_CMP instead of KCOV_TRACE_PC. */
+ if (ioctl(fd, KCOV_ENABLE, KCOV_TRACE_CMP))
+ perror("ioctl"), exit(1);
+ __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED);
+ read(-1, NULL, 0);
+ /* Read number of comparisons collected. */
+ n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
+ for (i = 0; i < n; i++) {
+ type = cover[i * KCOV_WORDS_PER_CMP + 1];
+ /* arg1 and arg2 - operands of the comparison. */
+ arg1 = cover[i * KCOV_WORDS_PER_CMP + 2];
+ arg2 = cover[i * KCOV_WORDS_PER_CMP + 3];
+ /* ip - caller address. */
+ ip = cover[i * KCOV_WORDS_PER_CMP + 4];
+ /* size of the operands. */
+ size = 1 << ((type & KCOV_CMP_MASK) >> 1);
+ /* is_const - true if either operand is a compile-time constant.*/
+ is_const = type & KCOV_CMP_CONST;
+ printf("ip: 0x%lx type: 0x%lx, arg1: 0x%lx, arg2: 0x%lx, "
+ "size: %lu, %s\n",
+ ip, type, arg1, arg2, size,
+ is_const ? "const" : "non-const");
+ }
+ if (ioctl(fd, KCOV_DISABLE, 0))
+ perror("ioctl"), exit(1);
+ /* Free resources. */
+ if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
+ perror("munmap"), exit(1);
+ if (close(fd))
+ perror("close"), exit(1);
+ return 0;
+ }
+
+Note that the kcov modes (coverage collection or comparison operands) are
+mutually exclusive.
+
+Remote coverage collection
+--------------------------
+
+With KCOV_ENABLE coverage is collected only for syscalls that are issued
+from the current process. With KCOV_REMOTE_ENABLE it's possible to collect
+coverage for arbitrary parts of the kernel code, provided that those parts
+are annotated with kcov_remote_start()/kcov_remote_stop().
+
+This allows to collect coverage from two types of kernel background
+threads: the global ones, that are spawned during kernel boot in a limited
+number of instances (e.g. one USB hub_event() worker thread is spawned per
+USB HCD); and the local ones, that are spawned when a user interacts with
+some kernel interface (e.g. vhost workers); as well as from soft
+interrupts.
+
+To enable collecting coverage from a global background thread or from a
+softirq, a unique global handle must be assigned and passed to the
+corresponding kcov_remote_start() call. Then a userspace process can pass
+a list of such handles to the KCOV_REMOTE_ENABLE ioctl in the handles
+array field of the kcov_remote_arg struct. This will attach the used kcov
+device to the code sections, that are referenced by those handles.
+
+Since there might be many local background threads spawned from different
+userspace processes, we can't use a single global handle per annotation.
+Instead, the userspace process passes a non-zero handle through the
+common_handle field of the kcov_remote_arg struct. This common handle gets
+saved to the kcov_handle field in the current task_struct and needs to be
+passed to the newly spawned threads via custom annotations. Those threads
+should in turn be annotated with kcov_remote_start()/kcov_remote_stop().
+
+Internally kcov stores handles as u64 integers. The top byte of a handle
+is used to denote the id of a subsystem that this handle belongs to, and
+the lower 4 bytes are used to denote the id of a thread instance within
+that subsystem. A reserved value 0 is used as a subsystem id for common
+handles as they don't belong to a particular subsystem. The bytes 4-7 are
+currently reserved and must be zero. In the future the number of bytes
+used for the subsystem or handle ids might be increased.
+
+When a particular userspace proccess collects coverage via a common
+handle, kcov will collect coverage for each code section that is annotated
+to use the common handle obtained as kcov_handle from the current
+task_struct. However non common handles allow to collect coverage
+selectively from different subsystems.
+
+.. code-block:: c
+
+ struct kcov_remote_arg {
+ __u32 trace_mode;
+ __u32 area_size;
+ __u32 num_handles;
+ __aligned_u64 common_handle;
+ __aligned_u64 handles[0];
+ };
+
+ #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
+ #define KCOV_DISABLE _IO('c', 101)
+ #define KCOV_REMOTE_ENABLE _IOW('c', 102, struct kcov_remote_arg)
+
+ #define COVER_SIZE (64 << 10)
+
+ #define KCOV_TRACE_PC 0
+
+ #define KCOV_SUBSYSTEM_COMMON (0x00ull << 56)
+ #define KCOV_SUBSYSTEM_USB (0x01ull << 56)
+
+ #define KCOV_SUBSYSTEM_MASK (0xffull << 56)
+ #define KCOV_INSTANCE_MASK (0xffffffffull)
+
+ static inline __u64 kcov_remote_handle(__u64 subsys, __u64 inst)
+ {
+ if (subsys & ~KCOV_SUBSYSTEM_MASK || inst & ~KCOV_INSTANCE_MASK)
+ return 0;
+ return subsys | inst;
+ }
+
+ #define KCOV_COMMON_ID 0x42
+ #define KCOV_USB_BUS_NUM 1
+
+ int main(int argc, char **argv)
+ {
+ int fd;
+ unsigned long *cover, n, i;
+ struct kcov_remote_arg *arg;
+
+ fd = open("/sys/kernel/debug/kcov", O_RDWR);
+ if (fd == -1)
+ perror("open"), exit(1);
+ if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
+ perror("ioctl"), exit(1);
+ cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
+ PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+ if ((void*)cover == MAP_FAILED)
+ perror("mmap"), exit(1);
+
+ /* Enable coverage collection via common handle and from USB bus #1. */
+ arg = calloc(1, sizeof(*arg) + sizeof(uint64_t));
+ if (!arg)
+ perror("calloc"), exit(1);
+ arg->trace_mode = KCOV_TRACE_PC;
+ arg->area_size = COVER_SIZE;
+ arg->num_handles = 1;
+ arg->common_handle = kcov_remote_handle(KCOV_SUBSYSTEM_COMMON,
+ KCOV_COMMON_ID);
+ arg->handles[0] = kcov_remote_handle(KCOV_SUBSYSTEM_USB,
+ KCOV_USB_BUS_NUM);
+ if (ioctl(fd, KCOV_REMOTE_ENABLE, arg))
+ perror("ioctl"), free(arg), exit(1);
+ free(arg);
+
+ /*
+ * Here the user needs to trigger execution of a kernel code section
+ * that is either annotated with the common handle, or to trigger some
+ * activity on USB bus #1.
+ */
+ sleep(2);
+
+ n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
+ for (i = 0; i < n; i++)
+ printf("0x%lx\n", cover[i + 1]);
+ if (ioctl(fd, KCOV_DISABLE, 0))
+ perror("ioctl"), exit(1);
+ if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
+ perror("munmap"), exit(1);
+ if (close(fd))
+ perror("close"), exit(1);
+ return 0;
+ }
diff --git a/Documentation/dev-tools/kcsan.rst b/Documentation/dev-tools/kcsan.rst
new file mode 100644
index 000000000..be7a0b0e1
--- /dev/null
+++ b/Documentation/dev-tools/kcsan.rst
@@ -0,0 +1,316 @@
+The Kernel Concurrency Sanitizer (KCSAN)
+========================================
+
+The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector, which
+relies on compile-time instrumentation, and uses a watchpoint-based sampling
+approach to detect races. KCSAN's primary purpose is to detect `data races`_.
+
+Usage
+-----
+
+KCSAN is supported by both GCC and Clang. With GCC we require version 11 or
+later, and with Clang also require version 11 or later.
+
+To enable KCSAN configure the kernel with::
+
+ CONFIG_KCSAN = y
+
+KCSAN provides several other configuration options to customize behaviour (see
+the respective help text in ``lib/Kconfig.kcsan`` for more info).
+
+Error reports
+~~~~~~~~~~~~~
+
+A typical data race report looks like this::
+
+ ==================================================================
+ BUG: KCSAN: data-race in generic_permission / kernfs_refresh_inode
+
+ write to 0xffff8fee4c40700c of 4 bytes by task 175 on cpu 4:
+ kernfs_refresh_inode+0x70/0x170
+ kernfs_iop_permission+0x4f/0x90
+ inode_permission+0x190/0x200
+ link_path_walk.part.0+0x503/0x8e0
+ path_lookupat.isra.0+0x69/0x4d0
+ filename_lookup+0x136/0x280
+ user_path_at_empty+0x47/0x60
+ vfs_statx+0x9b/0x130
+ __do_sys_newlstat+0x50/0xb0
+ __x64_sys_newlstat+0x37/0x50
+ do_syscall_64+0x85/0x260
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ read to 0xffff8fee4c40700c of 4 bytes by task 166 on cpu 6:
+ generic_permission+0x5b/0x2a0
+ kernfs_iop_permission+0x66/0x90
+ inode_permission+0x190/0x200
+ link_path_walk.part.0+0x503/0x8e0
+ path_lookupat.isra.0+0x69/0x4d0
+ filename_lookup+0x136/0x280
+ user_path_at_empty+0x47/0x60
+ do_faccessat+0x11a/0x390
+ __x64_sys_access+0x3c/0x50
+ do_syscall_64+0x85/0x260
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ Reported by Kernel Concurrency Sanitizer on:
+ CPU: 6 PID: 166 Comm: systemd-journal Not tainted 5.3.0-rc7+ #1
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ ==================================================================
+
+The header of the report provides a short summary of the functions involved in
+the race. It is followed by the access types and stack traces of the 2 threads
+involved in the data race.
+
+The other less common type of data race report looks like this::
+
+ ==================================================================
+ BUG: KCSAN: data-race in e1000_clean_rx_irq+0x551/0xb10
+
+ race at unknown origin, with read to 0xffff933db8a2ae6c of 1 bytes by interrupt on cpu 0:
+ e1000_clean_rx_irq+0x551/0xb10
+ e1000_clean+0x533/0xda0
+ net_rx_action+0x329/0x900
+ __do_softirq+0xdb/0x2db
+ irq_exit+0x9b/0xa0
+ do_IRQ+0x9c/0xf0
+ ret_from_intr+0x0/0x18
+ default_idle+0x3f/0x220
+ arch_cpu_idle+0x21/0x30
+ do_idle+0x1df/0x230
+ cpu_startup_entry+0x14/0x20
+ rest_init+0xc5/0xcb
+ arch_call_rest_init+0x13/0x2b
+ start_kernel+0x6db/0x700
+
+ Reported by Kernel Concurrency Sanitizer on:
+ CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc7+ #2
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ ==================================================================
+
+This report is generated where it was not possible to determine the other
+racing thread, but a race was inferred due to the data value of the watched
+memory location having changed. These can occur either due to missing
+instrumentation or e.g. DMA accesses. These reports will only be generated if
+``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y`` (selected by default).
+
+Selective analysis
+~~~~~~~~~~~~~~~~~~
+
+It may be desirable to disable data race detection for specific accesses,
+functions, compilation units, or entire subsystems. For static blacklisting,
+the below options are available:
+
+* KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that
+ any data races due to accesses in ``expr`` should be ignored and resulting
+ behaviour when encountering a data race is deemed safe.
+
+* Disabling data race detection for entire functions can be accomplished by
+ using the function attribute ``__no_kcsan``::
+
+ __no_kcsan
+ void foo(void) {
+ ...
+
+ To dynamically limit for which functions to generate reports, see the
+ `DebugFS interface`_ blacklist/whitelist feature.
+
+* To disable data race detection for a particular compilation unit, add to the
+ ``Makefile``::
+
+ KCSAN_SANITIZE_file.o := n
+
+* To disable data race detection for all compilation units listed in a
+ ``Makefile``, add to the respective ``Makefile``::
+
+ KCSAN_SANITIZE := n
+
+Furthermore, it is possible to tell KCSAN to show or hide entire classes of
+data races, depending on preferences. These can be changed via the following
+Kconfig options:
+
+* ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY``: If enabled and a conflicting write
+ is observed via a watchpoint, but the data value of the memory location was
+ observed to remain unchanged, do not report the data race.
+
+* ``CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC``: Assume that plain aligned writes
+ up to word size are atomic by default. Assumes that such writes are not
+ subject to unsafe compiler optimizations resulting in data races. The option
+ causes KCSAN to not report data races due to conflicts where the only plain
+ accesses are aligned writes up to word size.
+
+DebugFS interface
+~~~~~~~~~~~~~~~~~
+
+The file ``/sys/kernel/debug/kcsan`` provides the following interface:
+
+* Reading ``/sys/kernel/debug/kcsan`` returns various runtime statistics.
+
+* Writing ``on`` or ``off`` to ``/sys/kernel/debug/kcsan`` allows turning KCSAN
+ on or off, respectively.
+
+* Writing ``!some_func_name`` to ``/sys/kernel/debug/kcsan`` adds
+ ``some_func_name`` to the report filter list, which (by default) blacklists
+ reporting data races where either one of the top stackframes are a function
+ in the list.
+
+* Writing either ``blacklist`` or ``whitelist`` to ``/sys/kernel/debug/kcsan``
+ changes the report filtering behaviour. For example, the blacklist feature
+ can be used to silence frequently occurring data races; the whitelist feature
+ can help with reproduction and testing of fixes.
+
+Tuning performance
+~~~~~~~~~~~~~~~~~~
+
+Core parameters that affect KCSAN's overall performance and bug detection
+ability are exposed as kernel command-line arguments whose defaults can also be
+changed via the corresponding Kconfig options.
+
+* ``kcsan.skip_watch`` (``CONFIG_KCSAN_SKIP_WATCH``): Number of per-CPU memory
+ operations to skip, before another watchpoint is set up. Setting up
+ watchpoints more frequently will result in the likelihood of races to be
+ observed to increase. This parameter has the most significant impact on
+ overall system performance and race detection ability.
+
+* ``kcsan.udelay_task`` (``CONFIG_KCSAN_UDELAY_TASK``): For tasks, the
+ microsecond delay to stall execution after a watchpoint has been set up.
+ Larger values result in the window in which we may observe a race to
+ increase.
+
+* ``kcsan.udelay_interrupt`` (``CONFIG_KCSAN_UDELAY_INTERRUPT``): For
+ interrupts, the microsecond delay to stall execution after a watchpoint has
+ been set up. Interrupts have tighter latency requirements, and their delay
+ should generally be smaller than the one chosen for tasks.
+
+They may be tweaked at runtime via ``/sys/module/kcsan/parameters/``.
+
+Data Races
+----------
+
+In an execution, two memory accesses form a *data race* if they *conflict*,
+they happen concurrently in different threads, and at least one of them is a
+*plain access*; they *conflict* if both access the same memory location, and at
+least one is a write. For a more thorough discussion and definition, see `"Plain
+Accesses and Data Races" in the LKMM`_.
+
+.. _"Plain Accesses and Data Races" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n1922
+
+Relationship with the Linux-Kernel Memory Consistency Model (LKMM)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The LKMM defines the propagation and ordering rules of various memory
+operations, which gives developers the ability to reason about concurrent code.
+Ultimately this allows to determine the possible executions of concurrent code,
+and if that code is free from data races.
+
+KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``,
+``atomic_*``, etc.), but is oblivious of any ordering guarantees and simply
+assumes that memory barriers are placed correctly. In other words, KCSAN
+assumes that as long as a plain access is not observed to race with another
+conflicting access, memory operations are correctly ordered.
+
+This means that KCSAN will not report *potential* data races due to missing
+memory ordering. Developers should therefore carefully consider the required
+memory ordering requirements that remain unchecked. If, however, missing
+memory ordering (that is observable with a particular compiler and
+architecture) leads to an observable data race (e.g. entering a critical
+section erroneously), KCSAN would report the resulting data race.
+
+Race Detection Beyond Data Races
+--------------------------------
+
+For code with complex concurrency design, race-condition bugs may not always
+manifest as data races. Race conditions occur if concurrently executing
+operations result in unexpected system behaviour. On the other hand, data races
+are defined at the C-language level. The following macros can be used to check
+properties of concurrent code where bugs would not manifest as data races.
+
+.. kernel-doc:: include/linux/kcsan-checks.h
+ :functions: ASSERT_EXCLUSIVE_WRITER ASSERT_EXCLUSIVE_WRITER_SCOPED
+ ASSERT_EXCLUSIVE_ACCESS ASSERT_EXCLUSIVE_ACCESS_SCOPED
+ ASSERT_EXCLUSIVE_BITS
+
+Implementation Details
+----------------------
+
+KCSAN relies on observing that two accesses happen concurrently. Crucially, we
+want to (a) increase the chances of observing races (especially for races that
+manifest rarely), and (b) be able to actually observe them. We can accomplish
+(a) by injecting various delays, and (b) by using address watchpoints (or
+breakpoints).
+
+If we deliberately stall a memory access, while we have a watchpoint for its
+address set up, and then observe the watchpoint to fire, two accesses to the
+same address just raced. Using hardware watchpoints, this is the approach taken
+in `DataCollider
+<http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf>`_.
+Unlike DataCollider, KCSAN does not use hardware watchpoints, but instead
+relies on compiler instrumentation and "soft watchpoints".
+
+In KCSAN, watchpoints are implemented using an efficient encoding that stores
+access type, size, and address in a long; the benefits of using "soft
+watchpoints" are portability and greater flexibility. KCSAN then relies on the
+compiler instrumenting plain accesses. For each instrumented plain access:
+
+1. Check if a matching watchpoint exists; if yes, and at least one access is a
+ write, then we encountered a racing access.
+
+2. Periodically, if no matching watchpoint exists, set up a watchpoint and
+ stall for a small randomized delay.
+
+3. Also check the data value before the delay, and re-check the data value
+ after delay; if the values mismatch, we infer a race of unknown origin.
+
+To detect data races between plain and marked accesses, KCSAN also annotates
+marked accesses, but only to check if a watchpoint exists; i.e. KCSAN never
+sets up a watchpoint on marked accesses. By never setting up watchpoints for
+marked operations, if all accesses to a variable that is accessed concurrently
+are properly marked, KCSAN will never trigger a watchpoint and therefore never
+report the accesses.
+
+Key Properties
+~~~~~~~~~~~~~~
+
+1. **Memory Overhead:** The overall memory overhead is only a few MiB
+ depending on configuration. The current implementation uses a small array of
+ longs to encode watchpoint information, which is negligible.
+
+2. **Performance Overhead:** KCSAN's runtime aims to be minimal, using an
+ efficient watchpoint encoding that does not require acquiring any shared
+ locks in the fast-path. For kernel boot on a system with 8 CPUs:
+
+ - 5.0x slow-down with the default KCSAN config;
+ - 2.8x slow-down from runtime fast-path overhead only (set very large
+ ``KCSAN_SKIP_WATCH`` and unset ``KCSAN_SKIP_WATCH_RANDOMIZE``).
+
+3. **Annotation Overheads:** Minimal annotations are required outside the KCSAN
+ runtime. As a result, maintenance overheads are minimal as the kernel
+ evolves.
+
+4. **Detects Racy Writes from Devices:** Due to checking data values upon
+ setting up watchpoints, racy writes from devices can also be detected.
+
+5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
+ rules; this may result in missed data races (false negatives).
+
+6. **Analysis Accuracy:** For observed executions, due to using a sampling
+ strategy, the analysis is *unsound* (false negatives possible), but aims to
+ be complete (no false positives).
+
+Alternatives Considered
+-----------------------
+
+An alternative data race detection approach for the kernel can be found in the
+`Kernel Thread Sanitizer (KTSAN) <https://github.com/google/ktsan/wiki>`_.
+KTSAN is a happens-before data race detector, which explicitly establishes the
+happens-before order between memory operations, which can then be used to
+determine data races as defined in `Data Races`_.
+
+To build a correct happens-before relation, KTSAN must be aware of all ordering
+rules of the LKMM and synchronization primitives. Unfortunately, any omission
+leads to large numbers of false positives, which is especially detrimental in
+the context of the kernel which includes numerous custom synchronization
+mechanisms. To track the happens-before relation, KTSAN's implementation
+requires metadata for each memory location (shadow memory), which for each page
+corresponds to 4 pages of shadow memory, and can translate into overhead of
+tens of GiB on a large system.
diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
new file mode 100644
index 000000000..77b688e6a
--- /dev/null
+++ b/Documentation/dev-tools/kgdb.rst
@@ -0,0 +1,940 @@
+=================================================
+Using kgdb, kdb and the kernel debugger internals
+=================================================
+
+:Author: Jason Wessel
+
+Introduction
+============
+
+The kernel has two different debugger front ends (kdb and kgdb) which
+interface to the debug core. It is possible to use either of the
+debugger front ends and dynamically transition between them if you
+configure the kernel properly at compile and runtime.
+
+Kdb is simplistic shell-style interface which you can use on a system
+console with a keyboard or serial console. You can use it to inspect
+memory, registers, process lists, dmesg, and even set breakpoints to
+stop in a certain location. Kdb is not a source level debugger, although
+you can set breakpoints and execute some basic kernel run control. Kdb
+is mainly aimed at doing some analysis to aid in development or
+diagnosing kernel problems. You can access some symbols by name in
+kernel built-ins or in kernel modules if the code was built with
+``CONFIG_KALLSYMS``.
+
+Kgdb is intended to be used as a source level debugger for the Linux
+kernel. It is used along with gdb to debug a Linux kernel. The
+expectation is that gdb can be used to "break in" to the kernel to
+inspect memory, variables and look through call stack information
+similar to the way an application developer would use gdb to debug an
+application. It is possible to place breakpoints in kernel code and
+perform some limited execution stepping.
+
+Two machines are required for using kgdb. One of these machines is a
+development machine and the other is the target machine. The kernel to
+be debugged runs on the target machine. The development machine runs an
+instance of gdb against the vmlinux file which contains the symbols (not
+a boot image such as bzImage, zImage, uImage...). In gdb the developer
+specifies the connection parameters and connects to kgdb. The type of
+connection a developer makes with gdb depends on the availability of
+kgdb I/O modules compiled as built-ins or loadable kernel modules in the
+test machine's kernel.
+
+Compiling a kernel
+==================
+
+- In order to enable compilation of kdb, you must first enable kgdb.
+
+- The kgdb test compile options are described in the kgdb test suite
+ chapter.
+
+Kernel config options for kgdb
+------------------------------
+
+To enable ``CONFIG_KGDB`` you should look under
+:menuselection:`Kernel hacking --> Kernel debugging` and select
+:menuselection:`KGDB: kernel debugger`.
+
+While it is not a hard requirement that you have symbols in your vmlinux
+file, gdb tends not to be very useful without the symbolic data, so you
+will want to turn on ``CONFIG_DEBUG_INFO`` which is called
+:menuselection:`Compile the kernel with debug info` in the config menu.
+
+It is advised, but not required, that you turn on the
+``CONFIG_FRAME_POINTER`` kernel option which is called :menuselection:`Compile
+the kernel with frame pointers` in the config menu. This option inserts code
+to into the compiled executable which saves the frame information in
+registers or on the stack at different points which allows a debugger
+such as gdb to more accurately construct stack back traces while
+debugging the kernel.
+
+If the architecture that you are using supports the kernel option
+``CONFIG_STRICT_KERNEL_RWX``, you should consider turning it off. This
+option will prevent the use of software breakpoints because it marks
+certain regions of the kernel's memory space as read-only. If kgdb
+supports it for the architecture you are using, you can use hardware
+breakpoints if you desire to run with the ``CONFIG_STRICT_KERNEL_RWX``
+option turned on, else you need to turn off this option.
+
+Next you should choose one of more I/O drivers to interconnect debugging
+host and debugged target. Early boot debugging requires a KGDB I/O
+driver that supports early debugging and the driver must be built into
+the kernel directly. Kgdb I/O driver configuration takes place via
+kernel or module parameters which you can learn more about in the in the
+section that describes the parameter kgdboc.
+
+Here is an example set of ``.config`` symbols to enable or disable for kgdb::
+
+ # CONFIG_STRICT_KERNEL_RWX is not set
+ CONFIG_FRAME_POINTER=y
+ CONFIG_KGDB=y
+ CONFIG_KGDB_SERIAL_CONSOLE=y
+
+Kernel config options for kdb
+-----------------------------
+
+Kdb is quite a bit more complex than the simple gdbstub sitting on top
+of the kernel's debug core. Kdb must implement a shell, and also adds
+some helper functions in other parts of the kernel, responsible for
+printing out interesting data such as what you would see if you ran
+``lsmod``, or ``ps``. In order to build kdb into the kernel you follow the
+same steps as you would for kgdb.
+
+The main config option for kdb is ``CONFIG_KGDB_KDB`` which is called
+:menuselection:`KGDB_KDB: include kdb frontend for kgdb` in the config menu.
+In theory you would have already also selected an I/O driver such as the
+``CONFIG_KGDB_SERIAL_CONSOLE`` interface if you plan on using kdb on a
+serial port, when you were configuring kgdb.
+
+If you want to use a PS/2-style keyboard with kdb, you would select
+``CONFIG_KDB_KEYBOARD`` which is called :menuselection:`KGDB_KDB: keyboard as
+input device` in the config menu. The ``CONFIG_KDB_KEYBOARD`` option is not
+used for anything in the gdb interface to kgdb. The ``CONFIG_KDB_KEYBOARD``
+option only works with kdb.
+
+Here is an example set of ``.config`` symbols to enable/disable kdb::
+
+ # CONFIG_STRICT_KERNEL_RWX is not set
+ CONFIG_FRAME_POINTER=y
+ CONFIG_KGDB=y
+ CONFIG_KGDB_SERIAL_CONSOLE=y
+ CONFIG_KGDB_KDB=y
+ CONFIG_KDB_KEYBOARD=y
+
+Kernel Debugger Boot Arguments
+==============================
+
+This section describes the various runtime kernel parameters that affect
+the configuration of the kernel debugger. The following chapter covers
+using kdb and kgdb as well as providing some examples of the
+configuration parameters.
+
+Kernel parameter: kgdboc
+------------------------
+
+The kgdboc driver was originally an abbreviation meant to stand for
+"kgdb over console". Today it is the primary mechanism to configure how
+to communicate from gdb to kgdb as well as the devices you want to use
+to interact with the kdb shell.
+
+For kgdb/gdb, kgdboc is designed to work with a single serial port. It
+is intended to cover the circumstance where you want to use a serial
+console as your primary console as well as using it to perform kernel
+debugging. It is also possible to use kgdb on a serial port which is not
+designated as a system console. Kgdboc may be configured as a kernel
+built-in or a kernel loadable module. You can only make use of
+``kgdbwait`` and early debugging if you build kgdboc into the kernel as
+a built-in.
+
+Optionally you can elect to activate kms (Kernel Mode Setting)
+integration. When you use kms with kgdboc and you have a video driver
+that has atomic mode setting hooks, it is possible to enter the debugger
+on the graphics console. When the kernel execution is resumed, the
+previous graphics mode will be restored. This integration can serve as a
+useful tool to aid in diagnosing crashes or doing analysis of memory
+with kdb while allowing the full graphics console applications to run.
+
+kgdboc arguments
+~~~~~~~~~~~~~~~~
+
+Usage::
+
+ kgdboc=[kms][[,]kbd][[,]serial_device][,baud]
+
+The order listed above must be observed if you use any of the optional
+configurations together.
+
+Abbreviations:
+
+- kms = Kernel Mode Setting
+
+- kbd = Keyboard
+
+You can configure kgdboc to use the keyboard, and/or a serial device
+depending on if you are using kdb and/or kgdb, in one of the following
+scenarios. The order listed above must be observed if you use any of the
+optional configurations together. Using kms + only gdb is generally not
+a useful combination.
+
+Using loadable module or built-in
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+1. As a kernel built-in:
+
+ Use the kernel boot argument::
+
+ kgdboc=<tty-device>,[baud]
+
+2. As a kernel loadable module:
+
+ Use the command::
+
+ modprobe kgdboc kgdboc=<tty-device>,[baud]
+
+ Here are two examples of how you might format the kgdboc string. The
+ first is for an x86 target using the first serial port. The second
+ example is for the ARM Versatile AB using the second serial port.
+
+ 1. ``kgdboc=ttyS0,115200``
+
+ 2. ``kgdboc=ttyAMA1,115200``
+
+Configure kgdboc at runtime with sysfs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+At run time you can enable or disable kgdboc by echoing a parameters
+into the sysfs. Here are two examples:
+
+1. Enable kgdboc on ttyS0::
+
+ echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
+
+2. Disable kgdboc::
+
+ echo "" > /sys/module/kgdboc/parameters/kgdboc
+
+.. note::
+
+ You do not need to specify the baud if you are configuring the
+ console on tty which is already configured or open.
+
+More examples
+^^^^^^^^^^^^^
+
+You can configure kgdboc to use the keyboard, and/or a serial device
+depending on if you are using kdb and/or kgdb, in one of the following
+scenarios.
+
+1. kdb and kgdb over only a serial port::
+
+ kgdboc=<serial_device>[,baud]
+
+ Example::
+
+ kgdboc=ttyS0,115200
+
+2. kdb and kgdb with keyboard and a serial port::
+
+ kgdboc=kbd,<serial_device>[,baud]
+
+ Example::
+
+ kgdboc=kbd,ttyS0,115200
+
+3. kdb with a keyboard::
+
+ kgdboc=kbd
+
+4. kdb with kernel mode setting::
+
+ kgdboc=kms,kbd
+
+5. kdb with kernel mode setting and kgdb over a serial port::
+
+ kgdboc=kms,kbd,ttyS0,115200
+
+.. note::
+
+ Kgdboc does not support interrupting the target via the gdb remote
+ protocol. You must manually send a :kbd:`SysRq-G` unless you have a proxy
+ that splits console output to a terminal program. A console proxy has a
+ separate TCP port for the debugger and a separate TCP port for the
+ "human" console. The proxy can take care of sending the :kbd:`SysRq-G`
+ for you.
+
+When using kgdboc with no debugger proxy, you can end up connecting the
+debugger at one of two entry points. If an exception occurs after you
+have loaded kgdboc, a message should print on the console stating it is
+waiting for the debugger. In this case you disconnect your terminal
+program and then connect the debugger in its place. If you want to
+interrupt the target system and forcibly enter a debug session you have
+to issue a :kbd:`Sysrq` sequence and then type the letter :kbd:`g`. Then you
+disconnect the terminal session and connect gdb. Your options if you
+don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as
+on the initial connect, or to use a debugger proxy that allows an
+unmodified gdb to do the debugging.
+
+Kernel parameter: ``kgdboc_earlycon``
+-------------------------------------
+
+If you specify the kernel parameter ``kgdboc_earlycon`` and your serial
+driver registers a boot console that supports polling (doesn't need
+interrupts and implements a nonblocking read() function) kgdb will attempt
+to work using the boot console until it can transition to the regular
+tty driver specified by the ``kgdboc`` parameter.
+
+Normally there is only one boot console (especially that implements the
+read() function) so just adding ``kgdboc_earlycon`` on its own is
+sufficient to make this work. If you have more than one boot console you
+can add the boot console's name to differentiate. Note that names that
+are registered through the boot console layer and the tty layer are not
+the same for the same port.
+
+For instance, on one board to be explicit you might do::
+
+ kgdboc_earlycon=qcom_geni kgdboc=ttyMSM0
+
+If the only boot console on the device was "qcom_geni", you could simplify::
+
+ kgdboc_earlycon kgdboc=ttyMSM0
+
+Kernel parameter: ``kgdbwait``
+------------------------------
+
+The Kernel command line option ``kgdbwait`` makes kgdb wait for a
+debugger connection during booting of a kernel. You can only use this
+option if you compiled a kgdb I/O driver into the kernel and you
+specified the I/O driver configuration as a kernel command line option.
+The kgdbwait parameter should always follow the configuration parameter
+for the kgdb I/O driver in the kernel command line else the I/O driver
+will not be configured prior to asking the kernel to use it to wait.
+
+The kernel will stop and wait as early as the I/O driver and
+architecture allows when you use this option. If you build the kgdb I/O
+driver as a loadable kernel module kgdbwait will not do anything.
+
+Kernel parameter: ``kgdbcon``
+-----------------------------
+
+The ``kgdbcon`` feature allows you to see printk() messages inside gdb
+while gdb is connected to the kernel. Kdb does not make use of the kgdbcon
+feature.
+
+Kgdb supports using the gdb serial protocol to send console messages to
+the debugger when the debugger is connected and running. There are two
+ways to activate this feature.
+
+1. Activate with the kernel command line option::
+
+ kgdbcon
+
+2. Use sysfs before configuring an I/O driver::
+
+ echo 1 > /sys/module/kgdb/parameters/kgdb_use_con
+
+.. note::
+
+ If you do this after you configure the kgdb I/O driver, the
+ setting will not take effect until the next point the I/O is
+ reconfigured.
+
+.. important::
+
+ You cannot use kgdboc + kgdbcon on a tty that is an
+ active system console. An example of incorrect usage is::
+
+ console=ttyS0,115200 kgdboc=ttyS0 kgdbcon
+
+It is possible to use this option with kgdboc on a tty that is not a
+system console.
+
+Run time parameter: ``kgdbreboot``
+----------------------------------
+
+The kgdbreboot feature allows you to change how the debugger deals with
+the reboot notification. You have 3 choices for the behavior. The
+default behavior is always set to 0.
+
+.. tabularcolumns:: |p{0.4cm}|p{11.5cm}|p{5.6cm}|
+
+.. flat-table::
+ :widths: 1 10 8
+
+ * - 1
+ - ``echo -1 > /sys/module/debug_core/parameters/kgdbreboot``
+ - Ignore the reboot notification entirely.
+
+ * - 2
+ - ``echo 0 > /sys/module/debug_core/parameters/kgdbreboot``
+ - Send the detach message to any attached debugger client.
+
+ * - 3
+ - ``echo 1 > /sys/module/debug_core/parameters/kgdbreboot``
+ - Enter the debugger on reboot notify.
+
+Kernel parameter: ``nokaslr``
+-----------------------------
+
+If the architecture that you are using enable KASLR by default,
+you should consider turning it off. KASLR randomizes the
+virtual address where the kernel image is mapped and confuse
+gdb which resolve kernel symbol address from symbol table
+of vmlinux.
+
+Using kdb
+=========
+
+Quick start for kdb on a serial port
+------------------------------------
+
+This is a quick example of how to use kdb.
+
+1. Configure kgdboc at boot using kernel parameters::
+
+ console=ttyS0,115200 kgdboc=ttyS0,115200 nokaslr
+
+ OR
+
+ Configure kgdboc after the kernel has booted; assuming you are using
+ a serial port console::
+
+ echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
+
+2. Enter the kernel debugger manually or by waiting for an oops or
+ fault. There are several ways you can enter the kernel debugger
+ manually; all involve using the :kbd:`SysRq-G`, which means you must have
+ enabled ``CONFIG_MAGIC_SysRq=y`` in your kernel config.
+
+ - When logged in as root or with a super user session you can run::
+
+ echo g > /proc/sysrq-trigger
+
+ - Example using minicom 2.2
+
+ Press: :kbd:`CTRL-A` :kbd:`f` :kbd:`g`
+
+ - When you have telneted to a terminal server that supports sending
+ a remote break
+
+ Press: :kbd:`CTRL-]`
+
+ Type in: ``send break``
+
+ Press: :kbd:`Enter` :kbd:`g`
+
+3. From the kdb prompt you can run the ``help`` command to see a complete
+ list of the commands that are available.
+
+ Some useful commands in kdb include:
+
+ =========== =================================================================
+ ``lsmod`` Shows where kernel modules are loaded
+ ``ps`` Displays only the active processes
+ ``ps A`` Shows all the processes
+ ``summary`` Shows kernel version info and memory usage
+ ``bt`` Get a backtrace of the current process using dump_stack()
+ ``dmesg`` View the kernel syslog buffer
+ ``go`` Continue the system
+ =========== =================================================================
+
+4. When you are done using kdb you need to consider rebooting the system
+ or using the ``go`` command to resuming normal kernel execution. If you
+ have paused the kernel for a lengthy period of time, applications
+ that rely on timely networking or anything to do with real wall clock
+ time could be adversely affected, so you should take this into
+ consideration when using the kernel debugger.
+
+Quick start for kdb using a keyboard connected console
+------------------------------------------------------
+
+This is a quick example of how to use kdb with a keyboard.
+
+1. Configure kgdboc at boot using kernel parameters::
+
+ kgdboc=kbd
+
+ OR
+
+ Configure kgdboc after the kernel has booted::
+
+ echo kbd > /sys/module/kgdboc/parameters/kgdboc
+
+2. Enter the kernel debugger manually or by waiting for an oops or
+ fault. There are several ways you can enter the kernel debugger
+ manually; all involve using the :kbd:`SysRq-G`, which means you must have
+ enabled ``CONFIG_MAGIC_SysRq=y`` in your kernel config.
+
+ - When logged in as root or with a super user session you can run::
+
+ echo g > /proc/sysrq-trigger
+
+ - Example using a laptop keyboard:
+
+ Press and hold down: :kbd:`Alt`
+
+ Press and hold down: :kbd:`Fn`
+
+ Press and release the key with the label: :kbd:`SysRq`
+
+ Release: :kbd:`Fn`
+
+ Press and release: :kbd:`g`
+
+ Release: :kbd:`Alt`
+
+ - Example using a PS/2 101-key keyboard
+
+ Press and hold down: :kbd:`Alt`
+
+ Press and release the key with the label: :kbd:`SysRq`
+
+ Press and release: :kbd:`g`
+
+ Release: :kbd:`Alt`
+
+3. Now type in a kdb command such as ``help``, ``dmesg``, ``bt`` or ``go`` to
+ continue kernel execution.
+
+Using kgdb / gdb
+================
+
+In order to use kgdb you must activate it by passing configuration
+information to one of the kgdb I/O drivers. If you do not pass any
+configuration information kgdb will not do anything at all. Kgdb will
+only actively hook up to the kernel trap hooks if a kgdb I/O driver is
+loaded and configured. If you unconfigure a kgdb I/O driver, kgdb will
+unregister all the kernel hook points.
+
+All kgdb I/O drivers can be reconfigured at run time, if
+``CONFIG_SYSFS`` and ``CONFIG_MODULES`` are enabled, by echo'ing a new
+config string to ``/sys/module/<driver>/parameter/<option>``. The driver
+can be unconfigured by passing an empty string. You cannot change the
+configuration while the debugger is attached. Make sure to detach the
+debugger with the ``detach`` command prior to trying to unconfigure a
+kgdb I/O driver.
+
+Connecting with gdb to a serial port
+------------------------------------
+
+1. Configure kgdboc
+
+ Configure kgdboc at boot using kernel parameters::
+
+ kgdboc=ttyS0,115200
+
+ OR
+
+ Configure kgdboc after the kernel has booted::
+
+ echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc
+
+2. Stop kernel execution (break into the debugger)
+
+ In order to connect to gdb via kgdboc, the kernel must first be
+ stopped. There are several ways to stop the kernel which include
+ using kgdbwait as a boot argument, via a :kbd:`SysRq-G`, or running the
+ kernel until it takes an exception where it waits for the debugger to
+ attach.
+
+ - When logged in as root or with a super user session you can run::
+
+ echo g > /proc/sysrq-trigger
+
+ - Example using minicom 2.2
+
+ Press: :kbd:`CTRL-A` :kbd:`f` :kbd:`g`
+
+ - When you have telneted to a terminal server that supports sending
+ a remote break
+
+ Press: :kbd:`CTRL-]`
+
+ Type in: ``send break``
+
+ Press: :kbd:`Enter` :kbd:`g`
+
+3. Connect from gdb
+
+ Example (using a directly connected port)::
+
+ % gdb ./vmlinux
+ (gdb) set remotebaud 115200
+ (gdb) target remote /dev/ttyS0
+
+
+ Example (kgdb to a terminal server on TCP port 2012)::
+
+ % gdb ./vmlinux
+ (gdb) target remote 192.168.2.2:2012
+
+
+ Once connected, you can debug a kernel the way you would debug an
+ application program.
+
+ If you are having problems connecting or something is going seriously
+ wrong while debugging, it will most often be the case that you want
+ to enable gdb to be verbose about its target communications. You do
+ this prior to issuing the ``target remote`` command by typing in::
+
+ set debug remote 1
+
+Remember if you continue in gdb, and need to "break in" again, you need
+to issue an other :kbd:`SysRq-G`. It is easy to create a simple entry point by
+putting a breakpoint at ``sys_sync`` and then you can run ``sync`` from a
+shell or script to break into the debugger.
+
+kgdb and kdb interoperability
+=============================
+
+It is possible to transition between kdb and kgdb dynamically. The debug
+core will remember which you used the last time and automatically start
+in the same mode.
+
+Switching between kdb and kgdb
+------------------------------
+
+Switching from kgdb to kdb
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are two ways to switch from kgdb to kdb: you can use gdb to issue
+a maintenance packet, or you can blindly type the command ``$3#33``.
+Whenever the kernel debugger stops in kgdb mode it will print the
+message ``KGDB or $3#33 for KDB``. It is important to note that you have
+to type the sequence correctly in one pass. You cannot type a backspace
+or delete because kgdb will interpret that as part of the debug stream.
+
+1. Change from kgdb to kdb by blindly typing::
+
+ $3#33
+
+2. Change from kgdb to kdb with gdb::
+
+ maintenance packet 3
+
+ .. note::
+
+ Now you must kill gdb. Typically you press :kbd:`CTRL-Z` and issue
+ the command::
+
+ kill -9 %
+
+Change from kdb to kgdb
+~~~~~~~~~~~~~~~~~~~~~~~
+
+There are two ways you can change from kdb to kgdb. You can manually
+enter kgdb mode by issuing the kgdb command from the kdb shell prompt,
+or you can connect gdb while the kdb shell prompt is active. The kdb
+shell looks for the typical first commands that gdb would issue with the
+gdb remote protocol and if it sees one of those commands it
+automatically changes into kgdb mode.
+
+1. From kdb issue the command::
+
+ kgdb
+
+ Now disconnect your terminal program and connect gdb in its place
+
+2. At the kdb prompt, disconnect the terminal program and connect gdb in
+ its place.
+
+Running kdb commands from gdb
+-----------------------------
+
+It is possible to run a limited set of kdb commands from gdb, using the
+gdb monitor command. You don't want to execute any of the run control or
+breakpoint operations, because it can disrupt the state of the kernel
+debugger. You should be using gdb for breakpoints and run control
+operations if you have gdb connected. The more useful commands to run
+are things like lsmod, dmesg, ps or possibly some of the memory
+information commands. To see all the kdb commands you can run
+``monitor help``.
+
+Example::
+
+ (gdb) monitor ps
+ 1 idle process (state I) and
+ 27 sleeping system daemon (state M) processes suppressed,
+ use 'ps A' to see all.
+ Task Addr Pid Parent [*] cpu State Thread Command
+
+ 0xc78291d0 1 0 0 0 S 0xc7829404 init
+ 0xc7954150 942 1 0 0 S 0xc7954384 dropbear
+ 0xc78789c0 944 1 0 0 S 0xc7878bf4 sh
+ (gdb)
+
+kgdb Test Suite
+===============
+
+When kgdb is enabled in the kernel config you can also elect to enable
+the config parameter ``KGDB_TESTS``. Turning this on will enable a special
+kgdb I/O module which is designed to test the kgdb internal functions.
+
+The kgdb tests are mainly intended for developers to test the kgdb
+internals as well as a tool for developing a new kgdb architecture
+specific implementation. These tests are not really for end users of the
+Linux kernel. The primary source of documentation would be to look in
+the ``drivers/misc/kgdbts.c`` file.
+
+The kgdb test suite can also be configured at compile time to run the
+core set of tests by setting the kernel config parameter
+``KGDB_TESTS_ON_BOOT``. This particular option is aimed at automated
+regression testing and does not require modifying the kernel boot config
+arguments. If this is turned on, the kgdb test suite can be disabled by
+specifying ``kgdbts=`` as a kernel boot argument.
+
+Kernel Debugger Internals
+=========================
+
+Architecture Specifics
+----------------------
+
+The kernel debugger is organized into a number of components:
+
+1. The debug core
+
+ The debug core is found in ``kernel/debugger/debug_core.c``. It
+ contains:
+
+ - A generic OS exception handler which includes sync'ing the
+ processors into a stopped state on an multi-CPU system.
+
+ - The API to talk to the kgdb I/O drivers
+
+ - The API to make calls to the arch-specific kgdb implementation
+
+ - The logic to perform safe memory reads and writes to memory while
+ using the debugger
+
+ - A full implementation for software breakpoints unless overridden
+ by the arch
+
+ - The API to invoke either the kdb or kgdb frontend to the debug
+ core.
+
+ - The structures and callback API for atomic kernel mode setting.
+
+ .. note:: kgdboc is where the kms callbacks are invoked.
+
+2. kgdb arch-specific implementation
+
+ This implementation is generally found in ``arch/*/kernel/kgdb.c``. As
+ an example, ``arch/x86/kernel/kgdb.c`` contains the specifics to
+ implement HW breakpoint as well as the initialization to dynamically
+ register and unregister for the trap handlers on this architecture.
+ The arch-specific portion implements:
+
+ - contains an arch-specific trap catcher which invokes
+ kgdb_handle_exception() to start kgdb about doing its work
+
+ - translation to and from gdb specific packet format to struct pt_regs
+
+ - Registration and unregistration of architecture specific trap
+ hooks
+
+ - Any special exception handling and cleanup
+
+ - NMI exception handling and cleanup
+
+ - (optional) HW breakpoints
+
+3. gdbstub frontend (aka kgdb)
+
+ The gdbstub is located in ``kernel/debug/gdbstub.c``. It contains:
+
+ - All the logic to implement the gdb serial protocol
+
+4. kdb frontend
+
+ The kdb debugger shell is broken down into a number of components.
+ The kdb core is located in kernel/debug/kdb. There are a number of
+ helper functions in some of the other kernel components to make it
+ possible for kdb to examine and report information about the kernel
+ without taking locks that could cause a kernel deadlock. The kdb core
+ contains implements the following functionality.
+
+ - A simple shell
+
+ - The kdb core command set
+
+ - A registration API to register additional kdb shell commands.
+
+ - A good example of a self-contained kdb module is the ``ftdump``
+ command for dumping the ftrace buffer. See:
+ ``kernel/trace/trace_kdb.c``
+
+ - For an example of how to dynamically register a new kdb command
+ you can build the kdb_hello.ko kernel module from
+ ``samples/kdb/kdb_hello.c``. To build this example you can set
+ ``CONFIG_SAMPLES=y`` and ``CONFIG_SAMPLE_KDB=m`` in your kernel
+ config. Later run ``modprobe kdb_hello`` and the next time you
+ enter the kdb shell, you can run the ``hello`` command.
+
+ - The implementation for kdb_printf() which emits messages directly
+ to I/O drivers, bypassing the kernel log.
+
+ - SW / HW breakpoint management for the kdb shell
+
+5. kgdb I/O driver
+
+ Each kgdb I/O driver has to provide an implementation for the
+ following:
+
+ - configuration via built-in or module
+
+ - dynamic configuration and kgdb hook registration calls
+
+ - read and write character interface
+
+ - A cleanup handler for unconfiguring from the kgdb core
+
+ - (optional) Early debug methodology
+
+ Any given kgdb I/O driver has to operate very closely with the
+ hardware and must do it in such a way that does not enable interrupts
+ or change other parts of the system context without completely
+ restoring them. The kgdb core will repeatedly "poll" a kgdb I/O
+ driver for characters when it needs input. The I/O driver is expected
+ to return immediately if there is no data available. Doing so allows
+ for the future possibility to touch watchdog hardware in such a way
+ as to have a target system not reset when these are enabled.
+
+If you are intent on adding kgdb architecture specific support for a new
+architecture, the architecture should define ``HAVE_ARCH_KGDB`` in the
+architecture specific Kconfig file. This will enable kgdb for the
+architecture, and at that point you must create an architecture specific
+kgdb implementation.
+
+There are a few flags which must be set on every architecture in their
+``asm/kgdb.h`` file. These are:
+
+- ``NUMREGBYTES``:
+ The size in bytes of all of the registers, so that we
+ can ensure they will all fit into a packet.
+
+- ``BUFMAX``:
+ The size in bytes of the buffer GDB will read into. This must
+ be larger than NUMREGBYTES.
+
+- ``CACHE_FLUSH_IS_SAFE``:
+ Set to 1 if it is always safe to call
+ flush_cache_range or flush_icache_range. On some architectures,
+ these functions may not be safe to call on SMP since we keep other
+ CPUs in a holding pattern.
+
+There are also the following functions for the common backend, found in
+``kernel/kgdb.c``, that must be supplied by the architecture-specific
+backend unless marked as (optional), in which case a default function
+maybe used if the architecture does not need to provide a specific
+implementation.
+
+.. kernel-doc:: include/linux/kgdb.h
+ :internal:
+
+kgdboc internals
+----------------
+
+kgdboc and uarts
+~~~~~~~~~~~~~~~~
+
+The kgdboc driver is actually a very thin driver that relies on the
+underlying low level to the hardware driver having "polling hooks" to
+which the tty driver is attached. In the initial implementation of
+kgdboc the serial_core was changed to expose a low level UART hook for
+doing polled mode reading and writing of a single character while in an
+atomic context. When kgdb makes an I/O request to the debugger, kgdboc
+invokes a callback in the serial core which in turn uses the callback in
+the UART driver.
+
+When using kgdboc with a UART, the UART driver must implement two
+callbacks in the struct uart_ops.
+Example from ``drivers/8250.c``::
+
+
+ #ifdef CONFIG_CONSOLE_POLL
+ .poll_get_char = serial8250_get_poll_char,
+ .poll_put_char = serial8250_put_poll_char,
+ #endif
+
+
+Any implementation specifics around creating a polling driver use the
+``#ifdef CONFIG_CONSOLE_POLL``, as shown above. Keep in mind that
+polling hooks have to be implemented in such a way that they can be
+called from an atomic context and have to restore the state of the UART
+chip on return such that the system can return to normal when the
+debugger detaches. You need to be very careful with any kind of lock you
+consider, because failing here is most likely going to mean pressing the
+reset button.
+
+kgdboc and keyboards
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+The kgdboc driver contains logic to configure communications with an
+attached keyboard. The keyboard infrastructure is only compiled into the
+kernel when ``CONFIG_KDB_KEYBOARD=y`` is set in the kernel configuration.
+
+The core polled keyboard driver for PS/2 type keyboards is in
+``drivers/char/kdb_keyboard.c``. This driver is hooked into the debug core
+when kgdboc populates the callback in the array called
+:c:expr:`kdb_poll_funcs[]`. The kdb_get_kbd_char() is the top-level
+function which polls hardware for single character input.
+
+kgdboc and kms
+~~~~~~~~~~~~~~~~~~
+
+The kgdboc driver contains logic to request the graphics display to
+switch to a text context when you are using ``kgdboc=kms,kbd``, provided
+that you have a video driver which has a frame buffer console and atomic
+kernel mode setting support.
+
+Every time the kernel debugger is entered it calls
+kgdboc_pre_exp_handler() which in turn calls con_debug_enter()
+in the virtual console layer. On resuming kernel execution, the kernel
+debugger calls kgdboc_post_exp_handler() which in turn calls
+con_debug_leave().
+
+Any video driver that wants to be compatible with the kernel debugger
+and the atomic kms callbacks must implement the ``mode_set_base_atomic``,
+``fb_debug_enter`` and ``fb_debug_leave operations``. For the
+``fb_debug_enter`` and ``fb_debug_leave`` the option exists to use the
+generic drm fb helper functions or implement something custom for the
+hardware. The following example shows the initialization of the
+.mode_set_base_atomic operation in
+drivers/gpu/drm/i915/intel_display.c::
+
+
+ static const struct drm_crtc_helper_funcs intel_helper_funcs = {
+ [...]
+ .mode_set_base_atomic = intel_pipe_set_base_atomic,
+ [...]
+ };
+
+
+Here is an example of how the i915 driver initializes the
+fb_debug_enter and fb_debug_leave functions to use the generic drm
+helpers in ``drivers/gpu/drm/i915/intel_fb.c``::
+
+
+ static struct fb_ops intelfb_ops = {
+ [...]
+ .fb_debug_enter = drm_fb_helper_debug_enter,
+ .fb_debug_leave = drm_fb_helper_debug_leave,
+ [...]
+ };
+
+
+Credits
+=======
+
+The following people have contributed to this document:
+
+1. Amit Kale <amitkale@linsyssoft.com>
+
+2. Tom Rini <trini@kernel.crashing.org>
+
+In March 2008 this document was completely rewritten by:
+
+- Jason Wessel <jason.wessel@windriver.com>
+
+In Jan 2010 this document was updated to include kdb.
+
+- Jason Wessel <jason.wessel@windriver.com>
diff --git a/Documentation/dev-tools/kmemleak.rst b/Documentation/dev-tools/kmemleak.rst
new file mode 100644
index 000000000..1c935f41c
--- /dev/null
+++ b/Documentation/dev-tools/kmemleak.rst
@@ -0,0 +1,259 @@
+Kernel Memory Leak Detector
+===========================
+
+Kmemleak provides a way of detecting possible kernel memory leaks in a
+way similar to a `tracing garbage collector
+<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_,
+with the difference that the orphan objects are not freed but only
+reported via /sys/kernel/debug/kmemleak. A similar method is used by the
+Valgrind tool (``memcheck --leak-check``) to detect the memory leaks in
+user-space applications.
+
+Usage
+-----
+
+CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel
+thread scans the memory every 10 minutes (by default) and prints the
+number of new unreferenced objects found. If the ``debugfs`` isn't already
+mounted, mount with::
+
+ # mount -t debugfs nodev /sys/kernel/debug/
+
+To display the details of all the possible scanned memory leaks::
+
+ # cat /sys/kernel/debug/kmemleak
+
+To trigger an intermediate memory scan::
+
+ # echo scan > /sys/kernel/debug/kmemleak
+
+To clear the list of all current possible memory leaks::
+
+ # echo clear > /sys/kernel/debug/kmemleak
+
+New leaks will then come up upon reading ``/sys/kernel/debug/kmemleak``
+again.
+
+Note that the orphan objects are listed in the order they were allocated
+and one object at the beginning of the list may cause other subsequent
+objects to be reported as orphan.
+
+Memory scanning parameters can be modified at run-time by writing to the
+``/sys/kernel/debug/kmemleak`` file. The following parameters are supported:
+
+- off
+ disable kmemleak (irreversible)
+- stack=on
+ enable the task stacks scanning (default)
+- stack=off
+ disable the tasks stacks scanning
+- scan=on
+ start the automatic memory scanning thread (default)
+- scan=off
+ stop the automatic memory scanning thread
+- scan=<secs>
+ set the automatic memory scanning period in seconds
+ (default 600, 0 to stop the automatic scanning)
+- scan
+ trigger a memory scan
+- clear
+ clear list of current memory leak suspects, done by
+ marking all current reported unreferenced objects grey,
+ or free all kmemleak objects if kmemleak has been disabled.
+- dump=<addr>
+ dump information about the object found at <addr>
+
+Kmemleak can also be disabled at boot-time by passing ``kmemleak=off`` on
+the kernel command line.
+
+Memory may be allocated or freed before kmemleak is initialised and
+these actions are stored in an early log buffer. The size of this buffer
+is configured via the CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE option.
+
+If CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF are enabled, the kmemleak is
+disabled by default. Passing ``kmemleak=on`` on the kernel command
+line enables the function.
+
+If you are getting errors like "Error while writing to stdout" or "write_loop:
+Invalid argument", make sure kmemleak is properly enabled.
+
+Basic Algorithm
+---------------
+
+The memory allocations via :c:func:`kmalloc`, :c:func:`vmalloc`,
+:c:func:`kmem_cache_alloc` and
+friends are traced and the pointers, together with additional
+information like size and stack trace, are stored in a rbtree.
+The corresponding freeing function calls are tracked and the pointers
+removed from the kmemleak data structures.
+
+An allocated block of memory is considered orphan if no pointer to its
+start address or to any location inside the block can be found by
+scanning the memory (including saved registers). This means that there
+might be no way for the kernel to pass the address of the allocated
+block to a freeing function and therefore the block is considered a
+memory leak.
+
+The scanning algorithm steps:
+
+ 1. mark all objects as white (remaining white objects will later be
+ considered orphan)
+ 2. scan the memory starting with the data section and stacks, checking
+ the values against the addresses stored in the rbtree. If
+ a pointer to a white object is found, the object is added to the
+ gray list
+ 3. scan the gray objects for matching addresses (some white objects
+ can become gray and added at the end of the gray list) until the
+ gray set is finished
+ 4. the remaining white objects are considered orphan and reported via
+ /sys/kernel/debug/kmemleak
+
+Some allocated memory blocks have pointers stored in the kernel's
+internal data structures and they cannot be detected as orphans. To
+avoid this, kmemleak can also store the number of values pointing to an
+address inside the block address range that need to be found so that the
+block is not considered a leak. One example is __vmalloc().
+
+Testing specific sections with kmemleak
+---------------------------------------
+
+Upon initial bootup your /sys/kernel/debug/kmemleak output page may be
+quite extensive. This can also be the case if you have very buggy code
+when doing development. To work around these situations you can use the
+'clear' command to clear all reported unreferenced objects from the
+/sys/kernel/debug/kmemleak output. By issuing a 'scan' after a 'clear'
+you can find new unreferenced objects; this should help with testing
+specific sections of code.
+
+To test a critical section on demand with a clean kmemleak do::
+
+ # echo clear > /sys/kernel/debug/kmemleak
+ ... test your kernel or modules ...
+ # echo scan > /sys/kernel/debug/kmemleak
+
+Then as usual to get your report with::
+
+ # cat /sys/kernel/debug/kmemleak
+
+Freeing kmemleak internal objects
+---------------------------------
+
+To allow access to previously found memory leaks after kmemleak has been
+disabled by the user or due to an fatal error, internal kmemleak objects
+won't be freed when kmemleak is disabled, and those objects may occupy
+a large part of physical memory.
+
+In this situation, you may reclaim memory with::
+
+ # echo clear > /sys/kernel/debug/kmemleak
+
+Kmemleak API
+------------
+
+See the include/linux/kmemleak.h header for the functions prototype.
+
+- ``kmemleak_init`` - initialize kmemleak
+- ``kmemleak_alloc`` - notify of a memory block allocation
+- ``kmemleak_alloc_percpu`` - notify of a percpu memory block allocation
+- ``kmemleak_vmalloc`` - notify of a vmalloc() memory allocation
+- ``kmemleak_free`` - notify of a memory block freeing
+- ``kmemleak_free_part`` - notify of a partial memory block freeing
+- ``kmemleak_free_percpu`` - notify of a percpu memory block freeing
+- ``kmemleak_update_trace`` - update object allocation stack trace
+- ``kmemleak_not_leak`` - mark an object as not a leak
+- ``kmemleak_ignore`` - do not scan or report an object as leak
+- ``kmemleak_scan_area`` - add scan areas inside a memory block
+- ``kmemleak_no_scan`` - do not scan a memory block
+- ``kmemleak_erase`` - erase an old value in a pointer variable
+- ``kmemleak_alloc_recursive`` - as kmemleak_alloc but checks the recursiveness
+- ``kmemleak_free_recursive`` - as kmemleak_free but checks the recursiveness
+
+The following functions take a physical address as the object pointer
+and only perform the corresponding action if the address has a lowmem
+mapping:
+
+- ``kmemleak_alloc_phys``
+- ``kmemleak_free_part_phys``
+- ``kmemleak_not_leak_phys``
+- ``kmemleak_ignore_phys``
+
+Dealing with false positives/negatives
+--------------------------------------
+
+The false negatives are real memory leaks (orphan objects) but not
+reported by kmemleak because values found during the memory scanning
+point to such objects. To reduce the number of false negatives, kmemleak
+provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and
+kmemleak_erase functions (see above). The task stacks also increase the
+amount of false negatives and their scanning is not enabled by default.
+
+The false positives are objects wrongly reported as being memory leaks
+(orphan). For objects known not to be leaks, kmemleak provides the
+kmemleak_not_leak function. The kmemleak_ignore could also be used if
+the memory block is known not to contain other pointers and it will no
+longer be scanned.
+
+Some of the reported leaks are only transient, especially on SMP
+systems, because of pointers temporarily stored in CPU registers or
+stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing
+the minimum age of an object to be reported as a memory leak.
+
+Limitations and Drawbacks
+-------------------------
+
+The main drawback is the reduced performance of memory allocation and
+freeing. To avoid other penalties, the memory scanning is only performed
+when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is
+intended for debugging purposes where the performance might not be the
+most important requirement.
+
+To keep the algorithm simple, kmemleak scans for values pointing to any
+address inside a block's address range. This may lead to an increased
+number of false negatives. However, it is likely that a real memory leak
+will eventually become visible.
+
+Another source of false negatives is the data stored in non-pointer
+values. In a future version, kmemleak could only scan the pointer
+members in the allocated structures. This feature would solve many of
+the false negative cases described above.
+
+The tool can report false positives. These are cases where an allocated
+block doesn't need to be freed (some cases in the init_call functions),
+the pointer is calculated by other methods than the usual container_of
+macro or the pointer is stored in a location not scanned by kmemleak.
+
+Page allocations and ioremap are not tracked.
+
+Testing with kmemleak-test
+--------------------------
+
+To check if you have all set up to use kmemleak, you can use the kmemleak-test
+module, a module that deliberately leaks memory. Set CONFIG_DEBUG_KMEMLEAK_TEST
+as module (it can't be used as built-in) and boot the kernel with kmemleak
+enabled. Load the module and perform a scan with::
+
+ # modprobe kmemleak-test
+ # echo scan > /sys/kernel/debug/kmemleak
+
+Note that the you may not get results instantly or on the first scanning. When
+kmemleak gets results, it'll log ``kmemleak: <count of leaks> new suspected
+memory leaks``. Then read the file to see then::
+
+ # cat /sys/kernel/debug/kmemleak
+ unreferenced object 0xffff89862ca702e8 (size 32):
+ comm "modprobe", pid 2088, jiffies 4294680594 (age 375.486s)
+ hex dump (first 32 bytes):
+ 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
+ 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
+ backtrace:
+ [<00000000e0a73ec7>] 0xffffffffc01d2036
+ [<000000000c5d2a46>] do_one_initcall+0x41/0x1df
+ [<0000000046db7e0a>] do_init_module+0x55/0x200
+ [<00000000542b9814>] load_module+0x203c/0x2480
+ [<00000000c2850256>] __do_sys_finit_module+0xba/0xe0
+ [<000000006564e7ef>] do_syscall_64+0x43/0x110
+ [<000000007c873fa6>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
+ ...
+
+Removing the module with ``rmmod kmemleak_test`` should also trigger some
+kmemleak results.
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
new file mode 100644
index 000000000..a901def73
--- /dev/null
+++ b/Documentation/dev-tools/kselftest.rst
@@ -0,0 +1,350 @@
+======================
+Linux Kernel Selftests
+======================
+
+The kernel contains a set of "self tests" under the tools/testing/selftests/
+directory. These are intended to be small tests to exercise individual code
+paths in the kernel. Tests are intended to be run after building, installing
+and booting a kernel.
+
+You can find additional information on Kselftest framework, how to
+write new tests using the framework on Kselftest wiki:
+
+https://kselftest.wiki.kernel.org/
+
+On some systems, hot-plug tests could hang forever waiting for cpu and
+memory to be ready to be offlined. A special hot-plug target is created
+to run the full range of hot-plug tests. In default mode, hot-plug tests run
+in safe mode with a limited scope. In limited mode, cpu-hotplug test is
+run on a single cpu as opposed to all hotplug capable cpus, and memory
+hotplug test is run on 2% of hotplug capable memory instead of 10%.
+
+kselftest runs as a userspace process. Tests that can be written/run in
+userspace may wish to use the `Test Harness`_. Tests that need to be
+run in kernel space may wish to use a `Test Module`_.
+
+Running the selftests (hotplug tests are run in limited mode)
+=============================================================
+
+To build the tests::
+
+ $ make -C tools/testing/selftests
+
+To run the tests::
+
+ $ make -C tools/testing/selftests run_tests
+
+To build and run the tests with a single command, use::
+
+ $ make kselftest
+
+Note that some tests will require root privileges.
+
+Kselftest supports saving output files in a separate directory and then
+running tests. To locate output files in a separate directory two syntaxes
+are supported. In both cases the working directory must be the root of the
+kernel src. This is applicable to "Running a subset of selftests" section
+below.
+
+To build, save output files in a separate directory with O= ::
+
+ $ make O=/tmp/kselftest kselftest
+
+To build, save output files in a separate directory with KBUILD_OUTPUT ::
+
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make kselftest
+
+The O= assignment takes precedence over the KBUILD_OUTPUT environment
+variable.
+
+The above commands by default run the tests and print full pass/fail report.
+Kselftest supports "summary" option to make it easier to understand the test
+results. Please find the detailed individual test results for each test in
+/tmp/testname file(s) when summary option is specified. This is applicable
+to "Running a subset of selftests" section below.
+
+To run kselftest with summary option enabled ::
+
+ $ make summary=1 kselftest
+
+Running a subset of selftests
+=============================
+
+You can use the "TARGETS" variable on the make command line to specify
+single test to run, or a list of tests to run.
+
+To run only tests targeted for a single subsystem::
+
+ $ make -C tools/testing/selftests TARGETS=ptrace run_tests
+
+You can specify multiple tests to build and run::
+
+ $ make TARGETS="size timers" kselftest
+
+To build, save output files in a separate directory with O= ::
+
+ $ make O=/tmp/kselftest TARGETS="size timers" kselftest
+
+To build, save output files in a separate directory with KBUILD_OUTPUT ::
+
+ $ export KBUILD_OUTPUT=/tmp/kselftest; make TARGETS="size timers" kselftest
+
+Additionally you can use the "SKIP_TARGETS" variable on the make command
+line to specify one or more targets to exclude from the TARGETS list.
+
+To run all tests but a single subsystem::
+
+ $ make -C tools/testing/selftests SKIP_TARGETS=ptrace run_tests
+
+You can specify multiple tests to skip::
+
+ $ make SKIP_TARGETS="size timers" kselftest
+
+You can also specify a restricted list of tests to run together with a
+dedicated skiplist::
+
+ $ make TARGETS="bpf breakpoints size timers" SKIP_TARGETS=bpf kselftest
+
+See the top-level tools/testing/selftests/Makefile for the list of all
+possible targets.
+
+Running the full range hotplug selftests
+========================================
+
+To build the hotplug tests::
+
+ $ make -C tools/testing/selftests hotplug
+
+To run the hotplug tests::
+
+ $ make -C tools/testing/selftests run_hotplug
+
+Note that some tests will require root privileges.
+
+
+Install selftests
+=================
+
+You can use the "install" target of "make" (which calls the `kselftest_install.sh`
+tool) to install selftests in the default location (`tools/testing/selftests/kselftest_install`),
+or in a user specified location via the `INSTALL_PATH` "make" variable.
+
+To install selftests in default location::
+
+ $ make -C tools/testing/selftests install
+
+To install selftests in a user specified location::
+
+ $ make -C tools/testing/selftests install INSTALL_PATH=/some/other/path
+
+Running installed selftests
+===========================
+
+Found in the install directory, as well as in the Kselftest tarball,
+is a script named `run_kselftest.sh` to run the tests.
+
+You can simply do the following to run the installed Kselftests. Please
+note some tests will require root privileges::
+
+ $ cd kselftest_install
+ $ ./run_kselftest.sh
+
+To see the list of available tests, the `-l` option can be used::
+
+ $ ./run_kselftest.sh -l
+
+The `-c` option can be used to run all the tests from a test collection, or
+the `-t` option for specific single tests. Either can be used multiple times::
+
+ $ ./run_kselftest.sh -c bpf -c seccomp -t timers:posix_timers -t timer:nanosleep
+
+For other features see the script usage output, seen with the `-h` option.
+
+Packaging selftests
+===================
+
+In some cases packaging is desired, such as when tests need to run on a
+different system. To package selftests, run::
+
+ $ make -C tools/testing/selftests gen_tar
+
+This generates a tarball in the `INSTALL_PATH/kselftest-packages` directory. By
+default, `.gz` format is used. The tar compression format can be overridden by
+specifying a `FORMAT` make variable. Any value recognized by `tar's auto-compress`_
+option is supported, such as::
+
+ $ make -C tools/testing/selftests gen_tar FORMAT=.xz
+
+`make gen_tar` invokes `make install` so you can use it to package a subset of
+tests by using variables specified in `Running a subset of selftests`_
+section::
+
+ $ make -C tools/testing/selftests gen_tar TARGETS="bpf" FORMAT=.xz
+
+.. _tar's auto-compress: https://www.gnu.org/software/tar/manual/html_node/gzip.html#auto_002dcompress
+
+Contributing new tests
+======================
+
+In general, the rules for selftests are
+
+ * Do as much as you can if you're not root;
+
+ * Don't take too long;
+
+ * Don't break the build on any architecture, and
+
+ * Don't cause the top-level "make run_tests" to fail if your feature is
+ unconfigured.
+
+Contributing new tests (details)
+================================
+
+ * Use TEST_GEN_XXX if such binaries or files are generated during
+ compiling.
+
+ TEST_PROGS, TEST_GEN_PROGS mean it is the executable tested by
+ default.
+
+ TEST_CUSTOM_PROGS should be used by tests that require custom build
+ rules and prevent common build rule use.
+
+ TEST_PROGS are for test shell scripts. Please ensure shell script has
+ its exec bit set. Otherwise, lib.mk run_tests will generate a warning.
+
+ TEST_CUSTOM_PROGS and TEST_PROGS will be run by common run_tests.
+
+ TEST_PROGS_EXTENDED, TEST_GEN_PROGS_EXTENDED mean it is the
+ executable which is not tested by default.
+ TEST_FILES, TEST_GEN_FILES mean it is the file which is used by
+ test.
+
+ * First use the headers inside the kernel source and/or git repo, and then the
+ system headers. Headers for the kernel release as opposed to headers
+ installed by the distro on the system should be the primary focus to be able
+ to find regressions.
+
+ * If a test needs specific kernel config options enabled, add a config file in
+ the test directory to enable them.
+
+ e.g: tools/testing/selftests/android/config
+
+Test Module
+===========
+
+Kselftest tests the kernel from userspace. Sometimes things need
+testing from within the kernel, one method of doing this is to create a
+test module. We can tie the module into the kselftest framework by
+using a shell script test runner. ``kselftest/module.sh`` is designed
+to facilitate this process. There is also a header file provided to
+assist writing kernel modules that are for use with kselftest:
+
+- ``tools/testing/kselftest/kselftest_module.h``
+- ``tools/testing/kselftest/kselftest/module.sh``
+
+How to use
+----------
+
+Here we show the typical steps to create a test module and tie it into
+kselftest. We use kselftests for lib/ as an example.
+
+1. Create the test module
+
+2. Create the test script that will run (load/unload) the module
+ e.g. ``tools/testing/selftests/lib/printf.sh``
+
+3. Add line to config file e.g. ``tools/testing/selftests/lib/config``
+
+4. Add test script to makefile e.g. ``tools/testing/selftests/lib/Makefile``
+
+5. Verify it works:
+
+.. code-block:: sh
+
+ # Assumes you have booted a fresh build of this kernel tree
+ cd /path/to/linux/tree
+ make kselftest-merge
+ make modules
+ sudo make modules_install
+ make TARGETS=lib kselftest
+
+Example Module
+--------------
+
+A bare bones test module might look like this:
+
+.. code-block:: c
+
+ // SPDX-License-Identifier: GPL-2.0+
+
+ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+ #include "../tools/testing/selftests/kselftest/module.h"
+
+ KSTM_MODULE_GLOBALS();
+
+ /*
+ * Kernel module for testing the foobinator
+ */
+
+ static int __init test_function()
+ {
+ ...
+ }
+
+ static void __init selftest(void)
+ {
+ KSTM_CHECK_ZERO(do_test_case("", 0));
+ }
+
+ KSTM_MODULE_LOADERS(test_foo);
+ MODULE_AUTHOR("John Developer <jd@fooman.org>");
+ MODULE_LICENSE("GPL");
+
+Example test script
+-------------------
+
+.. code-block:: sh
+
+ #!/bin/bash
+ # SPDX-License-Identifier: GPL-2.0+
+ $(dirname $0)/../kselftest/module.sh "foo" test_foo
+
+
+Test Harness
+============
+
+The kselftest_harness.h file contains useful helpers to build tests. The
+test harness is for userspace testing, for kernel space testing see `Test
+Module`_ above.
+
+The tests from tools/testing/selftests/seccomp/seccomp_bpf.c can be used as
+example.
+
+Example
+-------
+
+.. kernel-doc:: tools/testing/selftests/kselftest_harness.h
+ :doc: example
+
+
+Helpers
+-------
+
+.. kernel-doc:: tools/testing/selftests/kselftest_harness.h
+ :functions: TH_LOG TEST TEST_SIGNAL FIXTURE FIXTURE_DATA FIXTURE_SETUP
+ FIXTURE_TEARDOWN TEST_F TEST_HARNESS_MAIN FIXTURE_VARIANT
+ FIXTURE_VARIANT_ADD
+
+Operators
+---------
+
+.. kernel-doc:: tools/testing/selftests/kselftest_harness.h
+ :doc: operators
+
+.. kernel-doc:: tools/testing/selftests/kselftest_harness.h
+ :functions: ASSERT_EQ ASSERT_NE ASSERT_LT ASSERT_LE ASSERT_GT ASSERT_GE
+ ASSERT_NULL ASSERT_TRUE ASSERT_NULL ASSERT_TRUE ASSERT_FALSE
+ ASSERT_STREQ ASSERT_STRNE EXPECT_EQ EXPECT_NE EXPECT_LT
+ EXPECT_LE EXPECT_GT EXPECT_GE EXPECT_NULL EXPECT_TRUE
+ EXPECT_FALSE EXPECT_STREQ EXPECT_STRNE
diff --git a/Documentation/dev-tools/kunit/api/index.rst b/Documentation/dev-tools/kunit/api/index.rst
new file mode 100644
index 000000000..9b9bffe5d
--- /dev/null
+++ b/Documentation/dev-tools/kunit/api/index.rst
@@ -0,0 +1,16 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+API Reference
+=============
+.. toctree::
+
+ test
+
+This section documents the KUnit kernel testing API. It is divided into the
+following sections:
+
+================================= ==============================================
+:doc:`test` documents all of the standard testing API
+ excluding mocking or mocking related features.
+================================= ==============================================
diff --git a/Documentation/dev-tools/kunit/api/test.rst b/Documentation/dev-tools/kunit/api/test.rst
new file mode 100644
index 000000000..aaa97f17e
--- /dev/null
+++ b/Documentation/dev-tools/kunit/api/test.rst
@@ -0,0 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========
+Test API
+========
+
+This file documents all of the standard testing API excluding mocking or mocking
+related features.
+
+.. kernel-doc:: include/kunit/test.h
+ :internal:
diff --git a/Documentation/dev-tools/kunit/faq.rst b/Documentation/dev-tools/kunit/faq.rst
new file mode 100644
index 000000000..8d5029ad2
--- /dev/null
+++ b/Documentation/dev-tools/kunit/faq.rst
@@ -0,0 +1,103 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+Frequently Asked Questions
+==========================
+
+How is this different from Autotest, kselftest, etc?
+====================================================
+KUnit is a unit testing framework. Autotest, kselftest (and some others) are
+not.
+
+A `unit test <https://martinfowler.com/bliki/UnitTest.html>`_ is supposed to
+test a single unit of code in isolation, hence the name. A unit test should be
+the finest granularity of testing and as such should allow all possible code
+paths to be tested in the code under test; this is only possible if the code
+under test is very small and does not have any external dependencies outside of
+the test's control like hardware.
+
+There are no testing frameworks currently available for the kernel that do not
+require installing the kernel on a test machine or in a VM and all require
+tests to be written in userspace and run on the kernel under test; this is true
+for Autotest, kselftest, and some others, disqualifying any of them from being
+considered unit testing frameworks.
+
+Does KUnit support running on architectures other than UML?
+===========================================================
+
+Yes, well, mostly.
+
+For the most part, the KUnit core framework (what you use to write the tests)
+can compile to any architecture; it compiles like just another part of the
+kernel and runs when the kernel boots, or when built as a module, when the
+module is loaded. However, there is some infrastructure,
+like the KUnit Wrapper (``tools/testing/kunit/kunit.py``) that does not support
+other architectures.
+
+In short, this means that, yes, you can run KUnit on other architectures, but
+it might require more work than using KUnit on UML.
+
+For more information, see :ref:`kunit-on-non-uml`.
+
+What is the difference between a unit test and these other kinds of tests?
+==========================================================================
+Most existing tests for the Linux kernel would be categorized as an integration
+test, or an end-to-end test.
+
+- A unit test is supposed to test a single unit of code in isolation, hence the
+ name. A unit test should be the finest granularity of testing and as such
+ should allow all possible code paths to be tested in the code under test; this
+ is only possible if the code under test is very small and does not have any
+ external dependencies outside of the test's control like hardware.
+- An integration test tests the interaction between a minimal set of components,
+ usually just two or three. For example, someone might write an integration
+ test to test the interaction between a driver and a piece of hardware, or to
+ test the interaction between the userspace libraries the kernel provides and
+ the kernel itself; however, one of these tests would probably not test the
+ entire kernel along with hardware interactions and interactions with the
+ userspace.
+- An end-to-end test usually tests the entire system from the perspective of the
+ code under test. For example, someone might write an end-to-end test for the
+ kernel by installing a production configuration of the kernel on production
+ hardware with a production userspace and then trying to exercise some behavior
+ that depends on interactions between the hardware, the kernel, and userspace.
+
+KUnit isn't working, what should I do?
+======================================
+
+Unfortunately, there are a number of things which can break, but here are some
+things to try.
+
+1. Try running ``./tools/testing/kunit/kunit.py run`` with the ``--raw_output``
+ parameter. This might show details or error messages hidden by the kunit_tool
+ parser.
+2. Instead of running ``kunit.py run``, try running ``kunit.py config``,
+ ``kunit.py build``, and ``kunit.py exec`` independently. This can help track
+ down where an issue is occurring. (If you think the parser is at fault, you
+ can run it manually against stdin or a file with ``kunit.py parse``.)
+3. Running the UML kernel directly can often reveal issues or error messages
+ kunit_tool ignores. This should be as simple as running ``./vmlinux`` after
+ building the UML kernel (e.g., by using ``kunit.py build``). Note that UML
+ has some unusual requirements (such as the host having a tmpfs filesystem
+ mounted), and has had issues in the past when built statically and the host
+ has KASLR enabled. (On older host kernels, you may need to run ``setarch
+ `uname -m` -R ./vmlinux`` to disable KASLR.)
+4. Make sure the kernel .config has ``CONFIG_KUNIT=y`` and at least one test
+ (e.g. ``CONFIG_KUNIT_EXAMPLE_TEST=y``). kunit_tool will keep its .config
+ around, so you can see what config was used after running ``kunit.py run``.
+ It also preserves any config changes you might make, so you can
+ enable/disable things with ``make ARCH=um menuconfig`` or similar, and then
+ re-run kunit_tool.
+5. Try to run ``make ARCH=um defconfig`` before running ``kunit.py run``. This
+ may help clean up any residual config items which could be causing problems.
+6. Finally, try running KUnit outside UML. KUnit and KUnit tests can be
+ built into any kernel, or can be built as a module and loaded at runtime.
+ Doing so should allow you to determine if UML is causing the issue you're
+ seeing. When tests are built-in, they will execute when the kernel boots, and
+ modules will automatically execute associated tests when loaded. Test results
+ can be collected from ``/sys/kernel/debug/kunit/<test suite>/results``, and
+ can be parsed with ``kunit.py parse``. For more details, see "KUnit on
+ non-UML architectures" in :doc:`usage`.
+
+If none of the above tricks help, you are always welcome to email any issues to
+kunit-dev@googlegroups.com.
diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst
new file mode 100644
index 000000000..c234a3ab3
--- /dev/null
+++ b/Documentation/dev-tools/kunit/index.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+KUnit - Unit Testing for the Linux Kernel
+=========================================
+
+.. toctree::
+ :maxdepth: 2
+
+ start
+ usage
+ kunit-tool
+ api/index
+ style
+ faq
+
+What is KUnit?
+==============
+
+KUnit is a lightweight unit testing and mocking framework for the Linux kernel.
+
+KUnit is heavily inspired by JUnit, Python's unittest.mock, and
+Googletest/Googlemock for C++. KUnit provides facilities for defining unit test
+cases, grouping related test cases into test suites, providing common
+infrastructure for running tests, and much more.
+
+KUnit consists of a kernel component, which provides a set of macros for easily
+writing unit tests. Tests written against KUnit will run on kernel boot if
+built-in, or when loaded if built as a module. These tests write out results to
+the kernel log in `TAP <https://testanything.org/>`_ format.
+
+To make running these tests (and reading the results) easier, KUnit offers
+:doc:`kunit_tool <kunit-tool>`, which builds a `User Mode Linux
+<http://user-mode-linux.sourceforge.net>`_ kernel, runs it, and parses the test
+results. This provides a quick way of running KUnit tests during development,
+without requiring a virtual machine or separate hardware.
+
+Get started now: :doc:`start`
+
+Why KUnit?
+==========
+
+A unit test is supposed to test a single unit of code in isolation, hence the
+name. A unit test should be the finest granularity of testing and as such should
+allow all possible code paths to be tested in the code under test; this is only
+possible if the code under test is very small and does not have any external
+dependencies outside of the test's control like hardware.
+
+KUnit provides a common framework for unit tests within the kernel.
+
+KUnit tests can be run on most architectures, and most tests are architecture
+independent. All built-in KUnit tests run on kernel startup. Alternatively,
+KUnit and KUnit tests can be built as modules and tests will run when the test
+module is loaded.
+
+.. note::
+
+ KUnit can also run tests without needing a virtual machine or actual
+ hardware under User Mode Linux. User Mode Linux is a Linux architecture,
+ like ARM or x86, which compiles the kernel as a Linux executable. KUnit
+ can be used with UML either by building with ``ARCH=um`` (like any other
+ architecture), or by using :doc:`kunit_tool <kunit-tool>`.
+
+KUnit is fast. Excluding build time, from invocation to completion KUnit can run
+several dozen tests in only 10 to 20 seconds; this might not sound like a big
+deal to some people, but having such fast and easy to run tests fundamentally
+changes the way you go about testing and even writing code in the first place.
+Linus himself said in his `git talk at Google
+<https://gist.github.com/lorn/1272686/revisions#diff-53c65572127855f1b003db4064a94573R874>`_:
+
+ "... a lot of people seem to think that performance is about doing the
+ same thing, just doing it faster, and that is not true. That is not what
+ performance is all about. If you can do something really fast, really
+ well, people will start using it differently."
+
+In this context Linus was talking about branching and merging,
+but this point also applies to testing. If your tests are slow, unreliable, are
+difficult to write, and require a special setup or special hardware to run,
+then you wait a lot longer to write tests, and you wait a lot longer to run
+tests; this means that tests are likely to break, unlikely to test a lot of
+things, and are unlikely to be rerun once they pass. If your tests are really
+fast, you run them all the time, every time you make a change, and every time
+someone sends you some code. Why trust that someone ran all their tests
+correctly on every change when you can just run them yourself in less time than
+it takes to read their test log?
+
+How do I use it?
+================
+
+* :doc:`start` - for new users of KUnit
+* :doc:`usage` - for a more detailed explanation of KUnit features
+* :doc:`api/index` - for the list of KUnit APIs used for testing
+* :doc:`kunit-tool` - for more information on the kunit_tool helper script
+* :doc:`faq` - for answers to some common questions about KUnit
diff --git a/Documentation/dev-tools/kunit/kunit-tool.rst b/Documentation/dev-tools/kunit/kunit-tool.rst
new file mode 100644
index 000000000..29ae2fee8
--- /dev/null
+++ b/Documentation/dev-tools/kunit/kunit-tool.rst
@@ -0,0 +1,57 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+kunit_tool How-To
+=================
+
+What is kunit_tool?
+===================
+
+kunit_tool is a script (``tools/testing/kunit/kunit.py``) that aids in building
+the Linux kernel as UML (`User Mode Linux
+<http://user-mode-linux.sourceforge.net/>`_), running KUnit tests, parsing
+the test results and displaying them in a user friendly manner.
+
+kunit_tool addresses the problem of being able to run tests without needing a
+virtual machine or actual hardware with User Mode Linux. User Mode Linux is a
+Linux architecture, like ARM or x86; however, unlike other architectures it
+compiles the kernel as a standalone Linux executable that can be run like any
+other program directly inside of a host operating system. To be clear, it does
+not require any virtualization support: it is just a regular program.
+
+What is a .kunitconfig?
+=======================
+
+It's just a defconfig that kunit_tool looks for in the base directory.
+kunit_tool uses it to generate a .config as you might expect. In addition, it
+verifies that the generated .config contains the CONFIG options in the
+.kunitconfig; the reason it does this is so that it is easy to be sure that a
+CONFIG that enables a test actually ends up in the .config.
+
+How do I use kunit_tool?
+========================
+
+If a kunitconfig is present at the root directory, all you have to do is:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run
+
+However, you most likely want to use it with the following options:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --timeout=30 --jobs=`nproc --all`
+
+- ``--timeout`` sets a maximum amount of time to allow tests to run.
+- ``--jobs`` sets the number of threads to use to build the kernel.
+
+.. note::
+ This command will work even without a .kunitconfig file: if no
+ .kunitconfig is present, a default one will be used instead.
+
+For a list of all the flags supported by kunit_tool, you can run:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run --help
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
new file mode 100644
index 000000000..454f30781
--- /dev/null
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -0,0 +1,237 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============
+Getting Started
+===============
+
+Installing dependencies
+=======================
+KUnit has the same dependencies as the Linux kernel. As long as you can build
+the kernel, you can run KUnit.
+
+Running tests with the KUnit Wrapper
+====================================
+Included with KUnit is a simple Python wrapper which runs tests under User Mode
+Linux, and formats the test results.
+
+The wrapper can be run with:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run
+
+For more information on this wrapper (also called kunit_tool) check out the
+:doc:`kunit-tool` page.
+
+Creating a .kunitconfig
+-----------------------
+If you want to run a specific set of tests (rather than those listed in the
+KUnit defconfig), you can provide Kconfig options in the ``.kunitconfig`` file.
+This file essentially contains the regular Kernel config, with the specific
+test targets as well. The ``.kunitconfig`` should also contain any other config
+options required by the tests.
+
+A good starting point for a ``.kunitconfig`` is the KUnit defconfig:
+
+.. code-block:: bash
+
+ cd $PATH_TO_LINUX_REPO
+ cp arch/um/configs/kunit_defconfig .kunitconfig
+
+You can then add any other Kconfig options you wish, e.g.:
+
+.. code-block:: none
+
+ CONFIG_LIST_KUNIT_TEST=y
+
+:doc:`kunit_tool <kunit-tool>` will ensure that all config options set in
+``.kunitconfig`` are set in the kernel ``.config`` before running the tests.
+It'll warn you if you haven't included the dependencies of the options you're
+using.
+
+.. note::
+ Note that removing something from the ``.kunitconfig`` will not trigger a
+ rebuild of the ``.config`` file: the configuration is only updated if the
+ ``.kunitconfig`` is not a subset of ``.config``. This means that you can use
+ other tools (such as make menuconfig) to adjust other config options.
+
+
+Running the tests (KUnit Wrapper)
+---------------------------------
+
+To make sure that everything is set up correctly, simply invoke the Python
+wrapper from your kernel repo:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run
+
+.. note::
+ You may want to run ``make mrproper`` first.
+
+If everything worked correctly, you should see the following:
+
+.. code-block:: bash
+
+ Generating .config ...
+ Building KUnit Kernel ...
+ Starting KUnit Kernel ...
+
+followed by a list of tests that are run. All of them should be passing.
+
+.. note::
+ Because it is building a lot of sources for the first time, the
+ ``Building KUnit kernel`` step may take a while.
+
+Running tests without the KUnit Wrapper
+=======================================
+
+If you'd rather not use the KUnit Wrapper (if, for example, you need to
+integrate with other systems, or use an architecture other than UML), KUnit can
+be included in any kernel, and the results read out and parsed manually.
+
+.. note::
+ KUnit is not designed for use in a production system, and it's possible that
+ tests may reduce the stability or security of the system.
+
+
+
+Configuring the kernel
+----------------------
+
+In order to enable KUnit itself, you simply need to enable the ``CONFIG_KUNIT``
+Kconfig option (it's under Kernel Hacking/Kernel Testing and Coverage in
+menuconfig). From there, you can enable any KUnit tests you want: they usually
+have config options ending in ``_KUNIT_TEST``.
+
+KUnit and KUnit tests can be compiled as modules: in this case the tests in a
+module will be run when the module is loaded.
+
+
+Running the tests (w/o KUnit Wrapper)
+-------------------------------------
+
+Build and run your kernel as usual. Test output will be written to the kernel
+log in `TAP <https://testanything.org/>`_ format.
+
+.. note::
+ It's possible that there will be other lines and/or data interspersed in the
+ TAP output.
+
+
+Writing your first test
+=======================
+
+In your kernel repo let's add some code that we can test. Create a file
+``drivers/misc/example.h`` with the contents:
+
+.. code-block:: c
+
+ int misc_example_add(int left, int right);
+
+create a file ``drivers/misc/example.c``:
+
+.. code-block:: c
+
+ #include <linux/errno.h>
+
+ #include "example.h"
+
+ int misc_example_add(int left, int right)
+ {
+ return left + right;
+ }
+
+Now add the following lines to ``drivers/misc/Kconfig``:
+
+.. code-block:: kconfig
+
+ config MISC_EXAMPLE
+ bool "My example"
+
+and the following lines to ``drivers/misc/Makefile``:
+
+.. code-block:: make
+
+ obj-$(CONFIG_MISC_EXAMPLE) += example.o
+
+Now we are ready to write the test. The test will be in
+``drivers/misc/example-test.c``:
+
+.. code-block:: c
+
+ #include <kunit/test.h>
+ #include "example.h"
+
+ /* Define the test cases. */
+
+ static void misc_example_add_test_basic(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, 1, misc_example_add(1, 0));
+ KUNIT_EXPECT_EQ(test, 2, misc_example_add(1, 1));
+ KUNIT_EXPECT_EQ(test, 0, misc_example_add(-1, 1));
+ KUNIT_EXPECT_EQ(test, INT_MAX, misc_example_add(0, INT_MAX));
+ KUNIT_EXPECT_EQ(test, -1, misc_example_add(INT_MAX, INT_MIN));
+ }
+
+ static void misc_example_test_failure(struct kunit *test)
+ {
+ KUNIT_FAIL(test, "This test never passes.");
+ }
+
+ static struct kunit_case misc_example_test_cases[] = {
+ KUNIT_CASE(misc_example_add_test_basic),
+ KUNIT_CASE(misc_example_test_failure),
+ {}
+ };
+
+ static struct kunit_suite misc_example_test_suite = {
+ .name = "misc-example",
+ .test_cases = misc_example_test_cases,
+ };
+ kunit_test_suite(misc_example_test_suite);
+
+Now add the following to ``drivers/misc/Kconfig``:
+
+.. code-block:: kconfig
+
+ config MISC_EXAMPLE_TEST
+ bool "Test for my example"
+ depends on MISC_EXAMPLE && KUNIT=y
+
+and the following to ``drivers/misc/Makefile``:
+
+.. code-block:: make
+
+ obj-$(CONFIG_MISC_EXAMPLE_TEST) += example-test.o
+
+Now add it to your ``.kunitconfig``:
+
+.. code-block:: none
+
+ CONFIG_MISC_EXAMPLE=y
+ CONFIG_MISC_EXAMPLE_TEST=y
+
+Now you can run the test:
+
+.. code-block:: bash
+
+ ./tools/testing/kunit/kunit.py run
+
+You should see the following failure:
+
+.. code-block:: none
+
+ ...
+ [16:08:57] [PASSED] misc-example:misc_example_add_test_basic
+ [16:08:57] [FAILED] misc-example:misc_example_test_failure
+ [16:08:57] EXPECTATION FAILED at drivers/misc/example-test.c:17
+ [16:08:57] This test never passes.
+ ...
+
+Congrats! You just wrote your first KUnit test!
+
+Next Steps
+==========
+* Check out the :doc:`usage` page for a more
+ in-depth explanation of KUnit.
diff --git a/Documentation/dev-tools/kunit/style.rst b/Documentation/dev-tools/kunit/style.rst
new file mode 100644
index 000000000..8dbcdc552
--- /dev/null
+++ b/Documentation/dev-tools/kunit/style.rst
@@ -0,0 +1,205 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+Test Style and Nomenclature
+===========================
+
+To make finding, writing, and using KUnit tests as simple as possible, it's
+strongly encouraged that they are named and written according to the guidelines
+below. While it's possible to write KUnit tests which do not follow these rules,
+they may break some tooling, may conflict with other tests, and may not be run
+automatically by testing systems.
+
+It's recommended that you only deviate from these guidelines when:
+
+1. Porting tests to KUnit which are already known with an existing name, or
+2. Writing tests which would cause serious problems if automatically run (e.g.,
+ non-deterministically producing false positives or negatives, or taking an
+ extremely long time to run).
+
+Subsystems, Suites, and Tests
+=============================
+
+In order to make tests as easy to find as possible, they're grouped into suites
+and subsystems. A test suite is a group of tests which test a related area of
+the kernel, and a subsystem is a set of test suites which test different parts
+of the same kernel subsystem or driver.
+
+Subsystems
+----------
+
+Every test suite must belong to a subsystem. A subsystem is a collection of one
+or more KUnit test suites which test the same driver or part of the kernel. A
+rule of thumb is that a test subsystem should match a single kernel module. If
+the code being tested can't be compiled as a module, in many cases the subsystem
+should correspond to a directory in the source tree or an entry in the
+MAINTAINERS file. If unsure, follow the conventions set by tests in similar
+areas.
+
+Test subsystems should be named after the code being tested, either after the
+module (wherever possible), or after the directory or files being tested. Test
+subsystems should be named to avoid ambiguity where necessary.
+
+If a test subsystem name has multiple components, they should be separated by
+underscores. *Do not* include "test" or "kunit" directly in the subsystem name
+unless you are actually testing other tests or the kunit framework itself.
+
+Example subsystems could be:
+
+``ext4``
+ Matches the module and filesystem name.
+``apparmor``
+ Matches the module name and LSM name.
+``kasan``
+ Common name for the tool, prominent part of the path ``mm/kasan``
+``snd_hda_codec_hdmi``
+ Has several components (``snd``, ``hda``, ``codec``, ``hdmi``) separated by
+ underscores. Matches the module name.
+
+Avoid names like these:
+
+``linear-ranges``
+ Names should use underscores, not dashes, to separate words. Prefer
+ ``linear_ranges``.
+``qos-kunit-test``
+ As well as using underscores, this name should not have "kunit-test" as a
+ suffix, and ``qos`` is ambiguous as a subsystem name. ``power_qos`` would be a
+ better name.
+``pc_parallel_port``
+ The corresponding module name is ``parport_pc``, so this subsystem should also
+ be named ``parport_pc``.
+
+.. note::
+ The KUnit API and tools do not explicitly know about subsystems. They're
+ simply a way of categorising test suites and naming modules which
+ provides a simple, consistent way for humans to find and run tests. This
+ may change in the future, though.
+
+Suites
+------
+
+KUnit tests are grouped into test suites, which cover a specific area of
+functionality being tested. Test suites can have shared initialisation and
+shutdown code which is run for all tests in the suite.
+Not all subsystems will need to be split into multiple test suites (e.g. simple drivers).
+
+Test suites are named after the subsystem they are part of. If a subsystem
+contains several suites, the specific area under test should be appended to the
+subsystem name, separated by an underscore.
+
+In the event that there are multiple types of test using KUnit within a
+subsystem (e.g., both unit tests and integration tests), they should be put into
+separate suites, with the type of test as the last element in the suite name.
+Unless these tests are actually present, avoid using ``_test``, ``_unittest`` or
+similar in the suite name.
+
+The full test suite name (including the subsystem name) should be specified as
+the ``.name`` member of the ``kunit_suite`` struct, and forms the base for the
+module name (see below).
+
+Example test suites could include:
+
+``ext4_inode``
+ Part of the ``ext4`` subsystem, testing the ``inode`` area.
+``kunit_try_catch``
+ Part of the ``kunit`` implementation itself, testing the ``try_catch`` area.
+``apparmor_property_entry``
+ Part of the ``apparmor`` subsystem, testing the ``property_entry`` area.
+``kasan``
+ The ``kasan`` subsystem has only one suite, so the suite name is the same as
+ the subsystem name.
+
+Avoid names like:
+
+``ext4_ext4_inode``
+ There's no reason to state the subsystem twice.
+``property_entry``
+ The suite name is ambiguous without the subsystem name.
+``kasan_integration_test``
+ Because there is only one suite in the ``kasan`` subsystem, the suite should
+ just be called ``kasan``. There's no need to redundantly add
+ ``integration_test``. Should a separate test suite with, for example, unit
+ tests be added, then that suite could be named ``kasan_unittest`` or similar.
+
+Test Cases
+----------
+
+Individual tests consist of a single function which tests a constrained
+codepath, property, or function. In the test output, individual tests' results
+will show up as subtests of the suite's results.
+
+Tests should be named after what they're testing. This is often the name of the
+function being tested, with a description of the input or codepath being tested.
+As tests are C functions, they should be named and written in accordance with
+the kernel coding style.
+
+.. note::
+ As tests are themselves functions, their names cannot conflict with
+ other C identifiers in the kernel. This may require some creative
+ naming. It's a good idea to make your test functions `static` to avoid
+ polluting the global namespace.
+
+Example test names include:
+
+``unpack_u32_with_null_name``
+ Tests the ``unpack_u32`` function when a NULL name is passed in.
+``test_list_splice``
+ Tests the ``list_splice`` macro. It has the prefix ``test_`` to avoid a
+ name conflict with the macro itself.
+
+
+Should it be necessary to refer to a test outside the context of its test suite,
+the *fully-qualified* name of a test should be the suite name followed by the
+test name, separated by a colon (i.e. ``suite:test``).
+
+Test Kconfig Entries
+====================
+
+Every test suite should be tied to a Kconfig entry.
+
+This Kconfig entry must:
+
+* be named ``CONFIG_<name>_KUNIT_TEST``: where <name> is the name of the test
+ suite.
+* be listed either alongside the config entries for the driver/subsystem being
+ tested, or be under [Kernel Hacking]→[Kernel Testing and Coverage]
+* depend on ``CONFIG_KUNIT``
+* be visible only if ``CONFIG_KUNIT_ALL_TESTS`` is not enabled.
+* have a default value of ``CONFIG_KUNIT_ALL_TESTS``.
+* have a brief description of KUnit in the help text
+
+Unless there's a specific reason not to (e.g. the test is unable to be built as
+a module), Kconfig entries for tests should be tristate.
+
+An example Kconfig entry:
+
+.. code-block:: none
+
+ config FOO_KUNIT_TEST
+ tristate "KUnit test for foo" if !KUNIT_ALL_TESTS
+ depends on KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ This builds unit tests for foo.
+
+ For more information on KUnit and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
+
+
+Test File and Module Names
+==========================
+
+KUnit tests can often be compiled as a module. These modules should be named
+after the test suite, followed by ``_test``. If this is likely to conflict with
+non-KUnit tests, the suffix ``_kunit`` can also be used.
+
+The easiest way of achieving this is to name the file containing the test suite
+``<suite>_test.c`` (or, as above, ``<suite>_kunit.c``). This file should be
+placed next to the code under test.
+
+If the suite name contains some or all of the name of the test's parent
+directory, it may make sense to modify the source filename to reduce redundancy.
+For example, a ``foo_firmware`` suite could be in the ``foo/firmware_test.c``
+file.
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
new file mode 100644
index 000000000..9c28c518e
--- /dev/null
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -0,0 +1,617 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========
+Using KUnit
+===========
+
+The purpose of this document is to describe what KUnit is, how it works, how it
+is intended to be used, and all the concepts and terminology that are needed to
+understand it. This guide assumes a working knowledge of the Linux kernel and
+some basic knowledge of testing.
+
+For a high level introduction to KUnit, including setting up KUnit for your
+project, see :doc:`start`.
+
+Organization of this document
+=============================
+
+This document is organized into two main sections: Testing and Isolating
+Behavior. The first covers what unit tests are and how to use KUnit to write
+them. The second covers how to use KUnit to isolate code and make it possible
+to unit test code that was otherwise un-unit-testable.
+
+Testing
+=======
+
+What is KUnit?
+--------------
+
+"K" is short for "kernel" so "KUnit" is the "(Linux) Kernel Unit Testing
+Framework." KUnit is intended first and foremost for writing unit tests; it is
+general enough that it can be used to write integration tests; however, this is
+a secondary goal. KUnit has no ambition of being the only testing framework for
+the kernel; for example, it does not intend to be an end-to-end testing
+framework.
+
+What is Unit Testing?
+---------------------
+
+A `unit test <https://martinfowler.com/bliki/UnitTest.html>`_ is a test that
+tests code at the smallest possible scope, a *unit* of code. In the C
+programming language that's a function.
+
+Unit tests should be written for all the publicly exposed functions in a
+compilation unit; so that is all the functions that are exported in either a
+*class* (defined below) or all functions which are **not** static.
+
+Writing Tests
+-------------
+
+Test Cases
+~~~~~~~~~~
+
+The fundamental unit in KUnit is the test case. A test case is a function with
+the signature ``void (*)(struct kunit *test)``. It calls a function to be tested
+and then sets *expectations* for what should happen. For example:
+
+.. code-block:: c
+
+ void example_test_success(struct kunit *test)
+ {
+ }
+
+ void example_test_failure(struct kunit *test)
+ {
+ KUNIT_FAIL(test, "This test never passes.");
+ }
+
+In the above example ``example_test_success`` always passes because it does
+nothing; no expectations are set, so all expectations pass. On the other hand
+``example_test_failure`` always fails because it calls ``KUNIT_FAIL``, which is
+a special expectation that logs a message and causes the test case to fail.
+
+Expectations
+~~~~~~~~~~~~
+An *expectation* is a way to specify that you expect a piece of code to do
+something in a test. An expectation is called like a function. A test is made
+by setting expectations about the behavior of a piece of code under test; when
+one or more of the expectations fail, the test case fails and information about
+the failure is logged. For example:
+
+.. code-block:: c
+
+ void add_test_basic(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, 1, add(1, 0));
+ KUNIT_EXPECT_EQ(test, 2, add(1, 1));
+ }
+
+In the above example ``add_test_basic`` makes a number of assertions about the
+behavior of a function called ``add``; the first parameter is always of type
+``struct kunit *``, which contains information about the current test context;
+the second parameter, in this case, is what the value is expected to be; the
+last value is what the value actually is. If ``add`` passes all of these
+expectations, the test case, ``add_test_basic`` will pass; if any one of these
+expectations fails, the test case will fail.
+
+It is important to understand that a test case *fails* when any expectation is
+violated; however, the test will continue running, potentially trying other
+expectations until the test case ends or is otherwise terminated. This is as
+opposed to *assertions* which are discussed later.
+
+To learn about more expectations supported by KUnit, see :doc:`api/test`.
+
+.. note::
+ A single test case should be pretty short, pretty easy to understand,
+ focused on a single behavior.
+
+For example, if we wanted to properly test the add function above, we would
+create additional tests cases which would each test a different property that an
+add function should have like this:
+
+.. code-block:: c
+
+ void add_test_basic(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, 1, add(1, 0));
+ KUNIT_EXPECT_EQ(test, 2, add(1, 1));
+ }
+
+ void add_test_negative(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, 0, add(-1, 1));
+ }
+
+ void add_test_max(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, INT_MAX, add(0, INT_MAX));
+ KUNIT_EXPECT_EQ(test, -1, add(INT_MAX, INT_MIN));
+ }
+
+ void add_test_overflow(struct kunit *test)
+ {
+ KUNIT_EXPECT_EQ(test, INT_MIN, add(INT_MAX, 1));
+ }
+
+Notice how it is immediately obvious what all the properties that we are testing
+for are.
+
+Assertions
+~~~~~~~~~~
+
+KUnit also has the concept of an *assertion*. An assertion is just like an
+expectation except the assertion immediately terminates the test case if it is
+not satisfied.
+
+For example:
+
+.. code-block:: c
+
+ static void mock_test_do_expect_default_return(struct kunit *test)
+ {
+ struct mock_test_context *ctx = test->priv;
+ struct mock *mock = ctx->mock;
+ int param0 = 5, param1 = -5;
+ const char *two_param_types[] = {"int", "int"};
+ const void *two_params[] = {&param0, &param1};
+ const void *ret;
+
+ ret = mock->do_expect(mock,
+ "test_printk", test_printk,
+ two_param_types, two_params,
+ ARRAY_SIZE(two_params));
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ret);
+ KUNIT_EXPECT_EQ(test, -4, *((int *) ret));
+ }
+
+In this example, the method under test should return a pointer to a value, so
+if the pointer returned by the method is null or an errno, we don't want to
+bother continuing the test since the following expectation could crash the test
+case. `ASSERT_NOT_ERR_OR_NULL(...)` allows us to bail out of the test case if
+the appropriate conditions have not been satisfied to complete the test.
+
+Test Suites
+~~~~~~~~~~~
+
+Now obviously one unit test isn't very helpful; the power comes from having
+many test cases covering all of a unit's behaviors. Consequently it is common
+to have many *similar* tests; in order to reduce duplication in these closely
+related tests most unit testing frameworks - including KUnit - provide the
+concept of a *test suite*. A *test suite* is just a collection of test cases
+for a unit of code with a set up function that gets invoked before every test
+case and then a tear down function that gets invoked after every test case
+completes.
+
+Example:
+
+.. code-block:: c
+
+ static struct kunit_case example_test_cases[] = {
+ KUNIT_CASE(example_test_foo),
+ KUNIT_CASE(example_test_bar),
+ KUNIT_CASE(example_test_baz),
+ {}
+ };
+
+ static struct kunit_suite example_test_suite = {
+ .name = "example",
+ .init = example_test_init,
+ .exit = example_test_exit,
+ .test_cases = example_test_cases,
+ };
+ kunit_test_suite(example_test_suite);
+
+In the above example the test suite, ``example_test_suite``, would run the test
+cases ``example_test_foo``, ``example_test_bar``, and ``example_test_baz``;
+each would have ``example_test_init`` called immediately before it and would
+have ``example_test_exit`` called immediately after it.
+``kunit_test_suite(example_test_suite)`` registers the test suite with the
+KUnit test framework.
+
+.. note::
+ A test case will only be run if it is associated with a test suite.
+
+``kunit_test_suite(...)`` is a macro which tells the linker to put the specified
+test suite in a special linker section so that it can be run by KUnit either
+after late_init, or when the test module is loaded (depending on whether the
+test was built in or not).
+
+For more information on these types of things see the :doc:`api/test`.
+
+Isolating Behavior
+==================
+
+The most important aspect of unit testing that other forms of testing do not
+provide is the ability to limit the amount of code under test to a single unit.
+In practice, this is only possible by being able to control what code gets run
+when the unit under test calls a function and this is usually accomplished
+through some sort of indirection where a function is exposed as part of an API
+such that the definition of that function can be changed without affecting the
+rest of the code base. In the kernel this primarily comes from two constructs,
+classes, structs that contain function pointers that are provided by the
+implementer, and architecture-specific functions which have definitions selected
+at compile time.
+
+Classes
+-------
+
+Classes are not a construct that is built into the C programming language;
+however, it is an easily derived concept. Accordingly, pretty much every project
+that does not use a standardized object oriented library (like GNOME's GObject)
+has their own slightly different way of doing object oriented programming; the
+Linux kernel is no exception.
+
+The central concept in kernel object oriented programming is the class. In the
+kernel, a *class* is a struct that contains function pointers. This creates a
+contract between *implementers* and *users* since it forces them to use the
+same function signature without having to call the function directly. In order
+for it to truly be a class, the function pointers must specify that a pointer
+to the class, known as a *class handle*, be one of the parameters; this makes
+it possible for the member functions (also known as *methods*) to have access
+to member variables (more commonly known as *fields*) allowing the same
+implementation to have multiple *instances*.
+
+Typically a class can be *overridden* by *child classes* by embedding the
+*parent class* in the child class. Then when a method provided by the child
+class is called, the child implementation knows that the pointer passed to it is
+of a parent contained within the child; because of this, the child can compute
+the pointer to itself because the pointer to the parent is always a fixed offset
+from the pointer to the child; this offset is the offset of the parent contained
+in the child struct. For example:
+
+.. code-block:: c
+
+ struct shape {
+ int (*area)(struct shape *this);
+ };
+
+ struct rectangle {
+ struct shape parent;
+ int length;
+ int width;
+ };
+
+ int rectangle_area(struct shape *this)
+ {
+ struct rectangle *self = container_of(this, struct shape, parent);
+
+ return self->length * self->width;
+ };
+
+ void rectangle_new(struct rectangle *self, int length, int width)
+ {
+ self->parent.area = rectangle_area;
+ self->length = length;
+ self->width = width;
+ }
+
+In this example (as in most kernel code) the operation of computing the pointer
+to the child from the pointer to the parent is done by ``container_of``.
+
+Faking Classes
+~~~~~~~~~~~~~~
+
+In order to unit test a piece of code that calls a method in a class, the
+behavior of the method must be controllable, otherwise the test ceases to be a
+unit test and becomes an integration test.
+
+A fake just provides an implementation of a piece of code that is different than
+what runs in a production instance, but behaves identically from the standpoint
+of the callers; this is usually done to replace a dependency that is hard to
+deal with, or is slow.
+
+A good example for this might be implementing a fake EEPROM that just stores the
+"contents" in an internal buffer. For example, let's assume we have a class that
+represents an EEPROM:
+
+.. code-block:: c
+
+ struct eeprom {
+ ssize_t (*read)(struct eeprom *this, size_t offset, char *buffer, size_t count);
+ ssize_t (*write)(struct eeprom *this, size_t offset, const char *buffer, size_t count);
+ };
+
+And we want to test some code that buffers writes to the EEPROM:
+
+.. code-block:: c
+
+ struct eeprom_buffer {
+ ssize_t (*write)(struct eeprom_buffer *this, const char *buffer, size_t count);
+ int flush(struct eeprom_buffer *this);
+ size_t flush_count; /* Flushes when buffer exceeds flush_count. */
+ };
+
+ struct eeprom_buffer *new_eeprom_buffer(struct eeprom *eeprom);
+ void destroy_eeprom_buffer(struct eeprom *eeprom);
+
+We can easily test this code by *faking out* the underlying EEPROM:
+
+.. code-block:: c
+
+ struct fake_eeprom {
+ struct eeprom parent;
+ char contents[FAKE_EEPROM_CONTENTS_SIZE];
+ };
+
+ ssize_t fake_eeprom_read(struct eeprom *parent, size_t offset, char *buffer, size_t count)
+ {
+ struct fake_eeprom *this = container_of(parent, struct fake_eeprom, parent);
+
+ count = min(count, FAKE_EEPROM_CONTENTS_SIZE - offset);
+ memcpy(buffer, this->contents + offset, count);
+
+ return count;
+ }
+
+ ssize_t fake_eeprom_write(struct eeprom *parent, size_t offset, const char *buffer, size_t count)
+ {
+ struct fake_eeprom *this = container_of(parent, struct fake_eeprom, parent);
+
+ count = min(count, FAKE_EEPROM_CONTENTS_SIZE - offset);
+ memcpy(this->contents + offset, buffer, count);
+
+ return count;
+ }
+
+ void fake_eeprom_init(struct fake_eeprom *this)
+ {
+ this->parent.read = fake_eeprom_read;
+ this->parent.write = fake_eeprom_write;
+ memset(this->contents, 0, FAKE_EEPROM_CONTENTS_SIZE);
+ }
+
+We can now use it to test ``struct eeprom_buffer``:
+
+.. code-block:: c
+
+ struct eeprom_buffer_test {
+ struct fake_eeprom *fake_eeprom;
+ struct eeprom_buffer *eeprom_buffer;
+ };
+
+ static void eeprom_buffer_test_does_not_write_until_flush(struct kunit *test)
+ {
+ struct eeprom_buffer_test *ctx = test->priv;
+ struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer;
+ struct fake_eeprom *fake_eeprom = ctx->fake_eeprom;
+ char buffer[] = {0xff};
+
+ eeprom_buffer->flush_count = SIZE_MAX;
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 1);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0);
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 1);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0);
+
+ eeprom_buffer->flush(eeprom_buffer);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff);
+ }
+
+ static void eeprom_buffer_test_flushes_after_flush_count_met(struct kunit *test)
+ {
+ struct eeprom_buffer_test *ctx = test->priv;
+ struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer;
+ struct fake_eeprom *fake_eeprom = ctx->fake_eeprom;
+ char buffer[] = {0xff};
+
+ eeprom_buffer->flush_count = 2;
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 1);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0);
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 1);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff);
+ }
+
+ static void eeprom_buffer_test_flushes_increments_of_flush_count(struct kunit *test)
+ {
+ struct eeprom_buffer_test *ctx = test->priv;
+ struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer;
+ struct fake_eeprom *fake_eeprom = ctx->fake_eeprom;
+ char buffer[] = {0xff, 0xff};
+
+ eeprom_buffer->flush_count = 2;
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 1);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0);
+
+ eeprom_buffer->write(eeprom_buffer, buffer, 2);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff);
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff);
+ /* Should have only flushed the first two bytes. */
+ KUNIT_EXPECT_EQ(test, fake_eeprom->contents[2], 0);
+ }
+
+ static int eeprom_buffer_test_init(struct kunit *test)
+ {
+ struct eeprom_buffer_test *ctx;
+
+ ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx);
+
+ ctx->fake_eeprom = kunit_kzalloc(test, sizeof(*ctx->fake_eeprom), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx->fake_eeprom);
+ fake_eeprom_init(ctx->fake_eeprom);
+
+ ctx->eeprom_buffer = new_eeprom_buffer(&ctx->fake_eeprom->parent);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx->eeprom_buffer);
+
+ test->priv = ctx;
+
+ return 0;
+ }
+
+ static void eeprom_buffer_test_exit(struct kunit *test)
+ {
+ struct eeprom_buffer_test *ctx = test->priv;
+
+ destroy_eeprom_buffer(ctx->eeprom_buffer);
+ }
+
+.. _kunit-on-non-uml:
+
+KUnit on non-UML architectures
+==============================
+
+By default KUnit uses UML as a way to provide dependencies for code under test.
+Under most circumstances KUnit's usage of UML should be treated as an
+implementation detail of how KUnit works under the hood. Nevertheless, there
+are instances where being able to run architecture-specific code or test
+against real hardware is desirable. For these reasons KUnit supports running on
+other architectures.
+
+Running existing KUnit tests on non-UML architectures
+-----------------------------------------------------
+
+There are some special considerations when running existing KUnit tests on
+non-UML architectures:
+
+* Hardware may not be deterministic, so a test that always passes or fails
+ when run under UML may not always do so on real hardware.
+* Hardware and VM environments may not be hermetic. KUnit tries its best to
+ provide a hermetic environment to run tests; however, it cannot manage state
+ that it doesn't know about outside of the kernel. Consequently, tests that
+ may be hermetic on UML may not be hermetic on other architectures.
+* Some features and tooling may not be supported outside of UML.
+* Hardware and VMs are slower than UML.
+
+None of these are reasons not to run your KUnit tests on real hardware; they are
+only things to be aware of when doing so.
+
+The biggest impediment will likely be that certain KUnit features and
+infrastructure may not support your target environment. For example, at this
+time the KUnit Wrapper (``tools/testing/kunit/kunit.py``) does not work outside
+of UML. Unfortunately, there is no way around this. Using UML (or even just a
+particular architecture) allows us to make a lot of assumptions that make it
+possible to do things which might otherwise be impossible.
+
+Nevertheless, all core KUnit framework features are fully supported on all
+architectures, and using them is straightforward: all you need to do is to take
+your kunitconfig, your Kconfig options for the tests you would like to run, and
+merge them into whatever config your are using for your platform. That's it!
+
+For example, let's say you have the following kunitconfig:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+
+If you wanted to run this test on an x86 VM, you might add the following config
+options to your ``.config``:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=y
+ CONFIG_KUNIT_EXAMPLE_TEST=y
+ CONFIG_SERIAL_8250=y
+ CONFIG_SERIAL_8250_CONSOLE=y
+
+All these new options do is enable support for a common serial console needed
+for logging.
+
+Next, you could build a kernel with these tests as follows:
+
+
+.. code-block:: bash
+
+ make ARCH=x86 olddefconfig
+ make ARCH=x86
+
+Once you have built a kernel, you could run it on QEMU as follows:
+
+.. code-block:: bash
+
+ qemu-system-x86_64 -enable-kvm \
+ -m 1024 \
+ -kernel arch/x86_64/boot/bzImage \
+ -append 'console=ttyS0' \
+ --nographic
+
+Interspersed in the kernel logs you might see the following:
+
+.. code-block:: none
+
+ TAP version 14
+ # Subtest: example
+ 1..1
+ # example_simple_test: initializing
+ ok 1 - example_simple_test
+ ok 1 - example
+
+Congratulations, you just ran a KUnit test on the x86 architecture!
+
+In a similar manner, kunit and kunit tests can also be built as modules,
+so if you wanted to run tests in this way you might add the following config
+options to your ``.config``:
+
+.. code-block:: none
+
+ CONFIG_KUNIT=m
+ CONFIG_KUNIT_EXAMPLE_TEST=m
+
+Once the kernel is built and installed, a simple
+
+.. code-block:: bash
+
+ modprobe example-test
+
+...will run the tests.
+
+.. note::
+ Note that you should make sure your test depends on ``KUNIT=y`` in Kconfig
+ if the test does not support module build. Otherwise, it will trigger
+ compile errors if ``CONFIG_KUNIT`` is ``m``.
+
+Writing new tests for other architectures
+-----------------------------------------
+
+The first thing you must do is ask yourself whether it is necessary to write a
+KUnit test for a specific architecture, and then whether it is necessary to
+write that test for a particular piece of hardware. In general, writing a test
+that depends on having access to a particular piece of hardware or software (not
+included in the Linux source repo) should be avoided at all costs.
+
+Even if you only ever plan on running your KUnit test on your hardware
+configuration, other people may want to run your tests and may not have access
+to your hardware. If you write your test to run on UML, then anyone can run your
+tests without knowing anything about your particular setup, and you can still
+run your tests on your hardware setup just by compiling for your architecture.
+
+.. important::
+ Always prefer tests that run on UML to tests that only run under a particular
+ architecture, and always prefer tests that run under QEMU or another easy
+ (and monetarily free) to obtain software environment to a specific piece of
+ hardware.
+
+Nevertheless, there are still valid reasons to write an architecture or hardware
+specific test: for example, you might want to test some code that really belongs
+in ``arch/some-arch/*``. Even so, try your best to write the test so that it
+does not depend on physical hardware: if some of your test cases don't need the
+hardware, only require the hardware for tests that actually need it.
+
+Now that you have narrowed down exactly what bits are hardware specific, the
+actual procedure for writing and running the tests is pretty much the same as
+writing normal KUnit tests. One special caveat is that you have to reset
+hardware state in between test cases; if this is not possible, you may only be
+able to run one test case per invocation.
+
+.. TODO(brendanhiggins@google.com): Add an actual example of an architecture-
+ dependent KUnit test.
+
+KUnit debugfs representation
+============================
+When kunit test suites are initialized, they create an associated directory
+in ``/sys/kernel/debug/kunit/<test-suite>``. The directory contains one file
+
+- results: "cat results" displays results of each test case and the results
+ of the entire suite for the last test run.
+
+The debugfs representation is primarily of use when kunit test suites are
+run in a native environment, either as modules or builtin. Having a way
+to display results like this is valuable as otherwise results can be
+intermixed with other events in dmesg output. The maximum size of each
+results file is KUNIT_LOG_SIZE bytes (defined in ``include/kunit/test.h``).
diff --git a/Documentation/dev-tools/sparse.rst b/Documentation/dev-tools/sparse.rst
new file mode 100644
index 000000000..02102be7f
--- /dev/null
+++ b/Documentation/dev-tools/sparse.rst
@@ -0,0 +1,102 @@
+.. Copyright 2004 Linus Torvalds
+.. Copyright 2004 Pavel Machek <pavel@ucw.cz>
+.. Copyright 2006 Bob Copeland <me@bobcopeland.com>
+
+Sparse
+======
+
+Sparse is a semantic checker for C programs; it can be used to find a
+number of potential problems with kernel code. See
+https://lwn.net/Articles/689907/ for an overview of sparse; this document
+contains some kernel-specific sparse information.
+More information on sparse, mainly about its internals, can be found in
+its official pages at https://sparse.docs.kernel.org.
+
+
+Using sparse for typechecking
+-----------------------------
+
+"__bitwise" is a type attribute, so you have to do something like this::
+
+ typedef int __bitwise pm_request_t;
+
+ enum pm_request {
+ PM_SUSPEND = (__force pm_request_t) 1,
+ PM_RESUME = (__force pm_request_t) 2
+ };
+
+which makes PM_SUSPEND and PM_RESUME "bitwise" integers (the "__force" is
+there because sparse will complain about casting to/from a bitwise type,
+but in this case we really _do_ want to force the conversion). And because
+the enum values are all the same type, now "enum pm_request" will be that
+type too.
+
+And with gcc, all the "__bitwise"/"__force stuff" goes away, and it all
+ends up looking just like integers to gcc.
+
+Quite frankly, you don't need the enum there. The above all really just
+boils down to one special "int __bitwise" type.
+
+So the simpler way is to just do::
+
+ typedef int __bitwise pm_request_t;
+
+ #define PM_SUSPEND ((__force pm_request_t) 1)
+ #define PM_RESUME ((__force pm_request_t) 2)
+
+and you now have all the infrastructure needed for strict typechecking.
+
+One small note: the constant integer "0" is special. You can use a
+constant zero as a bitwise integer type without sparse ever complaining.
+This is because "bitwise" (as the name implies) was designed for making
+sure that bitwise types don't get mixed up (little-endian vs big-endian
+vs cpu-endian vs whatever), and there the constant "0" really _is_
+special.
+
+Using sparse for lock checking
+------------------------------
+
+The following macros are undefined for gcc and defined during a sparse
+run to use the "context" tracking feature of sparse, applied to
+locking. These annotations tell sparse when a lock is held, with
+regard to the annotated function's entry and exit.
+
+__must_hold - The specified lock is held on function entry and exit.
+
+__acquires - The specified lock is held on function exit, but not entry.
+
+__releases - The specified lock is held on function entry, but not exit.
+
+If the function enters and exits without the lock held, acquiring and
+releasing the lock inside the function in a balanced way, no
+annotation is needed. The three annotations above are for cases where
+sparse would otherwise report a context imbalance.
+
+Getting sparse
+--------------
+
+You can get tarballs of the latest released versions from:
+https://www.kernel.org/pub/software/devel/sparse/dist/
+
+Alternatively, you can get snapshots of the latest development version
+of sparse using git to clone::
+
+ git://git.kernel.org/pub/scm/devel/sparse/sparse.git
+
+Once you have it, just do::
+
+ make
+ make install
+
+as a regular user, and it will install sparse in your ~/bin directory.
+
+Using sparse
+------------
+
+Do a kernel make with "make C=1" to run sparse on all the C files that get
+recompiled, or use "make C=2" to run sparse on the files whether they need to
+be recompiled or not. The latter is a fast way to check the whole tree if you
+have already built it.
+
+The optional make variable CF can be used to pass arguments to sparse. The
+build system passes -Wbitwise to sparse automatically.
diff --git a/Documentation/dev-tools/ubsan.rst b/Documentation/dev-tools/ubsan.rst
new file mode 100644
index 000000000..655e6b63c
--- /dev/null
+++ b/Documentation/dev-tools/ubsan.rst
@@ -0,0 +1,88 @@
+The Undefined Behavior Sanitizer - UBSAN
+========================================
+
+UBSAN is a runtime undefined behaviour checker.
+
+UBSAN uses compile-time instrumentation to catch undefined behavior (UB).
+Compiler inserts code that perform certain kinds of checks before operations
+that may cause UB. If check fails (i.e. UB detected) __ubsan_handle_*
+function called to print error message.
+
+GCC has that feature since 4.9.x [1_] (see ``-fsanitize=undefined`` option and
+its suboptions). GCC 5.x has more checkers implemented [2_].
+
+Report example
+--------------
+
+::
+
+ ================================================================================
+ UBSAN: Undefined behaviour in ../include/linux/bitops.h:110:33
+ shift exponent 32 is to large for 32-bit type 'unsigned int'
+ CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-rc1+ #26
+ 0000000000000000 ffffffff82403cc8 ffffffff815e6cd6 0000000000000001
+ ffffffff82403cf8 ffffffff82403ce0 ffffffff8163a5ed 0000000000000020
+ ffffffff82403d78 ffffffff8163ac2b ffffffff815f0001 0000000000000002
+ Call Trace:
+ [<ffffffff815e6cd6>] dump_stack+0x45/0x5f
+ [<ffffffff8163a5ed>] ubsan_epilogue+0xd/0x40
+ [<ffffffff8163ac2b>] __ubsan_handle_shift_out_of_bounds+0xeb/0x130
+ [<ffffffff815f0001>] ? radix_tree_gang_lookup_slot+0x51/0x150
+ [<ffffffff8173c586>] _mix_pool_bytes+0x1e6/0x480
+ [<ffffffff83105653>] ? dmi_walk_early+0x48/0x5c
+ [<ffffffff8173c881>] add_device_randomness+0x61/0x130
+ [<ffffffff83105b35>] ? dmi_save_one_device+0xaa/0xaa
+ [<ffffffff83105653>] dmi_walk_early+0x48/0x5c
+ [<ffffffff831066ae>] dmi_scan_machine+0x278/0x4b4
+ [<ffffffff8111d58a>] ? vprintk_default+0x1a/0x20
+ [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120
+ [<ffffffff830b2240>] setup_arch+0x405/0xc2c
+ [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120
+ [<ffffffff830ae053>] start_kernel+0x83/0x49a
+ [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120
+ [<ffffffff830ad386>] x86_64_start_reservations+0x2a/0x2c
+ [<ffffffff830ad4f3>] x86_64_start_kernel+0x16b/0x17a
+ ================================================================================
+
+Usage
+-----
+
+To enable UBSAN configure kernel with::
+
+ CONFIG_UBSAN=y
+
+and to check the entire kernel::
+
+ CONFIG_UBSAN_SANITIZE_ALL=y
+
+To enable instrumentation for specific files or directories, add a line
+similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)::
+
+ UBSAN_SANITIZE_main.o := y
+
+- For all files in one directory::
+
+ UBSAN_SANITIZE := y
+
+To exclude files from being instrumented even if
+``CONFIG_UBSAN_SANITIZE_ALL=y``, use::
+
+ UBSAN_SANITIZE_main.o := n
+
+and::
+
+ UBSAN_SANITIZE := n
+
+Detection of unaligned accesses controlled through the separate option -
+CONFIG_UBSAN_ALIGNMENT. It's off by default on architectures that support
+unaligned accesses (CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y). One could
+still enable it in config, just note that it will produce a lot of UBSAN
+reports.
+
+References
+----------
+
+.. _1: https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
+.. _2: https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html