diff options
Diffstat (limited to '')
31 files changed, 9448 insertions, 0 deletions
diff --git a/Documentation/dev-tools/checkpatch.rst b/Documentation/dev-tools/checkpatch.rst new file mode 100644 index 0000000000..c3389c6f38 --- /dev/null +++ b/Documentation/dev-tools/checkpatch.rst @@ -0,0 +1,1256 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +========== +Checkpatch +========== + +Checkpatch (scripts/checkpatch.pl) is a perl script which checks for trivial +style violations in patches and optionally corrects them. Checkpatch can +also be run on file contexts and without the kernel tree. + +Checkpatch is not always right. Your judgement takes precedence over checkpatch +messages. If your code looks better with the violations, then its probably +best left alone. + + +Options +======= + +This section will describe the options checkpatch can be run with. + +Usage:: + + ./scripts/checkpatch.pl [OPTION]... [FILE]... + +Available options: + + - -q, --quiet + + Enable quiet mode. + + - -v, --verbose + Enable verbose mode. Additional verbose test descriptions are output + so as to provide information on why that particular message is shown. + + - --no-tree + + Run checkpatch without the kernel tree. + + - --no-signoff + + Disable the 'Signed-off-by' line check. The sign-off is a simple line at + the end of the explanation for the patch, which certifies that you wrote it + or otherwise have the right to pass it on as an open-source patch. + + Example:: + + Signed-off-by: Random J Developer <random@developer.example.org> + + Setting this flag effectively stops a message for a missing signed-off-by + line in a patch context. + + - --patch + + Treat FILE as a patch. This is the default option and need not be + explicitly specified. + + - --emacs + + Set output to emacs compile window format. This allows emacs users to jump + from the error in the compile window directly to the offending line in the + patch. + + - --terse + + Output only one line per report. + + - --showfile + + Show the diffed file position instead of the input file position. + + - -g, --git + + Treat FILE as a single commit or a git revision range. + + Single commit with: + + - <rev> + - <rev>^ + - <rev>~n + + Multiple commits with: + + - <rev1>..<rev2> + - <rev1>...<rev2> + - <rev>-<count> + + - -f, --file + + Treat FILE as a regular source file. This option must be used when running + checkpatch on source files in the kernel. + + - --subjective, --strict + + Enable stricter tests in checkpatch. By default the tests emitted as CHECK + do not activate by default. Use this flag to activate the CHECK tests. + + - --list-types + + Every message emitted by checkpatch has an associated TYPE. Add this flag + to display all the types in checkpatch. + + Note that when this flag is active, checkpatch does not read the input FILE, + and no message is emitted. Only a list of types in checkpatch is output. + + - --types TYPE(,TYPE2...) + + Only display messages with the given types. + + Example:: + + ./scripts/checkpatch.pl mypatch.patch --types EMAIL_SUBJECT,BRACES + + - --ignore TYPE(,TYPE2...) + + Checkpatch will not emit messages for the specified types. + + Example:: + + ./scripts/checkpatch.pl mypatch.patch --ignore EMAIL_SUBJECT,BRACES + + - --show-types + + By default checkpatch doesn't display the type associated with the messages. + Set this flag to show the message type in the output. + + - --max-line-length=n + + Set the max line length (default 100). If a line exceeds the specified + length, a LONG_LINE message is emitted. + + + The message level is different for patch and file contexts. For patches, + a WARNING is emitted. While a milder CHECK is emitted for files. So for + file contexts, the --strict flag must also be enabled. + + - --min-conf-desc-length=n + + Set the Kconfig entry minimum description length, if shorter, warn. + + - --tab-size=n + + Set the number of spaces for tab (default 8). + + - --root=PATH + + PATH to the kernel tree root. + + This option must be specified when invoking checkpatch from outside + the kernel root. + + - --no-summary + + Suppress the per file summary. + + - --mailback + + Only produce a report in case of Warnings or Errors. Milder Checks are + excluded from this. + + - --summary-file + + Include the filename in summary. + + - --debug KEY=[0|1] + + Turn on/off debugging of KEY, where KEY is one of 'values', 'possible', + 'type', and 'attr' (default is all off). + + - --fix + + This is an EXPERIMENTAL feature. If correctable errors exists, a file + <inputfile>.EXPERIMENTAL-checkpatch-fixes is created which has the + automatically fixable errors corrected. + + - --fix-inplace + + EXPERIMENTAL - Similar to --fix but input file is overwritten with fixes. + + DO NOT USE this flag unless you are absolutely sure and you have a backup + in place. + + - --ignore-perl-version + + Override checking of perl version. Runtime errors maybe encountered after + enabling this flag if the perl version does not meet the minimum specified. + + - --codespell + + Use the codespell dictionary for checking spelling errors. + + - --codespellfile + + Use the specified codespell file. + Default is '/usr/share/codespell/dictionary.txt'. + + - --typedefsfile + + Read additional types from this file. + + - --color[=WHEN] + + Use colors 'always', 'never', or only when output is a terminal ('auto'). + Default is 'auto'. + + - --kconfig-prefix=WORD + + Use WORD as a prefix for Kconfig symbols (default is `CONFIG_`). + + - -h, --help, --version + + Display the help text. + +Message Levels +============== + +Messages in checkpatch are divided into three levels. The levels of messages +in checkpatch denote the severity of the error. They are: + + - ERROR + + This is the most strict level. Messages of type ERROR must be taken + seriously as they denote things that are very likely to be wrong. + + - WARNING + + This is the next stricter level. Messages of type WARNING requires a + more careful review. But it is milder than an ERROR. + + - CHECK + + This is the mildest level. These are things which may require some thought. + +Type Descriptions +================= + +This section contains a description of all the message types in checkpatch. + +.. Types in this section are also parsed by checkpatch. +.. The types are grouped into subsections based on use. + + +Allocation style +---------------- + + **ALLOC_ARRAY_ARGS** + The first argument for kcalloc or kmalloc_array should be the + number of elements. sizeof() as the first argument is generally + wrong. + + See: https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html + + **ALLOC_SIZEOF_STRUCT** + The allocation style is bad. In general for family of + allocation functions using sizeof() to get memory size, + constructs like:: + + p = alloc(sizeof(struct foo), ...) + + should be:: + + p = alloc(sizeof(*p), ...) + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#allocating-memory + + **ALLOC_WITH_MULTIPLY** + Prefer kmalloc_array/kcalloc over kmalloc/kzalloc with a + sizeof multiply. + + See: https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html + + +API usage +--------- + + **ARCH_DEFINES** + Architecture specific defines should be avoided wherever + possible. + + **ARCH_INCLUDE_LINUX** + Whenever asm/file.h is included and linux/file.h exists, a + conversion can be made when linux/file.h includes asm/file.h. + However this is not always the case (See signal.h). + This message type is emitted only for includes from arch/. + + **AVOID_BUG** + BUG() or BUG_ON() should be avoided totally. + Use WARN() and WARN_ON() instead, and handle the "impossible" + error condition as gracefully as possible. + + See: https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on + + **CONSIDER_KSTRTO** + The simple_strtol(), simple_strtoll(), simple_strtoul(), and + simple_strtoull() functions explicitly ignore overflows, which + may lead to unexpected results in callers. The respective kstrtol(), + kstrtoll(), kstrtoul(), and kstrtoull() functions tend to be the + correct replacements. + + See: https://www.kernel.org/doc/html/latest/process/deprecated.html#simple-strtol-simple-strtoll-simple-strtoul-simple-strtoull + + **CONSTANT_CONVERSION** + Use of __constant_<foo> form is discouraged for the following functions:: + + __constant_cpu_to_be[x] + __constant_cpu_to_le[x] + __constant_be[x]_to_cpu + __constant_le[x]_to_cpu + __constant_htons + __constant_ntohs + + Using any of these outside of include/uapi/ is not preferred as using the + function without __constant_ is identical when the argument is a + constant. + + In big endian systems, the macros like __constant_cpu_to_be32(x) and + cpu_to_be32(x) expand to the same expression:: + + #define __constant_cpu_to_be32(x) ((__force __be32)(__u32)(x)) + #define __cpu_to_be32(x) ((__force __be32)(__u32)(x)) + + In little endian systems, the macros __constant_cpu_to_be32(x) and + cpu_to_be32(x) expand to __constant_swab32 and __swab32. __swab32 + has a __builtin_constant_p check:: + + #define __swab32(x) \ + (__builtin_constant_p((__u32)(x)) ? \ + ___constant_swab32(x) : \ + __fswab32(x)) + + So ultimately they have a special case for constants. + Similar is the case with all of the macros in the list. Thus + using the __constant_... forms are unnecessarily verbose and + not preferred outside of include/uapi. + + See: https://lore.kernel.org/lkml/1400106425.12666.6.camel@joe-AO725/ + + **DEPRECATED_API** + Usage of a deprecated RCU API is detected. It is recommended to replace + old flavourful RCU APIs by their new vanilla-RCU counterparts. + + The full list of available RCU APIs can be viewed from the kernel docs. + + See: https://www.kernel.org/doc/html/latest/RCU/whatisRCU.html#full-list-of-rcu-apis + + **DEPRECATED_VARIABLE** + EXTRA_{A,C,CPP,LD}FLAGS are deprecated and should be replaced by the new + flags added via commit f77bf01425b1 ("kbuild: introduce ccflags-y, + asflags-y and ldflags-y"). + + The following conversion scheme maybe used:: + + EXTRA_AFLAGS -> asflags-y + EXTRA_CFLAGS -> ccflags-y + EXTRA_CPPFLAGS -> cppflags-y + EXTRA_LDFLAGS -> ldflags-y + + See: + + 1. https://lore.kernel.org/lkml/20070930191054.GA15876@uranus.ravnborg.org/ + 2. https://lore.kernel.org/lkml/1313384834-24433-12-git-send-email-lacombar@gmail.com/ + 3. https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#compilation-flags + + **DEVICE_ATTR_FUNCTIONS** + The function names used in DEVICE_ATTR is unusual. + Typically, the store and show functions are used with <attr>_store and + <attr>_show, where <attr> is a named attribute variable of the device. + + Consider the following examples:: + + static DEVICE_ATTR(type, 0444, type_show, NULL); + static DEVICE_ATTR(power, 0644, power_show, power_store); + + The function names should preferably follow the above pattern. + + See: https://www.kernel.org/doc/html/latest/driver-api/driver-model/device.html#attributes + + **DEVICE_ATTR_RO** + The DEVICE_ATTR_RO(name) helper macro can be used instead of + DEVICE_ATTR(name, 0444, name_show, NULL); + + Note that the macro automatically appends _show to the named + attribute variable of the device for the show method. + + See: https://www.kernel.org/doc/html/latest/driver-api/driver-model/device.html#attributes + + **DEVICE_ATTR_RW** + The DEVICE_ATTR_RW(name) helper macro can be used instead of + DEVICE_ATTR(name, 0644, name_show, name_store); + + Note that the macro automatically appends _show and _store to the + named attribute variable of the device for the show and store methods. + + See: https://www.kernel.org/doc/html/latest/driver-api/driver-model/device.html#attributes + + **DEVICE_ATTR_WO** + The DEVICE_AATR_WO(name) helper macro can be used instead of + DEVICE_ATTR(name, 0200, NULL, name_store); + + Note that the macro automatically appends _store to the + named attribute variable of the device for the store method. + + See: https://www.kernel.org/doc/html/latest/driver-api/driver-model/device.html#attributes + + **DUPLICATED_SYSCTL_CONST** + Commit d91bff3011cf ("proc/sysctl: add shared variables for range + check") added some shared const variables to be used instead of a local + copy in each source file. + + Consider replacing the sysctl range checking value with the shared + one in include/linux/sysctl.h. The following conversion scheme may + be used:: + + &zero -> SYSCTL_ZERO + &one -> SYSCTL_ONE + &int_max -> SYSCTL_INT_MAX + + See: + + 1. https://lore.kernel.org/lkml/20190430180111.10688-1-mcroce@redhat.com/ + 2. https://lore.kernel.org/lkml/20190531131422.14970-1-mcroce@redhat.com/ + + **ENOSYS** + ENOSYS means that a nonexistent system call was called. + Earlier, it was wrongly used for things like invalid operations on + otherwise valid syscalls. This should be avoided in new code. + + See: https://lore.kernel.org/lkml/5eb299021dec23c1a48fa7d9f2c8b794e967766d.1408730669.git.luto@amacapital.net/ + + **ENOTSUPP** + ENOTSUPP is not a standard error code and should be avoided in new patches. + EOPNOTSUPP should be used instead. + + See: https://lore.kernel.org/netdev/20200510182252.GA411829@lunn.ch/ + + **EXPORT_SYMBOL** + EXPORT_SYMBOL should immediately follow the symbol to be exported. + + **IN_ATOMIC** + in_atomic() is not for driver use so any such use is reported as an ERROR. + Also in_atomic() is often used to determine if sleeping is permitted, + but it is not reliable in this use model. Therefore its use is + strongly discouraged. + + However, in_atomic() is ok for core kernel use. + + See: https://lore.kernel.org/lkml/20080320201723.b87b3732.akpm@linux-foundation.org/ + + **LOCKDEP** + The lockdep_no_validate class was added as a temporary measure to + prevent warnings on conversion of device->sem to device->mutex. + It should not be used for any other purpose. + + See: https://lore.kernel.org/lkml/1268959062.9440.467.camel@laptop/ + + **MALFORMED_INCLUDE** + The #include statement has a malformed path. This has happened + because the author has included a double slash "//" in the pathname + accidentally. + + **USE_LOCKDEP** + lockdep_assert_held() annotations should be preferred over + assertions based on spin_is_locked() + + See: https://www.kernel.org/doc/html/latest/locking/lockdep-design.html#annotations + + **UAPI_INCLUDE** + No #include statements in include/uapi should use a uapi/ path. + + **USLEEP_RANGE** + usleep_range() should be preferred over udelay(). The proper way of + using usleep_range() is mentioned in the kernel docs. + + See: https://www.kernel.org/doc/html/latest/timers/timers-howto.html#delays-information-on-the-various-kernel-delay-sleep-mechanisms + + +Comments +-------- + + **BLOCK_COMMENT_STYLE** + The comment style is incorrect. The preferred style for multi- + line comments is:: + + /* + * This is the preferred style + * for multi line comments. + */ + + The networking comment style is a bit different, with the first line + not empty like the former:: + + /* This is the preferred comment style + * for files in net/ and drivers/net/ + */ + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#commenting + + **C99_COMMENTS** + C99 style single line comments (//) should not be used. + Prefer the block comment style instead. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#commenting + + **DATA_RACE** + Applications of data_race() should have a comment so as to document the + reasoning behind why it was deemed safe. + + See: https://lore.kernel.org/lkml/20200401101714.44781-1-elver@google.com/ + + **FSF_MAILING_ADDRESS** + Kernel maintainers reject new instances of the GPL boilerplate paragraph + directing people to write to the FSF for a copy of the GPL, since the + FSF has moved in the past and may do so again. + So do not write paragraphs about writing to the Free Software Foundation's + mailing address. + + See: https://lore.kernel.org/lkml/20131006222342.GT19510@leaf/ + + +Commit message +-------------- + + **BAD_SIGN_OFF** + The signed-off-by line does not fall in line with the standards + specified by the community. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#developer-s-certificate-of-origin-1-1 + + **BAD_STABLE_ADDRESS_STYLE** + The email format for stable is incorrect. + Some valid options for stable address are:: + + 1. stable@vger.kernel.org + 2. stable@kernel.org + + For adding version info, the following comment style should be used:: + + stable@vger.kernel.org # version info + + **COMMIT_COMMENT_SYMBOL** + Commit log lines starting with a '#' are ignored by git as + comments. To solve this problem addition of a single space + infront of the log line is enough. + + **COMMIT_MESSAGE** + The patch is missing a commit description. A brief + description of the changes made by the patch should be added. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes + + **EMAIL_SUBJECT** + Naming the tool that found the issue is not very useful in the + subject line. A good subject line summarizes the change that + the patch brings. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes + + **FROM_SIGN_OFF_MISMATCH** + The author's email does not match with that in the Signed-off-by: + line(s). This can be sometimes caused due to an improperly configured + email client. + + This message is emitted due to any of the following reasons:: + + - The email names do not match. + - The email addresses do not match. + - The email subaddresses do not match. + - The email comments do not match. + + **MISSING_SIGN_OFF** + The patch is missing a Signed-off-by line. A signed-off-by + line should be added according to Developer's certificate of + Origin. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin + + **NO_AUTHOR_SIGN_OFF** + The author of the patch has not signed off the patch. It is + required that a simple sign off line should be present at the + end of explanation of the patch to denote that the author has + written it or otherwise has the rights to pass it on as an open + source patch. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin + + **DIFF_IN_COMMIT_MSG** + Avoid having diff content in commit message. + This causes problems when one tries to apply a file containing both + the changelog and the diff because patch(1) tries to apply the diff + which it found in the changelog. + + See: https://lore.kernel.org/lkml/20150611134006.9df79a893e3636019ad2759e@linux-foundation.org/ + + **GERRIT_CHANGE_ID** + To be picked up by gerrit, the footer of the commit message might + have a Change-Id like:: + + Change-Id: Ic8aaa0728a43936cd4c6e1ed590e01ba8f0fbf5b + Signed-off-by: A. U. Thor <author@example.com> + + The Change-Id line must be removed before submitting. + + **GIT_COMMIT_ID** + The proper way to reference a commit id is: + commit <12+ chars of sha1> ("<title line>") + + An example may be:: + + Commit e21d2170f36602ae2708 ("video: remove unnecessary + platform_set_drvdata()") removed the unnecessary + platform_set_drvdata(), but left the variable "dev" unused, + delete it. + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes + + **BAD_FIXES_TAG** + The Fixes: tag is malformed or does not follow the community conventions. + This can occur if the tag have been split into multiple lines (e.g., when + pasted in an email program with word wrapping enabled). + + See: https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes + + +Comparison style +---------------- + + **ASSIGN_IN_IF** + Do not use assignments in if condition. + Example:: + + if ((foo = bar(...)) < BAZ) { + + should be written as:: + + foo = bar(...); + if (foo < BAZ) { + + **BOOL_COMPARISON** + Comparisons of A to true and false are better written + as A and !A. + + See: https://lore.kernel.org/lkml/1365563834.27174.12.camel@joe-AO722/ + + **COMPARISON_TO_NULL** + Comparisons to NULL in the form (foo == NULL) or (foo != NULL) + are better written as (!foo) and (foo). + + **CONSTANT_COMPARISON** + Comparisons with a constant or upper case identifier on the left + side of the test should be avoided. + + +Indentation and Line Breaks +--------------------------- + + **CODE_INDENT** + Code indent should use tabs instead of spaces. + Outside of comments, documentation and Kconfig, + spaces are never used for indentation. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#indentation + + **DEEP_INDENTATION** + Indentation with 6 or more tabs usually indicate overly indented + code. + + It is suggested to refactor excessive indentation of + if/else/for/do/while/switch statements. + + See: https://lore.kernel.org/lkml/1328311239.21255.24.camel@joe2Laptop/ + + **SWITCH_CASE_INDENT_LEVEL** + switch should be at the same indent as case. + Example:: + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + fallthrough; + default: + break; + } + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#indentation + + **LONG_LINE** + The line has exceeded the specified maximum length. + To use a different maximum line length, the --max-line-length=n option + may be added while invoking checkpatch. + + Earlier, the default line length was 80 columns. Commit bdc48fa11e46 + ("checkpatch/coding-style: deprecate 80-column warning") increased the + limit to 100 columns. This is not a hard limit either and it's + preferable to stay within 80 columns whenever possible. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#breaking-long-lines-and-strings + + **LONG_LINE_STRING** + A string starts before but extends beyond the maximum line length. + To use a different maximum line length, the --max-line-length=n option + may be added while invoking checkpatch. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#breaking-long-lines-and-strings + + **LONG_LINE_COMMENT** + A comment starts before but extends beyond the maximum line length. + To use a different maximum line length, the --max-line-length=n option + may be added while invoking checkpatch. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#breaking-long-lines-and-strings + + **SPLIT_STRING** + Quoted strings that appear as messages in userspace and can be + grepped, should not be split across multiple lines. + + See: https://lore.kernel.org/lkml/20120203052727.GA15035@leaf/ + + **MULTILINE_DEREFERENCE** + A single dereferencing identifier spanned on multiple lines like:: + + struct_identifier->member[index]. + member = <foo>; + + is generally hard to follow. It can easily lead to typos and so makes + the code vulnerable to bugs. + + If fixing the multiple line dereferencing leads to an 80 column + violation, then either rewrite the code in a more simple way or if the + starting part of the dereferencing identifier is the same and used at + multiple places then store it in a temporary variable, and use that + temporary variable only at all the places. For example, if there are + two dereferencing identifiers:: + + member1->member2->member3.foo1; + member1->member2->member3.foo2; + + then store the member1->member2->member3 part in a temporary variable. + It not only helps to avoid the 80 column violation but also reduces + the program size by removing the unnecessary dereferences. + + But if none of the above methods work then ignore the 80 column + violation because it is much easier to read a dereferencing identifier + on a single line. + + **TRAILING_STATEMENTS** + Trailing statements (for example after any conditional) should be + on the next line. + Statements, such as:: + + if (x == y) break; + + should be:: + + if (x == y) + break; + + +Macros, Attributes and Symbols +------------------------------ + + **ARRAY_SIZE** + The ARRAY_SIZE(foo) macro should be preferred over + sizeof(foo)/sizeof(foo[0]) for finding number of elements in an + array. + + The macro is defined in include/linux/kernel.h:: + + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + + **AVOID_EXTERNS** + Function prototypes don't need to be declared extern in .h + files. It's assumed by the compiler and is unnecessary. + + **AVOID_L_PREFIX** + Local symbol names that are prefixed with `.L` should be avoided, + as this has special meaning for the assembler; a symbol entry will + not be emitted into the symbol table. This can prevent `objtool` + from generating correct unwind info. + + Symbols with STB_LOCAL binding may still be used, and `.L` prefixed + local symbol names are still generally usable within a function, + but `.L` prefixed local symbol names should not be used to denote + the beginning or end of code regions via + `SYM_CODE_START_LOCAL`/`SYM_CODE_END` + + **BIT_MACRO** + Defines like: 1 << <digit> could be BIT(digit). + The BIT() macro is defined via include/linux/bits.h:: + + #define BIT(nr) (1UL << (nr)) + + **CONST_READ_MOSTLY** + When a variable is tagged with the __read_mostly annotation, it is a + signal to the compiler that accesses to the variable will be mostly + reads and rarely(but NOT never) a write. + + const __read_mostly does not make any sense as const data is already + read-only. The __read_mostly annotation thus should be removed. + + **DATE_TIME** + It is generally desirable that building the same source code with + the same set of tools is reproducible, i.e. the output is always + exactly the same. + + The kernel does *not* use the ``__DATE__`` and ``__TIME__`` macros, + and enables warnings if they are used as they can lead to + non-deterministic builds. + + See: https://www.kernel.org/doc/html/latest/kbuild/reproducible-builds.html#timestamps + + **DEFINE_ARCH_HAS** + The ARCH_HAS_xyz and ARCH_HAVE_xyz patterns are wrong. + + For big conceptual features use Kconfig symbols instead. And for + smaller things where we have compatibility fallback functions but + want architectures able to override them with optimized ones, we + should either use weak functions (appropriate for some cases), or + the symbol that protects them should be the same symbol we use. + + See: https://lore.kernel.org/lkml/CA+55aFycQ9XJvEOsiM3txHL5bjUc8CeKWJNR_H+MiicaddB42Q@mail.gmail.com/ + + **DO_WHILE_MACRO_WITH_TRAILING_SEMICOLON** + do {} while(0) macros should not have a trailing semicolon. + + **INIT_ATTRIBUTE** + Const init definitions should use __initconst instead of + __initdata. + + Similarly init definitions without const require a separate + use of const. + + **INLINE_LOCATION** + The inline keyword should sit between storage class and type. + + For example, the following segment:: + + inline static int example_function(void) + { + ... + } + + should be:: + + static inline int example_function(void) + { + ... + } + + **MISPLACED_INIT** + It is possible to use section markers on variables in a way + which gcc doesn't understand (or at least not the way the + developer intended):: + + static struct __initdata samsung_pll_clock exynos4_plls[nr_plls] = { + + does not put exynos4_plls in the .initdata section. The __initdata + marker can be virtually anywhere on the line, except right after + "struct". The preferred location is before the "=" sign if there is + one, or before the trailing ";" otherwise. + + See: https://lore.kernel.org/lkml/1377655732.3619.19.camel@joe-AO722/ + + **MULTISTATEMENT_MACRO_USE_DO_WHILE** + Macros with multiple statements should be enclosed in a + do - while block. Same should also be the case for macros + starting with `if` to avoid logic defects:: + + #define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl + + **PREFER_FALLTHROUGH** + Use the `fallthrough;` pseudo keyword instead of + `/* fallthrough */` like comments. + + **TRAILING_SEMICOLON** + Macro definition should not end with a semicolon. The macro + invocation style should be consistent with function calls. + This can prevent any unexpected code paths:: + + #define MAC do_something; + + If this macro is used within a if else statement, like:: + + if (some_condition) + MAC; + + else + do_something; + + Then there would be a compilation error, because when the macro is + expanded there are two trailing semicolons, so the else branch gets + orphaned. + + See: https://lore.kernel.org/lkml/1399671106.2912.21.camel@joe-AO725/ + + **SINGLE_STATEMENT_DO_WHILE_MACRO** + For the multi-statement macros, it is necessary to use the do-while + loop to avoid unpredictable code paths. The do-while loop helps to + group the multiple statements into a single one so that a + function-like macro can be used as a function only. + + But for the single statement macros, it is unnecessary to use the + do-while loop. Although the code is syntactically correct but using + the do-while loop is redundant. So remove the do-while loop for single + statement macros. + + **WEAK_DECLARATION** + Using weak declarations like __attribute__((weak)) or __weak + can have unintended link defects. Avoid using them. + + +Functions and Variables +----------------------- + + **CAMELCASE** + Avoid CamelCase Identifiers. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#naming + + **CONST_CONST** + Using `const <type> const *` is generally meant to be + written `const <type> * const`. + + **CONST_STRUCT** + Using const is generally a good idea. Checkpatch reads + a list of frequently used structs that are always or + almost always constant. + + The existing structs list can be viewed from + `scripts/const_structs.checkpatch`. + + See: https://lore.kernel.org/lkml/alpine.DEB.2.10.1608281509480.3321@hadrien/ + + **EMBEDDED_FUNCTION_NAME** + Embedded function names are less appropriate to use as + refactoring can cause function renaming. Prefer the use of + "%s", __func__ to embedded function names. + + Note that this does not work with -f (--file) checkpatch option + as it depends on patch context providing the function name. + + **FUNCTION_ARGUMENTS** + This warning is emitted due to any of the following reasons: + + 1. Arguments for the function declaration do not follow + the identifier name. Example:: + + void foo + (int bar, int baz) + + This should be corrected to:: + + void foo(int bar, int baz) + + 2. Some arguments for the function definition do not + have an identifier name. Example:: + + void foo(int) + + All arguments should have identifier names. + + **FUNCTION_WITHOUT_ARGS** + Function declarations without arguments like:: + + int foo() + + should be:: + + int foo(void) + + **GLOBAL_INITIALISERS** + Global variables should not be initialized explicitly to + 0 (or NULL, false, etc.). Your compiler (or rather your + loader, which is responsible for zeroing out the relevant + sections) automatically does it for you. + + **INITIALISED_STATIC** + Static variables should not be initialized explicitly to zero. + Your compiler (or rather your loader) automatically does + it for you. + + **MULTIPLE_ASSIGNMENTS** + Multiple assignments on a single line makes the code unnecessarily + complicated. So on a single line assign value to a single variable + only, this makes the code more readable and helps avoid typos. + + **RETURN_PARENTHESES** + return is not a function and as such doesn't need parentheses:: + + return (bar); + + can simply be:: + + return bar; + + +Permissions +----------- + + **DEVICE_ATTR_PERMS** + The permissions used in DEVICE_ATTR are unusual. + Typically only three permissions are used - 0644 (RW), 0444 (RO) + and 0200 (WO). + + See: https://www.kernel.org/doc/html/latest/filesystems/sysfs.html#attributes + + **EXECUTE_PERMISSIONS** + There is no reason for source files to be executable. The executable + bit can be removed safely. + + **EXPORTED_WORLD_WRITABLE** + Exporting world writable sysfs/debugfs files is usually a bad thing. + When done arbitrarily they can introduce serious security bugs. + In the past, some of the debugfs vulnerabilities would seemingly allow + any local user to write arbitrary values into device registers - a + situation from which little good can be expected to emerge. + + See: https://lore.kernel.org/linux-arm-kernel/cover.1296818921.git.segoon@openwall.com/ + + **NON_OCTAL_PERMISSIONS** + Permission bits should use 4 digit octal permissions (like 0700 or 0444). + Avoid using any other base like decimal. + + **SYMBOLIC_PERMS** + Permission bits in the octal form are more readable and easier to + understand than their symbolic counterparts because many command-line + tools use this notation. Experienced kernel developers have been using + these traditional Unix permission bits for decades and so they find it + easier to understand the octal notation than the symbolic macros. + For example, it is harder to read S_IWUSR|S_IRUGO than 0644, which + obscures the developer's intent rather than clarifying it. + + See: https://lore.kernel.org/lkml/CA+55aFw5v23T-zvDZp-MmD_EYxF8WbafwwB59934FV7g21uMGQ@mail.gmail.com/ + + +Spacing and Brackets +-------------------- + + **ASSIGNMENT_CONTINUATIONS** + Assignment operators should not be written at the start of a + line but should follow the operand at the previous line. + + **BRACES** + The placement of braces is stylistically incorrect. + The preferred way is to put the opening brace last on the line, + and put the closing brace first:: + + if (x is true) { + we do y + } + + This applies for all non-functional blocks. + However, there is one special case, namely functions: they have the + opening brace at the beginning of the next line, thus:: + + int function(int x) + { + body of function + } + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces + + **BRACKET_SPACE** + Whitespace before opening bracket '[' is prohibited. + There are some exceptions: + + 1. With a type on the left:: + + int [] a; + + 2. At the beginning of a line for slice initialisers:: + + [0...10] = 5, + + 3. Inside a curly brace:: + + = { [0...10] = 5 } + + **CONCATENATED_STRING** + Concatenated elements should have a space in between. + Example:: + + printk(KERN_INFO"bar"); + + should be:: + + printk(KERN_INFO "bar"); + + **ELSE_AFTER_BRACE** + `else {` should follow the closing block `}` on the same line. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces + + **LINE_SPACING** + Vertical space is wasted given the limited number of lines an + editor window can display when multiple blank lines are used. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#spaces + + **OPEN_BRACE** + The opening brace should be following the function definitions on the + next line. For any non-functional block it should be on the same line + as the last construct. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces + + **POINTER_LOCATION** + When using pointer data or a function that returns a pointer type, + the preferred use of * is adjacent to the data name or function name + and not adjacent to the type name. + Examples:: + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#spaces + + **SPACING** + Whitespace style used in the kernel sources is described in kernel docs. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#spaces + + **TRAILING_WHITESPACE** + Trailing whitespace should always be removed. + Some editors highlight the trailing whitespace and cause visual + distractions when editing files. + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#spaces + + **UNNECESSARY_PARENTHESES** + Parentheses are not required in the following cases: + + 1. Function pointer uses:: + + (foo->bar)(); + + could be:: + + foo->bar(); + + 2. Comparisons in if:: + + if ((foo->bar) && (foo->baz)) + if ((foo == bar)) + + could be:: + + if (foo->bar && foo->baz) + if (foo == bar) + + 3. addressof/dereference single Lvalues:: + + &(foo->bar) + *(foo->bar) + + could be:: + + &foo->bar + *foo->bar + + **WHILE_AFTER_BRACE** + while should follow the closing bracket on the same line:: + + do { + ... + } while(something); + + See: https://www.kernel.org/doc/html/latest/process/coding-style.html#placing-braces-and-spaces + + +Others +------ + + **CONFIG_DESCRIPTION** + Kconfig symbols should have a help text which fully describes + it. + + **CORRUPTED_PATCH** + The patch seems to be corrupted or lines are wrapped. + Please regenerate the patch file before sending it to the maintainer. + + **CVS_KEYWORD** + Since linux moved to git, the CVS markers are no longer used. + So, CVS style keywords ($Id$, $Revision$, $Log$) should not be + added. + + **DEFAULT_NO_BREAK** + switch default case is sometimes written as "default:;". This can + cause new cases added below default to be defective. + + A "break;" should be added after empty default statement to avoid + unwanted fallthrough. + + **DOS_LINE_ENDINGS** + For DOS-formatted patches, there are extra ^M symbols at the end of + the line. These should be removed. + + **DT_SCHEMA_BINDING_PATCH** + DT bindings moved to a json-schema based format instead of + freeform text. + + See: https://www.kernel.org/doc/html/latest/devicetree/bindings/writing-schema.html + + **DT_SPLIT_BINDING_PATCH** + Devicetree bindings should be their own patch. This is because + bindings are logically independent from a driver implementation, + they have a different maintainer (even though they often + are applied via the same tree), and it makes for a cleaner history in the + DT only tree created with git-filter-branch. + + See: https://www.kernel.org/doc/html/latest/devicetree/bindings/submitting-patches.html#i-for-patch-submitters + + **EMBEDDED_FILENAME** + Embedding the complete filename path inside the file isn't particularly + useful as often the path is moved around and becomes incorrect. + + **FILE_PATH_CHANGES** + Whenever files are added, moved, or deleted, the MAINTAINERS file + patterns can be out of sync or outdated. + + So MAINTAINERS might need updating in these cases. + + **MEMSET** + The memset use appears to be incorrect. This may be caused due to + badly ordered parameters. Please recheck the usage. + + **NOT_UNIFIED_DIFF** + The patch file does not appear to be in unified-diff format. Please + regenerate the patch file before sending it to the maintainer. + + **PRINTF_0XDECIMAL** + Prefixing 0x with decimal output is defective and should be corrected. + + **SPDX_LICENSE_TAG** + The source file is missing or has an improper SPDX identifier tag. + The Linux kernel requires the precise SPDX identifier in all source files, + and it is thoroughly documented in the kernel docs. + + See: https://www.kernel.org/doc/html/latest/process/license-rules.html + + **TYPO_SPELLING** + Some words may have been misspelled. Consider reviewing them. diff --git a/Documentation/dev-tools/coccinelle.rst b/Documentation/dev-tools/coccinelle.rst new file mode 100644 index 0000000000..535ce126fb --- /dev/null +++ b/Documentation/dev-tools/coccinelle.rst @@ -0,0 +1,519 @@ +.. Copyright 2010 Nicolas Palix <npalix@diku.dk> +.. Copyright 2010 Julia Lawall <julia@diku.dk> +.. Copyright 2010 Gilles Muller <Gilles.Muller@lip6.fr> + +.. highlight:: none + +.. _devtools_coccinelle: + +Coccinelle +========== + +Coccinelle is a tool for pattern matching and text transformation that has +many uses in kernel development, including the application of complex, +tree-wide patches and detection of problematic programming patterns. + +Getting Coccinelle +------------------ + +The semantic patches included in the kernel use features and options +which are provided by Coccinelle version 1.0.0-rc11 and above. +Using earlier versions will fail as the option names used by +the Coccinelle files and coccicheck have been updated. + +Coccinelle is available through the package manager +of many distributions, e.g. : + + - Debian + - Fedora + - Ubuntu + - OpenSUSE + - Arch Linux + - NetBSD + - FreeBSD + +Some distribution packages are obsolete and it is recommended +to use the latest version released from the Coccinelle homepage at +http://coccinelle.lip6.fr/ + +Or from Github at: + +https://github.com/coccinelle/coccinelle + +Once you have it, run the following commands:: + + ./autogen + ./configure + make + +as a regular user, and install it with:: + + sudo make install + +More detailed installation instructions to build from source can be +found at: + +https://github.com/coccinelle/coccinelle/blob/master/install.txt + +Supplemental documentation +-------------------------- + +For supplemental documentation refer to the wiki: + +https://bottest.wiki.kernel.org/coccicheck + +The wiki documentation always refers to the linux-next version of the script. + +For Semantic Patch Language(SmPL) grammar documentation refer to: + +https://coccinelle.gitlabpages.inria.fr/website/docs/main_grammar.html + +Using Coccinelle on the Linux kernel +------------------------------------ + +A Coccinelle-specific target is defined in the top level +Makefile. This target is named ``coccicheck`` and calls the ``coccicheck`` +front-end in the ``scripts`` directory. + +Four basic modes are defined: ``patch``, ``report``, ``context``, and +``org``. The mode to use is specified by setting the MODE variable with +``MODE=<mode>``. + +- ``patch`` proposes a fix, when possible. + +- ``report`` generates a list in the following format: + file:line:column-column: message + +- ``context`` highlights lines of interest and their context in a + diff-like style. Lines of interest are indicated with ``-``. + +- ``org`` generates a report in the Org mode format of Emacs. + +Note that not all semantic patches implement all modes. For easy use +of Coccinelle, the default mode is "report". + +Two other modes provide some common combinations of these modes. + +- ``chain`` tries the previous modes in the order above until one succeeds. + +- ``rep+ctxt`` runs successively the report mode and the context mode. + It should be used with the C option (described later) + which checks the code on a file basis. + +Examples +~~~~~~~~ + +To make a report for every semantic patch, run the following command:: + + make coccicheck MODE=report + +To produce patches, run:: + + make coccicheck MODE=patch + + +The coccicheck target applies every semantic patch available in the +sub-directories of ``scripts/coccinelle`` to the entire Linux kernel. + +For each semantic patch, a commit message is proposed. It gives a +description of the problem being checked by the semantic patch, and +includes a reference to Coccinelle. + +As with any static code analyzer, Coccinelle produces false +positives. Thus, reports must be carefully checked, and patches +reviewed. + +To enable verbose messages set the V= variable, for example:: + + make coccicheck MODE=report V=1 + +Coccinelle parallelization +-------------------------- + +By default, coccicheck tries to run as parallel as possible. To change +the parallelism, set the J= variable. For example, to run across 4 CPUs:: + + make coccicheck MODE=report J=4 + +As of Coccinelle 1.0.2 Coccinelle uses Ocaml parmap for parallelization; +if support for this is detected you will benefit from parmap parallelization. + +When parmap is enabled coccicheck will enable dynamic load balancing by using +``--chunksize 1`` argument. This ensures we keep feeding threads with work +one by one, so that we avoid the situation where most work gets done by only +a few threads. With dynamic load balancing, if a thread finishes early we keep +feeding it more work. + +When parmap is enabled, if an error occurs in Coccinelle, this error +value is propagated back, and the return value of the ``make coccicheck`` +command captures this return value. + +Using Coccinelle with a single semantic patch +--------------------------------------------- + +The optional make variable COCCI can be used to check a single +semantic patch. In that case, the variable must be initialized with +the name of the semantic patch to apply. + +For instance:: + + make coccicheck COCCI=<my_SP.cocci> MODE=patch + +or:: + + make coccicheck COCCI=<my_SP.cocci> MODE=report + + +Controlling Which Files are Processed by Coccinelle +--------------------------------------------------- + +By default the entire kernel source tree is checked. + +To apply Coccinelle to a specific directory, ``M=`` can be used. +For example, to check drivers/net/wireless/ one may write:: + + make coccicheck M=drivers/net/wireless/ + +To apply Coccinelle on a file basis, instead of a directory basis, the +C variable is used by the makefile to select which files to work with. +This variable can be used to run scripts for the entire kernel, a +specific directory, or for a single file. + +For example, to check drivers/bluetooth/bfusb.c, the value 1 is +passed to the C variable to check files that make considers +need to be compiled.:: + + make C=1 CHECK=scripts/coccicheck drivers/bluetooth/bfusb.o + +The value 2 is passed to the C variable to check files regardless of +whether they need to be compiled or not.:: + + make C=2 CHECK=scripts/coccicheck drivers/bluetooth/bfusb.o + +In these modes, which work on a file basis, there is no information +about semantic patches displayed, and no commit message proposed. + +This runs every semantic patch in scripts/coccinelle by default. The +COCCI variable may additionally be used to only apply a single +semantic patch as shown in the previous section. + +The "report" mode is the default. You can select another one with the +MODE variable explained above. + +Debugging Coccinelle SmPL patches +--------------------------------- + +Using coccicheck is best as it provides in the spatch command line +include options matching the options used when we compile the kernel. +You can learn what these options are by using V=1; you could then +manually run Coccinelle with debug options added. + +Alternatively you can debug running Coccinelle against SmPL patches +by asking for stderr to be redirected to stderr. By default stderr +is redirected to /dev/null; if you'd like to capture stderr you +can specify the ``DEBUG_FILE="file.txt"`` option to coccicheck. For +instance:: + + rm -f cocci.err + make coccicheck COCCI=scripts/coccinelle/free/kfree.cocci MODE=report DEBUG_FILE=cocci.err + cat cocci.err + +You can use SPFLAGS to add debugging flags; for instance you may want to +add both ``--profile --show-trying`` to SPFLAGS when debugging. For example +you may want to use:: + + rm -f err.log + export COCCI=scripts/coccinelle/misc/irqf_oneshot.cocci + make coccicheck DEBUG_FILE="err.log" MODE=report SPFLAGS="--profile --show-trying" M=./drivers/mfd + +err.log will now have the profiling information, while stdout will +provide some progress information as Coccinelle moves forward with +work. + +NOTE: + +DEBUG_FILE support is only supported when using coccinelle >= 1.0.2. + +Currently, DEBUG_FILE support is only available to check folders, and +not single files. This is because checking a single file requires spatch +to be called twice leading to DEBUG_FILE being set both times to the same value, +giving rise to an error. + +.cocciconfig support +-------------------- + +Coccinelle supports reading .cocciconfig for default Coccinelle options that +should be used every time spatch is spawned. The order of precedence for +variables for .cocciconfig is as follows: + +- Your current user's home directory is processed first +- Your directory from which spatch is called is processed next +- The directory provided with the ``--dir`` option is processed last, if used + +Since coccicheck runs through make, it naturally runs from the kernel +proper dir; as such the second rule above would be implied for picking up a +.cocciconfig when using ``make coccicheck``. + +``make coccicheck`` also supports using M= targets. If you do not supply +any M= target, it is assumed you want to target the entire kernel. +The kernel coccicheck script has:: + + if [ "$KBUILD_EXTMOD" = "" ] ; then + OPTIONS="--dir $srctree $COCCIINCLUDE" + else + OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE" + fi + +KBUILD_EXTMOD is set when an explicit target with M= is used. For both cases +the spatch ``--dir`` argument is used, as such third rule applies when whether +M= is used or not, and when M= is used the target directory can have its own +.cocciconfig file. When M= is not passed as an argument to coccicheck the +target directory is the same as the directory from where spatch was called. + +If not using the kernel's coccicheck target, keep the above precedence +order logic of .cocciconfig reading. If using the kernel's coccicheck target, +override any of the kernel's .coccicheck's settings using SPFLAGS. + +We help Coccinelle when used against Linux with a set of sensible default +options for Linux with our own Linux .cocciconfig. This hints to coccinelle +that git can be used for ``git grep`` queries over coccigrep. A timeout of 200 +seconds should suffice for now. + +The options picked up by coccinelle when reading a .cocciconfig do not appear +as arguments to spatch processes running on your system. To confirm what +options will be used by Coccinelle run:: + + spatch --print-options-only + +You can override with your own preferred index option by using SPFLAGS. Take +note that when there are conflicting options Coccinelle takes precedence for +the last options passed. Using .cocciconfig is possible to use idutils, however +given the order of precedence followed by Coccinelle, since the kernel now +carries its own .cocciconfig, you will need to use SPFLAGS to use idutils if +desired. See below section "Additional flags" for more details on how to use +idutils. + +Additional flags +---------------- + +Additional flags can be passed to spatch through the SPFLAGS +variable. This works as Coccinelle respects the last flags +given to it when options are in conflict. :: + + make SPFLAGS=--use-glimpse coccicheck + +Coccinelle supports idutils as well but requires coccinelle >= 1.0.6. +When no ID file is specified coccinelle assumes your ID database file +is in the file .id-utils.index on the top level of the kernel. Coccinelle +carries a script scripts/idutils_index.sh which creates the database with:: + + mkid -i C --output .id-utils.index + +If you have another database filename you can also just symlink with this +name. :: + + make SPFLAGS=--use-idutils coccicheck + +Alternatively you can specify the database filename explicitly, for +instance:: + + make SPFLAGS="--use-idutils /full-path/to/ID" coccicheck + +See ``spatch --help`` to learn more about spatch options. + +Note that the ``--use-glimpse`` and ``--use-idutils`` options +require external tools for indexing the code. None of them is +thus active by default. However, by indexing the code with +one of these tools, and according to the cocci file used, +spatch could proceed the entire code base more quickly. + +SmPL patch specific options +--------------------------- + +SmPL patches can have their own requirements for options passed +to Coccinelle. SmPL patch-specific options can be provided by +providing them at the top of the SmPL patch, for instance:: + + // Options: --no-includes --include-headers + +SmPL patch Coccinelle requirements +---------------------------------- + +As Coccinelle features get added some more advanced SmPL patches +may require newer versions of Coccinelle. If an SmPL patch requires +a minimum version of Coccinelle, this can be specified as follows, +as an example if requiring at least Coccinelle >= 1.0.5:: + + // Requires: 1.0.5 + +Proposing new semantic patches +------------------------------ + +New semantic patches can be proposed and submitted by kernel +developers. For sake of clarity, they should be organized in the +sub-directories of ``scripts/coccinelle/``. + + +Detailed description of the ``report`` mode +------------------------------------------- + +``report`` generates a list in the following format:: + + file:line:column-column: message + +Example +~~~~~~~ + +Running:: + + make coccicheck MODE=report COCCI=scripts/coccinelle/api/err_cast.cocci + +will execute the following part of the SmPL script:: + + <smpl> + @r depends on !context && !patch && (org || report)@ + expression x; + position p; + @@ + + ERR_PTR@p(PTR_ERR(x)) + + @script:python depends on report@ + p << r.p; + x << r.x; + @@ + + msg="ERR_CAST can be used with %s" % (x) + coccilib.report.print_report(p[0], msg) + </smpl> + +This SmPL excerpt generates entries on the standard output, as +illustrated below:: + + /home/user/linux/crypto/ctr.c:188:9-16: ERR_CAST can be used with alg + /home/user/linux/crypto/authenc.c:619:9-16: ERR_CAST can be used with auth + /home/user/linux/crypto/xts.c:227:9-16: ERR_CAST can be used with alg + + +Detailed description of the ``patch`` mode +------------------------------------------ + +When the ``patch`` mode is available, it proposes a fix for each problem +identified. + +Example +~~~~~~~ + +Running:: + + make coccicheck MODE=patch COCCI=scripts/coccinelle/api/err_cast.cocci + +will execute the following part of the SmPL script:: + + <smpl> + @ depends on !context && patch && !org && !report @ + expression x; + @@ + + - ERR_PTR(PTR_ERR(x)) + + ERR_CAST(x) + </smpl> + +This SmPL excerpt generates patch hunks on the standard output, as +illustrated below:: + + diff -u -p a/crypto/ctr.c b/crypto/ctr.c + --- a/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200 + +++ b/crypto/ctr.c 2010-06-03 23:44:49.000000000 +0200 + @@ -185,7 +185,7 @@ static struct crypto_instance *crypto_ct + alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER, + CRYPTO_ALG_TYPE_MASK); + if (IS_ERR(alg)) + - return ERR_PTR(PTR_ERR(alg)); + + return ERR_CAST(alg); + + /* Block size must be >= 4 bytes. */ + err = -EINVAL; + +Detailed description of the ``context`` mode +-------------------------------------------- + +``context`` highlights lines of interest and their context +in a diff-like style. + + **NOTE**: The diff-like output generated is NOT an applicable patch. The + intent of the ``context`` mode is to highlight the important lines + (annotated with minus, ``-``) and gives some surrounding context + lines around. This output can be used with the diff mode of + Emacs to review the code. + +Example +~~~~~~~ + +Running:: + + make coccicheck MODE=context COCCI=scripts/coccinelle/api/err_cast.cocci + +will execute the following part of the SmPL script:: + + <smpl> + @ depends on context && !patch && !org && !report@ + expression x; + @@ + + * ERR_PTR(PTR_ERR(x)) + </smpl> + +This SmPL excerpt generates diff hunks on the standard output, as +illustrated below:: + + diff -u -p /home/user/linux/crypto/ctr.c /tmp/nothing + --- /home/user/linux/crypto/ctr.c 2010-05-26 10:49:38.000000000 +0200 + +++ /tmp/nothing + @@ -185,7 +185,6 @@ static struct crypto_instance *crypto_ct + alg = crypto_attr_alg(tb[1], CRYPTO_ALG_TYPE_CIPHER, + CRYPTO_ALG_TYPE_MASK); + if (IS_ERR(alg)) + - return ERR_PTR(PTR_ERR(alg)); + + /* Block size must be >= 4 bytes. */ + err = -EINVAL; + +Detailed description of the ``org`` mode +---------------------------------------- + +``org`` generates a report in the Org mode format of Emacs. + +Example +~~~~~~~ + +Running:: + + make coccicheck MODE=org COCCI=scripts/coccinelle/api/err_cast.cocci + +will execute the following part of the SmPL script:: + + <smpl> + @r depends on !context && !patch && (org || report)@ + expression x; + position p; + @@ + + ERR_PTR@p(PTR_ERR(x)) + + @script:python depends on org@ + p << r.p; + x << r.x; + @@ + + msg="ERR_CAST can be used with %s" % (x) + msg_safe=msg.replace("[","@(").replace("]",")") + coccilib.org.print_todo(p[0], msg_safe) + </smpl> + +This SmPL excerpt generates Org entries on the standard output, as +illustrated below:: + + * TODO [[view:/home/user/linux/crypto/ctr.c::face=ovl-face1::linb=188::colb=9::cole=16][ERR_CAST can be used with alg]] + * TODO [[view:/home/user/linux/crypto/authenc.c::face=ovl-face1::linb=619::colb=9::cole=16][ERR_CAST can be used with auth]] + * TODO [[view:/home/user/linux/crypto/xts.c::face=ovl-face1::linb=227::colb=9::cole=16][ERR_CAST can be used with alg]] diff --git a/Documentation/dev-tools/gcov.rst b/Documentation/dev-tools/gcov.rst new file mode 100644 index 0000000000..5fce2b06f2 --- /dev/null +++ b/Documentation/dev-tools/gcov.rst @@ -0,0 +1,274 @@ +Using gcov with the Linux kernel +================================ + +gcov profiling kernel support enables the use of GCC's coverage testing +tool gcov_ with the Linux kernel. Coverage data of a running kernel +is exported in gcov-compatible format via the "gcov" debugfs directory. +To get coverage data for a specific file, change to the kernel build +directory and use gcov with the ``-o`` option as follows (requires root):: + + # cd /tmp/linux-out + # gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c + +This will create source code files annotated with execution counts +in the current directory. In addition, graphical gcov front-ends such +as lcov_ can be used to automate the process of collecting data +for the entire kernel and provide coverage overviews in HTML format. + +Possible uses: + +* debugging (has this line been reached at all?) +* test improvement (how do I change my test to cover these lines?) +* minimizing kernel configurations (do I need this option if the + associated code is never run?) + +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html +.. _lcov: http://ltp.sourceforge.net/coverage/lcov.php + + +Preparation +----------- + +Configure the kernel with:: + + CONFIG_DEBUG_FS=y + CONFIG_GCOV_KERNEL=y + +and to get coverage data for the entire kernel:: + + CONFIG_GCOV_PROFILE_ALL=y + +Note that kernels compiled with profiling flags will be significantly +larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported +on all architectures. + +Profiling data will only become accessible once debugfs has been +mounted:: + + mount -t debugfs none /sys/kernel/debug + + +Customization +------------- + +To enable profiling for specific files or directories, add a line +similar to the following to the respective kernel Makefile: + +- For a single file (e.g. main.o):: + + GCOV_PROFILE_main.o := y + +- For all files in one directory:: + + GCOV_PROFILE := y + +To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL +is specified, use:: + + GCOV_PROFILE_main.o := n + +and:: + + GCOV_PROFILE := n + +Only files which are linked to the main kernel image or are compiled as +kernel modules are supported by this mechanism. + + +Files +----- + +The gcov kernel support creates the following files in debugfs: + +``/sys/kernel/debug/gcov`` + Parent directory for all gcov-related files. + +``/sys/kernel/debug/gcov/reset`` + Global reset file: resets all coverage data to zero when + written to. + +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda`` + The actual gcov data file as understood by the gcov + tool. Resets file coverage data to zero when written to. + +``/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno`` + Symbolic link to a static data file required by the gcov + tool. This file is generated by gcc when compiling with + option ``-ftest-coverage``. + + +Modules +------- + +Kernel modules may contain cleanup code which is only run during +module unload time. The gcov mechanism provides a means to collect +coverage data for such code by keeping a copy of the data associated +with the unloaded module. This data remains available through debugfs. +Once the module is loaded again, the associated coverage counters are +initialized with the data from its previous instantiation. + +This behavior can be deactivated by specifying the gcov_persist kernel +parameter:: + + gcov_persist=0 + +At run-time, a user can also choose to discard data for an unloaded +module by writing to its data file or the global reset file. + + +Separated build and test machines +--------------------------------- + +The gcov kernel profiling infrastructure is designed to work out-of-the +box for setups where kernels are built and run on the same machine. In +cases where the kernel runs on a separate machine, special preparations +must be made, depending on where the gcov tool is used: + +.. _gcov-test: + +a) gcov is run on the TEST machine + + The gcov tool version on the test machine must be compatible with the + gcc version used for kernel build. Also the following files need to be + copied from build to test machine: + + from the source tree: + - all C source files + headers + + from the build tree: + - all C source files + headers + - all .gcda and .gcno files + - all links to directories + + It is important to note that these files need to be placed into the + exact same file system location on the test machine as on the build + machine. If any of the path components is symbolic link, the actual + directory needs to be used instead (due to make's CURDIR handling). + +.. _gcov-build: + +b) gcov is run on the BUILD machine + + The following files need to be copied after each test case from test + to build machine: + + from the gcov directory in sysfs: + - all .gcda files + - all links to .gcno files + + These files can be copied to any location on the build machine. gcov + must then be called with the -o option pointing to that directory. + + Example directory setup on the build machine:: + + /tmp/linux: kernel source tree + /tmp/out: kernel build directory as specified by make O= + /tmp/coverage: location of the files copied from the test machine + + [user@build] cd /tmp/out + [user@build] gcov -o /tmp/coverage/tmp/out/init main.c + + +Note on compilers +----------------- + +GCC and LLVM gcov tools are not necessarily compatible. Use gcov_ to work with +GCC-generated .gcno and .gcda files, and use llvm-cov_ for Clang. + +.. _gcov: https://gcc.gnu.org/onlinedocs/gcc/Gcov.html +.. _llvm-cov: https://llvm.org/docs/CommandGuide/llvm-cov.html + +Build differences between GCC and Clang gcov are handled by Kconfig. It +automatically selects the appropriate gcov format depending on the detected +toolchain. + + +Troubleshooting +--------------- + +Problem + Compilation aborts during linker step. + +Cause + Profiling flags are specified for source files which are not + linked to the main kernel or which are linked by a custom + linker procedure. + +Solution + Exclude affected source files from profiling by specifying + ``GCOV_PROFILE := n`` or ``GCOV_PROFILE_basename.o := n`` in the + corresponding Makefile. + +Problem + Files copied from sysfs appear empty or incomplete. + +Cause + Due to the way seq_file works, some tools such as cp or tar + may not correctly copy files from sysfs. + +Solution + Use ``cat`` to read ``.gcda`` files and ``cp -d`` to copy links. + Alternatively use the mechanism shown in Appendix B. + + +Appendix A: gather_on_build.sh +------------------------------ + +Sample script to gather coverage meta files on the build machine +(see :ref:`Separated build and test machines a. <gcov-test>`): + +.. code-block:: sh + + #!/bin/bash + + KSRC=$1 + KOBJ=$2 + DEST=$3 + + if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then + echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2 + exit 1 + fi + + KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) + KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -) + + find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \ + -perm /u+r,g+r | tar cfz $DEST -P -T - + + if [ $? -eq 0 ] ; then + echo "$DEST successfully created, copy to test system and unpack with:" + echo " tar xfz $DEST -P" + else + echo "Could not create file $DEST" + fi + + +Appendix B: gather_on_test.sh +----------------------------- + +Sample script to gather coverage data files on the test machine +(see :ref:`Separated build and test machines b. <gcov-build>`): + +.. code-block:: sh + + #!/bin/bash -e + + DEST=$1 + GCDA=/sys/kernel/debug/gcov + + if [ -z "$DEST" ] ; then + echo "Usage: $0 <output.tar.gz>" >&2 + exit 1 + fi + + TEMPDIR=$(mktemp -d) + echo Collecting data.. + find $GCDA -type d -exec mkdir -p $TEMPDIR/\{\} \; + find $GCDA -name '*.gcda' -exec sh -c 'cat < $0 > '$TEMPDIR'/$0' {} \; + find $GCDA -name '*.gcno' -exec sh -c 'cp -d $0 '$TEMPDIR'/$0' {} \; + tar czf $DEST -C $TEMPDIR sys + rm -rf $TEMPDIR + + echo "$DEST successfully created, copy to build system and unpack with:" + echo " tar xfz $DEST" diff --git a/Documentation/dev-tools/gdb-kernel-debugging.rst b/Documentation/dev-tools/gdb-kernel-debugging.rst new file mode 100644 index 0000000000..895285c037 --- /dev/null +++ b/Documentation/dev-tools/gdb-kernel-debugging.rst @@ -0,0 +1,179 @@ +.. highlight:: none + +Debugging kernel and modules via gdb +==================================== + +The kernel debugger kgdb, hypervisors like QEMU or JTAG-based hardware +interfaces allow to debug the Linux kernel and its modules during runtime +using gdb. Gdb comes with a powerful scripting interface for python. The +kernel provides a collection of helper scripts that can simplify typical +kernel debugging steps. This is a short tutorial about how to enable and use +them. It focuses on QEMU/KVM virtual machines as target, but the examples can +be transferred to the other gdb stubs as well. + + +Requirements +------------ + +- gdb 7.2+ (recommended: 7.4+) with python support enabled (typically true + for distributions) + + +Setup +----- + +- Create a virtual Linux machine for QEMU/KVM (see www.linux-kvm.org and + www.qemu.org for more details). For cross-development, + https://landley.net/aboriginal/bin keeps a pool of machine images and + toolchains that can be helpful to start from. + +- Build the kernel with CONFIG_GDB_SCRIPTS enabled, but leave + CONFIG_DEBUG_INFO_REDUCED off. If your architecture supports + CONFIG_FRAME_POINTER, keep it enabled. + +- Install that kernel on the guest, turn off KASLR if necessary by adding + "nokaslr" to the kernel command line. + Alternatively, QEMU allows to boot the kernel directly using -kernel, + -append, -initrd command line switches. This is generally only useful if + you do not depend on modules. See QEMU documentation for more details on + this mode. In this case, you should build the kernel with + CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR. + +- Build the gdb scripts (required on kernels v5.1 and above):: + + make scripts_gdb + +- Enable the gdb stub of QEMU/KVM, either + + - at VM startup time by appending "-s" to the QEMU command line + + or + + - during runtime by issuing "gdbserver" from the QEMU monitor + console + +- cd /path/to/linux-build + +- Start gdb: gdb vmlinux + + Note: Some distros may restrict auto-loading of gdb scripts to known safe + directories. In case gdb reports to refuse loading vmlinux-gdb.py, add:: + + add-auto-load-safe-path /path/to/linux-build + + to ~/.gdbinit. See gdb help for more details. + +- Attach to the booted guest:: + + (gdb) target remote :1234 + + +Examples of using the Linux-provided gdb helpers +------------------------------------------------ + +- Load module (and main kernel) symbols:: + + (gdb) lx-symbols + loading vmlinux + scanning for modules in /home/user/linux/build + loading @0xffffffffa0020000: /home/user/linux/build/net/netfilter/xt_tcpudp.ko + loading @0xffffffffa0016000: /home/user/linux/build/net/netfilter/xt_pkttype.ko + loading @0xffffffffa0002000: /home/user/linux/build/net/netfilter/xt_limit.ko + loading @0xffffffffa00ca000: /home/user/linux/build/net/packet/af_packet.ko + loading @0xffffffffa003c000: /home/user/linux/build/fs/fuse/fuse.ko + ... + loading @0xffffffffa0000000: /home/user/linux/build/drivers/ata/ata_generic.ko + +- Set a breakpoint on some not yet loaded module function, e.g.:: + + (gdb) b btrfs_init_sysfs + Function "btrfs_init_sysfs" not defined. + Make breakpoint pending on future shared library load? (y or [n]) y + Breakpoint 1 (btrfs_init_sysfs) pending. + +- Continue the target:: + + (gdb) c + +- Load the module on the target and watch the symbols being loaded as well as + the breakpoint hit:: + + loading @0xffffffffa0034000: /home/user/linux/build/lib/libcrc32c.ko + loading @0xffffffffa0050000: /home/user/linux/build/lib/lzo/lzo_compress.ko + loading @0xffffffffa006e000: /home/user/linux/build/lib/zlib_deflate/zlib_deflate.ko + loading @0xffffffffa01b1000: /home/user/linux/build/fs/btrfs/btrfs.ko + + Breakpoint 1, btrfs_init_sysfs () at /home/user/linux/fs/btrfs/sysfs.c:36 + 36 btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj); + +- Dump the log buffer of the target kernel:: + + (gdb) lx-dmesg + [ 0.000000] Initializing cgroup subsys cpuset + [ 0.000000] Initializing cgroup subsys cpu + [ 0.000000] Linux version 3.8.0-rc4-dbg+ (... + [ 0.000000] Command line: root=/dev/sda2 resume=/dev/sda1 vga=0x314 + [ 0.000000] e820: BIOS-provided physical RAM map: + [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable + [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved + .... + +- Examine fields of the current task struct(supported by x86 and arm64 only):: + + (gdb) p $lx_current().pid + $1 = 4998 + (gdb) p $lx_current().comm + $2 = "modprobe\000\000\000\000\000\000\000" + +- Make use of the per-cpu function for the current or a specified CPU:: + + (gdb) p $lx_per_cpu("runqueues").nr_running + $3 = 1 + (gdb) p $lx_per_cpu("runqueues", 2).nr_running + $4 = 0 + +- Dig into hrtimers using the container_of helper:: + + (gdb) set $next = $lx_per_cpu("hrtimer_bases").clock_base[0].active.next + (gdb) p *$container_of($next, "struct hrtimer", "node") + $5 = { + node = { + node = { + __rb_parent_color = 18446612133355256072, + rb_right = 0x0 <irq_stack_union>, + rb_left = 0x0 <irq_stack_union> + }, + expires = { + tv64 = 1835268000000 + } + }, + _softexpires = { + tv64 = 1835268000000 + }, + function = 0xffffffff81078232 <tick_sched_timer>, + base = 0xffff88003fd0d6f0, + state = 1, + start_pid = 0, + start_site = 0xffffffff81055c1f <hrtimer_start_range_ns+20>, + start_comm = "swapper/2\000\000\000\000\000\000" + } + + +List of commands and functions +------------------------------ + +The number of commands and convenience functions may evolve over the time, +this is just a snapshot of the initial version:: + + (gdb) apropos lx + function lx_current -- Return current task + function lx_module -- Find module by name and return the module variable + function lx_per_cpu -- Return per-cpu variable + function lx_task_by_pid -- Find Linux task by PID and return the task_struct variable + function lx_thread_info -- Calculate Linux thread_info from task variable + lx-dmesg -- Print Linux kernel log buffer + lx-lsmod -- List currently loaded modules + lx-symbols -- (Re-)load symbols of Linux kernel and currently loaded modules + +Detailed help can be obtained via "help <command-name>" for commands and "help +function <function-name>" for convenience functions. diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst new file mode 100644 index 0000000000..6b0663075d --- /dev/null +++ b/Documentation/dev-tools/index.rst @@ -0,0 +1,44 @@ +================================ +Development tools for the kernel +================================ + +This document is a collection of documents about development tools that can +be used to work on the kernel. For now, the documents have been pulled +together without any significant effort to integrate them into a coherent +whole; patches welcome! + +A brief overview of testing-specific tools can be found in +Documentation/dev-tools/testing-overview.rst + +.. class:: toc-title + + Table of contents + +.. toctree:: + :maxdepth: 2 + + testing-overview + checkpatch + coccinelle + sparse + kcov + gcov + kasan + kmsan + ubsan + kmemleak + kcsan + kfence + gdb-kernel-debugging + kgdb + kselftest + kunit/index + ktap + + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst new file mode 100644 index 0000000000..382818a719 --- /dev/null +++ b/Documentation/dev-tools/kasan.rst @@ -0,0 +1,549 @@ +The Kernel Address Sanitizer (KASAN) +==================================== + +Overview +-------- + +Kernel Address Sanitizer (KASAN) is a dynamic memory safety error detector +designed to find out-of-bounds and use-after-free bugs. + +KASAN has three modes: + +1. Generic KASAN +2. Software Tag-Based KASAN +3. Hardware Tag-Based KASAN + +Generic KASAN, enabled with CONFIG_KASAN_GENERIC, is the mode intended for +debugging, similar to userspace ASan. This mode is supported on many CPU +architectures, but it has significant performance and memory overheads. + +Software Tag-Based KASAN or SW_TAGS KASAN, enabled with CONFIG_KASAN_SW_TAGS, +can be used for both debugging and dogfood testing, similar to userspace HWASan. +This mode is only supported for arm64, but its moderate memory overhead allows +using it for testing on memory-restricted devices with real workloads. + +Hardware Tag-Based KASAN or HW_TAGS KASAN, enabled with CONFIG_KASAN_HW_TAGS, +is the mode intended to be used as an in-field memory bug detector or as a +security mitigation. This mode only works on arm64 CPUs that support MTE +(Memory Tagging Extension), but it has low memory and performance overheads and +thus can be used in production. + +For details about the memory and performance impact of each KASAN mode, see the +descriptions of the corresponding Kconfig options. + +The Generic and the Software Tag-Based modes are commonly referred to as the +software modes. The Software Tag-Based and the Hardware Tag-Based modes are +referred to as the tag-based modes. + +Support +------- + +Architectures +~~~~~~~~~~~~~ + +Generic KASAN is supported on x86_64, arm, arm64, powerpc, riscv, s390, xtensa, +and loongarch, and the tag-based KASAN modes are supported only on arm64. + +Compilers +~~~~~~~~~ + +Software KASAN modes use compile-time instrumentation to insert validity checks +before every memory access and thus require a compiler version that provides +support for that. The Hardware Tag-Based mode relies on hardware to perform +these checks but still requires a compiler version that supports the memory +tagging instructions. + +Generic KASAN requires GCC version 8.3.0 or later +or any Clang version supported by the kernel. + +Software Tag-Based KASAN requires GCC 11+ +or any Clang version supported by the kernel. + +Hardware Tag-Based KASAN requires GCC 10+ or Clang 12+. + +Memory types +~~~~~~~~~~~~ + +Generic KASAN supports finding bugs in all of slab, page_alloc, vmap, vmalloc, +stack, and global memory. + +Software Tag-Based KASAN supports slab, page_alloc, vmalloc, and stack memory. + +Hardware Tag-Based KASAN supports slab, page_alloc, and non-executable vmalloc +memory. + +For slab, both software KASAN modes support SLUB and SLAB allocators, while +Hardware Tag-Based KASAN only supports SLUB. + +Usage +----- + +To enable KASAN, configure the kernel with:: + + CONFIG_KASAN=y + +and choose between ``CONFIG_KASAN_GENERIC`` (to enable Generic KASAN), +``CONFIG_KASAN_SW_TAGS`` (to enable Software Tag-Based KASAN), and +``CONFIG_KASAN_HW_TAGS`` (to enable Hardware Tag-Based KASAN). + +For the software modes, also choose between ``CONFIG_KASAN_OUTLINE`` and +``CONFIG_KASAN_INLINE``. Outline and inline are compiler instrumentation types. +The former produces a smaller binary while the latter is up to 2 times faster. + +To include alloc and free stack traces of affected slab objects into reports, +enable ``CONFIG_STACKTRACE``. To include alloc and free stack traces of affected +physical pages, enable ``CONFIG_PAGE_OWNER`` and boot with ``page_owner=on``. + +Boot parameters +~~~~~~~~~~~~~~~ + +KASAN is affected by the generic ``panic_on_warn`` command line parameter. +When it is enabled, KASAN panics the kernel after printing a bug report. + +By default, KASAN prints a bug report only for the first invalid memory access. +With ``kasan_multi_shot``, KASAN prints a report on every invalid access. This +effectively disables ``panic_on_warn`` for KASAN reports. + +Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot +parameter can be used to control panic and reporting behaviour: + +- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether + to only print a KASAN report, panic the kernel, or panic the kernel on + invalid writes only (default: ``report``). The panic happens even if + ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of + Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on + asynchronously checked accesses (including reads). + +Software and Hardware Tag-Based KASAN modes (see the section about various +modes below) support altering stack trace collection behavior: + +- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack + traces collection (default: ``on``). +- ``kasan.stack_ring_size=<number of entries>`` specifies the number of entries + in the stack ring (default: ``32768``). + +Hardware Tag-Based KASAN mode is intended for use in production as a security +mitigation. Therefore, it supports additional boot parameters that allow +disabling KASAN altogether or controlling its features: + +- ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``). + +- ``kasan.mode=sync``, ``=async`` or ``=asymm`` controls whether KASAN + is configured in synchronous, asynchronous or asymmetric mode of + execution (default: ``sync``). + Synchronous mode: a bad access is detected immediately when a tag + check fault occurs. + Asynchronous mode: a bad access detection is delayed. When a tag check + fault occurs, the information is stored in hardware (in the TFSR_EL1 + register for arm64). The kernel periodically checks the hardware and + only reports tag faults during these checks. + Asymmetric mode: a bad access is detected synchronously on reads and + asynchronously on writes. + +- ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc + allocations (default: ``on``). + +- ``kasan.page_alloc.sample=<sampling interval>`` makes KASAN tag only every + Nth page_alloc allocation with the order equal or greater than + ``kasan.page_alloc.sample.order``, where N is the value of the ``sample`` + parameter (default: ``1``, or tag every such allocation). + This parameter is intended to mitigate the performance overhead introduced + by KASAN. + Note that enabling this parameter makes Hardware Tag-Based KASAN skip checks + of allocations chosen by sampling and thus miss bad accesses to these + allocations. Use the default value for accurate bug detection. + +- ``kasan.page_alloc.sample.order=<minimum page order>`` specifies the minimum + order of allocations that are affected by sampling (default: ``3``). + Only applies when ``kasan.page_alloc.sample`` is set to a value greater + than ``1``. + This parameter is intended to allow sampling only large page_alloc + allocations, which is the biggest source of the performance overhead. + +Error reports +~~~~~~~~~~~~~ + +A typical KASAN report looks like this:: + + ================================================================== + BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa8/0xbc [test_kasan] + Write of size 1 at addr ffff8801f44ec37b by task insmod/2760 + + CPU: 1 PID: 2760 Comm: insmod Not tainted 4.19.0-rc3+ #698 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 + Call Trace: + dump_stack+0x94/0xd8 + print_address_description+0x73/0x280 + kasan_report+0x144/0x187 + __asan_report_store1_noabort+0x17/0x20 + kmalloc_oob_right+0xa8/0xbc [test_kasan] + kmalloc_tests_init+0x16/0x700 [test_kasan] + do_one_initcall+0xa5/0x3ae + do_init_module+0x1b6/0x547 + load_module+0x75df/0x8070 + __do_sys_init_module+0x1c6/0x200 + __x64_sys_init_module+0x6e/0xb0 + do_syscall_64+0x9f/0x2c0 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 + RIP: 0033:0x7f96443109da + RSP: 002b:00007ffcf0b51b08 EFLAGS: 00000202 ORIG_RAX: 00000000000000af + RAX: ffffffffffffffda RBX: 000055dc3ee521a0 RCX: 00007f96443109da + RDX: 00007f96445cff88 RSI: 0000000000057a50 RDI: 00007f9644992000 + RBP: 000055dc3ee510b0 R08: 0000000000000003 R09: 0000000000000000 + R10: 00007f964430cd0a R11: 0000000000000202 R12: 00007f96445cff88 + R13: 000055dc3ee51090 R14: 0000000000000000 R15: 0000000000000000 + + Allocated by task 2760: + save_stack+0x43/0xd0 + kasan_kmalloc+0xa7/0xd0 + kmem_cache_alloc_trace+0xe1/0x1b0 + kmalloc_oob_right+0x56/0xbc [test_kasan] + kmalloc_tests_init+0x16/0x700 [test_kasan] + do_one_initcall+0xa5/0x3ae + do_init_module+0x1b6/0x547 + load_module+0x75df/0x8070 + __do_sys_init_module+0x1c6/0x200 + __x64_sys_init_module+0x6e/0xb0 + do_syscall_64+0x9f/0x2c0 + entry_SYSCALL_64_after_hwframe+0x44/0xa9 + + Freed by task 815: + save_stack+0x43/0xd0 + __kasan_slab_free+0x135/0x190 + kasan_slab_free+0xe/0x10 + kfree+0x93/0x1a0 + umh_complete+0x6a/0xa0 + call_usermodehelper_exec_async+0x4c3/0x640 + ret_from_fork+0x35/0x40 + + The buggy address belongs to the object at ffff8801f44ec300 + which belongs to the cache kmalloc-128 of size 128 + The buggy address is located 123 bytes inside of + 128-byte region [ffff8801f44ec300, ffff8801f44ec380) + The buggy address belongs to the page: + page:ffffea0007d13b00 count:1 mapcount:0 mapping:ffff8801f7001640 index:0x0 + flags: 0x200000000000100(slab) + raw: 0200000000000100 ffffea0007d11dc0 0000001a0000001a ffff8801f7001640 + raw: 0000000000000000 0000000080150015 00000001ffffffff 0000000000000000 + page dumped because: kasan: bad access detected + + Memory state around the buggy address: + ffff8801f44ec200: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb + ffff8801f44ec280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc + >ffff8801f44ec300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 + ^ + ffff8801f44ec380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb + ffff8801f44ec400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc + ================================================================== + +The report header summarizes what kind of bug happened and what kind of access +caused it. It is followed by a stack trace of the bad access, a stack trace of +where the accessed memory was allocated (in case a slab object was accessed), +and a stack trace of where the object was freed (in case of a use-after-free +bug report). Next comes a description of the accessed slab object and the +information about the accessed memory page. + +In the end, the report shows the memory state around the accessed address. +Internally, KASAN tracks memory state separately for each memory granule, which +is either 8 or 16 aligned bytes depending on KASAN mode. Each number in the +memory state section of the report shows the state of one of the memory +granules that surround the accessed address. + +For Generic KASAN, the size of each memory granule is 8. The state of each +granule is encoded in one shadow byte. Those 8 bytes can be accessible, +partially accessible, freed, or be a part of a redzone. KASAN uses the following +encoding for each shadow byte: 00 means that all 8 bytes of the corresponding +memory region are accessible; number N (1 <= N <= 7) means that the first N +bytes are accessible, and other (8 - N) bytes are not; any negative value +indicates that the entire 8-byte word is inaccessible. KASAN uses different +negative values to distinguish between different kinds of inaccessible memory +like redzones or freed memory (see mm/kasan/kasan.h). + +In the report above, the arrow points to the shadow byte ``03``, which means +that the accessed address is partially accessible. + +For tag-based KASAN modes, this last report section shows the memory tags around +the accessed address (see the `Implementation details`_ section). + +Note that KASAN bug titles (like ``slab-out-of-bounds`` or ``use-after-free``) +are best-effort: KASAN prints the most probable bug type based on the limited +information it has. The actual type of the bug might be different. + +Generic KASAN also reports up to two auxiliary call stack traces. These stack +traces point to places in code that interacted with the object but that are not +directly present in the bad access stack trace. Currently, this includes +call_rcu() and workqueue queuing. + +Implementation details +---------------------- + +Generic KASAN +~~~~~~~~~~~~~ + +Software KASAN modes use shadow memory to record whether each byte of memory is +safe to access and use compile-time instrumentation to insert shadow memory +checks before each memory access. + +Generic KASAN dedicates 1/8th of kernel memory to its shadow memory (16TB +to cover 128TB on x86_64) and uses direct mapping with a scale and offset to +translate a memory address to its corresponding shadow address. + +Here is the function which translates an address to its corresponding shadow +address:: + + static inline void *kasan_mem_to_shadow(const void *addr) + { + return (void *)((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) + + KASAN_SHADOW_OFFSET; + } + +where ``KASAN_SHADOW_SCALE_SHIFT = 3``. + +Compile-time instrumentation is used to insert memory access checks. Compiler +inserts function calls (``__asan_load*(addr)``, ``__asan_store*(addr)``) before +each memory access of size 1, 2, 4, 8, or 16. These functions check whether +memory accesses are valid or not by checking corresponding shadow memory. + +With inline instrumentation, instead of making function calls, the compiler +directly inserts the code to check shadow memory. This option significantly +enlarges the kernel, but it gives an x1.1-x2 performance boost over the +outline-instrumented kernel. + +Generic KASAN is the only mode that delays the reuse of freed objects via +quarantine (see mm/kasan/quarantine.c for implementation). + +Software Tag-Based KASAN +~~~~~~~~~~~~~~~~~~~~~~~~ + +Software Tag-Based KASAN uses a software memory tagging approach to checking +access validity. It is currently only implemented for the arm64 architecture. + +Software Tag-Based KASAN uses the Top Byte Ignore (TBI) feature of arm64 CPUs +to store a pointer tag in the top byte of kernel pointers. It uses shadow memory +to store memory tags associated with each 16-byte memory cell (therefore, it +dedicates 1/16th of the kernel memory for shadow memory). + +On each memory allocation, Software Tag-Based KASAN generates a random tag, tags +the allocated memory with this tag, and embeds the same tag into the returned +pointer. + +Software Tag-Based KASAN uses compile-time instrumentation to insert checks +before each memory access. These checks make sure that the tag of the memory +that is being accessed is equal to the tag of the pointer that is used to access +this memory. In case of a tag mismatch, Software Tag-Based KASAN prints a bug +report. + +Software Tag-Based KASAN also has two instrumentation modes (outline, which +emits callbacks to check memory accesses; and inline, which performs the shadow +memory checks inline). With outline instrumentation mode, a bug report is +printed from the function that performs the access check. With inline +instrumentation, a ``brk`` instruction is emitted by the compiler, and a +dedicated ``brk`` handler is used to print bug reports. + +Software Tag-Based KASAN uses 0xFF as a match-all pointer tag (accesses through +pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently +reserved to tag freed memory regions. + +Hardware Tag-Based KASAN +~~~~~~~~~~~~~~~~~~~~~~~~ + +Hardware Tag-Based KASAN is similar to the software mode in concept but uses +hardware memory tagging support instead of compiler instrumentation and +shadow memory. + +Hardware Tag-Based KASAN is currently only implemented for arm64 architecture +and based on both arm64 Memory Tagging Extension (MTE) introduced in ARMv8.5 +Instruction Set Architecture and Top Byte Ignore (TBI). + +Special arm64 instructions are used to assign memory tags for each allocation. +Same tags are assigned to pointers to those allocations. On every memory +access, hardware makes sure that the tag of the memory that is being accessed is +equal to the tag of the pointer that is used to access this memory. In case of a +tag mismatch, a fault is generated, and a report is printed. + +Hardware Tag-Based KASAN uses 0xFF as a match-all pointer tag (accesses through +pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently +reserved to tag freed memory regions. + +If the hardware does not support MTE (pre ARMv8.5), Hardware Tag-Based KASAN +will not be enabled. In this case, all KASAN boot parameters are ignored. + +Note that enabling CONFIG_KASAN_HW_TAGS always results in in-kernel TBI being +enabled. Even when ``kasan.mode=off`` is provided or when the hardware does not +support MTE (but supports TBI). + +Hardware Tag-Based KASAN only reports the first found bug. After that, MTE tag +checking gets disabled. + +Shadow memory +------------- + +The contents of this section are only applicable to software KASAN modes. + +The kernel maps memory in several different parts of the address space. +The range of kernel virtual addresses is large: there is not enough real +memory to support a real shadow region for every address that could be +accessed by the kernel. Therefore, KASAN only maps real shadow for certain +parts of the address space. + +Default behaviour +~~~~~~~~~~~~~~~~~ + +By default, architectures only map real memory over the shadow region +for the linear mapping (and potentially other small areas). For all +other areas - such as vmalloc and vmemmap space - a single read-only +page is mapped over the shadow area. This read-only shadow page +declares all memory accesses as permitted. + +This presents a problem for modules: they do not live in the linear +mapping but in a dedicated module space. By hooking into the module +allocator, KASAN temporarily maps real shadow memory to cover them. +This allows detection of invalid accesses to module globals, for example. + +This also creates an incompatibility with ``VMAP_STACK``: if the stack +lives in vmalloc space, it will be shadowed by the read-only page, and +the kernel will fault when trying to set up the shadow data for stack +variables. + +CONFIG_KASAN_VMALLOC +~~~~~~~~~~~~~~~~~~~~ + +With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the +cost of greater memory usage. Currently, this is supported on x86, +arm64, riscv, s390, and powerpc. + +This works by hooking into vmalloc and vmap and dynamically +allocating real shadow memory to back the mappings. + +Most mappings in vmalloc space are small, requiring less than a full +page of shadow space. Allocating a full shadow page per mapping would +therefore be wasteful. Furthermore, to ensure that different mappings +use different shadow pages, mappings would have to be aligned to +``KASAN_GRANULE_SIZE * PAGE_SIZE``. + +Instead, KASAN shares backing space across multiple mappings. It allocates +a backing page when a mapping in vmalloc space uses a particular page +of the shadow region. This page can be shared by other vmalloc +mappings later on. + +KASAN hooks into the vmap infrastructure to lazily clean up unused shadow +memory. + +To avoid the difficulties around swapping mappings around, KASAN expects +that the part of the shadow region that covers the vmalloc space will +not be covered by the early shadow page but will be left unmapped. +This will require changes in arch-specific code. + +This allows ``VMAP_STACK`` support on x86 and can simplify support of +architectures that do not have a fixed module region. + +For developers +-------------- + +Ignoring accesses +~~~~~~~~~~~~~~~~~ + +Software KASAN modes use compiler instrumentation to insert validity checks. +Such instrumentation might be incompatible with some parts of the kernel, and +therefore needs to be disabled. + +Other parts of the kernel might access metadata for allocated objects. +Normally, KASAN detects and reports such accesses, but in some cases (e.g., +in memory allocators), these accesses are valid. + +For software KASAN modes, to disable instrumentation for a specific file or +directory, add a ``KASAN_SANITIZE`` annotation to the respective kernel +Makefile: + +- For a single file (e.g., main.o):: + + KASAN_SANITIZE_main.o := n + +- For all files in one directory:: + + KASAN_SANITIZE := n + +For software KASAN modes, to disable instrumentation on a per-function basis, +use the KASAN-specific ``__no_sanitize_address`` function attribute or the +generic ``noinstr`` one. + +Note that disabling compiler instrumentation (either on a per-file or a +per-function basis) makes KASAN ignore the accesses that happen directly in +that code for software KASAN modes. It does not help when the accesses happen +indirectly (through calls to instrumented functions) or with Hardware +Tag-Based KASAN, which does not use compiler instrumentation. + +For software KASAN modes, to disable KASAN reports in a part of the kernel code +for the current task, annotate this part of the code with a +``kasan_disable_current()``/``kasan_enable_current()`` section. This also +disables the reports for indirect accesses that happen through function calls. + +For tag-based KASAN modes, to disable access checking, use +``kasan_reset_tag()`` or ``page_kasan_tag_reset()``. Note that temporarily +disabling access checking via ``page_kasan_tag_reset()`` requires saving and +restoring the per-page KASAN tag via ``page_kasan_tag``/``page_kasan_tag_set``. + +Tests +~~~~~ + +There are KASAN tests that allow verifying that KASAN works and can detect +certain types of memory corruptions. The tests consist of two parts: + +1. Tests that are integrated with the KUnit Test Framework. Enabled with +``CONFIG_KASAN_KUNIT_TEST``. These tests can be run and partially verified +automatically in a few different ways; see the instructions below. + +2. Tests that are currently incompatible with KUnit. Enabled with +``CONFIG_KASAN_MODULE_TEST`` and can only be run as a module. These tests can +only be verified manually by loading the kernel module and inspecting the +kernel log for KASAN reports. + +Each KUnit-compatible KASAN test prints one of multiple KASAN reports if an +error is detected. Then the test prints its number and status. + +When a test passes:: + + ok 28 - kmalloc_double_kzfree + +When a test fails due to a failed ``kmalloc``:: + + # kmalloc_large_oob_right: ASSERTION FAILED at lib/test_kasan.c:163 + Expected ptr is not null, but is + not ok 4 - kmalloc_large_oob_right + +When a test fails due to a missing KASAN report:: + + # kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:974 + KASAN failure expected in "kfree_sensitive(ptr)", but none occurred + not ok 44 - kmalloc_double_kzfree + + +At the end the cumulative status of all KASAN tests is printed. On success:: + + ok 1 - kasan + +Or, if one of the tests failed:: + + not ok 1 - kasan + +There are a few ways to run KUnit-compatible KASAN tests. + +1. Loadable module + + With ``CONFIG_KUNIT`` enabled, KASAN-KUnit tests can be built as a loadable + module and run by loading ``test_kasan.ko`` with ``insmod`` or ``modprobe``. + +2. Built-In + + With ``CONFIG_KUNIT`` built-in, KASAN-KUnit tests can be built-in as well. + In this case, the tests will run at boot as a late-init call. + +3. Using kunit_tool + + With ``CONFIG_KUNIT`` and ``CONFIG_KASAN_KUNIT_TEST`` built-in, it is also + possible to use ``kunit_tool`` to see the results of KUnit tests in a more + readable way. This will not print the KASAN reports of the tests that passed. + See `KUnit documentation <https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html>`_ + for more up-to-date information on ``kunit_tool``. + +.. _KUnit: https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst new file mode 100644 index 0000000000..6611434e2d --- /dev/null +++ b/Documentation/dev-tools/kcov.rst @@ -0,0 +1,374 @@ +KCOV: code coverage for fuzzing +=============================== + +KCOV collects and exposes kernel code coverage information in a form suitable +for coverage-guided fuzzing. Coverage data of a running kernel is exported via +the ``kcov`` debugfs file. Coverage collection is enabled on a task basis, and +thus KCOV can capture precise coverage of a single system call. + +Note that KCOV does not aim to collect as much coverage as possible. It aims +to collect more or less stable coverage that is a function of syscall inputs. +To achieve this goal, it does not collect coverage in soft/hard interrupts +(unless remove coverage collection is enabled, see below) and from some +inherently non-deterministic parts of the kernel (e.g. scheduler, locking). + +Besides collecting code coverage, KCOV can also collect comparison operands. +See the "Comparison operands collection" section for details. + +Besides collecting coverage data from syscall handlers, KCOV can also collect +coverage for annotated parts of the kernel executing in background kernel +tasks or soft interrupts. See the "Remote coverage collection" section for +details. + +Prerequisites +------------- + +KCOV relies on compiler instrumentation and requires GCC 6.1.0 or later +or any Clang version supported by the kernel. + +Collecting comparison operands is supported with GCC 8+ or with Clang. + +To enable KCOV, configure the kernel with:: + + CONFIG_KCOV=y + +To enable comparison operands collection, set:: + + CONFIG_KCOV_ENABLE_COMPARISONS=y + +Coverage data only becomes accessible once debugfs has been mounted:: + + mount -t debugfs none /sys/kernel/debug + +Coverage collection +------------------- + +The following program demonstrates how to use KCOV to collect coverage for a +single syscall from within a test program: + +.. code-block:: c + + #include <stdio.h> + #include <stddef.h> + #include <stdint.h> + #include <stdlib.h> + #include <sys/types.h> + #include <sys/stat.h> + #include <sys/ioctl.h> + #include <sys/mman.h> + #include <unistd.h> + #include <fcntl.h> + #include <linux/types.h> + + #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) + #define KCOV_ENABLE _IO('c', 100) + #define KCOV_DISABLE _IO('c', 101) + #define COVER_SIZE (64<<10) + + #define KCOV_TRACE_PC 0 + #define KCOV_TRACE_CMP 1 + + int main(int argc, char **argv) + { + int fd; + unsigned long *cover, n, i; + + /* A single fd descriptor allows coverage collection on a single + * thread. + */ + fd = open("/sys/kernel/debug/kcov", O_RDWR); + if (fd == -1) + perror("open"), exit(1); + /* Setup trace mode and trace size. */ + if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE)) + perror("ioctl"), exit(1); + /* Mmap buffer shared between kernel- and user-space. */ + cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long), + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if ((void*)cover == MAP_FAILED) + perror("mmap"), exit(1); + /* Enable coverage collection on the current thread. */ + if (ioctl(fd, KCOV_ENABLE, KCOV_TRACE_PC)) + perror("ioctl"), exit(1); + /* Reset coverage from the tail of the ioctl() call. */ + __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED); + /* Call the target syscall call. */ + read(-1, NULL, 0); + /* Read number of PCs collected. */ + n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); + for (i = 0; i < n; i++) + printf("0x%lx\n", cover[i + 1]); + /* Disable coverage collection for the current thread. After this call + * coverage can be enabled for a different thread. + */ + if (ioctl(fd, KCOV_DISABLE, 0)) + perror("ioctl"), exit(1); + /* Free resources. */ + if (munmap(cover, COVER_SIZE * sizeof(unsigned long))) + perror("munmap"), exit(1); + if (close(fd)) + perror("close"), exit(1); + return 0; + } + +After piping through ``addr2line`` the output of the program looks as follows:: + + SyS_read + fs/read_write.c:562 + __fdget_pos + fs/file.c:774 + __fget_light + fs/file.c:746 + __fget_light + fs/file.c:750 + __fget_light + fs/file.c:760 + __fdget_pos + fs/file.c:784 + SyS_read + fs/read_write.c:562 + +If a program needs to collect coverage from several threads (independently), +it needs to open ``/sys/kernel/debug/kcov`` in each thread separately. + +The interface is fine-grained to allow efficient forking of test processes. +That is, a parent process opens ``/sys/kernel/debug/kcov``, enables trace mode, +mmaps coverage buffer, and then forks child processes in a loop. The child +processes only need to enable coverage (it gets disabled automatically when +a thread exits). + +Comparison operands collection +------------------------------ + +Comparison operands collection is similar to coverage collection: + +.. code-block:: c + + /* Same includes and defines as above. */ + + /* Number of 64-bit words per record. */ + #define KCOV_WORDS_PER_CMP 4 + + /* + * The format for the types of collected comparisons. + * + * Bit 0 shows whether one of the arguments is a compile-time constant. + * Bits 1 & 2 contain log2 of the argument size, up to 8 bytes. + */ + + #define KCOV_CMP_CONST (1 << 0) + #define KCOV_CMP_SIZE(n) ((n) << 1) + #define KCOV_CMP_MASK KCOV_CMP_SIZE(3) + + int main(int argc, char **argv) + { + int fd; + uint64_t *cover, type, arg1, arg2, is_const, size; + unsigned long n, i; + + fd = open("/sys/kernel/debug/kcov", O_RDWR); + if (fd == -1) + perror("open"), exit(1); + if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE)) + perror("ioctl"), exit(1); + /* + * Note that the buffer pointer is of type uint64_t*, because all + * the comparison operands are promoted to uint64_t. + */ + cover = (uint64_t *)mmap(NULL, COVER_SIZE * sizeof(unsigned long), + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if ((void*)cover == MAP_FAILED) + perror("mmap"), exit(1); + /* Note KCOV_TRACE_CMP instead of KCOV_TRACE_PC. */ + if (ioctl(fd, KCOV_ENABLE, KCOV_TRACE_CMP)) + perror("ioctl"), exit(1); + __atomic_store_n(&cover[0], 0, __ATOMIC_RELAXED); + read(-1, NULL, 0); + /* Read number of comparisons collected. */ + n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); + for (i = 0; i < n; i++) { + uint64_t ip; + + type = cover[i * KCOV_WORDS_PER_CMP + 1]; + /* arg1 and arg2 - operands of the comparison. */ + arg1 = cover[i * KCOV_WORDS_PER_CMP + 2]; + arg2 = cover[i * KCOV_WORDS_PER_CMP + 3]; + /* ip - caller address. */ + ip = cover[i * KCOV_WORDS_PER_CMP + 4]; + /* size of the operands. */ + size = 1 << ((type & KCOV_CMP_MASK) >> 1); + /* is_const - true if either operand is a compile-time constant.*/ + is_const = type & KCOV_CMP_CONST; + printf("ip: 0x%lx type: 0x%lx, arg1: 0x%lx, arg2: 0x%lx, " + "size: %lu, %s\n", + ip, type, arg1, arg2, size, + is_const ? "const" : "non-const"); + } + if (ioctl(fd, KCOV_DISABLE, 0)) + perror("ioctl"), exit(1); + /* Free resources. */ + if (munmap(cover, COVER_SIZE * sizeof(unsigned long))) + perror("munmap"), exit(1); + if (close(fd)) + perror("close"), exit(1); + return 0; + } + +Note that the KCOV modes (collection of code coverage or comparison operands) +are mutually exclusive. + +Remote coverage collection +-------------------------- + +Besides collecting coverage data from handlers of syscalls issued from a +userspace process, KCOV can also collect coverage for parts of the kernel +executing in other contexts - so-called "remote" coverage. + +Using KCOV to collect remote coverage requires: + +1. Modifying kernel code to annotate the code section from where coverage + should be collected with ``kcov_remote_start`` and ``kcov_remote_stop``. + +2. Using ``KCOV_REMOTE_ENABLE`` instead of ``KCOV_ENABLE`` in the userspace + process that collects coverage. + +Both ``kcov_remote_start`` and ``kcov_remote_stop`` annotations and the +``KCOV_REMOTE_ENABLE`` ioctl accept handles that identify particular coverage +collection sections. The way a handle is used depends on the context where the +matching code section executes. + +KCOV supports collecting remote coverage from the following contexts: + +1. Global kernel background tasks. These are the tasks that are spawned during + kernel boot in a limited number of instances (e.g. one USB ``hub_event`` + worker is spawned per one USB HCD). + +2. Local kernel background tasks. These are spawned when a userspace process + interacts with some kernel interface and are usually killed when the process + exits (e.g. vhost workers). + +3. Soft interrupts. + +For #1 and #3, a unique global handle must be chosen and passed to the +corresponding ``kcov_remote_start`` call. Then a userspace process must pass +this handle to ``KCOV_REMOTE_ENABLE`` in the ``handles`` array field of the +``kcov_remote_arg`` struct. This will attach the used KCOV device to the code +section referenced by this handle. Multiple global handles identifying +different code sections can be passed at once. + +For #2, the userspace process instead must pass a non-zero handle through the +``common_handle`` field of the ``kcov_remote_arg`` struct. This common handle +gets saved to the ``kcov_handle`` field in the current ``task_struct`` and +needs to be passed to the newly spawned local tasks via custom kernel code +modifications. Those tasks should in turn use the passed handle in their +``kcov_remote_start`` and ``kcov_remote_stop`` annotations. + +KCOV follows a predefined format for both global and common handles. Each +handle is a ``u64`` integer. Currently, only the one top and the lower 4 bytes +are used. Bytes 4-7 are reserved and must be zero. + +For global handles, the top byte of the handle denotes the id of a subsystem +this handle belongs to. For example, KCOV uses ``1`` as the USB subsystem id. +The lower 4 bytes of a global handle denote the id of a task instance within +that subsystem. For example, each ``hub_event`` worker uses the USB bus number +as the task instance id. + +For common handles, a reserved value ``0`` is used as a subsystem id, as such +handles don't belong to a particular subsystem. The lower 4 bytes of a common +handle identify a collective instance of all local tasks spawned by the +userspace process that passed a common handle to ``KCOV_REMOTE_ENABLE``. + +In practice, any value can be used for common handle instance id if coverage +is only collected from a single userspace process on the system. However, if +common handles are used by multiple processes, unique instance ids must be +used for each process. One option is to use the process id as the common +handle instance id. + +The following program demonstrates using KCOV to collect coverage from both +local tasks spawned by the process and the global task that handles USB bus #1: + +.. code-block:: c + + /* Same includes and defines as above. */ + + struct kcov_remote_arg { + __u32 trace_mode; + __u32 area_size; + __u32 num_handles; + __aligned_u64 common_handle; + __aligned_u64 handles[0]; + }; + + #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) + #define KCOV_DISABLE _IO('c', 101) + #define KCOV_REMOTE_ENABLE _IOW('c', 102, struct kcov_remote_arg) + + #define COVER_SIZE (64 << 10) + + #define KCOV_TRACE_PC 0 + + #define KCOV_SUBSYSTEM_COMMON (0x00ull << 56) + #define KCOV_SUBSYSTEM_USB (0x01ull << 56) + + #define KCOV_SUBSYSTEM_MASK (0xffull << 56) + #define KCOV_INSTANCE_MASK (0xffffffffull) + + static inline __u64 kcov_remote_handle(__u64 subsys, __u64 inst) + { + if (subsys & ~KCOV_SUBSYSTEM_MASK || inst & ~KCOV_INSTANCE_MASK) + return 0; + return subsys | inst; + } + + #define KCOV_COMMON_ID 0x42 + #define KCOV_USB_BUS_NUM 1 + + int main(int argc, char **argv) + { + int fd; + unsigned long *cover, n, i; + struct kcov_remote_arg *arg; + + fd = open("/sys/kernel/debug/kcov", O_RDWR); + if (fd == -1) + perror("open"), exit(1); + if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE)) + perror("ioctl"), exit(1); + cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long), + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if ((void*)cover == MAP_FAILED) + perror("mmap"), exit(1); + + /* Enable coverage collection via common handle and from USB bus #1. */ + arg = calloc(1, sizeof(*arg) + sizeof(uint64_t)); + if (!arg) + perror("calloc"), exit(1); + arg->trace_mode = KCOV_TRACE_PC; + arg->area_size = COVER_SIZE; + arg->num_handles = 1; + arg->common_handle = kcov_remote_handle(KCOV_SUBSYSTEM_COMMON, + KCOV_COMMON_ID); + arg->handles[0] = kcov_remote_handle(KCOV_SUBSYSTEM_USB, + KCOV_USB_BUS_NUM); + if (ioctl(fd, KCOV_REMOTE_ENABLE, arg)) + perror("ioctl"), free(arg), exit(1); + free(arg); + + /* + * Here the user needs to trigger execution of a kernel code section + * that is either annotated with the common handle, or to trigger some + * activity on USB bus #1. + */ + sleep(2); + + n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED); + for (i = 0; i < n; i++) + printf("0x%lx\n", cover[i + 1]); + if (ioctl(fd, KCOV_DISABLE, 0)) + perror("ioctl"), exit(1); + if (munmap(cover, COVER_SIZE * sizeof(unsigned long))) + perror("munmap"), exit(1); + if (close(fd)) + perror("close"), exit(1); + return 0; + } diff --git a/Documentation/dev-tools/kcsan.rst b/Documentation/dev-tools/kcsan.rst new file mode 100644 index 0000000000..3ae866dcc9 --- /dev/null +++ b/Documentation/dev-tools/kcsan.rst @@ -0,0 +1,366 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright (C) 2019, Google LLC. + +The Kernel Concurrency Sanitizer (KCSAN) +======================================== + +The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector, which +relies on compile-time instrumentation, and uses a watchpoint-based sampling +approach to detect races. KCSAN's primary purpose is to detect `data races`_. + +Usage +----- + +KCSAN is supported by both GCC and Clang. With GCC we require version 11 or +later, and with Clang also require version 11 or later. + +To enable KCSAN configure the kernel with:: + + CONFIG_KCSAN = y + +KCSAN provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kcsan`` for more info). + +Error reports +~~~~~~~~~~~~~ + +A typical data race report looks like this:: + + ================================================================== + BUG: KCSAN: data-race in test_kernel_read / test_kernel_write + + write to 0xffffffffc009a628 of 8 bytes by task 487 on cpu 0: + test_kernel_write+0x1d/0x30 + access_thread+0x89/0xd0 + kthread+0x23e/0x260 + ret_from_fork+0x22/0x30 + + read to 0xffffffffc009a628 of 8 bytes by task 488 on cpu 6: + test_kernel_read+0x10/0x20 + access_thread+0x89/0xd0 + kthread+0x23e/0x260 + ret_from_fork+0x22/0x30 + + value changed: 0x00000000000009a6 -> 0x00000000000009b2 + + Reported by Kernel Concurrency Sanitizer on: + CPU: 6 PID: 488 Comm: access_thread Not tainted 5.12.0-rc2+ #1 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the functions involved in +the race. It is followed by the access types and stack traces of the 2 threads +involved in the data race. If KCSAN also observed a value change, the observed +old value and new value are shown on the "value changed" line respectively. + +The other less common type of data race report looks like this:: + + ================================================================== + BUG: KCSAN: data-race in test_kernel_rmw_array+0x71/0xd0 + + race at unknown origin, with read to 0xffffffffc009bdb0 of 8 bytes by task 515 on cpu 2: + test_kernel_rmw_array+0x71/0xd0 + access_thread+0x89/0xd0 + kthread+0x23e/0x260 + ret_from_fork+0x22/0x30 + + value changed: 0x0000000000002328 -> 0x0000000000002329 + + Reported by Kernel Concurrency Sanitizer on: + CPU: 2 PID: 515 Comm: access_thread Not tainted 5.12.0-rc2+ #1 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +This report is generated where it was not possible to determine the other +racing thread, but a race was inferred due to the data value of the watched +memory location having changed. These reports always show a "value changed" +line. A common reason for reports of this type are missing instrumentation in +the racing thread, but could also occur due to e.g. DMA accesses. Such reports +are shown only if ``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y``, which is +enabled by default. + +Selective analysis +~~~~~~~~~~~~~~~~~~ + +It may be desirable to disable data race detection for specific accesses, +functions, compilation units, or entire subsystems. For static blacklisting, +the below options are available: + +* KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that + any data races due to accesses in ``expr`` should be ignored and resulting + behaviour when encountering a data race is deemed safe. Please see + `"Marking Shared-Memory Accesses" in the LKMM`_ for more information. + +* Disabling data race detection for entire functions can be accomplished by + using the function attribute ``__no_kcsan``:: + + __no_kcsan + void foo(void) { + ... + + To dynamically limit for which functions to generate reports, see the + `DebugFS interface`_ blacklist/whitelist feature. + +* To disable data race detection for a particular compilation unit, add to the + ``Makefile``:: + + KCSAN_SANITIZE_file.o := n + +* To disable data race detection for all compilation units listed in a + ``Makefile``, add to the respective ``Makefile``:: + + KCSAN_SANITIZE := n + +.. _"Marking Shared-Memory Accesses" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt + +Furthermore, it is possible to tell KCSAN to show or hide entire classes of +data races, depending on preferences. These can be changed via the following +Kconfig options: + +* ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY``: If enabled and a conflicting write + is observed via a watchpoint, but the data value of the memory location was + observed to remain unchanged, do not report the data race. + +* ``CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC``: Assume that plain aligned writes + up to word size are atomic by default. Assumes that such writes are not + subject to unsafe compiler optimizations resulting in data races. The option + causes KCSAN to not report data races due to conflicts where the only plain + accesses are aligned writes up to word size. + +* ``CONFIG_KCSAN_PERMISSIVE``: Enable additional permissive rules to ignore + certain classes of common data races. Unlike the above, the rules are more + complex involving value-change patterns, access type, and address. This + option depends on ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=y``. For details + please see the ``kernel/kcsan/permissive.h``. Testers and maintainers that + only focus on reports from specific subsystems and not the whole kernel are + recommended to disable this option. + +To use the strictest possible rules, select ``CONFIG_KCSAN_STRICT=y``, which +configures KCSAN to follow the Linux-kernel memory consistency model (LKMM) as +closely as possible. + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +The file ``/sys/kernel/debug/kcsan`` provides the following interface: + +* Reading ``/sys/kernel/debug/kcsan`` returns various runtime statistics. + +* Writing ``on`` or ``off`` to ``/sys/kernel/debug/kcsan`` allows turning KCSAN + on or off, respectively. + +* Writing ``!some_func_name`` to ``/sys/kernel/debug/kcsan`` adds + ``some_func_name`` to the report filter list, which (by default) blacklists + reporting data races where either one of the top stackframes are a function + in the list. + +* Writing either ``blacklist`` or ``whitelist`` to ``/sys/kernel/debug/kcsan`` + changes the report filtering behaviour. For example, the blacklist feature + can be used to silence frequently occurring data races; the whitelist feature + can help with reproduction and testing of fixes. + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +Core parameters that affect KCSAN's overall performance and bug detection +ability are exposed as kernel command-line arguments whose defaults can also be +changed via the corresponding Kconfig options. + +* ``kcsan.skip_watch`` (``CONFIG_KCSAN_SKIP_WATCH``): Number of per-CPU memory + operations to skip, before another watchpoint is set up. Setting up + watchpoints more frequently will result in the likelihood of races to be + observed to increase. This parameter has the most significant impact on + overall system performance and race detection ability. + +* ``kcsan.udelay_task`` (``CONFIG_KCSAN_UDELAY_TASK``): For tasks, the + microsecond delay to stall execution after a watchpoint has been set up. + Larger values result in the window in which we may observe a race to + increase. + +* ``kcsan.udelay_interrupt`` (``CONFIG_KCSAN_UDELAY_INTERRUPT``): For + interrupts, the microsecond delay to stall execution after a watchpoint has + been set up. Interrupts have tighter latency requirements, and their delay + should generally be smaller than the one chosen for tasks. + +They may be tweaked at runtime via ``/sys/module/kcsan/parameters/``. + +Data Races +---------- + +In an execution, two memory accesses form a *data race* if they *conflict*, +they happen concurrently in different threads, and at least one of them is a +*plain access*; they *conflict* if both access the same memory location, and at +least one is a write. For a more thorough discussion and definition, see `"Plain +Accesses and Data Races" in the LKMM`_. + +.. _"Plain Accesses and Data Races" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n1922 + +Relationship with the Linux-Kernel Memory Consistency Model (LKMM) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The LKMM defines the propagation and ordering rules of various memory +operations, which gives developers the ability to reason about concurrent code. +Ultimately this allows to determine the possible executions of concurrent code, +and if that code is free from data races. + +KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``, +``atomic_*``, etc.), and a subset of ordering guarantees implied by memory +barriers. With ``CONFIG_KCSAN_WEAK_MEMORY=y``, KCSAN models load or store +buffering, and can detect missing ``smp_mb()``, ``smp_wmb()``, ``smp_rmb()``, +``smp_store_release()``, and all ``atomic_*`` operations with equivalent +implied barriers. + +Note, KCSAN will not report all data races due to missing memory ordering, +specifically where a memory barrier would be required to prohibit subsequent +memory operation from reordering before the barrier. Developers should +therefore carefully consider the required memory ordering requirements that +remain unchecked. + +Race Detection Beyond Data Races +-------------------------------- + +For code with complex concurrency design, race-condition bugs may not always +manifest as data races. Race conditions occur if concurrently executing +operations result in unexpected system behaviour. On the other hand, data races +are defined at the C-language level. The following macros can be used to check +properties of concurrent code where bugs would not manifest as data races. + +.. kernel-doc:: include/linux/kcsan-checks.h + :functions: ASSERT_EXCLUSIVE_WRITER ASSERT_EXCLUSIVE_WRITER_SCOPED + ASSERT_EXCLUSIVE_ACCESS ASSERT_EXCLUSIVE_ACCESS_SCOPED + ASSERT_EXCLUSIVE_BITS + +Implementation Details +---------------------- + +KCSAN relies on observing that two accesses happen concurrently. Crucially, we +want to (a) increase the chances of observing races (especially for races that +manifest rarely), and (b) be able to actually observe them. We can accomplish +(a) by injecting various delays, and (b) by using address watchpoints (or +breakpoints). + +If we deliberately stall a memory access, while we have a watchpoint for its +address set up, and then observe the watchpoint to fire, two accesses to the +same address just raced. Using hardware watchpoints, this is the approach taken +in `DataCollider +<http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf>`_. +Unlike DataCollider, KCSAN does not use hardware watchpoints, but instead +relies on compiler instrumentation and "soft watchpoints". + +In KCSAN, watchpoints are implemented using an efficient encoding that stores +access type, size, and address in a long; the benefits of using "soft +watchpoints" are portability and greater flexibility. KCSAN then relies on the +compiler instrumenting plain accesses. For each instrumented plain access: + +1. Check if a matching watchpoint exists; if yes, and at least one access is a + write, then we encountered a racing access. + +2. Periodically, if no matching watchpoint exists, set up a watchpoint and + stall for a small randomized delay. + +3. Also check the data value before the delay, and re-check the data value + after delay; if the values mismatch, we infer a race of unknown origin. + +To detect data races between plain and marked accesses, KCSAN also annotates +marked accesses, but only to check if a watchpoint exists; i.e. KCSAN never +sets up a watchpoint on marked accesses. By never setting up watchpoints for +marked operations, if all accesses to a variable that is accessed concurrently +are properly marked, KCSAN will never trigger a watchpoint and therefore never +report the accesses. + +Modeling Weak Memory +~~~~~~~~~~~~~~~~~~~~ + +KCSAN's approach to detecting data races due to missing memory barriers is +based on modeling access reordering (with ``CONFIG_KCSAN_WEAK_MEMORY=y``). +Each plain memory access for which a watchpoint is set up, is also selected for +simulated reordering within the scope of its function (at most 1 in-flight +access). + +Once an access has been selected for reordering, it is checked along every +other access until the end of the function scope. If an appropriate memory +barrier is encountered, the access will no longer be considered for simulated +reordering. + +When the result of a memory operation should be ordered by a barrier, KCSAN can +then detect data races where the conflict only occurs as a result of a missing +barrier. Consider the example:: + + int x, flag; + void T1(void) + { + x = 1; // data race! + WRITE_ONCE(flag, 1); // correct: smp_store_release(&flag, 1) + } + void T2(void) + { + while (!READ_ONCE(flag)); // correct: smp_load_acquire(&flag) + ... = x; // data race! + } + +When weak memory modeling is enabled, KCSAN can consider ``x`` in ``T1`` for +simulated reordering. After the write of ``flag``, ``x`` is again checked for +concurrent accesses: because ``T2`` is able to proceed after the write of +``flag``, a data race is detected. With the correct barriers in place, ``x`` +would not be considered for reordering after the proper release of ``flag``, +and no data race would be detected. + +Deliberate trade-offs in complexity but also practical limitations mean only a +subset of data races due to missing memory barriers can be detected. With +currently available compiler support, the implementation is limited to modeling +the effects of "buffering" (delaying accesses), since the runtime cannot +"prefetch" accesses. Also recall that watchpoints are only set up for plain +accesses, and the only access type for which KCSAN simulates reordering. This +means reordering of marked accesses is not modeled. + +A consequence of the above is that acquire operations do not require barrier +instrumentation (no prefetching). Furthermore, marked accesses introducing +address or control dependencies do not require special handling (the marked +access cannot be reordered, later dependent accesses cannot be prefetched). + +Key Properties +~~~~~~~~~~~~~~ + +1. **Memory Overhead:** The overall memory overhead is only a few MiB + depending on configuration. The current implementation uses a small array of + longs to encode watchpoint information, which is negligible. + +2. **Performance Overhead:** KCSAN's runtime aims to be minimal, using an + efficient watchpoint encoding that does not require acquiring any shared + locks in the fast-path. For kernel boot on a system with 8 CPUs: + + - 5.0x slow-down with the default KCSAN config; + - 2.8x slow-down from runtime fast-path overhead only (set very large + ``KCSAN_SKIP_WATCH`` and unset ``KCSAN_SKIP_WATCH_RANDOMIZE``). + +3. **Annotation Overheads:** Minimal annotations are required outside the KCSAN + runtime. As a result, maintenance overheads are minimal as the kernel + evolves. + +4. **Detects Racy Writes from Devices:** Due to checking data values upon + setting up watchpoints, racy writes from devices can also be detected. + +5. **Memory Ordering:** KCSAN is aware of only a subset of LKMM ordering rules; + this may result in missed data races (false negatives). + +6. **Analysis Accuracy:** For observed executions, due to using a sampling + strategy, the analysis is *unsound* (false negatives possible), but aims to + be complete (no false positives). + +Alternatives Considered +----------------------- + +An alternative data race detection approach for the kernel can be found in the +`Kernel Thread Sanitizer (KTSAN) <https://github.com/google/ktsan/wiki>`_. +KTSAN is a happens-before data race detector, which explicitly establishes the +happens-before order between memory operations, which can then be used to +determine data races as defined in `Data Races`_. + +To build a correct happens-before relation, KTSAN must be aware of all ordering +rules of the LKMM and synchronization primitives. Unfortunately, any omission +leads to large numbers of false positives, which is especially detrimental in +the context of the kernel which includes numerous custom synchronization +mechanisms. To track the happens-before relation, KTSAN's implementation +requires metadata for each memory location (shadow memory), which for each page +corresponds to 4 pages of shadow memory, and can translate into overhead of +tens of GiB on a large system. diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 0000000000..936f6aaa75 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,333 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright (C) 2020, Google LLC. + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +To build a kernel with KFENCE support, but disabled by default (to enable, set +``kfence.sample_interval`` to non-zero value), configure the kernel with:: + + CONFIG_KFENCE=y + CONFIG_KFENCE_SAMPLE_INTERVAL=0 + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +The sample interval controls a timer that sets up KFENCE allocations. By +default, to keep the real sample interval predictable, the normal timer also +causes CPU wake-ups when the system is completely idle. This may be undesirable +on power-constrained systems. The boot parameter ``kfence.deferrable=1`` +instead switches to a "deferrable" timer which does not force CPU wake-ups on +idle systems, at the risk of unpredictable sample intervals. The default is +configurable via the Kconfig option ``CONFIG_KFENCE_DEFERRABLE``. + +.. warning:: + The KUnit test suite is very likely to fail when using a deferrable timer + since it currently causes very unpredictable sample intervals. + +The KFENCE memory pool is of fixed size, and if the pool is exhausted, no +further KFENCE allocations occur. With ``CONFIG_KFENCE_NUM_OBJECTS`` (default +255), the number of available guarded objects can be controlled. Each object +requires 2 pages, one for the object itself and the other one used as a guard +page; object pages are interleaved with guard pages, and every object page is +therefore surrounded by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Note: On architectures that support huge pages, KFENCE will ensure that the +pool is using pages of size ``PAGE_SIZE``. This will result in additional page +tables being allocated. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa6/0x234 + + Out-of-bounds read at 0xffff8c3f2e291fff (1B left of kfence-#72): + test_out_of_bounds_read+0xa6/0x234 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + kfence-#72: 0xffff8c3f2e292000-0xffff8c3f2e29201f, size=32, cache=kmalloc-32 + + allocated by task 484 on cpu 0 at 32.919330s: + test_alloc+0xfe/0x738 + test_out_of_bounds_read+0x9b/0x234 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. Note that, real kernel addresses are only shown when using the +kernel command line option ``no_hash_pointers``. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free read in test_use_after_free_read+0xb3/0x143 + + Use-after-free read at 0xffff8c3f2e2a0000 (in kfence-#79): + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + kfence-#79: 0xffff8c3f2e2a0000-0xffff8c3f2e2a001f, size=32, cache=kmalloc-32 + + allocated by task 488 on cpu 2 at 33.871326s: + test_alloc+0xfe/0x738 + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + freed by task 488 on cpu 2 at 33.871358s: + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + CPU: 2 PID: 488 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffff8c3f2e2a4000 (in kfence-#81): + test_double_free+0xdc/0x171 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + kfence-#81: 0xffff8c3f2e2a4000-0xffff8c3f2e2a401f, size=32, cache=kmalloc-32 + + allocated by task 490 on cpu 1 at 34.175321s: + test_alloc+0xfe/0x738 + test_double_free+0x76/0x171 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + freed by task 490 on cpu 1 at 34.175348s: + test_double_free+0xa8/0x171 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + CPU: 1 PID: 490 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Corrupted memory at 0xffff8c3f2e33aff9 [ 0xac . . . . . . ] (in kfence-#156): + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + kfence-#156: 0xffff8c3f2e33afb0-0xffff8c3f2e33aff8, size=73, cache=kmalloc-96 + + allocated by task 502 on cpu 7 at 42.159302s: + test_alloc+0xfe/0x738 + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x61/0xa0 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x176/0x1b0 + ret_from_fork+0x22/0x30 + + CPU: 7 PID: 502 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ================================================================== + +For such errors, the address where the corruption occurred as well as the +invalidly written bytes (offset from the address) are shown; in this +representation, '.' denote untouched bytes. In the example above ``0xac`` is +the value written to the invalid address at offset 0, and the remaining '.' +denote that no following bytes have been touched. Note that, real values are +only shown if the kernel was booted with ``no_hash_pointers``; to avoid +information disclosure otherwise, '!' is used instead to denote invalidly +written bytes. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid read in test_invalid_access+0x26/0xe0 + + Invalid read at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, the next allocation through the main allocator (SLAB or +SLUB) returns a guarded allocation from the KFENCE object pool (allocation +sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and +the next allocation is set up after the expiration of the interval. + +When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated" +through the main allocator's fast-path by relying on static branches via the +static keys infrastructure. The static branch is toggled to redirect the +allocation to KFENCE. Depending on sample interval, target workloads, and +system architecture, this may perform better than the simple dynamic branch. +Careful benchmarking is recommended. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access, and marking the page as accessible so that the faulting +code can (wrongly) continue executing (set ``panic_on_warn`` to panic instead). + +To detect out-of-bounds writes to memory within the object's page itself, +KFENCE also uses pattern-based redzones. For each object page, a redzone is set +up for all non-object memory. For typical alignments, the redzone is only +required on the unguarded side of an object. Because KFENCE must honor the +cache's requested alignment, special alignments may result in unprotected gaps +on either side of an object, all of which are redzoned. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +If pool utilization reaches 75% (default) or above, to reduce the risk of the +pool eventually being fully occupied by allocated objects yet ensure diverse +coverage of allocations, KFENCE limits currently covered allocations of the +same source from further filling up the pool. The "source" of an allocation is +based on its partial allocation stack trace. A side-effect is that this also +limits frequent long-lived allocations (e.g. pagecache) of the same source +filling up the pool permanently, which is the most common risk for the pool +becoming full and the sampled allocation rate dropping to zero. The threshold +at which to start limiting currently covered allocations can be configured via +the boot parameter ``kfence.skip_covered_thresh`` (pool usage%). + +Interface +--------- + +The following describes the functions which are used by allocators as well as +page handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free __kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +<http://llvm.org/docs/GwpAsan.html>`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +<https://linux.die.net/man/3/efence>`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. + +It is worth highlighting that KASAN and KFENCE are complementary, with +different target environments. For instance, KASAN is the better debugging-aid, +where test cases or reproducers exists: due to the lower chance to detect the +error, it would require more effort using KFENCE to debug. Deployments at scale +that cannot afford to enable KASAN, however, would benefit from using KFENCE to +discover bugs due to code paths not exercised by test cases or fuzzers. diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst new file mode 100644 index 0000000000..f83ba2601e --- /dev/null +++ b/Documentation/dev-tools/kgdb.rst @@ -0,0 +1,939 @@ +================================================= +Using kgdb, kdb and the kernel debugger internals +================================================= + +:Author: Jason Wessel + +Introduction +============ + +The kernel has two different debugger front ends (kdb and kgdb) which +interface to the debug core. It is possible to use either of the +debugger front ends and dynamically transition between them if you +configure the kernel properly at compile and runtime. + +Kdb is simplistic shell-style interface which you can use on a system +console with a keyboard or serial console. You can use it to inspect +memory, registers, process lists, dmesg, and even set breakpoints to +stop in a certain location. Kdb is not a source level debugger, although +you can set breakpoints and execute some basic kernel run control. Kdb +is mainly aimed at doing some analysis to aid in development or +diagnosing kernel problems. You can access some symbols by name in +kernel built-ins or in kernel modules if the code was built with +``CONFIG_KALLSYMS``. + +Kgdb is intended to be used as a source level debugger for the Linux +kernel. It is used along with gdb to debug a Linux kernel. The +expectation is that gdb can be used to "break in" to the kernel to +inspect memory, variables and look through call stack information +similar to the way an application developer would use gdb to debug an +application. It is possible to place breakpoints in kernel code and +perform some limited execution stepping. + +Two machines are required for using kgdb. One of these machines is a +development machine and the other is the target machine. The kernel to +be debugged runs on the target machine. The development machine runs an +instance of gdb against the vmlinux file which contains the symbols (not +a boot image such as bzImage, zImage, uImage...). In gdb the developer +specifies the connection parameters and connects to kgdb. The type of +connection a developer makes with gdb depends on the availability of +kgdb I/O modules compiled as built-ins or loadable kernel modules in the +test machine's kernel. + +Compiling a kernel +================== + +- In order to enable compilation of kdb, you must first enable kgdb. + +- The kgdb test compile options are described in the kgdb test suite + chapter. + +Kernel config options for kgdb +------------------------------ + +To enable ``CONFIG_KGDB`` you should look under +:menuselection:`Kernel hacking --> Kernel debugging` and select +:menuselection:`KGDB: kernel debugger`. + +While it is not a hard requirement that you have symbols in your vmlinux +file, gdb tends not to be very useful without the symbolic data, so you +will want to turn on ``CONFIG_DEBUG_INFO`` which is called +:menuselection:`Compile the kernel with debug info` in the config menu. + +It is advised, but not required, that you turn on the +``CONFIG_FRAME_POINTER`` kernel option which is called :menuselection:`Compile +the kernel with frame pointers` in the config menu. This option inserts code +into the compiled executable which saves the frame information in registers +or on the stack at different points which allows a debugger such as gdb to +more accurately construct stack back traces while debugging the kernel. + +If the architecture that you are using supports the kernel option +``CONFIG_STRICT_KERNEL_RWX``, you should consider turning it off. This +option will prevent the use of software breakpoints because it marks +certain regions of the kernel's memory space as read-only. If kgdb +supports it for the architecture you are using, you can use hardware +breakpoints if you desire to run with the ``CONFIG_STRICT_KERNEL_RWX`` +option turned on, else you need to turn off this option. + +Next you should choose one of more I/O drivers to interconnect debugging +host and debugged target. Early boot debugging requires a KGDB I/O +driver that supports early debugging and the driver must be built into +the kernel directly. Kgdb I/O driver configuration takes place via +kernel or module parameters which you can learn more about in the in the +section that describes the parameter kgdboc. + +Here is an example set of ``.config`` symbols to enable or disable for kgdb:: + + # CONFIG_STRICT_KERNEL_RWX is not set + CONFIG_FRAME_POINTER=y + CONFIG_KGDB=y + CONFIG_KGDB_SERIAL_CONSOLE=y + +Kernel config options for kdb +----------------------------- + +Kdb is quite a bit more complex than the simple gdbstub sitting on top +of the kernel's debug core. Kdb must implement a shell, and also adds +some helper functions in other parts of the kernel, responsible for +printing out interesting data such as what you would see if you ran +``lsmod``, or ``ps``. In order to build kdb into the kernel you follow the +same steps as you would for kgdb. + +The main config option for kdb is ``CONFIG_KGDB_KDB`` which is called +:menuselection:`KGDB_KDB: include kdb frontend for kgdb` in the config menu. +In theory you would have already also selected an I/O driver such as the +``CONFIG_KGDB_SERIAL_CONSOLE`` interface if you plan on using kdb on a +serial port, when you were configuring kgdb. + +If you want to use a PS/2-style keyboard with kdb, you would select +``CONFIG_KDB_KEYBOARD`` which is called :menuselection:`KGDB_KDB: keyboard as +input device` in the config menu. The ``CONFIG_KDB_KEYBOARD`` option is not +used for anything in the gdb interface to kgdb. The ``CONFIG_KDB_KEYBOARD`` +option only works with kdb. + +Here is an example set of ``.config`` symbols to enable/disable kdb:: + + # CONFIG_STRICT_KERNEL_RWX is not set + CONFIG_FRAME_POINTER=y + CONFIG_KGDB=y + CONFIG_KGDB_SERIAL_CONSOLE=y + CONFIG_KGDB_KDB=y + CONFIG_KDB_KEYBOARD=y + +Kernel Debugger Boot Arguments +============================== + +This section describes the various runtime kernel parameters that affect +the configuration of the kernel debugger. The following chapter covers +using kdb and kgdb as well as providing some examples of the +configuration parameters. + +Kernel parameter: kgdboc +------------------------ + +The kgdboc driver was originally an abbreviation meant to stand for +"kgdb over console". Today it is the primary mechanism to configure how +to communicate from gdb to kgdb as well as the devices you want to use +to interact with the kdb shell. + +For kgdb/gdb, kgdboc is designed to work with a single serial port. It +is intended to cover the circumstance where you want to use a serial +console as your primary console as well as using it to perform kernel +debugging. It is also possible to use kgdb on a serial port which is not +designated as a system console. Kgdboc may be configured as a kernel +built-in or a kernel loadable module. You can only make use of +``kgdbwait`` and early debugging if you build kgdboc into the kernel as +a built-in. + +Optionally you can elect to activate kms (Kernel Mode Setting) +integration. When you use kms with kgdboc and you have a video driver +that has atomic mode setting hooks, it is possible to enter the debugger +on the graphics console. When the kernel execution is resumed, the +previous graphics mode will be restored. This integration can serve as a +useful tool to aid in diagnosing crashes or doing analysis of memory +with kdb while allowing the full graphics console applications to run. + +kgdboc arguments +~~~~~~~~~~~~~~~~ + +Usage:: + + kgdboc=[kms][[,]kbd][[,]serial_device][,baud] + +The order listed above must be observed if you use any of the optional +configurations together. + +Abbreviations: + +- kms = Kernel Mode Setting + +- kbd = Keyboard + +You can configure kgdboc to use the keyboard, and/or a serial device +depending on if you are using kdb and/or kgdb, in one of the following +scenarios. The order listed above must be observed if you use any of the +optional configurations together. Using kms + only gdb is generally not +a useful combination. + +Using loadable module or built-in +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +1. As a kernel built-in: + + Use the kernel boot argument:: + + kgdboc=<tty-device>,[baud] + +2. As a kernel loadable module: + + Use the command:: + + modprobe kgdboc kgdboc=<tty-device>,[baud] + + Here are two examples of how you might format the kgdboc string. The + first is for an x86 target using the first serial port. The second + example is for the ARM Versatile AB using the second serial port. + + 1. ``kgdboc=ttyS0,115200`` + + 2. ``kgdboc=ttyAMA1,115200`` + +Configure kgdboc at runtime with sysfs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +At run time you can enable or disable kgdboc by echoing a parameters +into the sysfs. Here are two examples: + +1. Enable kgdboc on ttyS0:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Disable kgdboc:: + + echo "" > /sys/module/kgdboc/parameters/kgdboc + +.. note:: + + You do not need to specify the baud if you are configuring the + console on tty which is already configured or open. + +More examples +^^^^^^^^^^^^^ + +You can configure kgdboc to use the keyboard, and/or a serial device +depending on if you are using kdb and/or kgdb, in one of the following +scenarios. + +1. kdb and kgdb over only a serial port:: + + kgdboc=<serial_device>[,baud] + + Example:: + + kgdboc=ttyS0,115200 + +2. kdb and kgdb with keyboard and a serial port:: + + kgdboc=kbd,<serial_device>[,baud] + + Example:: + + kgdboc=kbd,ttyS0,115200 + +3. kdb with a keyboard:: + + kgdboc=kbd + +4. kdb with kernel mode setting:: + + kgdboc=kms,kbd + +5. kdb with kernel mode setting and kgdb over a serial port:: + + kgdboc=kms,kbd,ttyS0,115200 + +.. note:: + + Kgdboc does not support interrupting the target via the gdb remote + protocol. You must manually send a :kbd:`SysRq-G` unless you have a proxy + that splits console output to a terminal program. A console proxy has a + separate TCP port for the debugger and a separate TCP port for the + "human" console. The proxy can take care of sending the :kbd:`SysRq-G` + for you. + +When using kgdboc with no debugger proxy, you can end up connecting the +debugger at one of two entry points. If an exception occurs after you +have loaded kgdboc, a message should print on the console stating it is +waiting for the debugger. In this case you disconnect your terminal +program and then connect the debugger in its place. If you want to +interrupt the target system and forcibly enter a debug session you have +to issue a :kbd:`Sysrq` sequence and then type the letter :kbd:`g`. Then you +disconnect the terminal session and connect gdb. Your options if you +don't like this are to hack gdb to send the :kbd:`SysRq-G` for you as well as +on the initial connect, or to use a debugger proxy that allows an +unmodified gdb to do the debugging. + +Kernel parameter: ``kgdboc_earlycon`` +------------------------------------- + +If you specify the kernel parameter ``kgdboc_earlycon`` and your serial +driver registers a boot console that supports polling (doesn't need +interrupts and implements a nonblocking read() function) kgdb will attempt +to work using the boot console until it can transition to the regular +tty driver specified by the ``kgdboc`` parameter. + +Normally there is only one boot console (especially that implements the +read() function) so just adding ``kgdboc_earlycon`` on its own is +sufficient to make this work. If you have more than one boot console you +can add the boot console's name to differentiate. Note that names that +are registered through the boot console layer and the tty layer are not +the same for the same port. + +For instance, on one board to be explicit you might do:: + + kgdboc_earlycon=qcom_geni kgdboc=ttyMSM0 + +If the only boot console on the device was "qcom_geni", you could simplify:: + + kgdboc_earlycon kgdboc=ttyMSM0 + +Kernel parameter: ``kgdbwait`` +------------------------------ + +The Kernel command line option ``kgdbwait`` makes kgdb wait for a +debugger connection during booting of a kernel. You can only use this +option if you compiled a kgdb I/O driver into the kernel and you +specified the I/O driver configuration as a kernel command line option. +The kgdbwait parameter should always follow the configuration parameter +for the kgdb I/O driver in the kernel command line else the I/O driver +will not be configured prior to asking the kernel to use it to wait. + +The kernel will stop and wait as early as the I/O driver and +architecture allows when you use this option. If you build the kgdb I/O +driver as a loadable kernel module kgdbwait will not do anything. + +Kernel parameter: ``kgdbcon`` +----------------------------- + +The ``kgdbcon`` feature allows you to see printk() messages inside gdb +while gdb is connected to the kernel. Kdb does not make use of the kgdbcon +feature. + +Kgdb supports using the gdb serial protocol to send console messages to +the debugger when the debugger is connected and running. There are two +ways to activate this feature. + +1. Activate with the kernel command line option:: + + kgdbcon + +2. Use sysfs before configuring an I/O driver:: + + echo 1 > /sys/module/kgdb/parameters/kgdb_use_con + +.. note:: + + If you do this after you configure the kgdb I/O driver, the + setting will not take effect until the next point the I/O is + reconfigured. + +.. important:: + + You cannot use kgdboc + kgdbcon on a tty that is an + active system console. An example of incorrect usage is:: + + console=ttyS0,115200 kgdboc=ttyS0 kgdbcon + +It is possible to use this option with kgdboc on a tty that is not a +system console. + +Run time parameter: ``kgdbreboot`` +---------------------------------- + +The kgdbreboot feature allows you to change how the debugger deals with +the reboot notification. You have 3 choices for the behavior. The +default behavior is always set to 0. + +.. tabularcolumns:: |p{0.4cm}|p{11.5cm}|p{5.6cm}| + +.. flat-table:: + :widths: 1 10 8 + + * - 1 + - ``echo -1 > /sys/module/debug_core/parameters/kgdbreboot`` + - Ignore the reboot notification entirely. + + * - 2 + - ``echo 0 > /sys/module/debug_core/parameters/kgdbreboot`` + - Send the detach message to any attached debugger client. + + * - 3 + - ``echo 1 > /sys/module/debug_core/parameters/kgdbreboot`` + - Enter the debugger on reboot notify. + +Kernel parameter: ``nokaslr`` +----------------------------- + +If the architecture that you are using enable KASLR by default, +you should consider turning it off. KASLR randomizes the +virtual address where the kernel image is mapped and confuse +gdb which resolve kernel symbol address from symbol table +of vmlinux. + +Using kdb +========= + +Quick start for kdb on a serial port +------------------------------------ + +This is a quick example of how to use kdb. + +1. Configure kgdboc at boot using kernel parameters:: + + console=ttyS0,115200 kgdboc=ttyS0,115200 nokaslr + + OR + + Configure kgdboc after the kernel has booted; assuming you are using + a serial port console:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Enter the kernel debugger manually or by waiting for an oops or + fault. There are several ways you can enter the kernel debugger + manually; all involve using the :kbd:`SysRq-G`, which means you must have + enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using minicom 2.2 + + Press: :kbd:`CTRL-A` :kbd:`f` :kbd:`g` + + - When you have telneted to a terminal server that supports sending + a remote break + + Press: :kbd:`CTRL-]` + + Type in: ``send break`` + + Press: :kbd:`Enter` :kbd:`g` + +3. From the kdb prompt you can run the ``help`` command to see a complete + list of the commands that are available. + + Some useful commands in kdb include: + + =========== ================================================================= + ``lsmod`` Shows where kernel modules are loaded + ``ps`` Displays only the active processes + ``ps A`` Shows all the processes + ``summary`` Shows kernel version info and memory usage + ``bt`` Get a backtrace of the current process using dump_stack() + ``dmesg`` View the kernel syslog buffer + ``go`` Continue the system + =========== ================================================================= + +4. When you are done using kdb you need to consider rebooting the system + or using the ``go`` command to resuming normal kernel execution. If you + have paused the kernel for a lengthy period of time, applications + that rely on timely networking or anything to do with real wall clock + time could be adversely affected, so you should take this into + consideration when using the kernel debugger. + +Quick start for kdb using a keyboard connected console +------------------------------------------------------ + +This is a quick example of how to use kdb with a keyboard. + +1. Configure kgdboc at boot using kernel parameters:: + + kgdboc=kbd + + OR + + Configure kgdboc after the kernel has booted:: + + echo kbd > /sys/module/kgdboc/parameters/kgdboc + +2. Enter the kernel debugger manually or by waiting for an oops or + fault. There are several ways you can enter the kernel debugger + manually; all involve using the :kbd:`SysRq-G`, which means you must have + enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using a laptop keyboard: + + Press and hold down: :kbd:`Alt` + + Press and hold down: :kbd:`Fn` + + Press and release the key with the label: :kbd:`SysRq` + + Release: :kbd:`Fn` + + Press and release: :kbd:`g` + + Release: :kbd:`Alt` + + - Example using a PS/2 101-key keyboard + + Press and hold down: :kbd:`Alt` + + Press and release the key with the label: :kbd:`SysRq` + + Press and release: :kbd:`g` + + Release: :kbd:`Alt` + +3. Now type in a kdb command such as ``help``, ``dmesg``, ``bt`` or ``go`` to + continue kernel execution. + +Using kgdb / gdb +================ + +In order to use kgdb you must activate it by passing configuration +information to one of the kgdb I/O drivers. If you do not pass any +configuration information kgdb will not do anything at all. Kgdb will +only actively hook up to the kernel trap hooks if a kgdb I/O driver is +loaded and configured. If you unconfigure a kgdb I/O driver, kgdb will +unregister all the kernel hook points. + +All kgdb I/O drivers can be reconfigured at run time, if +``CONFIG_SYSFS`` and ``CONFIG_MODULES`` are enabled, by echo'ing a new +config string to ``/sys/module/<driver>/parameter/<option>``. The driver +can be unconfigured by passing an empty string. You cannot change the +configuration while the debugger is attached. Make sure to detach the +debugger with the ``detach`` command prior to trying to unconfigure a +kgdb I/O driver. + +Connecting with gdb to a serial port +------------------------------------ + +1. Configure kgdboc + + Configure kgdboc at boot using kernel parameters:: + + kgdboc=ttyS0,115200 + + OR + + Configure kgdboc after the kernel has booted:: + + echo ttyS0 > /sys/module/kgdboc/parameters/kgdboc + +2. Stop kernel execution (break into the debugger) + + In order to connect to gdb via kgdboc, the kernel must first be + stopped. There are several ways to stop the kernel which include + using kgdbwait as a boot argument, via a :kbd:`SysRq-G`, or running the + kernel until it takes an exception where it waits for the debugger to + attach. + + - When logged in as root or with a super user session you can run:: + + echo g > /proc/sysrq-trigger + + - Example using minicom 2.2 + + Press: :kbd:`CTRL-A` :kbd:`f` :kbd:`g` + + - When you have telneted to a terminal server that supports sending + a remote break + + Press: :kbd:`CTRL-]` + + Type in: ``send break`` + + Press: :kbd:`Enter` :kbd:`g` + +3. Connect from gdb + + Example (using a directly connected port):: + + % gdb ./vmlinux + (gdb) set serial baud 115200 + (gdb) target remote /dev/ttyS0 + + + Example (kgdb to a terminal server on TCP port 2012):: + + % gdb ./vmlinux + (gdb) target remote 192.168.2.2:2012 + + + Once connected, you can debug a kernel the way you would debug an + application program. + + If you are having problems connecting or something is going seriously + wrong while debugging, it will most often be the case that you want + to enable gdb to be verbose about its target communications. You do + this prior to issuing the ``target remote`` command by typing in:: + + set debug remote 1 + +Remember if you continue in gdb, and need to "break in" again, you need +to issue an other :kbd:`SysRq-G`. It is easy to create a simple entry point by +putting a breakpoint at ``sys_sync`` and then you can run ``sync`` from a +shell or script to break into the debugger. + +kgdb and kdb interoperability +============================= + +It is possible to transition between kdb and kgdb dynamically. The debug +core will remember which you used the last time and automatically start +in the same mode. + +Switching between kdb and kgdb +------------------------------ + +Switching from kgdb to kdb +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There are two ways to switch from kgdb to kdb: you can use gdb to issue +a maintenance packet, or you can blindly type the command ``$3#33``. +Whenever the kernel debugger stops in kgdb mode it will print the +message ``KGDB or $3#33 for KDB``. It is important to note that you have +to type the sequence correctly in one pass. You cannot type a backspace +or delete because kgdb will interpret that as part of the debug stream. + +1. Change from kgdb to kdb by blindly typing:: + + $3#33 + +2. Change from kgdb to kdb with gdb:: + + maintenance packet 3 + + .. note:: + + Now you must kill gdb. Typically you press :kbd:`CTRL-Z` and issue + the command:: + + kill -9 % + +Change from kdb to kgdb +~~~~~~~~~~~~~~~~~~~~~~~ + +There are two ways you can change from kdb to kgdb. You can manually +enter kgdb mode by issuing the kgdb command from the kdb shell prompt, +or you can connect gdb while the kdb shell prompt is active. The kdb +shell looks for the typical first commands that gdb would issue with the +gdb remote protocol and if it sees one of those commands it +automatically changes into kgdb mode. + +1. From kdb issue the command:: + + kgdb + + Now disconnect your terminal program and connect gdb in its place + +2. At the kdb prompt, disconnect the terminal program and connect gdb in + its place. + +Running kdb commands from gdb +----------------------------- + +It is possible to run a limited set of kdb commands from gdb, using the +gdb monitor command. You don't want to execute any of the run control or +breakpoint operations, because it can disrupt the state of the kernel +debugger. You should be using gdb for breakpoints and run control +operations if you have gdb connected. The more useful commands to run +are things like lsmod, dmesg, ps or possibly some of the memory +information commands. To see all the kdb commands you can run +``monitor help``. + +Example:: + + (gdb) monitor ps + 1 idle process (state I) and + 27 sleeping system daemon (state M) processes suppressed, + use 'ps A' to see all. + Task Addr Pid Parent [*] cpu State Thread Command + + 0xc78291d0 1 0 0 0 S 0xc7829404 init + 0xc7954150 942 1 0 0 S 0xc7954384 dropbear + 0xc78789c0 944 1 0 0 S 0xc7878bf4 sh + (gdb) + +kgdb Test Suite +=============== + +When kgdb is enabled in the kernel config you can also elect to enable +the config parameter ``KGDB_TESTS``. Turning this on will enable a special +kgdb I/O module which is designed to test the kgdb internal functions. + +The kgdb tests are mainly intended for developers to test the kgdb +internals as well as a tool for developing a new kgdb architecture +specific implementation. These tests are not really for end users of the +Linux kernel. The primary source of documentation would be to look in +the ``drivers/misc/kgdbts.c`` file. + +The kgdb test suite can also be configured at compile time to run the +core set of tests by setting the kernel config parameter +``KGDB_TESTS_ON_BOOT``. This particular option is aimed at automated +regression testing and does not require modifying the kernel boot config +arguments. If this is turned on, the kgdb test suite can be disabled by +specifying ``kgdbts=`` as a kernel boot argument. + +Kernel Debugger Internals +========================= + +Architecture Specifics +---------------------- + +The kernel debugger is organized into a number of components: + +1. The debug core + + The debug core is found in ``kernel/debugger/debug_core.c``. It + contains: + + - A generic OS exception handler which includes sync'ing the + processors into a stopped state on an multi-CPU system. + + - The API to talk to the kgdb I/O drivers + + - The API to make calls to the arch-specific kgdb implementation + + - The logic to perform safe memory reads and writes to memory while + using the debugger + + - A full implementation for software breakpoints unless overridden + by the arch + + - The API to invoke either the kdb or kgdb frontend to the debug + core. + + - The structures and callback API for atomic kernel mode setting. + + .. note:: kgdboc is where the kms callbacks are invoked. + +2. kgdb arch-specific implementation + + This implementation is generally found in ``arch/*/kernel/kgdb.c``. As + an example, ``arch/x86/kernel/kgdb.c`` contains the specifics to + implement HW breakpoint as well as the initialization to dynamically + register and unregister for the trap handlers on this architecture. + The arch-specific portion implements: + + - contains an arch-specific trap catcher which invokes + kgdb_handle_exception() to start kgdb about doing its work + + - translation to and from gdb specific packet format to struct pt_regs + + - Registration and unregistration of architecture specific trap + hooks + + - Any special exception handling and cleanup + + - NMI exception handling and cleanup + + - (optional) HW breakpoints + +3. gdbstub frontend (aka kgdb) + + The gdbstub is located in ``kernel/debug/gdbstub.c``. It contains: + + - All the logic to implement the gdb serial protocol + +4. kdb frontend + + The kdb debugger shell is broken down into a number of components. + The kdb core is located in kernel/debug/kdb. There are a number of + helper functions in some of the other kernel components to make it + possible for kdb to examine and report information about the kernel + without taking locks that could cause a kernel deadlock. The kdb core + contains implements the following functionality. + + - A simple shell + + - The kdb core command set + + - A registration API to register additional kdb shell commands. + + - A good example of a self-contained kdb module is the ``ftdump`` + command for dumping the ftrace buffer. See: + ``kernel/trace/trace_kdb.c`` + + - For an example of how to dynamically register a new kdb command + you can build the kdb_hello.ko kernel module from + ``samples/kdb/kdb_hello.c``. To build this example you can set + ``CONFIG_SAMPLES=y`` and ``CONFIG_SAMPLE_KDB=m`` in your kernel + config. Later run ``modprobe kdb_hello`` and the next time you + enter the kdb shell, you can run the ``hello`` command. + + - The implementation for kdb_printf() which emits messages directly + to I/O drivers, bypassing the kernel log. + + - SW / HW breakpoint management for the kdb shell + +5. kgdb I/O driver + + Each kgdb I/O driver has to provide an implementation for the + following: + + - configuration via built-in or module + + - dynamic configuration and kgdb hook registration calls + + - read and write character interface + + - A cleanup handler for unconfiguring from the kgdb core + + - (optional) Early debug methodology + + Any given kgdb I/O driver has to operate very closely with the + hardware and must do it in such a way that does not enable interrupts + or change other parts of the system context without completely + restoring them. The kgdb core will repeatedly "poll" a kgdb I/O + driver for characters when it needs input. The I/O driver is expected + to return immediately if there is no data available. Doing so allows + for the future possibility to touch watchdog hardware in such a way + as to have a target system not reset when these are enabled. + +If you are intent on adding kgdb architecture specific support for a new +architecture, the architecture should define ``HAVE_ARCH_KGDB`` in the +architecture specific Kconfig file. This will enable kgdb for the +architecture, and at that point you must create an architecture specific +kgdb implementation. + +There are a few flags which must be set on every architecture in their +``asm/kgdb.h`` file. These are: + +- ``NUMREGBYTES``: + The size in bytes of all of the registers, so that we + can ensure they will all fit into a packet. + +- ``BUFMAX``: + The size in bytes of the buffer GDB will read into. This must + be larger than NUMREGBYTES. + +- ``CACHE_FLUSH_IS_SAFE``: + Set to 1 if it is always safe to call + flush_cache_range or flush_icache_range. On some architectures, + these functions may not be safe to call on SMP since we keep other + CPUs in a holding pattern. + +There are also the following functions for the common backend, found in +``kernel/kgdb.c``, that must be supplied by the architecture-specific +backend unless marked as (optional), in which case a default function +maybe used if the architecture does not need to provide a specific +implementation. + +.. kernel-doc:: include/linux/kgdb.h + :internal: + +kgdboc internals +---------------- + +kgdboc and uarts +~~~~~~~~~~~~~~~~ + +The kgdboc driver is actually a very thin driver that relies on the +underlying low level to the hardware driver having "polling hooks" to +which the tty driver is attached. In the initial implementation of +kgdboc the serial_core was changed to expose a low level UART hook for +doing polled mode reading and writing of a single character while in an +atomic context. When kgdb makes an I/O request to the debugger, kgdboc +invokes a callback in the serial core which in turn uses the callback in +the UART driver. + +When using kgdboc with a UART, the UART driver must implement two +callbacks in the struct uart_ops. +Example from ``drivers/8250.c``:: + + + #ifdef CONFIG_CONSOLE_POLL + .poll_get_char = serial8250_get_poll_char, + .poll_put_char = serial8250_put_poll_char, + #endif + + +Any implementation specifics around creating a polling driver use the +``#ifdef CONFIG_CONSOLE_POLL``, as shown above. Keep in mind that +polling hooks have to be implemented in such a way that they can be +called from an atomic context and have to restore the state of the UART +chip on return such that the system can return to normal when the +debugger detaches. You need to be very careful with any kind of lock you +consider, because failing here is most likely going to mean pressing the +reset button. + +kgdboc and keyboards +~~~~~~~~~~~~~~~~~~~~~~~~ + +The kgdboc driver contains logic to configure communications with an +attached keyboard. The keyboard infrastructure is only compiled into the +kernel when ``CONFIG_KDB_KEYBOARD=y`` is set in the kernel configuration. + +The core polled keyboard driver for PS/2 type keyboards is in +``drivers/char/kdb_keyboard.c``. This driver is hooked into the debug core +when kgdboc populates the callback in the array called +:c:expr:`kdb_poll_funcs[]`. The kdb_get_kbd_char() is the top-level +function which polls hardware for single character input. + +kgdboc and kms +~~~~~~~~~~~~~~~~~~ + +The kgdboc driver contains logic to request the graphics display to +switch to a text context when you are using ``kgdboc=kms,kbd``, provided +that you have a video driver which has a frame buffer console and atomic +kernel mode setting support. + +Every time the kernel debugger is entered it calls +kgdboc_pre_exp_handler() which in turn calls con_debug_enter() +in the virtual console layer. On resuming kernel execution, the kernel +debugger calls kgdboc_post_exp_handler() which in turn calls +con_debug_leave(). + +Any video driver that wants to be compatible with the kernel debugger +and the atomic kms callbacks must implement the ``mode_set_base_atomic``, +``fb_debug_enter`` and ``fb_debug_leave operations``. For the +``fb_debug_enter`` and ``fb_debug_leave`` the option exists to use the +generic drm fb helper functions or implement something custom for the +hardware. The following example shows the initialization of the +.mode_set_base_atomic operation in +drivers/gpu/drm/i915/intel_display.c:: + + + static const struct drm_crtc_helper_funcs intel_helper_funcs = { + [...] + .mode_set_base_atomic = intel_pipe_set_base_atomic, + [...] + }; + + +Here is an example of how the i915 driver initializes the +fb_debug_enter and fb_debug_leave functions to use the generic drm +helpers in ``drivers/gpu/drm/i915/intel_fb.c``:: + + + static struct fb_ops intelfb_ops = { + [...] + .fb_debug_enter = drm_fb_helper_debug_enter, + .fb_debug_leave = drm_fb_helper_debug_leave, + [...] + }; + + +Credits +======= + +The following people have contributed to this document: + +1. Amit Kale <amitkale@linsyssoft.com> + +2. Tom Rini <trini@kernel.crashing.org> + +In March 2008 this document was completely rewritten by: + +- Jason Wessel <jason.wessel@windriver.com> + +In Jan 2010 this document was updated to include kdb. + +- Jason Wessel <jason.wessel@windriver.com> diff --git a/Documentation/dev-tools/kmemleak.rst b/Documentation/dev-tools/kmemleak.rst new file mode 100644 index 0000000000..2cb00b5333 --- /dev/null +++ b/Documentation/dev-tools/kmemleak.rst @@ -0,0 +1,258 @@ +Kernel Memory Leak Detector +=========================== + +Kmemleak provides a way of detecting possible kernel memory leaks in a +way similar to a `tracing garbage collector +<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_, +with the difference that the orphan objects are not freed but only +reported via /sys/kernel/debug/kmemleak. A similar method is used by the +Valgrind tool (``memcheck --leak-check``) to detect the memory leaks in +user-space applications. + +Usage +----- + +CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel +thread scans the memory every 10 minutes (by default) and prints the +number of new unreferenced objects found. If the ``debugfs`` isn't already +mounted, mount with:: + + # mount -t debugfs nodev /sys/kernel/debug/ + +To display the details of all the possible scanned memory leaks:: + + # cat /sys/kernel/debug/kmemleak + +To trigger an intermediate memory scan:: + + # echo scan > /sys/kernel/debug/kmemleak + +To clear the list of all current possible memory leaks:: + + # echo clear > /sys/kernel/debug/kmemleak + +New leaks will then come up upon reading ``/sys/kernel/debug/kmemleak`` +again. + +Note that the orphan objects are listed in the order they were allocated +and one object at the beginning of the list may cause other subsequent +objects to be reported as orphan. + +Memory scanning parameters can be modified at run-time by writing to the +``/sys/kernel/debug/kmemleak`` file. The following parameters are supported: + +- off + disable kmemleak (irreversible) +- stack=on + enable the task stacks scanning (default) +- stack=off + disable the tasks stacks scanning +- scan=on + start the automatic memory scanning thread (default) +- scan=off + stop the automatic memory scanning thread +- scan=<secs> + set the automatic memory scanning period in seconds + (default 600, 0 to stop the automatic scanning) +- scan + trigger a memory scan +- clear + clear list of current memory leak suspects, done by + marking all current reported unreferenced objects grey, + or free all kmemleak objects if kmemleak has been disabled. +- dump=<addr> + dump information about the object found at <addr> + +Kmemleak can also be disabled at boot-time by passing ``kmemleak=off`` on +the kernel command line. + +Memory may be allocated or freed before kmemleak is initialised and +these actions are stored in an early log buffer. The size of this buffer +is configured via the CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE option. + +If CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF are enabled, the kmemleak is +disabled by default. Passing ``kmemleak=on`` on the kernel command +line enables the function. + +If you are getting errors like "Error while writing to stdout" or "write_loop: +Invalid argument", make sure kmemleak is properly enabled. + +Basic Algorithm +--------------- + +The memory allocations via :c:func:`kmalloc`, :c:func:`vmalloc`, +:c:func:`kmem_cache_alloc` and +friends are traced and the pointers, together with additional +information like size and stack trace, are stored in a rbtree. +The corresponding freeing function calls are tracked and the pointers +removed from the kmemleak data structures. + +An allocated block of memory is considered orphan if no pointer to its +start address or to any location inside the block can be found by +scanning the memory (including saved registers). This means that there +might be no way for the kernel to pass the address of the allocated +block to a freeing function and therefore the block is considered a +memory leak. + +The scanning algorithm steps: + + 1. mark all objects as white (remaining white objects will later be + considered orphan) + 2. scan the memory starting with the data section and stacks, checking + the values against the addresses stored in the rbtree. If + a pointer to a white object is found, the object is added to the + gray list + 3. scan the gray objects for matching addresses (some white objects + can become gray and added at the end of the gray list) until the + gray set is finished + 4. the remaining white objects are considered orphan and reported via + /sys/kernel/debug/kmemleak + +Some allocated memory blocks have pointers stored in the kernel's +internal data structures and they cannot be detected as orphans. To +avoid this, kmemleak can also store the number of values pointing to an +address inside the block address range that need to be found so that the +block is not considered a leak. One example is __vmalloc(). + +Testing specific sections with kmemleak +--------------------------------------- + +Upon initial bootup your /sys/kernel/debug/kmemleak output page may be +quite extensive. This can also be the case if you have very buggy code +when doing development. To work around these situations you can use the +'clear' command to clear all reported unreferenced objects from the +/sys/kernel/debug/kmemleak output. By issuing a 'scan' after a 'clear' +you can find new unreferenced objects; this should help with testing +specific sections of code. + +To test a critical section on demand with a clean kmemleak do:: + + # echo clear > /sys/kernel/debug/kmemleak + ... test your kernel or modules ... + # echo scan > /sys/kernel/debug/kmemleak + +Then as usual to get your report with:: + + # cat /sys/kernel/debug/kmemleak + +Freeing kmemleak internal objects +--------------------------------- + +To allow access to previously found memory leaks after kmemleak has been +disabled by the user or due to an fatal error, internal kmemleak objects +won't be freed when kmemleak is disabled, and those objects may occupy +a large part of physical memory. + +In this situation, you may reclaim memory with:: + + # echo clear > /sys/kernel/debug/kmemleak + +Kmemleak API +------------ + +See the include/linux/kmemleak.h header for the functions prototype. + +- ``kmemleak_init`` - initialize kmemleak +- ``kmemleak_alloc`` - notify of a memory block allocation +- ``kmemleak_alloc_percpu`` - notify of a percpu memory block allocation +- ``kmemleak_vmalloc`` - notify of a vmalloc() memory allocation +- ``kmemleak_free`` - notify of a memory block freeing +- ``kmemleak_free_part`` - notify of a partial memory block freeing +- ``kmemleak_free_percpu`` - notify of a percpu memory block freeing +- ``kmemleak_update_trace`` - update object allocation stack trace +- ``kmemleak_not_leak`` - mark an object as not a leak +- ``kmemleak_ignore`` - do not scan or report an object as leak +- ``kmemleak_scan_area`` - add scan areas inside a memory block +- ``kmemleak_no_scan`` - do not scan a memory block +- ``kmemleak_erase`` - erase an old value in a pointer variable +- ``kmemleak_alloc_recursive`` - as kmemleak_alloc but checks the recursiveness +- ``kmemleak_free_recursive`` - as kmemleak_free but checks the recursiveness + +The following functions take a physical address as the object pointer +and only perform the corresponding action if the address has a lowmem +mapping: + +- ``kmemleak_alloc_phys`` +- ``kmemleak_free_part_phys`` +- ``kmemleak_ignore_phys`` + +Dealing with false positives/negatives +-------------------------------------- + +The false negatives are real memory leaks (orphan objects) but not +reported by kmemleak because values found during the memory scanning +point to such objects. To reduce the number of false negatives, kmemleak +provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and +kmemleak_erase functions (see above). The task stacks also increase the +amount of false negatives and their scanning is not enabled by default. + +The false positives are objects wrongly reported as being memory leaks +(orphan). For objects known not to be leaks, kmemleak provides the +kmemleak_not_leak function. The kmemleak_ignore could also be used if +the memory block is known not to contain other pointers and it will no +longer be scanned. + +Some of the reported leaks are only transient, especially on SMP +systems, because of pointers temporarily stored in CPU registers or +stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing +the minimum age of an object to be reported as a memory leak. + +Limitations and Drawbacks +------------------------- + +The main drawback is the reduced performance of memory allocation and +freeing. To avoid other penalties, the memory scanning is only performed +when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is +intended for debugging purposes where the performance might not be the +most important requirement. + +To keep the algorithm simple, kmemleak scans for values pointing to any +address inside a block's address range. This may lead to an increased +number of false negatives. However, it is likely that a real memory leak +will eventually become visible. + +Another source of false negatives is the data stored in non-pointer +values. In a future version, kmemleak could only scan the pointer +members in the allocated structures. This feature would solve many of +the false negative cases described above. + +The tool can report false positives. These are cases where an allocated +block doesn't need to be freed (some cases in the init_call functions), +the pointer is calculated by other methods than the usual container_of +macro or the pointer is stored in a location not scanned by kmemleak. + +Page allocations and ioremap are not tracked. + +Testing with kmemleak-test +-------------------------- + +To check if you have all set up to use kmemleak, you can use the kmemleak-test +module, a module that deliberately leaks memory. Set CONFIG_SAMPLE_KMEMLEAK +as module (it can't be used as built-in) and boot the kernel with kmemleak +enabled. Load the module and perform a scan with:: + + # modprobe kmemleak-test + # echo scan > /sys/kernel/debug/kmemleak + +Note that the you may not get results instantly or on the first scanning. When +kmemleak gets results, it'll log ``kmemleak: <count of leaks> new suspected +memory leaks``. Then read the file to see then:: + + # cat /sys/kernel/debug/kmemleak + unreferenced object 0xffff89862ca702e8 (size 32): + comm "modprobe", pid 2088, jiffies 4294680594 (age 375.486s) + hex dump (first 32 bytes): + 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk + 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. + backtrace: + [<00000000e0a73ec7>] 0xffffffffc01d2036 + [<000000000c5d2a46>] do_one_initcall+0x41/0x1df + [<0000000046db7e0a>] do_init_module+0x55/0x200 + [<00000000542b9814>] load_module+0x203c/0x2480 + [<00000000c2850256>] __do_sys_finit_module+0xba/0xe0 + [<000000006564e7ef>] do_syscall_64+0x43/0x110 + [<000000007c873fa6>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 + ... + +Removing the module with ``rmmod kmemleak_test`` should also trigger some +kmemleak results. diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst new file mode 100644 index 0000000000..55fa82212e --- /dev/null +++ b/Documentation/dev-tools/kmsan.rst @@ -0,0 +1,428 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright (C) 2022, Google LLC. + +=================================== +The Kernel Memory Sanitizer (KMSAN) +=================================== + +KMSAN is a dynamic error detector aimed at finding uses of uninitialized +values. It is based on compiler instrumentation, and is quite similar to the +userspace `MemorySanitizer tool`_. + +An important note is that KMSAN is not intended for production use, because it +drastically increases kernel memory footprint and slows the whole system down. + +Usage +===== + +Building the kernel +------------------- + +In order to build a kernel with KMSAN you will need a fresh Clang (14.0.6+). +Please refer to `LLVM documentation`_ for the instructions on how to build Clang. + +Now configure and build the kernel with CONFIG_KMSAN enabled. + +Example report +-------------- + +Here is an example of a KMSAN report:: + + ===================================================== + BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test] + test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273 + kunit_run_case_internal lib/kunit/test.c:333 + kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 + kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 + kthread+0x721/0x850 kernel/kthread.c:327 + ret_from_fork+0x1f/0x30 ??:? + + Uninit was stored to memory at: + do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260 + test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 + kunit_run_case_internal lib/kunit/test.c:333 + kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 + kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 + kthread+0x721/0x850 kernel/kthread.c:327 + ret_from_fork+0x1f/0x30 ??:? + + Local variable uninit created at: + do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256 + test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 + + Bytes 4-7 of 8 are uninitialized + Memory access of size 8 starts at ffff888083fe3da0 + + CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G B E 5.16.0-rc3+ #104 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 + ===================================================== + +The report says that the local variable ``uninit`` was created uninitialized in +``do_uninit_local_array()``. The third stack trace corresponds to the place +where this variable was created. + +The first stack trace shows where the uninit value was used (in +``test_uninit_kmsan_check_memory()``). The tool shows the bytes which were left +uninitialized in the local variable, as well as the stack where the value was +copied to another memory location before use. + +A use of uninitialized value ``v`` is reported by KMSAN in the following cases: + + - in a condition, e.g. ``if (v) { ... }``; + - in an indexing or pointer dereferencing, e.g. ``array[v]`` or ``*v``; + - when it is copied to userspace or hardware, e.g. ``copy_to_user(..., &v, ...)``; + - when it is passed as an argument to a function, and + ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` is enabled (see below). + +The mentioned cases (apart from copying data to userspace or hardware, which is +a security issue) are considered undefined behavior from the C11 Standard point +of view. + +Disabling the instrumentation +----------------------------- + +A function can be marked with ``__no_kmsan_checks``. Doing so makes KMSAN +ignore uninitialized values in that function and mark its output as initialized. +As a result, the user will not get KMSAN reports related to that function. + +Another function attribute supported by KMSAN is ``__no_sanitize_memory``. +Applying this attribute to a function will result in KMSAN not instrumenting +it, which can be helpful if we do not want the compiler to interfere with some +low-level code (e.g. that marked with ``noinstr`` which implicitly adds +``__no_sanitize_memory``). + +This however comes at a cost: stack allocations from such functions will have +incorrect shadow/origin values, likely leading to false positives. Functions +called from non-instrumented code may also receive incorrect metadata for their +parameters. + +As a rule of thumb, avoid using ``__no_sanitize_memory`` explicitly. + +It is also possible to disable KMSAN for a single file (e.g. main.o):: + + KMSAN_SANITIZE_main.o := n + +or for the whole directory:: + + KMSAN_SANITIZE := n + +in the Makefile. Think of this as applying ``__no_sanitize_memory`` to every +function in the file or directory. Most users won't need KMSAN_SANITIZE, unless +their code gets broken by KMSAN (e.g. runs at early boot time). + +Support +======= + +In order for KMSAN to work the kernel must be built with Clang, which so far is +the only compiler that has KMSAN support. The kernel instrumentation pass is +based on the userspace `MemorySanitizer tool`_. + +The runtime library only supports x86_64 at the moment. + +How KMSAN works +=============== + +KMSAN shadow memory +------------------- + +KMSAN associates a metadata byte (also called shadow byte) with every byte of +kernel memory. A bit in the shadow byte is set iff the corresponding bit of the +kernel memory byte is uninitialized. Marking the memory uninitialized (i.e. +setting its shadow bytes to ``0xff``) is called poisoning, marking it +initialized (setting the shadow bytes to ``0x00``) is called unpoisoning. + +When a new variable is allocated on the stack, it is poisoned by default by +instrumentation code inserted by the compiler (unless it is a stack variable +that is immediately initialized). Any new heap allocation done without +``__GFP_ZERO`` is also poisoned. + +Compiler instrumentation also tracks the shadow values as they are used along +the code. When needed, instrumentation code invokes the runtime library in +``mm/kmsan/`` to persist shadow values. + +The shadow value of a basic or compound type is an array of bytes of the same +length. When a constant value is written into memory, that memory is unpoisoned. +When a value is read from memory, its shadow memory is also obtained and +propagated into all the operations which use that value. For every instruction +that takes one or more values the compiler generates code that calculates the +shadow of the result depending on those values and their shadows. + +Example:: + + int a = 0xff; // i.e. 0x000000ff + int b; + int c = a | b; + +In this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``, +shadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of +``c`` are uninitialized, while the lower byte is initialized. + +Origin tracking +--------------- + +Every four bytes of kernel memory also have a so-called origin mapped to them. +This origin describes the point in program execution at which the uninitialized +value was created. Every origin is associated with either the full allocation +stack (for heap-allocated memory), or the function containing the uninitialized +variable (for locals). + +When an uninitialized variable is allocated on stack or heap, a new origin +value is created, and that variable's origin is filled with that value. When a +value is read from memory, its origin is also read and kept together with the +shadow. For every instruction that takes one or more values, the origin of the +result is one of the origins corresponding to any of the uninitialized inputs. +If a poisoned value is written into memory, its origin is written to the +corresponding storage as well. + +Example 1:: + + int a = 42; + int b; + int c = a + b; + +In this case the origin of ``b`` is generated upon function entry, and is +stored to the origin of ``c`` right before the addition result is written into +memory. + +Several variables may share the same origin address, if they are stored in the +same four-byte chunk. In this case every write to either variable updates the +origin for all of them. We have to sacrifice precision in this case, because +storing origins for individual bits (and even bytes) would be too costly. + +Example 2:: + + int combine(short a, short b) { + union ret_t { + int i; + short s[2]; + } ret; + ret.s[0] = a; + ret.s[1] = b; + return ret.i; + } + +If ``a`` is initialized and ``b`` is not, the shadow of the result would be +0xffff0000, and the origin of the result would be the origin of ``b``. +``ret.s[0]`` would have the same origin, but it will never be used, because +that variable is initialized. + +If both function arguments are uninitialized, only the origin of the second +argument is preserved. + +Origin chaining +~~~~~~~~~~~~~~~ + +To ease debugging, KMSAN creates a new origin for every store of an +uninitialized value to memory. The new origin references both its creation stack +and the previous origin the value had. This may cause increased memory +consumption, so we limit the length of origin chains in the runtime. + +Clang instrumentation API +------------------------- + +Clang instrumentation pass inserts calls to functions defined in +``mm/kmsan/nstrumentation.c`` into the kernel code. + +Shadow manipulation +~~~~~~~~~~~~~~~~~~~ + +For every memory access the compiler emits a call to a function that returns a +pair of pointers to the shadow and origin addresses of the given memory:: + + typedef struct { + void *shadow, *origin; + } shadow_origin_ptr_t + + shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr) + shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr) + shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size) + shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size) + +The function name depends on the memory access size. + +The compiler makes sure that for every loaded value its shadow and origin +values are read from memory. When a value is stored to memory, its shadow and +origin are also stored using the metadata pointers. + +Handling locals +~~~~~~~~~~~~~~~ + +A special function is used to create a new origin value for a local variable and +set the origin of that variable to that value:: + + void __msan_poison_alloca(void *addr, uintptr_t size, char *descr) + +Access to per-task data +~~~~~~~~~~~~~~~~~~~~~~~ + +At the beginning of every instrumented function KMSAN inserts a call to +``__msan_get_context_state()``:: + + kmsan_context_state *__msan_get_context_state(void) + +``kmsan_context_state`` is declared in ``include/linux/kmsan.h``:: + + struct kmsan_context_state { + char param_tls[KMSAN_PARAM_SIZE]; + char retval_tls[KMSAN_RETVAL_SIZE]; + char va_arg_tls[KMSAN_PARAM_SIZE]; + char va_arg_origin_tls[KMSAN_PARAM_SIZE]; + u64 va_arg_overflow_size_tls; + char param_origin_tls[KMSAN_PARAM_SIZE]; + depot_stack_handle_t retval_origin_tls; + }; + +This structure is used by KMSAN to pass parameter shadows and origins between +instrumented functions (unless the parameters are checked immediately by +``CONFIG_KMSAN_CHECK_PARAM_RETVAL``). + +Passing uninitialized values to functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Clang's MemorySanitizer instrumentation has an option, +``-fsanitize-memory-param-retval``, which makes the compiler check function +parameters passed by value, as well as function return values. + +The option is controlled by ``CONFIG_KMSAN_CHECK_PARAM_RETVAL``, which is +enabled by default to let KMSAN report uninitialized values earlier. +Please refer to the `LKML discussion`_ for more details. + +Because of the way the checks are implemented in LLVM (they are only applied to +parameters marked as ``noundef``), not all parameters are guaranteed to be +checked, so we cannot give up the metadata storage in ``kmsan_context_state``. + +String functions +~~~~~~~~~~~~~~~~ + +The compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the +following functions. These functions are also called when data structures are +initialized or copied, making sure shadow and origin values are copied alongside +with the data:: + + void *__msan_memcpy(void *dst, void *src, uintptr_t n) + void *__msan_memmove(void *dst, void *src, uintptr_t n) + void *__msan_memset(void *dst, int c, uintptr_t n) + +Error reporting +~~~~~~~~~~~~~~~ + +For each use of a value the compiler emits a shadow check that calls +``__msan_warning()`` in the case that value is poisoned:: + + void __msan_warning(u32 origin) + +``__msan_warning()`` causes KMSAN runtime to print an error report. + +Inline assembly instrumentation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +KMSAN instruments every inline assembly output with a call to:: + + void __msan_instrument_asm_store(void *addr, uintptr_t size) + +, which unpoisons the memory region. + +This approach may mask certain errors, but it also helps to avoid a lot of +false positives in bitwise operations, atomics etc. + +Sometimes the pointers passed into inline assembly do not point to valid memory. +In such cases they are ignored at runtime. + + +Runtime library +--------------- + +The code is located in ``mm/kmsan/``. + +Per-task KMSAN state +~~~~~~~~~~~~~~~~~~~~ + +Every task_struct has an associated KMSAN task state that holds the KMSAN +context (see above) and a per-task flag disallowing KMSAN reports:: + + struct kmsan_context { + ... + bool allow_reporting; + struct kmsan_context_state cstate; + ... + } + + struct task_struct { + ... + struct kmsan_context kmsan; + ... + } + +KMSAN contexts +~~~~~~~~~~~~~~ + +When running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to +hold the metadata for function parameters and return values. + +But in the case the kernel is running in the interrupt, softirq or NMI context, +where ``current`` is unavailable, KMSAN switches to per-cpu interrupt state:: + + DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx); + +Metadata allocation +~~~~~~~~~~~~~~~~~~~ + +There are several places in the kernel for which the metadata is stored. + +1. Each ``struct page`` instance contains two pointers to its shadow and +origin pages:: + + struct page { + ... + struct page *shadow, *origin; + ... + }; + +At boot-time, the kernel allocates shadow and origin pages for every available +kernel page. This is done quite late, when the kernel address space is already +fragmented, so normal data pages may arbitrarily interleave with the metadata +pages. + +This means that in general for two contiguous memory pages their shadow/origin +pages may not be contiguous. Consequently, if a memory access crosses the +boundary of a memory block, accesses to shadow/origin memory may potentially +corrupt other pages or read incorrect values from them. + +In practice, contiguous memory pages returned by the same ``alloc_pages()`` +call will have contiguous metadata, whereas if these pages belong to two +different allocations their metadata pages can be fragmented. + +For the kernel data (``.data``, ``.bss`` etc.) and percpu memory regions +there also are no guarantees on metadata contiguity. + +In the case ``__msan_metadata_ptr_for_XXX_YYY()`` hits the border between two +pages with non-contiguous metadata, it returns pointers to fake shadow/origin regions:: + + char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); + char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); + +``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes. +All stores to ``dummy_store_page`` are ignored. + +2. For vmalloc memory and modules, there is a direct mapping between the memory +range, its shadow and origin. KMSAN reduces the vmalloc area by 3/4, making only +the first quarter available to ``vmalloc()``. The second quarter of the vmalloc +area contains shadow memory for the first quarter, the third one holds the +origins. A small part of the fourth quarter contains shadow and origins for the +kernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for +more details. + +When an array of pages is mapped into a contiguous virtual memory space, their +shadow and origin pages are similarly mapped into contiguous regions. + +References +========== + +E. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized +memory use in C++ +<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_. +In Proceedings of CGO 2015. + +.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html +.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html +.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@google.com/ diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst new file mode 100644 index 0000000000..deede972f2 --- /dev/null +++ b/Documentation/dev-tools/kselftest.rst @@ -0,0 +1,415 @@ +====================== +Linux Kernel Selftests +====================== + +The kernel contains a set of "self tests" under the tools/testing/selftests/ +directory. These are intended to be small tests to exercise individual code +paths in the kernel. Tests are intended to be run after building, installing +and booting a kernel. + +Kselftest from mainline can be run on older stable kernels. Running tests +from mainline offers the best coverage. Several test rings run mainline +kselftest suite on stable releases. The reason is that when a new test +gets added to test existing code to regression test a bug, we should be +able to run that test on an older kernel. Hence, it is important to keep +code that can still test an older kernel and make sure it skips the test +gracefully on newer releases. + +You can find additional information on Kselftest framework, how to +write new tests using the framework on Kselftest wiki: + +https://kselftest.wiki.kernel.org/ + +On some systems, hot-plug tests could hang forever waiting for cpu and +memory to be ready to be offlined. A special hot-plug target is created +to run the full range of hot-plug tests. In default mode, hot-plug tests run +in safe mode with a limited scope. In limited mode, cpu-hotplug test is +run on a single cpu as opposed to all hotplug capable cpus, and memory +hotplug test is run on 2% of hotplug capable memory instead of 10%. + +kselftest runs as a userspace process. Tests that can be written/run in +userspace may wish to use the `Test Harness`_. Tests that need to be +run in kernel space may wish to use a `Test Module`_. + +Running the selftests (hotplug tests are run in limited mode) +============================================================= + +To build the tests:: + + $ make headers + $ make -C tools/testing/selftests + +To run the tests:: + + $ make -C tools/testing/selftests run_tests + +To build and run the tests with a single command, use:: + + $ make kselftest + +Note that some tests will require root privileges. + +Kselftest supports saving output files in a separate directory and then +running tests. To locate output files in a separate directory two syntaxes +are supported. In both cases the working directory must be the root of the +kernel src. This is applicable to "Running a subset of selftests" section +below. + +To build, save output files in a separate directory with O= :: + + $ make O=/tmp/kselftest kselftest + +To build, save output files in a separate directory with KBUILD_OUTPUT :: + + $ export KBUILD_OUTPUT=/tmp/kselftest; make kselftest + +The O= assignment takes precedence over the KBUILD_OUTPUT environment +variable. + +The above commands by default run the tests and print full pass/fail report. +Kselftest supports "summary" option to make it easier to understand the test +results. Please find the detailed individual test results for each test in +/tmp/testname file(s) when summary option is specified. This is applicable +to "Running a subset of selftests" section below. + +To run kselftest with summary option enabled :: + + $ make summary=1 kselftest + +Running a subset of selftests +============================= + +You can use the "TARGETS" variable on the make command line to specify +single test to run, or a list of tests to run. + +To run only tests targeted for a single subsystem:: + + $ make -C tools/testing/selftests TARGETS=ptrace run_tests + +You can specify multiple tests to build and run:: + + $ make TARGETS="size timers" kselftest + +To build, save output files in a separate directory with O= :: + + $ make O=/tmp/kselftest TARGETS="size timers" kselftest + +To build, save output files in a separate directory with KBUILD_OUTPUT :: + + $ export KBUILD_OUTPUT=/tmp/kselftest; make TARGETS="size timers" kselftest + +Additionally you can use the "SKIP_TARGETS" variable on the make command +line to specify one or more targets to exclude from the TARGETS list. + +To run all tests but a single subsystem:: + + $ make -C tools/testing/selftests SKIP_TARGETS=ptrace run_tests + +You can specify multiple tests to skip:: + + $ make SKIP_TARGETS="size timers" kselftest + +You can also specify a restricted list of tests to run together with a +dedicated skiplist:: + + $ make TARGETS="bpf breakpoints size timers" SKIP_TARGETS=bpf kselftest + +See the top-level tools/testing/selftests/Makefile for the list of all +possible targets. + +Running the full range hotplug selftests +======================================== + +To build the hotplug tests:: + + $ make -C tools/testing/selftests hotplug + +To run the hotplug tests:: + + $ make -C tools/testing/selftests run_hotplug + +Note that some tests will require root privileges. + + +Install selftests +================= + +You can use the "install" target of "make" (which calls the `kselftest_install.sh` +tool) to install selftests in the default location (`tools/testing/selftests/kselftest_install`), +or in a user specified location via the `INSTALL_PATH` "make" variable. + +To install selftests in default location:: + + $ make -C tools/testing/selftests install + +To install selftests in a user specified location:: + + $ make -C tools/testing/selftests install INSTALL_PATH=/some/other/path + +Running installed selftests +=========================== + +Found in the install directory, as well as in the Kselftest tarball, +is a script named `run_kselftest.sh` to run the tests. + +You can simply do the following to run the installed Kselftests. Please +note some tests will require root privileges:: + + $ cd kselftest_install + $ ./run_kselftest.sh + +To see the list of available tests, the `-l` option can be used:: + + $ ./run_kselftest.sh -l + +The `-c` option can be used to run all the tests from a test collection, or +the `-t` option for specific single tests. Either can be used multiple times:: + + $ ./run_kselftest.sh -c bpf -c seccomp -t timers:posix_timers -t timer:nanosleep + +For other features see the script usage output, seen with the `-h` option. + +Timeout for selftests +===================== + +Selftests are designed to be quick and so a default timeout is used of 45 +seconds for each test. Tests can override the default timeout by adding +a settings file in their directory and set a timeout variable there to the +configured a desired upper timeout for the test. Only a few tests override +the timeout with a value higher than 45 seconds, selftests strives to keep +it that way. Timeouts in selftests are not considered fatal because the +system under which a test runs may change and this can also modify the +expected time it takes to run a test. If you have control over the systems +which will run the tests you can configure a test runner on those systems to +use a greater or lower timeout on the command line as with the `-o` or +the `--override-timeout` argument. For example to use 165 seconds instead +one would use: + + $ ./run_kselftest.sh --override-timeout 165 + +You can look at the TAP output to see if you ran into the timeout. Test +runners which know a test must run under a specific time can then optionally +treat these timeouts then as fatal. + +Packaging selftests +=================== + +In some cases packaging is desired, such as when tests need to run on a +different system. To package selftests, run:: + + $ make -C tools/testing/selftests gen_tar + +This generates a tarball in the `INSTALL_PATH/kselftest-packages` directory. By +default, `.gz` format is used. The tar compression format can be overridden by +specifying a `FORMAT` make variable. Any value recognized by `tar's auto-compress`_ +option is supported, such as:: + + $ make -C tools/testing/selftests gen_tar FORMAT=.xz + +`make gen_tar` invokes `make install` so you can use it to package a subset of +tests by using variables specified in `Running a subset of selftests`_ +section:: + + $ make -C tools/testing/selftests gen_tar TARGETS="bpf" FORMAT=.xz + +.. _tar's auto-compress: https://www.gnu.org/software/tar/manual/html_node/gzip.html#auto_002dcompress + +Contributing new tests +====================== + +In general, the rules for selftests are + + * Do as much as you can if you're not root; + + * Don't take too long; + + * Don't break the build on any architecture, and + + * Don't cause the top-level "make run_tests" to fail if your feature is + unconfigured. + +Contributing new tests (details) +================================ + + * In your Makefile, use facilities from lib.mk by including it instead of + reinventing the wheel. Specify flags and binaries generation flags on + need basis before including lib.mk. :: + + CFLAGS = $(KHDR_INCLUDES) + TEST_GEN_PROGS := close_range_test + include ../lib.mk + + * Use TEST_GEN_XXX if such binaries or files are generated during + compiling. + + TEST_PROGS, TEST_GEN_PROGS mean it is the executable tested by + default. + + TEST_CUSTOM_PROGS should be used by tests that require custom build + rules and prevent common build rule use. + + TEST_PROGS are for test shell scripts. Please ensure shell script has + its exec bit set. Otherwise, lib.mk run_tests will generate a warning. + + TEST_CUSTOM_PROGS and TEST_PROGS will be run by common run_tests. + + TEST_PROGS_EXTENDED, TEST_GEN_PROGS_EXTENDED mean it is the + executable which is not tested by default. + TEST_FILES, TEST_GEN_FILES mean it is the file which is used by + test. + + * First use the headers inside the kernel source and/or git repo, and then the + system headers. Headers for the kernel release as opposed to headers + installed by the distro on the system should be the primary focus to be able + to find regressions. Use KHDR_INCLUDES in Makefile to include headers from + the kernel source. + + * If a test needs specific kernel config options enabled, add a config file in + the test directory to enable them. + + e.g: tools/testing/selftests/android/config + + * Create a .gitignore file inside test directory and add all generated objects + in it. + + * Add new test name in TARGETS in selftests/Makefile:: + + TARGETS += android + + * All changes should pass:: + + kselftest-{all,install,clean,gen_tar} + kselftest-{all,install,clean,gen_tar} O=abo_path + kselftest-{all,install,clean,gen_tar} O=rel_path + make -C tools/testing/selftests {all,install,clean,gen_tar} + make -C tools/testing/selftests {all,install,clean,gen_tar} O=abs_path + make -C tools/testing/selftests {all,install,clean,gen_tar} O=rel_path + +Test Module +=========== + +Kselftest tests the kernel from userspace. Sometimes things need +testing from within the kernel, one method of doing this is to create a +test module. We can tie the module into the kselftest framework by +using a shell script test runner. ``kselftest/module.sh`` is designed +to facilitate this process. There is also a header file provided to +assist writing kernel modules that are for use with kselftest: + +- ``tools/testing/selftests/kselftest_module.h`` +- ``tools/testing/selftests/kselftest/module.sh`` + +Note that test modules should taint the kernel with TAINT_TEST. This will +happen automatically for modules which are in the ``tools/testing/`` +directory, or for modules which use the ``kselftest_module.h`` header above. +Otherwise, you'll need to add ``MODULE_INFO(test, "Y")`` to your module +source. selftests which do not load modules typically should not taint the +kernel, but in cases where a non-test module is loaded, TEST_TAINT can be +applied from userspace by writing to ``/proc/sys/kernel/tainted``. + +How to use +---------- + +Here we show the typical steps to create a test module and tie it into +kselftest. We use kselftests for lib/ as an example. + +1. Create the test module + +2. Create the test script that will run (load/unload) the module + e.g. ``tools/testing/selftests/lib/printf.sh`` + +3. Add line to config file e.g. ``tools/testing/selftests/lib/config`` + +4. Add test script to makefile e.g. ``tools/testing/selftests/lib/Makefile`` + +5. Verify it works: + +.. code-block:: sh + + # Assumes you have booted a fresh build of this kernel tree + cd /path/to/linux/tree + make kselftest-merge + make modules + sudo make modules_install + make TARGETS=lib kselftest + +Example Module +-------------- + +A bare bones test module might look like this: + +.. code-block:: c + + // SPDX-License-Identifier: GPL-2.0+ + + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + + #include "../tools/testing/selftests/kselftest_module.h" + + KSTM_MODULE_GLOBALS(); + + /* + * Kernel module for testing the foobinator + */ + + static int __init test_function() + { + ... + } + + static void __init selftest(void) + { + KSTM_CHECK_ZERO(do_test_case("", 0)); + } + + KSTM_MODULE_LOADERS(test_foo); + MODULE_AUTHOR("John Developer <jd@fooman.org>"); + MODULE_LICENSE("GPL"); + MODULE_INFO(test, "Y"); + +Example test script +------------------- + +.. code-block:: sh + + #!/bin/bash + # SPDX-License-Identifier: GPL-2.0+ + $(dirname $0)/../kselftest/module.sh "foo" test_foo + + +Test Harness +============ + +The kselftest_harness.h file contains useful helpers to build tests. The +test harness is for userspace testing, for kernel space testing see `Test +Module`_ above. + +The tests from tools/testing/selftests/seccomp/seccomp_bpf.c can be used as +example. + +Example +------- + +.. kernel-doc:: tools/testing/selftests/kselftest_harness.h + :doc: example + + +Helpers +------- + +.. kernel-doc:: tools/testing/selftests/kselftest_harness.h + :functions: TH_LOG TEST TEST_SIGNAL FIXTURE FIXTURE_DATA FIXTURE_SETUP + FIXTURE_TEARDOWN TEST_F TEST_HARNESS_MAIN FIXTURE_VARIANT + FIXTURE_VARIANT_ADD + +Operators +--------- + +.. kernel-doc:: tools/testing/selftests/kselftest_harness.h + :doc: operators + +.. kernel-doc:: tools/testing/selftests/kselftest_harness.h + :functions: ASSERT_EQ ASSERT_NE ASSERT_LT ASSERT_LE ASSERT_GT ASSERT_GE + ASSERT_NULL ASSERT_TRUE ASSERT_NULL ASSERT_TRUE ASSERT_FALSE + ASSERT_STREQ ASSERT_STRNE EXPECT_EQ EXPECT_NE EXPECT_LT + EXPECT_LE EXPECT_GT EXPECT_GE EXPECT_NULL EXPECT_TRUE + EXPECT_FALSE EXPECT_STREQ EXPECT_STRNE diff --git a/Documentation/dev-tools/ktap.rst b/Documentation/dev-tools/ktap.rst new file mode 100644 index 0000000000..414c105b10 --- /dev/null +++ b/Documentation/dev-tools/ktap.rst @@ -0,0 +1,311 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================================== +The Kernel Test Anything Protocol (KTAP), version 1 +=================================================== + +TAP, or the Test Anything Protocol is a format for specifying test results used +by a number of projects. It's website and specification are found at this `link +<https://testanything.org/>`_. The Linux Kernel largely uses TAP output for test +results. However, Kernel testing frameworks have special needs for test results +which don't align with the original TAP specification. Thus, a "Kernel TAP" +(KTAP) format is specified to extend and alter TAP to support these use-cases. +This specification describes the generally accepted format of KTAP as it is +currently used in the kernel. + +KTAP test results describe a series of tests (which may be nested: i.e., test +can have subtests), each of which can contain both diagnostic data -- e.g., log +lines -- and a final result. The test structure and results are +machine-readable, whereas the diagnostic data is unstructured and is there to +aid human debugging. + +KTAP output is built from four different types of lines: +- Version lines +- Plan lines +- Test case result lines +- Diagnostic lines + +In general, valid KTAP output should also form valid TAP output, but some +information, in particular nested test results, may be lost. Also note that +there is a stagnant draft specification for TAP14, KTAP diverges from this in +a couple of places (notably the "Subtest" header), which are described where +relevant later in this document. + +Version lines +------------- + +All KTAP-formatted results begin with a "version line" which specifies which +version of the (K)TAP standard the result is compliant with. + +For example: +- "KTAP version 1" +- "TAP version 13" +- "TAP version 14" + +Note that, in KTAP, subtests also begin with a version line, which denotes the +start of the nested test results. This differs from TAP14, which uses a +separate "Subtest" line. + +While, going forward, "KTAP version 1" should be used by compliant tests, it +is expected that most parsers and other tooling will accept the other versions +listed here for compatibility with existing tests and frameworks. + +Plan lines +---------- + +A test plan provides the number of tests (or subtests) in the KTAP output. + +Plan lines must follow the format of "1..N" where N is the number of tests or subtests. +Plan lines follow version lines to indicate the number of nested tests. + +While there are cases where the number of tests is not known in advance -- in +which case the test plan may be omitted -- it is strongly recommended one is +present where possible. + +Test case result lines +---------------------- + +Test case result lines indicate the final status of a test. +They are required and must have the format: + +.. code-block:: none + + <result> <number> [<description>][ # [<directive>] [<diagnostic data>]] + +The result can be either "ok", which indicates the test case passed, +or "not ok", which indicates that the test case failed. + +<number> represents the number of the test being performed. The first test must +have the number 1 and the number then must increase by 1 for each additional +subtest within the same test at the same nesting level. + +The description is a description of the test, generally the name of +the test, and can be any string of characters other than # or a +newline. The description is optional, but recommended. + +The directive and any diagnostic data is optional. If either are present, they +must follow a hash sign, "#". + +A directive is a keyword that indicates a different outcome for a test other +than passed and failed. The directive is optional, and consists of a single +keyword preceding the diagnostic data. In the event that a parser encounters +a directive it doesn't support, it should fall back to the "ok" / "not ok" +result. + +Currently accepted directives are: + +- "SKIP", which indicates a test was skipped (note the result of the test case + result line can be either "ok" or "not ok" if the SKIP directive is used) +- "TODO", which indicates that a test is not expected to pass at the moment, + e.g. because the feature it is testing is known to be broken. While this + directive is inherited from TAP, its use in the kernel is discouraged. +- "XFAIL", which indicates that a test is expected to fail. This is similar + to "TODO", above, and is used by some kselftest tests. +- “TIMEOUT”, which indicates a test has timed out (note the result of the test + case result line should be “not ok” if the TIMEOUT directive is used) +- “ERROR”, which indicates that the execution of a test has failed due to a + specific error that is included in the diagnostic data. (note the result of + the test case result line should be “not ok” if the ERROR directive is used) + +The diagnostic data is a plain-text field which contains any additional details +about why this result was produced. This is typically an error message for ERROR +or failed tests, or a description of missing dependencies for a SKIP result. + +The diagnostic data field is optional, and results which have neither a +directive nor any diagnostic data do not need to include the "#" field +separator. + +Example result lines include:: + + ok 1 test_case_name + +The test "test_case_name" passed. + +:: + + not ok 1 test_case_name + +The test "test_case_name" failed. + +:: + + ok 1 test # SKIP necessary dependency unavailable + +The test "test" was SKIPPED with the diagnostic message "necessary dependency +unavailable". + +:: + + not ok 1 test # TIMEOUT 30 seconds + +The test "test" timed out, with diagnostic data "30 seconds". + +:: + + ok 5 check return code # rcode=0 + +The test "check return code" passed, with additional diagnostic data “rcode=0” + + +Diagnostic lines +---------------- + +If tests wish to output any further information, they should do so using +"diagnostic lines". Diagnostic lines are optional, freeform text, and are +often used to describe what is being tested and any intermediate results in +more detail than the final result and diagnostic data line provides. + +Diagnostic lines are formatted as "# <diagnostic_description>", where the +description can be any string. Diagnostic lines can be anywhere in the test +output. As a rule, diagnostic lines regarding a test are directly before the +test result line for that test. + +Note that most tools will treat unknown lines (see below) as diagnostic lines, +even if they do not start with a "#": this is to capture any other useful +kernel output which may help debug the test. It is nevertheless recommended +that tests always prefix any diagnostic output they have with a "#" character. + +Unknown lines +------------- + +There may be lines within KTAP output that do not follow the format of one of +the four formats for lines described above. This is allowed, however, they will +not influence the status of the tests. + +This is an important difference from TAP. Kernel tests may print messages +to the system console or a log file. Both of these destinations may contain +messages either from unrelated kernel or userspace activity, or kernel +messages from non-test code that is invoked by the test. The kernel code +invoked by the test likely is not aware that a test is in progress and +thus can not print the message as a diagnostic message. + +Nested tests +------------ + +In KTAP, tests can be nested. This is done by having a test include within its +output an entire set of KTAP-formatted results. This can be used to categorize +and group related tests, or to split out different results from the same test. + +The "parent" test's result should consist of all of its subtests' results, +starting with another KTAP version line and test plan, and end with the overall +result. If one of the subtests fail, for example, the parent test should also +fail. + +Additionally, all lines in a subtest should be indented. One level of +indentation is two spaces: " ". The indentation should begin at the version +line and should end before the parent test's result line. + +"Unknown lines" are not considered to be lines in a subtest and thus are +allowed to be either indented or not indented. + +An example of a test with two nested subtests: + +:: + + KTAP version 1 + 1..1 + KTAP version 1 + 1..2 + ok 1 test_1 + not ok 2 test_2 + # example failed + not ok 1 example + +An example format with multiple levels of nested testing: + +:: + + KTAP version 1 + 1..2 + KTAP version 1 + 1..2 + KTAP version 1 + 1..2 + not ok 1 test_1 + ok 2 test_2 + not ok 1 test_3 + ok 2 test_4 # SKIP + not ok 1 example_test_1 + ok 2 example_test_2 + + +Major differences between TAP and KTAP +-------------------------------------- + +================================================== ========= =============== +Feature TAP KTAP +================================================== ========= =============== +yaml and json in diagnosic message ok not recommended +TODO directive ok not recognized +allows an arbitrary number of tests to be nested no yes +"Unknown lines" are in category of "Anything else" yes no +"Unknown lines" are incorrect allowed +================================================== ========= =============== + +The TAP14 specification does permit nested tests, but instead of using another +nested version line, uses a line of the form +"Subtest: <name>" where <name> is the name of the parent test. + +Example KTAP output +-------------------- +:: + + KTAP version 1 + 1..1 + KTAP version 1 + 1..3 + KTAP version 1 + 1..1 + # test_1: initializing test_1 + ok 1 test_1 + ok 1 example_test_1 + KTAP version 1 + 1..2 + ok 1 test_1 # SKIP test_1 skipped + ok 2 test_2 + ok 2 example_test_2 + KTAP version 1 + 1..3 + ok 1 test_1 + # test_2: FAIL + not ok 2 test_2 + ok 3 test_3 # SKIP test_3 skipped + not ok 3 example_test_3 + not ok 1 main_test + +This output defines the following hierarchy: + +A single test called "main_test", which fails, and has three subtests: +- "example_test_1", which passes, and has one subtest: + + - "test_1", which passes, and outputs the diagnostic message "test_1: initializing test_1" + +- "example_test_2", which passes, and has two subtests: + + - "test_1", which is skipped, with the explanation "test_1 skipped" + - "test_2", which passes + +- "example_test_3", which fails, and has three subtests + + - "test_1", which passes + - "test_2", which outputs the diagnostic line "test_2: FAIL", and fails. + - "test_3", which is skipped with the explanation "test_3 skipped" + +Note that the individual subtests with the same names do not conflict, as they +are found in different parent tests. This output also exhibits some sensible +rules for "bubbling up" test results: a test fails if any of its subtests fail. +Skipped tests do not affect the result of the parent test (though it often +makes sense for a test to be marked skipped if _all_ of its subtests have been +skipped). + +See also: +--------- + +- The TAP specification: + https://testanything.org/tap-version-13-specification.html +- The (stagnant) TAP version 14 specification: + https://github.com/TestAnything/Specification/blob/tap-14-specification/specification.md +- The kselftest documentation: + Documentation/dev-tools/kselftest.rst +- The KUnit documentation: + Documentation/dev-tools/kunit/index.rst diff --git a/Documentation/dev-tools/kunit/api/functionredirection.rst b/Documentation/dev-tools/kunit/api/functionredirection.rst new file mode 100644 index 0000000000..3791efc2fc --- /dev/null +++ b/Documentation/dev-tools/kunit/api/functionredirection.rst @@ -0,0 +1,162 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================== +Function Redirection API +======================== + +Overview +======== + +When writing unit tests, it's important to be able to isolate the code being +tested from other parts of the kernel. This ensures the reliability of the test +(it won't be affected by external factors), reduces dependencies on specific +hardware or config options (making the test easier to run), and protects the +stability of the rest of the system (making it less likely for test-specific +state to interfere with the rest of the system). + +While for some code (typically generic data structures, helpers, and other +"pure functions") this is trivial, for others (like device drivers, +filesystems, core subsystems) the code is heavily coupled with other parts of +the kernel. + +This coupling is often due to global state in some way: be it a global list of +devices, the filesystem, or some hardware state. Tests need to either carefully +manage, isolate, and restore state, or they can avoid it altogether by +replacing access to and mutation of this state with a "fake" or "mock" variant. + +By refactoring access to such state, such as by introducing a layer of +indirection which can use or emulate a separate set of test state. However, +such refactoring comes with its own costs (and undertaking significant +refactoring before being able to write tests is suboptimal). + +A simpler way to intercept and replace some of the function calls is to use +function redirection via static stubs. + + +Static Stubs +============ + +Static stubs are a way of redirecting calls to one function (the "real" +function) to another function (the "replacement" function). + +It works by adding a macro to the "real" function which checks to see if a test +is running, and if a replacement function is available. If so, that function is +called in place of the original. + +Using static stubs is pretty straightforward: + +1. Add the KUNIT_STATIC_STUB_REDIRECT() macro to the start of the "real" + function. + + This should be the first statement in the function, after any variable + declarations. KUNIT_STATIC_STUB_REDIRECT() takes the name of the + function, followed by all of the arguments passed to the real function. + + For example: + + .. code-block:: c + + void send_data_to_hardware(const char *str) + { + KUNIT_STATIC_STUB_REDIRECT(send_data_to_hardware, str); + /* real implementation */ + } + +2. Write one or more replacement functions. + + These functions should have the same function signature as the real function. + In the event they need to access or modify test-specific state, they can use + kunit_get_current_test() to get a struct kunit pointer. This can then + be passed to the expectation/assertion macros, or used to look up KUnit + resources. + + For example: + + .. code-block:: c + + void fake_send_data_to_hardware(const char *str) + { + struct kunit *test = kunit_get_current_test(); + KUNIT_EXPECT_STREQ(test, str, "Hello World!"); + } + +3. Activate the static stub from your test. + + From within a test, the redirection can be enabled with + kunit_activate_static_stub(), which accepts a struct kunit pointer, + the real function, and the replacement function. You can call this several + times with different replacement functions to swap out implementations of the + function. + + In our example, this would be + + .. code-block:: c + + kunit_activate_static_stub(test, + send_data_to_hardware, + fake_send_data_to_hardware); + +4. Call (perhaps indirectly) the real function. + + Once the redirection is activated, any call to the real function will call + the replacement function instead. Such calls may be buried deep in the + implementation of another function, but must occur from the test's kthread. + + For example: + + .. code-block:: c + + send_data_to_hardware("Hello World!"); /* Succeeds */ + send_data_to_hardware("Something else"); /* Fails the test. */ + +5. (Optionally) disable the stub. + + When you no longer need it, disable the redirection (and hence resume the + original behaviour of the 'real' function) using + kunit_deactivate_static_stub(). Otherwise, it will be automatically disabled + when the test exits. + + For example: + + .. code-block:: c + + kunit_deactivate_static_stub(test, send_data_to_hardware); + + +It's also possible to use these replacement functions to test to see if a +function is called at all, for example: + +.. code-block:: c + + void send_data_to_hardware(const char *str) + { + KUNIT_STATIC_STUB_REDIRECT(send_data_to_hardware, str); + /* real implementation */ + } + + /* In test file */ + int times_called = 0; + void fake_send_data_to_hardware(const char *str) + { + times_called++; + } + ... + /* In the test case, redirect calls for the duration of the test */ + kunit_activate_static_stub(test, send_data_to_hardware, fake_send_data_to_hardware); + + send_data_to_hardware("hello"); + KUNIT_EXPECT_EQ(test, times_called, 1); + + /* Can also deactivate the stub early, if wanted */ + kunit_deactivate_static_stub(test, send_data_to_hardware); + + send_data_to_hardware("hello again"); + KUNIT_EXPECT_EQ(test, times_called, 1); + + + +API Reference +============= + +.. kernel-doc:: include/kunit/static_stub.h + :internal: diff --git a/Documentation/dev-tools/kunit/api/index.rst b/Documentation/dev-tools/kunit/api/index.rst new file mode 100644 index 0000000000..2d8f756aab --- /dev/null +++ b/Documentation/dev-tools/kunit/api/index.rst @@ -0,0 +1,27 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============= +API Reference +============= +.. toctree:: + :hidden: + + test + resource + functionredirection + + +This page documents the KUnit kernel testing API. It is divided into the +following sections: + +Documentation/dev-tools/kunit/api/test.rst + + - Documents all of the standard testing API + +Documentation/dev-tools/kunit/api/resource.rst + + - Documents the KUnit resource API + +Documentation/dev-tools/kunit/api/functionredirection.rst + + - Documents the KUnit Function Redirection API diff --git a/Documentation/dev-tools/kunit/api/resource.rst b/Documentation/dev-tools/kunit/api/resource.rst new file mode 100644 index 0000000000..0a94f83125 --- /dev/null +++ b/Documentation/dev-tools/kunit/api/resource.rst @@ -0,0 +1,13 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============ +Resource API +============ + +This file documents the KUnit resource API. + +Most users won't need to use this API directly, power users can use it to store +state on a per-test basis, register custom cleanup actions, and more. + +.. kernel-doc:: include/kunit/resource.h + :internal: diff --git a/Documentation/dev-tools/kunit/api/test.rst b/Documentation/dev-tools/kunit/api/test.rst new file mode 100644 index 0000000000..c5eca423e8 --- /dev/null +++ b/Documentation/dev-tools/kunit/api/test.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======== +Test API +======== + +This file documents all of the standard testing API. + +.. kernel-doc:: include/kunit/test.h + :internal: diff --git a/Documentation/dev-tools/kunit/architecture.rst b/Documentation/dev-tools/kunit/architecture.rst new file mode 100644 index 0000000000..f335f883f8 --- /dev/null +++ b/Documentation/dev-tools/kunit/architecture.rst @@ -0,0 +1,196 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================== +KUnit Architecture +================== + +The KUnit architecture is divided into two parts: + +- `In-Kernel Testing Framework`_ +- `kunit_tool (Command-line Test Harness)`_ + +In-Kernel Testing Framework +=========================== + +The kernel testing library supports KUnit tests written in C using +KUnit. These KUnit tests are kernel code. KUnit performs the following +tasks: + +- Organizes tests +- Reports test results +- Provides test utilities + +Test Cases +---------- + +The test case is the fundamental unit in KUnit. KUnit test cases are organised +into suites. A KUnit test case is a function with type signature +``void (*)(struct kunit *test)``. These test case functions are wrapped in a +struct called struct kunit_case. + +.. note: + ``generate_params`` is optional for non-parameterized tests. + +Each KUnit test case receives a ``struct kunit`` context object that tracks a +running test. The KUnit assertion macros and other KUnit utilities use the +``struct kunit`` context object. As an exception, there are two fields: + +- ``->priv``: The setup functions can use it to store arbitrary test + user data. + +- ``->param_value``: It contains the parameter value which can be + retrieved in the parameterized tests. + +Test Suites +----------- + +A KUnit suite includes a collection of test cases. The KUnit suites +are represented by the ``struct kunit_suite``. For example: + +.. code-block:: c + + static struct kunit_case example_test_cases[] = { + KUNIT_CASE(example_test_foo), + KUNIT_CASE(example_test_bar), + KUNIT_CASE(example_test_baz), + {} + }; + + static struct kunit_suite example_test_suite = { + .name = "example", + .init = example_test_init, + .exit = example_test_exit, + .test_cases = example_test_cases, + }; + kunit_test_suite(example_test_suite); + +In the above example, the test suite ``example_test_suite``, runs the +test cases ``example_test_foo``, ``example_test_bar``, and +``example_test_baz``. Before running the test, the ``example_test_init`` +is called and after running the test, ``example_test_exit`` is called. +The ``kunit_test_suite(example_test_suite)`` registers the test suite +with the KUnit test framework. + +Executor +-------- + +The KUnit executor can list and run built-in KUnit tests on boot. +The Test suites are stored in a linker section +called ``.kunit_test_suites``. For the code, see ``KUNIT_TABLE()`` macro +definition in +`include/asm-generic/vmlinux.lds.h <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/vmlinux.lds.h?h=v6.0#n950>`_. +The linker section consists of an array of pointers to +``struct kunit_suite``, and is populated by the ``kunit_test_suites()`` +macro. The KUnit executor iterates over the linker section array in order to +run all the tests that are compiled into the kernel. + +.. kernel-figure:: kunit_suitememorydiagram.svg + :alt: KUnit Suite Memory + + KUnit Suite Memory Diagram + +On the kernel boot, the KUnit executor uses the start and end addresses +of this section to iterate over and run all tests. For the implementation of the +executor, see +`lib/kunit/executor.c <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/kunit/executor.c>`_. +When built as a module, the ``kunit_test_suites()`` macro defines a +``module_init()`` function, which runs all the tests in the compilation +unit instead of utilizing the executor. + +In KUnit tests, some error classes do not affect other tests +or parts of the kernel, each KUnit case executes in a separate thread +context. See the ``kunit_try_catch_run()`` function in +`lib/kunit/try-catch.c <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/kunit/try-catch.c?h=v5.15#n58>`_. + +Assertion Macros +---------------- + +KUnit tests verify state using expectations/assertions. +All expectations/assertions are formatted as: +``KUNIT_{EXPECT|ASSERT}_<op>[_MSG](kunit, property[, message])`` + +- ``{EXPECT|ASSERT}`` determines whether the check is an assertion or an + expectation. + In the event of a failure, the testing flow differs as follows: + + - For expectations, the test is marked as failed and the failure is logged. + + - Failing assertions, on the other hand, result in the test case being + terminated immediately. + + - Assertions call the function: + ``void __noreturn __kunit_abort(struct kunit *)``. + + - ``__kunit_abort`` calls the function: + ``void __noreturn kunit_try_catch_throw(struct kunit_try_catch *try_catch)``. + + - ``kunit_try_catch_throw`` calls the function: + ``void kthread_complete_and_exit(struct completion *, long) __noreturn;`` + and terminates the special thread context. + +- ``<op>`` denotes a check with options: ``TRUE`` (supplied property + has the boolean value "true"), ``EQ`` (two supplied properties are + equal), ``NOT_ERR_OR_NULL`` (supplied pointer is not null and does not + contain an "err" value). + +- ``[_MSG]`` prints a custom message on failure. + +Test Result Reporting +--------------------- +KUnit prints the test results in KTAP format. KTAP is based on TAP14, see +Documentation/dev-tools/ktap.rst. +KTAP works with KUnit and Kselftest. The KUnit executor prints KTAP results to +dmesg, and debugfs (if configured). + +Parameterized Tests +------------------- + +Each KUnit parameterized test is associated with a collection of +parameters. The test is invoked multiple times, once for each parameter +value and the parameter is stored in the ``param_value`` field. +The test case includes a KUNIT_CASE_PARAM() macro that accepts a +generator function. The generator function is passed the previous parameter +and returns the next parameter. It also includes a macro for generating +array-based common-case generators. + +kunit_tool (Command-line Test Harness) +====================================== + +``kunit_tool`` is a Python script, found in ``tools/testing/kunit/kunit.py``. It +is used to configure, build, execute, parse test results and run all of the +previous commands in correct order (i.e., configure, build, execute and parse). +You have two options for running KUnit tests: either build the kernel with KUnit +enabled and manually parse the results (see +Documentation/dev-tools/kunit/run_manual.rst) or use ``kunit_tool`` +(see Documentation/dev-tools/kunit/run_wrapper.rst). + +- ``configure`` command generates the kernel ``.config`` from a + ``.kunitconfig`` file (and any architecture-specific options). + The Python scripts available in ``qemu_configs`` folder + (for example, ``tools/testing/kunit/qemu configs/powerpc.py``) contains + additional configuration options for specific architectures. + It parses both the existing ``.config`` and the ``.kunitconfig`` files + to ensure that ``.config`` is a superset of ``.kunitconfig``. + If not, it will combine the two and run ``make olddefconfig`` to regenerate + the ``.config`` file. It then checks to see if ``.config`` has become a superset. + This verifies that all the Kconfig dependencies are correctly specified in the + file ``.kunitconfig``. The ``kunit_config.py`` script contains the code for parsing + Kconfigs. The code which runs ``make olddefconfig`` is part of the + ``kunit_kernel.py`` script. You can invoke this command through: + ``./tools/testing/kunit/kunit.py config`` and + generate a ``.config`` file. +- ``build`` runs ``make`` on the kernel tree with required options + (depends on the architecture and some options, for example: build_dir) + and reports any errors. + To build a KUnit kernel from the current ``.config``, you can use the + ``build`` argument: ``./tools/testing/kunit/kunit.py build``. +- ``exec`` command executes kernel results either directly (using + User-mode Linux configuration), or through an emulator such + as QEMU. It reads results from the log using standard + output (stdout), and passes them to ``parse`` to be parsed. + If you already have built a kernel with built-in KUnit tests, + you can run the kernel and display the test results with the ``exec`` + argument: ``./tools/testing/kunit/kunit.py exec``. +- ``parse`` extracts the KTAP output from a kernel log, parses + the test results, and prints a summary. For failed tests, any + diagnostic output will be included. diff --git a/Documentation/dev-tools/kunit/faq.rst b/Documentation/dev-tools/kunit/faq.rst new file mode 100644 index 0000000000..fae426f263 --- /dev/null +++ b/Documentation/dev-tools/kunit/faq.rst @@ -0,0 +1,104 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +Frequently Asked Questions +========================== + +How is this different from Autotest, kselftest, and so on? +========================================================== +KUnit is a unit testing framework. Autotest, kselftest (and some others) are +not. + +A `unit test <https://martinfowler.com/bliki/UnitTest.html>`_ is supposed to +test a single unit of code in isolation and hence the name *unit test*. A unit +test should be the finest granularity of testing and should allow all possible +code paths to be tested in the code under test. This is only possible if the +code under test is small and does not have any external dependencies outside of +the test's control like hardware. + +There are no testing frameworks currently available for the kernel that do not +require installing the kernel on a test machine or in a virtual machine. All +testing frameworks require tests to be written in userspace and run on the +kernel under test. This is true for Autotest, kselftest, and some others, +disqualifying any of them from being considered unit testing frameworks. + +Does KUnit support running on architectures other than UML? +=========================================================== + +Yes, mostly. + +For the most part, the KUnit core framework (what we use to write the tests) +can compile to any architecture. It compiles like just another part of the +kernel and runs when the kernel boots, or when built as a module, when the +module is loaded. However, there is infrastructure, like the KUnit Wrapper +(``tools/testing/kunit/kunit.py``) that might not support some architectures +(see :ref:`kunit-on-qemu`). + +In short, yes, you can run KUnit on other architectures, but it might require +more work than using KUnit on UML. + +For more information, see :ref:`kunit-on-non-uml`. + +.. _kinds-of-tests: + +What is the difference between a unit test and other kinds of tests? +==================================================================== +Most existing tests for the Linux kernel would be categorized as an integration +test, or an end-to-end test. + +- A unit test is supposed to test a single unit of code in isolation. A unit + test should be the finest granularity of testing and, as such, allows all + possible code paths to be tested in the code under test. This is only possible + if the code under test is small and does not have any external dependencies + outside of the test's control like hardware. +- An integration test tests the interaction between a minimal set of components, + usually just two or three. For example, someone might write an integration + test to test the interaction between a driver and a piece of hardware, or to + test the interaction between the userspace libraries the kernel provides and + the kernel itself. However, one of these tests would probably not test the + entire kernel along with hardware interactions and interactions with the + userspace. +- An end-to-end test usually tests the entire system from the perspective of the + code under test. For example, someone might write an end-to-end test for the + kernel by installing a production configuration of the kernel on production + hardware with a production userspace and then trying to exercise some behavior + that depends on interactions between the hardware, the kernel, and userspace. + +KUnit is not working, what should I do? +======================================= + +Unfortunately, there are a number of things which can break, but here are some +things to try. + +1. Run ``./tools/testing/kunit/kunit.py run`` with the ``--raw_output`` + parameter. This might show details or error messages hidden by the kunit_tool + parser. +2. Instead of running ``kunit.py run``, try running ``kunit.py config``, + ``kunit.py build``, and ``kunit.py exec`` independently. This can help track + down where an issue is occurring. (If you think the parser is at fault, you + can run it manually against ``stdin`` or a file with ``kunit.py parse``.) +3. Running the UML kernel directly can often reveal issues or error messages, + ``kunit_tool`` ignores. This should be as simple as running ``./vmlinux`` + after building the UML kernel (for example, by using ``kunit.py build``). + Note that UML has some unusual requirements (such as the host having a tmpfs + filesystem mounted), and has had issues in the past when built statically and + the host has KASLR enabled. (On older host kernels, you may need to run + ``setarch `uname -m` -R ./vmlinux`` to disable KASLR.) +4. Make sure the kernel .config has ``CONFIG_KUNIT=y`` and at least one test + (e.g. ``CONFIG_KUNIT_EXAMPLE_TEST=y``). kunit_tool will keep its .config + around, so you can see what config was used after running ``kunit.py run``. + It also preserves any config changes you might make, so you can + enable/disable things with ``make ARCH=um menuconfig`` or similar, and then + re-run kunit_tool. +5. Try to run ``make ARCH=um defconfig`` before running ``kunit.py run``. This + may help clean up any residual config items which could be causing problems. +6. Finally, try running KUnit outside UML. KUnit and KUnit tests can be + built into any kernel, or can be built as a module and loaded at runtime. + Doing so should allow you to determine if UML is causing the issue you're + seeing. When tests are built-in, they will execute when the kernel boots, and + modules will automatically execute associated tests when loaded. Test results + can be collected from ``/sys/kernel/debug/kunit/<test suite>/results``, and + can be parsed with ``kunit.py parse``. For more details, see :ref:`kunit-on-qemu`. + +If none of the above tricks help, you are always welcome to email any issues to +kunit-dev@googlegroups.com. diff --git a/Documentation/dev-tools/kunit/index.rst b/Documentation/dev-tools/kunit/index.rst new file mode 100644 index 0000000000..b3593ae29a --- /dev/null +++ b/Documentation/dev-tools/kunit/index.rst @@ -0,0 +1,109 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================= +KUnit - Linux Kernel Unit Testing +================================= + +.. toctree:: + :maxdepth: 2 + :caption: Contents: + + start + architecture + run_wrapper + run_manual + usage + api/index + style + faq + running_tips + +This section details the kernel unit testing framework. + +Introduction +============ + +KUnit (Kernel unit testing framework) provides a common framework for +unit tests within the Linux kernel. Using KUnit, you can define groups +of test cases called test suites. The tests either run on kernel boot +if built-in, or load as a module. KUnit automatically flags and reports +failed test cases in the kernel log. The test results appear in +:doc:`KTAP (Kernel - Test Anything Protocol) format</dev-tools/ktap>`. +It is inspired by JUnit, Python’s unittest.mock, and GoogleTest/GoogleMock +(C++ unit testing framework). + +KUnit tests are part of the kernel, written in the C (programming) +language, and test parts of the Kernel implementation (example: a C +language function). Excluding build time, from invocation to +completion, KUnit can run around 100 tests in less than 10 seconds. +KUnit can test any kernel component, for example: file system, system +calls, memory management, device drivers and so on. + +KUnit follows the white-box testing approach. The test has access to +internal system functionality. KUnit runs in kernel space and is not +restricted to things exposed to user-space. + +In addition, KUnit has kunit_tool, a script (``tools/testing/kunit/kunit.py``) +that configures the Linux kernel, runs KUnit tests under QEMU or UML +(:doc:`User Mode Linux </virt/uml/user_mode_linux_howto_v2>`), +parses the test results and +displays them in a user friendly manner. + +Features +-------- + +- Provides a framework for writing unit tests. +- Runs tests on any kernel architecture. +- Runs a test in milliseconds. + +Prerequisites +------------- + +- Any Linux kernel compatible hardware. +- For Kernel under test, Linux kernel version 5.5 or greater. + +Unit Testing +============ + +A unit test tests a single unit of code in isolation. A unit test is the finest +granularity of testing and allows all possible code paths to be tested in the +code under test. This is possible if the code under test is small and does not +have any external dependencies outside of the test's control like hardware. + + +Write Unit Tests +---------------- + +To write good unit tests, there is a simple but powerful pattern: +Arrange-Act-Assert. This is a great way to structure test cases and +defines an order of operations. + +- Arrange inputs and targets: At the start of the test, arrange the data + that allows a function to work. Example: initialize a statement or + object. +- Act on the target behavior: Call your function/code under test. +- Assert expected outcome: Verify that the result (or resulting state) is as + expected. + +Unit Testing Advantages +----------------------- + +- Increases testing speed and development in the long run. +- Detects bugs at initial stage and therefore decreases bug fix cost + compared to acceptance testing. +- Improves code quality. +- Encourages writing testable code. + +Read also :ref:`kinds-of-tests`. + +How do I use it? +================ + +You can find a step-by-step guide to writing and running KUnit tests in +Documentation/dev-tools/kunit/start.rst + +Alternatively, feel free to look through the rest of the KUnit documentation, +or to experiment with tools/testing/kunit/kunit.py and the example test under +lib/kunit/kunit-example-test.c + +Happy testing! diff --git a/Documentation/dev-tools/kunit/kunit_suitememorydiagram.svg b/Documentation/dev-tools/kunit/kunit_suitememorydiagram.svg new file mode 100644 index 0000000000..cf8fddc275 --- /dev/null +++ b/Documentation/dev-tools/kunit/kunit_suitememorydiagram.svg @@ -0,0 +1,81 @@ +<?xml version="1.0" encoding="UTF-8"?> +<svg width="796.93" height="555.73" version="1.1" viewBox="0 0 796.93 555.73" xmlns="http://www.w3.org/2000/svg"> + <g transform="translate(-13.724 -17.943)"> + <g fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"> + <rect x="323.56" y="18.443" width="115.75" height="41.331"/> + <rect x="323.56" y="463.09" width="115.75" height="41.331"/> + <rect x="323.56" y="531.84" width="115.75" height="41.331"/> + <rect x="323.56" y="88.931" width="115.75" height="74.231"/> + </g> + <g> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(0 -258.6)"> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(0 -217.27)"> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(0 -175.94)"> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(0 -134.61)"> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(0 -41.331)"> + <rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/> + <text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text> + </g> + <g transform="translate(3.4459e-5 -.71088)"> + <rect x="502.19" y="143.16" width="201.13" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/> + <text x="512.02319" y="168.02026" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="512.02319" y="168.02026" font-family="monospace">_kunit_suites_start</tspan></text> + </g> + <g transform="translate(3.0518e-5 -3.1753)"> + <rect x="502.19" y="445.69" width="201.13" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/> + <text x="521.61694" y="470.54846" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="521.61694" y="470.54846" font-family="monospace">_kunit_suites_end</tspan></text> + </g> + <rect x="14.224" y="277.78" width="134.47" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/> + <text x="32.062176" y="304.41287" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="32.062176" y="304.41287" font-family="monospace">.init.data</tspan></text> + <g transform="translate(217.98 145.12)" stroke="#1a1a1a"> + <circle cx="149.97" cy="373.01" r="3.4012"/> + <circle cx="163.46" cy="373.01" r="3.4012"/> + <circle cx="176.95" cy="373.01" r="3.4012"/> + </g> + <g transform="translate(217.98 -298.66)" stroke="#1a1a1a"> + <circle cx="149.97" cy="373.01" r="3.4012"/> + <circle cx="163.46" cy="373.01" r="3.4012"/> + <circle cx="176.95" cy="373.01" r="3.4012"/> + </g> + <g stroke="#1a1a1a"> + <rect x="323.56" y="328.49" width="115.75" height="51.549" fill="#b9dbc6"/> + <g transform="translate(217.98 -18.75)"> + <circle cx="149.97" cy="373.01" r="3.4012"/> + <circle cx="163.46" cy="373.01" r="3.4012"/> + <circle cx="176.95" cy="373.01" r="3.4012"/> + </g> + </g> + <g transform="scale(1.0933 .9147)" stroke-width="32.937" aria-label="{"> + <path d="m275.49 545.57c-35.836-8.432-47.43-24.769-47.957-64.821v-88.536c-0.527-44.795-10.54-57.97-49.538-67.456 38.998-10.013 49.011-23.715 49.538-67.983v-88.536c0.527-40.052 12.121-56.389 47.957-64.821v-5.797c-65.348 0-85.901 17.391-86.955 73.253v93.806c-0.527 36.89-10.013 50.065-44.795 59.551 34.782 10.013 44.268 23.188 44.795 60.078v93.279c1.581 56.389 21.607 73.78 86.955 73.78z"/> + </g> + <g transform="scale(1.1071 .90325)" stroke-width="14.44" aria-label="{"> + <path d="m461.46 443.55c-15.711-3.6967-20.794-10.859-21.025-28.418v-38.815c-0.23104-19.639-4.6209-25.415-21.718-29.574 17.097-4.3898 21.487-10.397 21.718-29.805v-38.815c0.23105-17.559 5.314-24.722 21.025-28.418v-2.5415c-28.649 0-37.66 7.6244-38.122 32.115v41.126c-0.23105 16.173-4.3898 21.949-19.639 26.108 15.249 4.3898 19.408 10.166 19.639 26.339v40.895c0.69313 24.722 9.4728 32.346 38.122 32.346z"/> + </g> + <path d="m449.55 161.84v2.5h49.504v-2.5z" color="#000000" style="-inkscape-stroke:none"/> + <g fill-rule="evenodd"> + <path d="m443.78 163.09 8.65-5v10z" color="#000000" stroke-width="1pt" style="-inkscape-stroke:none"/> + <path d="m453.1 156.94-10.648 6.1543 0.99804 0.57812 9.6504 5.5781zm-1.334 2.3125v7.6856l-6.6504-3.8438z" color="#000000" style="-inkscape-stroke:none"/> + </g> + <path d="m449.55 461.91v2.5h49.504v-2.5z" color="#000000" style="-inkscape-stroke:none"/> + <g fill-rule="evenodd"> + <path d="m443.78 463.16 8.65-5v10z" color="#000000" stroke-width="1pt" style="-inkscape-stroke:none"/> + <path d="m453.1 457-10.648 6.1562 0.99804 0.57617 9.6504 5.5781zm-1.334 2.3125v7.6856l-6.6504-3.8438z" color="#000000" style="-inkscape-stroke:none"/> + </g> + <rect x="515.64" y="223.9" width="294.52" height="178.49" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/> + <text x="523.33319" y="262.52542" font-family="monospace" font-size="14.667px" style="line-height:1.25" xml:space="preserve"><tspan x="523.33319" y="262.52542"><tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit_suite {</tspan><tspan x="523.33319" y="280.8588"><tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold"> const char</tspan> name[<tspan fill="#ff00ff" font-size="14.667px">256</tspan>];</tspan><tspan x="523.33319" y="299.19217"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">int</tspan> (*init)(<tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit *);</tspan><tspan x="523.33319" y="317.52554"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">void</tspan> (*exit)(<tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit *);</tspan><tspan x="523.33319" y="335.85892"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit_case *test_cases;</tspan><tspan x="523.33319" y="354.19229"> ...</tspan><tspan x="523.33319" y="372.52567">};</tspan></text> + </g> +</svg> diff --git a/Documentation/dev-tools/kunit/run_manual.rst b/Documentation/dev-tools/kunit/run_manual.rst new file mode 100644 index 0000000000..e7b46421f2 --- /dev/null +++ b/Documentation/dev-tools/kunit/run_manual.rst @@ -0,0 +1,57 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +Run Tests without kunit_tool +============================ + +If we do not want to use kunit_tool (For example: we want to integrate +with other systems, or run tests on real hardware), we can +include KUnit in any kernel, read out results, and parse manually. + +.. note:: KUnit is not designed for use in a production system. It is + possible that tests may reduce the stability or security of + the system. + +Configure the Kernel +==================== + +KUnit tests can run without kunit_tool. This can be useful, if: + +- We have an existing kernel configuration to test. +- Need to run on real hardware (or using an emulator/VM kunit_tool + does not support). +- Wish to integrate with some existing testing systems. + +KUnit is configured with the ``CONFIG_KUNIT`` option, and individual +tests can also be built by enabling their config options in our +``.config``. KUnit tests usually (but don't always) have config options +ending in ``_KUNIT_TEST``. Most tests can either be built as a module, +or be built into the kernel. + +.. note :: + + We can enable the ``KUNIT_ALL_TESTS`` config option to + automatically enable all tests with satisfied dependencies. This is + a good way of quickly testing everything applicable to the current + config. + +Once we have built our kernel (and/or modules), it is simple to run +the tests. If the tests are built-in, they will run automatically on the +kernel boot. The results will be written to the kernel log (``dmesg``) +in TAP format. + +If the tests are built as modules, they will run when the module is +loaded. + +.. code-block :: bash + + # modprobe example-test + +The results will appear in TAP format in ``dmesg``. + +.. note :: + + If ``CONFIG_KUNIT_DEBUGFS`` is enabled, KUnit test results will + be accessible from the ``debugfs`` filesystem (if mounted). + They will be in ``/sys/kernel/debug/kunit/<test_suite>/results``, in + TAP format. diff --git a/Documentation/dev-tools/kunit/run_wrapper.rst b/Documentation/dev-tools/kunit/run_wrapper.rst new file mode 100644 index 0000000000..19ddf5e070 --- /dev/null +++ b/Documentation/dev-tools/kunit/run_wrapper.rst @@ -0,0 +1,335 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================= +Running tests with kunit_tool +============================= + +We can either run KUnit tests using kunit_tool or can run tests +manually, and then use kunit_tool to parse the results. To run tests +manually, see: Documentation/dev-tools/kunit/run_manual.rst. +As long as we can build the kernel, we can run KUnit. + +kunit_tool is a Python script which configures and builds a kernel, runs +tests, and formats the test results. + +Run command: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run + +We should see the following: + +.. code-block:: + + Configuring KUnit Kernel ... + Building KUnit kernel... + Starting KUnit kernel... + +We may want to use the following options: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run --timeout=30 --jobs=`nproc --all` + +- ``--timeout`` sets a maximum amount of time for tests to run. +- ``--jobs`` sets the number of threads to build the kernel. + +kunit_tool will generate a ``.kunitconfig`` with a default +configuration, if no other ``.kunitconfig`` file exists +(in the build directory). In addition, it verifies that the +generated ``.config`` file contains the ``CONFIG`` options in the +``.kunitconfig``. +It is also possible to pass a separate ``.kunitconfig`` fragment to +kunit_tool. This is useful if we have several different groups of +tests we want to run independently, or if we want to use pre-defined +test configs for certain subsystems. + +To use a different ``.kunitconfig`` file (such as one +provided to test a particular subsystem), pass it as an option: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4/.kunitconfig + +To view kunit_tool flags (optional command-line arguments), run: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run --help + +Creating a ``.kunitconfig`` file +================================ + +If we want to run a specific set of tests (rather than those listed +in the KUnit ``defconfig``), we can provide Kconfig options in the +``.kunitconfig`` file. For default .kunitconfig, see: +https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/kunit/configs/default.config. +A ``.kunitconfig`` is a ``minconfig`` (a .config +generated by running ``make savedefconfig``), used for running a +specific set of tests. This file contains the regular Kernel configs +with specific test targets. The ``.kunitconfig`` also +contains any other config options required by the tests (For example: +dependencies for features under tests, configs that enable/disable +certain code blocks, arch configs and so on). + +To create a ``.kunitconfig``, using the KUnit ``defconfig``: + +.. code-block:: + + cd $PATH_TO_LINUX_REPO + cp tools/testing/kunit/configs/default.config .kunit/.kunitconfig + +We can then add any other Kconfig options. For example: + +.. code-block:: + + CONFIG_LIST_KUNIT_TEST=y + +kunit_tool ensures that all config options in ``.kunitconfig`` are +set in the kernel ``.config`` before running the tests. It warns if we +have not included the options dependencies. + +.. note:: Removing something from the ``.kunitconfig`` will + not rebuild the ``.config file``. The configuration is only + updated if the ``.kunitconfig`` is not a subset of ``.config``. + This means that we can use other tools + (For example: ``make menuconfig``) to adjust other config options. + The build dir needs to be set for ``make menuconfig`` to + work, therefore by default use ``make O=.kunit menuconfig``. + +Configuring, building, and running tests +======================================== + +If we want to make manual changes to the KUnit build process, we +can run part of the KUnit build process independently. +When running kunit_tool, from a ``.kunitconfig``, we can generate a +``.config`` by using the ``config`` argument: + +.. code-block:: + + ./tools/testing/kunit/kunit.py config + +To build a KUnit kernel from the current ``.config``, we can use the +``build`` argument: + +.. code-block:: + + ./tools/testing/kunit/kunit.py build + +If we already have built UML kernel with built-in KUnit tests, we +can run the kernel, and display the test results with the ``exec`` +argument: + +.. code-block:: + + ./tools/testing/kunit/kunit.py exec + +The ``run`` command discussed in section: **Running tests with kunit_tool**, +is equivalent to running the above three commands in sequence. + +Parsing test results +==================== + +KUnit tests output displays results in TAP (Test Anything Protocol) +format. When running tests, kunit_tool parses this output and prints +a summary. To see the raw test results in TAP format, we can pass the +``--raw_output`` argument: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run --raw_output + +If we have KUnit results in the raw TAP format, we can parse them and +print the human-readable summary with the ``parse`` command for +kunit_tool. This accepts a filename for an argument, or will read from +standard input. + +.. code-block:: bash + + # Reading from a file + ./tools/testing/kunit/kunit.py parse /var/log/dmesg + # Reading from stdin + dmesg | ./tools/testing/kunit/kunit.py parse + +Filtering tests +=============== + +By passing a bash style glob filter to the ``exec`` or ``run`` +commands, we can run a subset of the tests built into a kernel . For +example: if we only want to run KUnit resource tests, use: + +.. code-block:: + + ./tools/testing/kunit/kunit.py run 'kunit-resource*' + +This uses the standard glob format with wildcard characters. + +.. _kunit-on-qemu: + +Running tests on QEMU +===================== + +kunit_tool supports running tests on qemu as well as +via UML. To run tests on qemu, by default it requires two flags: + +- ``--arch``: Selects a configs collection (Kconfig, qemu config options + and so on), that allow KUnit tests to be run on the specified + architecture in a minimal way. The architecture argument is same as + the option name passed to the ``ARCH`` variable used by Kbuild. + Not all architectures currently support this flag, but we can use + ``--qemu_config`` to handle it. If ``um`` is passed (or this flag + is ignored), the tests will run via UML. Non-UML architectures, + for example: i386, x86_64, arm and so on; run on qemu. + +- ``--cross_compile``: Specifies the Kbuild toolchain. It passes the + same argument as passed to the ``CROSS_COMPILE`` variable used by + Kbuild. As a reminder, this will be the prefix for the toolchain + binaries such as GCC. For example: + + - ``sparc64-linux-gnu`` if we have the sparc toolchain installed on + our system. + + - ``$HOME/toolchains/microblaze/gcc-9.2.0-nolibc/microblaze-linux/bin/microblaze-linux`` + if we have downloaded the microblaze toolchain from the 0-day + website to a directory in our home directory called toolchains. + +This means that for most architectures, running under qemu is as simple as: + +.. code-block:: bash + + ./tools/testing/kunit/kunit.py run --arch=x86_64 + +When cross-compiling, we'll likely need to specify a different toolchain, for +example: + +.. code-block:: bash + + ./tools/testing/kunit/kunit.py run \ + --arch=s390 \ + --cross_compile=s390x-linux-gnu- + +If we want to run KUnit tests on an architecture not supported by +the ``--arch`` flag, or want to run KUnit tests on qemu using a +non-default configuration; then we can write our own``QemuConfig``. +These ``QemuConfigs`` are written in Python. They have an import line +``from..qemu_config import QemuArchParams`` at the top of the file. +The file must contain a variable called ``QEMU_ARCH`` that has an +instance of ``QemuArchParams`` assigned to it. See example in: +``tools/testing/kunit/qemu_configs/x86_64.py``. + +Once we have a ``QemuConfig``, we can pass it into kunit_tool, +using the ``--qemu_config`` flag. When used, this flag replaces the +``--arch`` flag. For example: using +``tools/testing/kunit/qemu_configs/x86_64.py``, the invocation appear +as + +.. code-block:: bash + + ./tools/testing/kunit/kunit.py run \ + --timeout=60 \ + --jobs=12 \ + --qemu_config=./tools/testing/kunit/qemu_configs/x86_64.py + +Running command-line arguments +============================== + +kunit_tool has a number of other command-line arguments which can +be useful for our test environment. Below are the most commonly used +command line arguments: + +- ``--help``: Lists all available options. To list common options, + place ``--help`` before the command. To list options specific to that + command, place ``--help`` after the command. + + .. note:: Different commands (``config``, ``build``, ``run``, etc) + have different supported options. +- ``--build_dir``: Specifies kunit_tool build directory. It includes + the ``.kunitconfig``, ``.config`` files and compiled kernel. + +- ``--make_options``: Specifies additional options to pass to make, when + compiling a kernel (using ``build`` or ``run`` commands). For example: + to enable compiler warnings, we can pass ``--make_options W=1``. + +- ``--alltests``: Enable a predefined set of options in order to build + as many tests as possible. + + .. note:: The list of enabled options can be found in + ``tools/testing/kunit/configs/all_tests.config``. + + If you only want to enable all tests with otherwise satisfied + dependencies, instead add ``CONFIG_KUNIT_ALL_TESTS=y`` to your + ``.kunitconfig``. + +- ``--kunitconfig``: Specifies the path or the directory of the ``.kunitconfig`` + file. For example: + + - ``lib/kunit/.kunitconfig`` can be the path of the file. + + - ``lib/kunit`` can be the directory in which the file is located. + + This file is used to build and run with a predefined set of tests + and their dependencies. For example, to run tests for a given subsystem. + +- ``--kconfig_add``: Specifies additional configuration options to be + appended to the ``.kunitconfig`` file. For example: + + .. code-block:: + + ./tools/testing/kunit/kunit.py run --kconfig_add CONFIG_KASAN=y + +- ``--arch``: Runs tests on the specified architecture. The architecture + argument is same as the Kbuild ARCH environment variable. + For example, i386, x86_64, arm, um, etc. Non-UML architectures run on qemu. + Default is `um`. + +- ``--cross_compile``: Specifies the Kbuild toolchain. It passes the + same argument as passed to the ``CROSS_COMPILE`` variable used by + Kbuild. This will be the prefix for the toolchain + binaries such as GCC. For example: + + - ``sparc64-linux-gnu-`` if we have the sparc toolchain installed on + our system. + + - ``$HOME/toolchains/microblaze/gcc-9.2.0-nolibc/microblaze-linux/bin/microblaze-linux`` + if we have downloaded the microblaze toolchain from the 0-day + website to a specified path in our home directory called toolchains. + +- ``--qemu_config``: Specifies the path to a file containing a + custom qemu architecture definition. This should be a python file + containing a `QemuArchParams` object. + +- ``--qemu_args``: Specifies additional qemu arguments, for example, ``-smp 8``. + +- ``--jobs``: Specifies the number of jobs (commands) to run simultaneously. + By default, this is set to the number of cores on your system. + +- ``--timeout``: Specifies the maximum number of seconds allowed for all tests to run. + This does not include the time taken to build the tests. + +- ``--kernel_args``: Specifies additional kernel command-line arguments. May be repeated. + +- ``--run_isolated``: If set, boots the kernel for each individual suite/test. + This is useful for debugging a non-hermetic test, one that + might pass/fail based on what ran before it. + +- ``--raw_output``: If set, generates unformatted output from kernel. Possible options are: + + - ``all``: To view the full kernel output, use ``--raw_output=all``. + + - ``kunit``: This is the default option and filters to KUnit output. Use ``--raw_output`` or ``--raw_output=kunit``. + +- ``--json``: If set, stores the test results in a JSON format and prints to `stdout` or + saves to a file if a filename is specified. + +- ``--filter``: Specifies filters on test attributes, for example, ``speed!=slow``. + Multiple filters can be used by wrapping input in quotes and separating filters + by commas. Example: ``--filter "speed>slow, module=example"``. + +- ``--filter_action``: If set to ``skip``, filtered tests will be shown as skipped + in the output rather than showing no output. + +- ``--list_tests``: If set, lists all tests that will be run. + +- ``--list_tests_attr``: If set, lists all tests that will be run and all of their + attributes. diff --git a/Documentation/dev-tools/kunit/running_tips.rst b/Documentation/dev-tools/kunit/running_tips.rst new file mode 100644 index 0000000000..766f9cdea0 --- /dev/null +++ b/Documentation/dev-tools/kunit/running_tips.rst @@ -0,0 +1,430 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +Tips For Running KUnit Tests +============================ + +Using ``kunit.py run`` ("kunit tool") +===================================== + +Running from any directory +-------------------------- + +It can be handy to create a bash function like: + +.. code-block:: bash + + function run_kunit() { + ( cd "$(git rev-parse --show-toplevel)" && ./tools/testing/kunit/kunit.py run "$@" ) + } + +.. note:: + Early versions of ``kunit.py`` (before 5.6) didn't work unless run from + the kernel root, hence the use of a subshell and ``cd``. + +Running a subset of tests +------------------------- + +``kunit.py run`` accepts an optional glob argument to filter tests. The format +is ``"<suite_glob>[.test_glob]"``. + +Say that we wanted to run the sysctl tests, we could do so via: + +.. code-block:: bash + + $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig + $ ./tools/testing/kunit/kunit.py run 'sysctl*' + +We can filter down to just the "write" tests via: + +.. code-block:: bash + + $ echo -e 'CONFIG_KUNIT=y\nCONFIG_KUNIT_ALL_TESTS=y' > .kunit/.kunitconfig + $ ./tools/testing/kunit/kunit.py run 'sysctl*.*write*' + +We're paying the cost of building more tests than we need this way, but it's +easier than fiddling with ``.kunitconfig`` files or commenting out +``kunit_suite``'s. + +However, if we wanted to define a set of tests in a less ad hoc way, the next +tip is useful. + +Defining a set of tests +----------------------- + +``kunit.py run`` (along with ``build``, and ``config``) supports a +``--kunitconfig`` flag. So if you have a set of tests that you want to run on a +regular basis (especially if they have other dependencies), you can create a +specific ``.kunitconfig`` for them. + +E.g. kunit has one for its tests: + +.. code-block:: bash + + $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit/.kunitconfig + +Alternatively, if you're following the convention of naming your +file ``.kunitconfig``, you can just pass in the dir, e.g. + +.. code-block:: bash + + $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit + +.. note:: + This is a relatively new feature (5.12+) so we don't have any + conventions yet about on what files should be checked in versus just + kept around locally. It's up to you and your maintainer to decide if a + config is useful enough to submit (and therefore have to maintain). + +.. note:: + Having ``.kunitconfig`` fragments in a parent and child directory is + iffy. There's discussion about adding an "import" statement in these + files to make it possible to have a top-level config run tests from all + child directories. But that would mean ``.kunitconfig`` files are no + longer just simple .config fragments. + + One alternative would be to have kunit tool recursively combine configs + automagically, but tests could theoretically depend on incompatible + options, so handling that would be tricky. + +Setting kernel commandline parameters +------------------------------------- + +You can use ``--kernel_args`` to pass arbitrary kernel arguments, e.g. + +.. code-block:: bash + + $ ./tools/testing/kunit/kunit.py run --kernel_args=param=42 --kernel_args=param2=false + + +Generating code coverage reports under UML +------------------------------------------ + +.. note:: + TODO(brendanhiggins@google.com): There are various issues with UML and + versions of gcc 7 and up. You're likely to run into missing ``.gcda`` + files or compile errors. + +This is different from the "normal" way of getting coverage information that is +documented in Documentation/dev-tools/gcov.rst. + +Instead of enabling ``CONFIG_GCOV_KERNEL=y``, we can set these options: + +.. code-block:: none + + CONFIG_DEBUG_KERNEL=y + CONFIG_DEBUG_INFO=y + CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y + CONFIG_GCOV=y + + +Putting it together into a copy-pastable sequence of commands: + +.. code-block:: bash + + # Append coverage options to the current config + $ ./tools/testing/kunit/kunit.py run --kunitconfig=.kunit/ --kunitconfig=tools/testing/kunit/configs/coverage_uml.config + # Extract the coverage information from the build dir (.kunit/) + $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ + + # From here on, it's the same process as with CONFIG_GCOV_KERNEL=y + # E.g. can generate an HTML report in a tmp dir like so: + $ genhtml -o /tmp/coverage_html coverage.info + + +If your installed version of gcc doesn't work, you can tweak the steps: + +.. code-block:: bash + + $ ./tools/testing/kunit/kunit.py run --make_options=CC=/usr/bin/gcc-6 + $ lcov -t "my_kunit_tests" -o coverage.info -c -d .kunit/ --gcov-tool=/usr/bin/gcov-6 + + +Running tests manually +====================== + +Running tests without using ``kunit.py run`` is also an important use case. +Currently it's your only option if you want to test on architectures other than +UML. + +As running the tests under UML is fairly straightforward (configure and compile +the kernel, run the ``./linux`` binary), this section will focus on testing +non-UML architectures. + + +Running built-in tests +---------------------- + +When setting tests to ``=y``, the tests will run as part of boot and print +results to dmesg in TAP format. So you just need to add your tests to your +``.config``, build and boot your kernel as normal. + +So if we compiled our kernel with: + +.. code-block:: none + + CONFIG_KUNIT=y + CONFIG_KUNIT_EXAMPLE_TEST=y + +Then we'd see output like this in dmesg signaling the test ran and passed: + +.. code-block:: none + + TAP version 14 + 1..1 + # Subtest: example + 1..1 + # example_simple_test: initializing + ok 1 - example_simple_test + ok 1 - example + +Running tests as modules +------------------------ + +Depending on the tests, you can build them as loadable modules. + +For example, we'd change the config options from before to + +.. code-block:: none + + CONFIG_KUNIT=y + CONFIG_KUNIT_EXAMPLE_TEST=m + +Then after booting into our kernel, we can run the test via + +.. code-block:: none + + $ modprobe kunit-example-test + +This will then cause it to print TAP output to stdout. + +.. note:: + The ``modprobe`` will *not* have a non-zero exit code if any test + failed (as of 5.13). But ``kunit.py parse`` would, see below. + +.. note:: + You can set ``CONFIG_KUNIT=m`` as well, however, some features will not + work and thus some tests might break. Ideally tests would specify they + depend on ``KUNIT=y`` in their ``Kconfig``'s, but this is an edge case + most test authors won't think about. + As of 5.13, the only difference is that ``current->kunit_test`` will + not exist. + +Pretty-printing results +----------------------- + +You can use ``kunit.py parse`` to parse dmesg for test output and print out +results in the same familiar format that ``kunit.py run`` does. + +.. code-block:: bash + + $ ./tools/testing/kunit/kunit.py parse /var/log/dmesg + + +Retrieving per suite results +---------------------------- + +Regardless of how you're running your tests, you can enable +``CONFIG_KUNIT_DEBUGFS`` to expose per-suite TAP-formatted results: + +.. code-block:: none + + CONFIG_KUNIT=y + CONFIG_KUNIT_EXAMPLE_TEST=m + CONFIG_KUNIT_DEBUGFS=y + +The results for each suite will be exposed under +``/sys/kernel/debug/kunit/<suite>/results``. +So using our example config: + +.. code-block:: bash + + $ modprobe kunit-example-test > /dev/null + $ cat /sys/kernel/debug/kunit/example/results + ... <TAP output> ... + + # After removing the module, the corresponding files will go away + $ modprobe -r kunit-example-test + $ cat /sys/kernel/debug/kunit/example/results + /sys/kernel/debug/kunit/example/results: No such file or directory + +Generating code coverage reports +-------------------------------- + +See Documentation/dev-tools/gcov.rst for details on how to do this. + +The only vaguely KUnit-specific advice here is that you probably want to build +your tests as modules. That way you can isolate the coverage from tests from +other code executed during boot, e.g. + +.. code-block:: bash + + # Reset coverage counters before running the test. + $ echo 0 > /sys/kernel/debug/gcov/reset + $ modprobe kunit-example-test + + +Test Attributes and Filtering +============================= + +Test suites and cases can be marked with test attributes, such as speed of +test. These attributes will later be printed in test output and can be used to +filter test execution. + +Marking Test Attributes +----------------------- + +Tests are marked with an attribute by including a ``kunit_attributes`` object +in the test definition. + +Test cases can be marked using the ``KUNIT_CASE_ATTR(test_name, attributes)`` +macro to define the test case instead of ``KUNIT_CASE(test_name)``. + +.. code-block:: c + + static const struct kunit_attributes example_attr = { + .speed = KUNIT_VERY_SLOW, + }; + + static struct kunit_case example_test_cases[] = { + KUNIT_CASE_ATTR(example_test, example_attr), + }; + +.. note:: + To mark a test case as slow, you can also use ``KUNIT_CASE_SLOW(test_name)``. + This is a helpful macro as the slow attribute is the most commonly used. + +Test suites can be marked with an attribute by setting the "attr" field in the +suite definition. + +.. code-block:: c + + static const struct kunit_attributes example_attr = { + .speed = KUNIT_VERY_SLOW, + }; + + static struct kunit_suite example_test_suite = { + ..., + .attr = example_attr, + }; + +.. note:: + Not all attributes need to be set in a ``kunit_attributes`` object. Unset + attributes will remain uninitialized and act as though the attribute is set + to 0 or NULL. Thus, if an attribute is set to 0, it is treated as unset. + These unset attributes will not be reported and may act as a default value + for filtering purposes. + +Reporting Attributes +-------------------- + +When a user runs tests, attributes will be present in the raw kernel output (in +KTAP format). Note that attributes will be hidden by default in kunit.py output +for all passing tests but the raw kernel output can be accessed using the +``--raw_output`` flag. This is an example of how test attributes for test cases +will be formatted in kernel output: + +.. code-block:: none + + # example_test.speed: slow + ok 1 example_test + +This is an example of how test attributes for test suites will be formatted in +kernel output: + +.. code-block:: none + + KTAP version 2 + # Subtest: example_suite + # module: kunit_example_test + 1..3 + ... + ok 1 example_suite + +Additionally, users can output a full attribute report of tests with their +attributes, using the command line flag ``--list_tests_attr``: + +.. code-block:: bash + + kunit.py run "example" --list_tests_attr + +.. note:: + This report can be accessed when running KUnit manually by passing in the + module_param ``kunit.action=list_attr``. + +Filtering +--------- + +Users can filter tests using the ``--filter`` command line flag when running +tests. As an example: + +.. code-block:: bash + + kunit.py run --filter speed=slow + + +You can also use the following operations on filters: "<", ">", "<=", ">=", +"!=", and "=". Example: + +.. code-block:: bash + + kunit.py run --filter "speed>slow" + +This example will run all tests with speeds faster than slow. Note that the +characters < and > are often interpreted by the shell, so they may need to be +quoted or escaped, as above. + +Additionally, you can use multiple filters at once. Simply separate filters +using commas. Example: + +.. code-block:: bash + + kunit.py run --filter "speed>slow, module=kunit_example_test" + +.. note:: + You can use this filtering feature when running KUnit manually by passing + the filter as a module param: ``kunit.filter="speed>slow, speed<=normal"``. + +Filtered tests will not run or show up in the test output. You can use the +``--filter_action=skip`` flag to skip filtered tests instead. These tests will be +shown in the test output in the test but will not run. To use this feature when +running KUnit manually, use the module param ``kunit.filter_action=skip``. + +Rules of Filtering Procedure +---------------------------- + +Since both suites and test cases can have attributes, there may be conflicts +between attributes during filtering. The process of filtering follows these +rules: + +- Filtering always operates at a per-test level. + +- If a test has an attribute set, then the test's value is filtered on. + +- Otherwise, the value falls back to the suite's value. + +- If neither are set, the attribute has a global "default" value, which is used. + +List of Current Attributes +-------------------------- + +``speed`` + +This attribute indicates the speed of a test's execution (how slow or fast the +test is). + +This attribute is saved as an enum with the following categories: "normal", +"slow", or "very_slow". The assumed default speed for tests is "normal". This +indicates that the test takes a relatively trivial amount of time (less than +1 second), regardless of the machine it is running on. Any test slower than +this could be marked as "slow" or "very_slow". + +The macro ``KUNIT_CASE_SLOW(test_name)`` can be easily used to set the speed +of a test case to "slow". + +``module`` + +This attribute indicates the name of the module associated with the test. + +This attribute is automatically saved as a string and is printed for each suite. +Tests can also be filtered using this attribute. diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst new file mode 100644 index 0000000000..a98235326b --- /dev/null +++ b/Documentation/dev-tools/kunit/start.rst @@ -0,0 +1,309 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=============== +Getting Started +=============== + +This page contains an overview of the kunit_tool and KUnit framework, +teaching how to run existing tests and then how to write a simple test case, +and covers common problems users face when using KUnit for the first time. + +Installing Dependencies +======================= +KUnit has the same dependencies as the Linux kernel. As long as you can +build the kernel, you can run KUnit. + +Running tests with kunit_tool +============================= +kunit_tool is a Python script, which configures and builds a kernel, runs +tests, and formats the test results. From the kernel repository, you +can run kunit_tool: + +.. code-block:: bash + + ./tools/testing/kunit/kunit.py run + +.. note :: + You may see the following error: + "The source tree is not clean, please run 'make ARCH=um mrproper'" + + This happens because internally kunit.py specifies ``.kunit`` + (default option) as the build directory in the command ``make O=output/dir`` + through the argument ``--build_dir``. Hence, before starting an + out-of-tree build, the source tree must be clean. + + There is also the same caveat mentioned in the "Build directory for + the kernel" section of the :doc:`admin-guide </admin-guide/README>`, + that is, its use, it must be used for all invocations of ``make``. + The good news is that it can indeed be solved by running + ``make ARCH=um mrproper``, just be aware that this will delete the + current configuration and all generated files. + +If everything worked correctly, you should see the following: + +.. code-block:: + + Configuring KUnit Kernel ... + Building KUnit Kernel ... + Starting KUnit Kernel ... + +The tests will pass or fail. + +.. note :: + Because it is building a lot of sources for the first time, + the ``Building KUnit Kernel`` step may take a while. + +For detailed information on this wrapper, see: +Documentation/dev-tools/kunit/run_wrapper.rst. + +Selecting which tests to run +---------------------------- + +By default, kunit_tool runs all tests reachable with minimal configuration, +that is, using default values for most of the kconfig options. However, +you can select which tests to run by: + +- `Customizing Kconfig`_ used to compile the kernel, or +- `Filtering tests by name`_ to select specifically which compiled tests to run. + +Customizing Kconfig +~~~~~~~~~~~~~~~~~~~ +A good starting point for the ``.kunitconfig`` is the KUnit default config. +If you didn't run ``kunit.py run`` yet, you can generate it by running: + +.. code-block:: bash + + cd $PATH_TO_LINUX_REPO + tools/testing/kunit/kunit.py config + cat .kunit/.kunitconfig + +.. note :: + ``.kunitconfig`` lives in the ``--build_dir`` used by kunit.py, which is + ``.kunit`` by default. + +Before running the tests, kunit_tool ensures that all config options +set in ``.kunitconfig`` are set in the kernel ``.config``. It will warn +you if you have not included dependencies for the options used. + +There are many ways to customize the configurations: + +a. Edit ``.kunit/.kunitconfig``. The file should contain the list of kconfig + options required to run the desired tests, including their dependencies. + You may want to remove CONFIG_KUNIT_ALL_TESTS from the ``.kunitconfig`` as + it will enable a number of additional tests that you may not want. + If you need to run on an architecture other than UML see :ref:`kunit-on-qemu`. + +b. Enable additional kconfig options on top of ``.kunit/.kunitconfig``. + For example, to include the kernel's linked-list test you can run:: + + ./tools/testing/kunit/kunit.py run \ + --kconfig_add CONFIG_LIST_KUNIT_TEST=y + +c. Provide the path of one or more .kunitconfig files from the tree. + For example, to run only ``FAT_FS`` and ``EXT4`` tests you can run:: + + ./tools/testing/kunit/kunit.py run \ + --kunitconfig ./fs/fat/.kunitconfig \ + --kunitconfig ./fs/ext4/.kunitconfig + +d. If you change the ``.kunitconfig``, kunit.py will trigger a rebuild of the + ``.config`` file. But you can edit the ``.config`` file directly or with + tools like ``make menuconfig O=.kunit``. As long as its a superset of + ``.kunitconfig``, kunit.py won't overwrite your changes. + + +.. note :: + + To save a .kunitconfig after finding a satisfactory configuration:: + + make savedefconfig O=.kunit + cp .kunit/defconfig .kunit/.kunitconfig + +Filtering tests by name +~~~~~~~~~~~~~~~~~~~~~~~ +If you want to be more specific than Kconfig can provide, it is also possible +to select which tests to execute at boot-time by passing a glob filter +(read instructions regarding the pattern in the manpage :manpage:`glob(7)`). +If there is a ``"."`` (period) in the filter, it will be interpreted as a +separator between the name of the test suite and the test case, +otherwise, it will be interpreted as the name of the test suite. +For example, let's assume we are using the default config: + +a. inform the name of a test suite, like ``"kunit_executor_test"``, + to run every test case it contains:: + + ./tools/testing/kunit/kunit.py run "kunit_executor_test" + +b. inform the name of a test case prefixed by its test suite, + like ``"example.example_simple_test"``, to run specifically that test case:: + + ./tools/testing/kunit/kunit.py run "example.example_simple_test" + +c. use wildcard characters (``*?[``) to run any test case that matches the pattern, + like ``"*.*64*"`` to run test cases containing ``"64"`` in the name inside + any test suite:: + + ./tools/testing/kunit/kunit.py run "*.*64*" + +Running Tests without the KUnit Wrapper +======================================= +If you do not want to use the KUnit Wrapper (for example: you want code +under test to integrate with other systems, or use a different/ +unsupported architecture or configuration), KUnit can be included in +any kernel, and the results are read out and parsed manually. + +.. note :: + ``CONFIG_KUNIT`` should not be enabled in a production environment. + Enabling KUnit disables Kernel Address-Space Layout Randomization + (KASLR), and tests may affect the state of the kernel in ways not + suitable for production. + +Configuring the Kernel +---------------------- +To enable KUnit itself, you need to enable the ``CONFIG_KUNIT`` Kconfig +option (under Kernel Hacking/Kernel Testing and Coverage in +``menuconfig``). From there, you can enable any KUnit tests. They +usually have config options ending in ``_KUNIT_TEST``. + +KUnit and KUnit tests can be compiled as modules. The tests in a module +will run when the module is loaded. + +Running Tests (without KUnit Wrapper) +------------------------------------- +Build and run your kernel. In the kernel log, the test output is printed +out in the TAP format. This will only happen by default if KUnit/tests +are built-in. Otherwise the module will need to be loaded. + +.. note :: + Some lines and/or data may get interspersed in the TAP output. + +Writing Your First Test +======================= +In your kernel repository, let's add some code that we can test. + +1. Create a file ``drivers/misc/example.h``, which includes: + +.. code-block:: c + + int misc_example_add(int left, int right); + +2. Create a file ``drivers/misc/example.c``, which includes: + +.. code-block:: c + + #include <linux/errno.h> + + #include "example.h" + + int misc_example_add(int left, int right) + { + return left + right; + } + +3. Add the following lines to ``drivers/misc/Kconfig``: + +.. code-block:: kconfig + + config MISC_EXAMPLE + bool "My example" + +4. Add the following lines to ``drivers/misc/Makefile``: + +.. code-block:: make + + obj-$(CONFIG_MISC_EXAMPLE) += example.o + +Now we are ready to write the test cases. + +1. Add the below test case in ``drivers/misc/example_test.c``: + +.. code-block:: c + + #include <kunit/test.h> + #include "example.h" + + /* Define the test cases. */ + + static void misc_example_add_test_basic(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, 1, misc_example_add(1, 0)); + KUNIT_EXPECT_EQ(test, 2, misc_example_add(1, 1)); + KUNIT_EXPECT_EQ(test, 0, misc_example_add(-1, 1)); + KUNIT_EXPECT_EQ(test, INT_MAX, misc_example_add(0, INT_MAX)); + KUNIT_EXPECT_EQ(test, -1, misc_example_add(INT_MAX, INT_MIN)); + } + + static void misc_example_test_failure(struct kunit *test) + { + KUNIT_FAIL(test, "This test never passes."); + } + + static struct kunit_case misc_example_test_cases[] = { + KUNIT_CASE(misc_example_add_test_basic), + KUNIT_CASE(misc_example_test_failure), + {} + }; + + static struct kunit_suite misc_example_test_suite = { + .name = "misc-example", + .test_cases = misc_example_test_cases, + }; + kunit_test_suite(misc_example_test_suite); + + MODULE_LICENSE("GPL"); + +2. Add the following lines to ``drivers/misc/Kconfig``: + +.. code-block:: kconfig + + config MISC_EXAMPLE_TEST + tristate "Test for my example" if !KUNIT_ALL_TESTS + depends on MISC_EXAMPLE && KUNIT + default KUNIT_ALL_TESTS + +Note: If your test does not support being built as a loadable module (which is +discouraged), replace tristate by bool, and depend on KUNIT=y instead of KUNIT. + +3. Add the following lines to ``drivers/misc/Makefile``: + +.. code-block:: make + + obj-$(CONFIG_MISC_EXAMPLE_TEST) += example_test.o + +4. Add the following lines to ``.kunit/.kunitconfig``: + +.. code-block:: none + + CONFIG_MISC_EXAMPLE=y + CONFIG_MISC_EXAMPLE_TEST=y + +5. Run the test: + +.. code-block:: bash + + ./tools/testing/kunit/kunit.py run + +You should see the following failure: + +.. code-block:: none + + ... + [16:08:57] [PASSED] misc-example:misc_example_add_test_basic + [16:08:57] [FAILED] misc-example:misc_example_test_failure + [16:08:57] EXPECTATION FAILED at drivers/misc/example-test.c:17 + [16:08:57] This test never passes. + ... + +Congrats! You just wrote your first KUnit test. + +Next Steps +========== + +If you're interested in using some of the more advanced features of kunit.py, +take a look at Documentation/dev-tools/kunit/run_wrapper.rst + +If you'd like to run tests without using kunit.py, check out +Documentation/dev-tools/kunit/run_manual.rst + +For more information on writing KUnit tests (including some common techniques +for testing different things), see Documentation/dev-tools/kunit/usage.rst diff --git a/Documentation/dev-tools/kunit/style.rst b/Documentation/dev-tools/kunit/style.rst new file mode 100644 index 0000000000..b6d0d7359f --- /dev/null +++ b/Documentation/dev-tools/kunit/style.rst @@ -0,0 +1,202 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=========================== +Test Style and Nomenclature +=========================== + +To make finding, writing, and using KUnit tests as simple as possible, it is +strongly encouraged that they are named and written according to the guidelines +below. While it is possible to write KUnit tests which do not follow these rules, +they may break some tooling, may conflict with other tests, and may not be run +automatically by testing systems. + +It is recommended that you only deviate from these guidelines when: + +1. Porting tests to KUnit which are already known with an existing name. +2. Writing tests which would cause serious problems if automatically run. For + example, non-deterministically producing false positives or negatives, or + taking a long time to run. + +Subsystems, Suites, and Tests +============================= + +To make tests easy to find, they are grouped into suites and subsystems. A test +suite is a group of tests which test a related area of the kernel. A subsystem +is a set of test suites which test different parts of a kernel subsystem +or a driver. + +Subsystems +---------- + +Every test suite must belong to a subsystem. A subsystem is a collection of one +or more KUnit test suites which test the same driver or part of the kernel. A +test subsystem should match a single kernel module. If the code being tested +cannot be compiled as a module, in many cases the subsystem should correspond to +a directory in the source tree or an entry in the ``MAINTAINERS`` file. If +unsure, follow the conventions set by tests in similar areas. + +Test subsystems should be named after the code being tested, either after the +module (wherever possible), or after the directory or files being tested. Test +subsystems should be named to avoid ambiguity where necessary. + +If a test subsystem name has multiple components, they should be separated by +underscores. *Do not* include "test" or "kunit" directly in the subsystem name +unless we are actually testing other tests or the kunit framework itself. For +example, subsystems could be called: + +``ext4`` + Matches the module and filesystem name. +``apparmor`` + Matches the module name and LSM name. +``kasan`` + Common name for the tool, prominent part of the path ``mm/kasan`` +``snd_hda_codec_hdmi`` + Has several components (``snd``, ``hda``, ``codec``, ``hdmi``) separated by + underscores. Matches the module name. + +Avoid names as shown in examples below: + +``linear-ranges`` + Names should use underscores, not dashes, to separate words. Prefer + ``linear_ranges``. +``qos-kunit-test`` + This name should use underscores, and not have "kunit-test" as a + suffix. ``qos`` is also ambiguous as a subsystem name, because several parts + of the kernel have a ``qos`` subsystem. ``power_qos`` would be a better name. +``pc_parallel_port`` + The corresponding module name is ``parport_pc``, so this subsystem should also + be named ``parport_pc``. + +.. note:: + The KUnit API and tools do not explicitly know about subsystems. They are + a way of categorizing test suites and naming modules which provides a + simple, consistent way for humans to find and run tests. This may change + in the future. + +Suites +------ + +KUnit tests are grouped into test suites, which cover a specific area of +functionality being tested. Test suites can have shared initialization and +shutdown code which is run for all tests in the suite. Not all subsystems need +to be split into multiple test suites (for example, simple drivers). + +Test suites are named after the subsystem they are part of. If a subsystem +contains several suites, the specific area under test should be appended to the +subsystem name, separated by an underscore. + +In the event that there are multiple types of test using KUnit within a +subsystem (for example, both unit tests and integration tests), they should be +put into separate suites, with the type of test as the last element in the suite +name. Unless these tests are actually present, avoid using ``_test``, ``_unittest`` +or similar in the suite name. + +The full test suite name (including the subsystem name) should be specified as +the ``.name`` member of the ``kunit_suite`` struct, and forms the base for the +module name. For example, test suites could include: + +``ext4_inode`` + Part of the ``ext4`` subsystem, testing the ``inode`` area. +``kunit_try_catch`` + Part of the ``kunit`` implementation itself, testing the ``try_catch`` area. +``apparmor_property_entry`` + Part of the ``apparmor`` subsystem, testing the ``property_entry`` area. +``kasan`` + The ``kasan`` subsystem has only one suite, so the suite name is the same as + the subsystem name. + +Avoid names, for example: + +``ext4_ext4_inode`` + There is no reason to state the subsystem twice. +``property_entry`` + The suite name is ambiguous without the subsystem name. +``kasan_integration_test`` + Because there is only one suite in the ``kasan`` subsystem, the suite should + just be called as ``kasan``. Do not redundantly add + ``integration_test``. It should be a separate test suite. For example, if the + unit tests are added, then that suite could be named as ``kasan_unittest`` or + similar. + +Test Cases +---------- + +Individual tests consist of a single function which tests a constrained +codepath, property, or function. In the test output, an individual test's +results will show up as subtests of the suite's results. + +Tests should be named after what they are testing. This is often the name of the +function being tested, with a description of the input or codepath being tested. +As tests are C functions, they should be named and written in accordance with +the kernel coding style. + +.. note:: + As tests are themselves functions, their names cannot conflict with + other C identifiers in the kernel. This may require some creative + naming. It is a good idea to make your test functions `static` to avoid + polluting the global namespace. + +Example test names include: + +``unpack_u32_with_null_name`` + Tests the ``unpack_u32`` function when a NULL name is passed in. +``test_list_splice`` + Tests the ``list_splice`` macro. It has the prefix ``test_`` to avoid a + name conflict with the macro itself. + + +Should it be necessary to refer to a test outside the context of its test suite, +the *fully-qualified* name of a test should be the suite name followed by the +test name, separated by a colon (i.e. ``suite:test``). + +Test Kconfig Entries +==================== + +Every test suite should be tied to a Kconfig entry. + +This Kconfig entry must: + +* be named ``CONFIG_<name>_KUNIT_TEST``: where <name> is the name of the test + suite. +* be listed either alongside the config entries for the driver/subsystem being + tested, or be under [Kernel Hacking]->[Kernel Testing and Coverage] +* depend on ``CONFIG_KUNIT``. +* be visible only if ``CONFIG_KUNIT_ALL_TESTS`` is not enabled. +* have a default value of ``CONFIG_KUNIT_ALL_TESTS``. +* have a brief description of KUnit in the help text. + +If we are not able to meet above conditions (for example, the test is unable to +be built as a module), Kconfig entries for tests should be tristate. + +For example, a Kconfig entry might look like: + +.. code-block:: none + + config FOO_KUNIT_TEST + tristate "KUnit test for foo" if !KUNIT_ALL_TESTS + depends on KUNIT + default KUNIT_ALL_TESTS + help + This builds unit tests for foo. + + For more information on KUnit and unit tests in general, + please refer to the KUnit documentation in Documentation/dev-tools/kunit/. + + If unsure, say N. + + +Test File and Module Names +========================== + +KUnit tests can often be compiled as a module. These modules should be named +after the test suite, followed by ``_test``. If this is likely to conflict with +non-KUnit tests, the suffix ``_kunit`` can also be used. + +The easiest way of achieving this is to name the file containing the test suite +``<suite>_test.c`` (or, as above, ``<suite>_kunit.c``). This file should be +placed next to the code under test. + +If the suite name contains some or all of the name of the test's parent +directory, it may make sense to modify the source filename to reduce redundancy. +For example, a ``foo_firmware`` suite could be in the ``foo/firmware_test.c`` +file. diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst new file mode 100644 index 0000000000..c27e1646ec --- /dev/null +++ b/Documentation/dev-tools/kunit/usage.rst @@ -0,0 +1,795 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Writing Tests +============= + +Test Cases +---------- + +The fundamental unit in KUnit is the test case. A test case is a function with +the signature ``void (*)(struct kunit *test)``. It calls the function under test +and then sets *expectations* for what should happen. For example: + +.. code-block:: c + + void example_test_success(struct kunit *test) + { + } + + void example_test_failure(struct kunit *test) + { + KUNIT_FAIL(test, "This test never passes."); + } + +In the above example, ``example_test_success`` always passes because it does +nothing; no expectations are set, and therefore all expectations pass. On the +other hand ``example_test_failure`` always fails because it calls ``KUNIT_FAIL``, +which is a special expectation that logs a message and causes the test case to +fail. + +Expectations +~~~~~~~~~~~~ +An *expectation* specifies that we expect a piece of code to do something in a +test. An expectation is called like a function. A test is made by setting +expectations about the behavior of a piece of code under test. When one or more +expectations fail, the test case fails and information about the failure is +logged. For example: + +.. code-block:: c + + void add_test_basic(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, 1, add(1, 0)); + KUNIT_EXPECT_EQ(test, 2, add(1, 1)); + } + +In the above example, ``add_test_basic`` makes a number of assertions about the +behavior of a function called ``add``. The first parameter is always of type +``struct kunit *``, which contains information about the current test context. +The second parameter, in this case, is what the value is expected to be. The +last value is what the value actually is. If ``add`` passes all of these +expectations, the test case, ``add_test_basic`` will pass; if any one of these +expectations fails, the test case will fail. + +A test case *fails* when any expectation is violated; however, the test will +continue to run, and try other expectations until the test case ends or is +otherwise terminated. This is as opposed to *assertions* which are discussed +later. + +To learn about more KUnit expectations, see Documentation/dev-tools/kunit/api/test.rst. + +.. note:: + A single test case should be short, easy to understand, and focused on a + single behavior. + +For example, if we want to rigorously test the ``add`` function above, create +additional tests cases which would test each property that an ``add`` function +should have as shown below: + +.. code-block:: c + + void add_test_basic(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, 1, add(1, 0)); + KUNIT_EXPECT_EQ(test, 2, add(1, 1)); + } + + void add_test_negative(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, 0, add(-1, 1)); + } + + void add_test_max(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, INT_MAX, add(0, INT_MAX)); + KUNIT_EXPECT_EQ(test, -1, add(INT_MAX, INT_MIN)); + } + + void add_test_overflow(struct kunit *test) + { + KUNIT_EXPECT_EQ(test, INT_MIN, add(INT_MAX, 1)); + } + +Assertions +~~~~~~~~~~ + +An assertion is like an expectation, except that the assertion immediately +terminates the test case if the condition is not satisfied. For example: + +.. code-block:: c + + static void test_sort(struct kunit *test) + { + int *a, i, r = 1; + a = kunit_kmalloc_array(test, TEST_LEN, sizeof(*a), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, a); + for (i = 0; i < TEST_LEN; i++) { + r = (r * 725861) % 6599; + a[i] = r; + } + sort(a, TEST_LEN, sizeof(*a), cmpint, NULL); + for (i = 0; i < TEST_LEN-1; i++) + KUNIT_EXPECT_LE(test, a[i], a[i + 1]); + } + +In this example, we need to be able to allocate an array to test the ``sort()`` +function. So we use ``KUNIT_ASSERT_NOT_ERR_OR_NULL()`` to abort the test if +there's an allocation error. + +.. note:: + In other test frameworks, ``ASSERT`` macros are often implemented by calling + ``return`` so they only work from the test function. In KUnit, we stop the + current kthread on failure, so you can call them from anywhere. + +.. note:: + Warning: There is an exception to the above rule. You shouldn't use assertions + in the suite's exit() function, or in the free function for a resource. These + run when a test is shutting down, and an assertion here prevents further + cleanup code from running, potentially leading to a memory leak. + +Customizing error messages +-------------------------- + +Each of the ``KUNIT_EXPECT`` and ``KUNIT_ASSERT`` macros have a ``_MSG`` +variant. These take a format string and arguments to provide additional +context to the automatically generated error messages. + +.. code-block:: c + + char some_str[41]; + generate_sha1_hex_string(some_str); + + /* Before. Not easy to tell why the test failed. */ + KUNIT_EXPECT_EQ(test, strlen(some_str), 40); + + /* After. Now we see the offending string. */ + KUNIT_EXPECT_EQ_MSG(test, strlen(some_str), 40, "some_str='%s'", some_str); + +Alternatively, one can take full control over the error message by using +``KUNIT_FAIL()``, e.g. + +.. code-block:: c + + /* Before */ + KUNIT_EXPECT_EQ(test, some_setup_function(), 0); + + /* After: full control over the failure message. */ + if (some_setup_function()) + KUNIT_FAIL(test, "Failed to setup thing for testing"); + + +Test Suites +~~~~~~~~~~~ + +We need many test cases covering all the unit's behaviors. It is common to have +many similar tests. In order to reduce duplication in these closely related +tests, most unit testing frameworks (including KUnit) provide the concept of a +*test suite*. A test suite is a collection of test cases for a unit of code +with optional setup and teardown functions that run before/after the whole +suite and/or every test case. + +.. note:: + A test case will only run if it is associated with a test suite. + +For example: + +.. code-block:: c + + static struct kunit_case example_test_cases[] = { + KUNIT_CASE(example_test_foo), + KUNIT_CASE(example_test_bar), + KUNIT_CASE(example_test_baz), + {} + }; + + static struct kunit_suite example_test_suite = { + .name = "example", + .init = example_test_init, + .exit = example_test_exit, + .suite_init = example_suite_init, + .suite_exit = example_suite_exit, + .test_cases = example_test_cases, + }; + kunit_test_suite(example_test_suite); + +In the above example, the test suite ``example_test_suite`` would first run +``example_suite_init``, then run the test cases ``example_test_foo``, +``example_test_bar``, and ``example_test_baz``. Each would have +``example_test_init`` called immediately before it and ``example_test_exit`` +called immediately after it. Finally, ``example_suite_exit`` would be called +after everything else. ``kunit_test_suite(example_test_suite)`` registers the +test suite with the KUnit test framework. + +.. note:: + The ``exit`` and ``suite_exit`` functions will run even if ``init`` or + ``suite_init`` fail. Make sure that they can handle any inconsistent + state which may result from ``init`` or ``suite_init`` encountering errors + or exiting early. + +``kunit_test_suite(...)`` is a macro which tells the linker to put the +specified test suite in a special linker section so that it can be run by KUnit +either after ``late_init``, or when the test module is loaded (if the test was +built as a module). + +For more information, see Documentation/dev-tools/kunit/api/test.rst. + +.. _kunit-on-non-uml: + +Writing Tests For Other Architectures +------------------------------------- + +It is better to write tests that run on UML to tests that only run under a +particular architecture. It is better to write tests that run under QEMU or +another easy to obtain (and monetarily free) software environment to a specific +piece of hardware. + +Nevertheless, there are still valid reasons to write a test that is architecture +or hardware specific. For example, we might want to test code that really +belongs in ``arch/some-arch/*``. Even so, try to write the test so that it does +not depend on physical hardware. Some of our test cases may not need hardware, +only few tests actually require the hardware to test it. When hardware is not +available, instead of disabling tests, we can skip them. + +Now that we have narrowed down exactly what bits are hardware specific, the +actual procedure for writing and running the tests is same as writing normal +KUnit tests. + +.. important:: + We may have to reset hardware state. If this is not possible, we may only + be able to run one test case per invocation. + +.. TODO(brendanhiggins@google.com): Add an actual example of an architecture- + dependent KUnit test. + +Common Patterns +=============== + +Isolating Behavior +------------------ + +Unit testing limits the amount of code under test to a single unit. It controls +what code gets run when the unit under test calls a function. Where a function +is exposed as part of an API such that the definition of that function can be +changed without affecting the rest of the code base. In the kernel, this comes +from two constructs: classes, which are structs that contain function pointers +provided by the implementer, and architecture-specific functions, which have +definitions selected at compile time. + +Classes +~~~~~~~ + +Classes are not a construct that is built into the C programming language; +however, it is an easily derived concept. Accordingly, in most cases, every +project that does not use a standardized object oriented library (like GNOME's +GObject) has their own slightly different way of doing object oriented +programming; the Linux kernel is no exception. + +The central concept in kernel object oriented programming is the class. In the +kernel, a *class* is a struct that contains function pointers. This creates a +contract between *implementers* and *users* since it forces them to use the +same function signature without having to call the function directly. To be a +class, the function pointers must specify that a pointer to the class, known as +a *class handle*, be one of the parameters. Thus the member functions (also +known as *methods*) have access to member variables (also known as *fields*) +allowing the same implementation to have multiple *instances*. + +A class can be *overridden* by *child classes* by embedding the *parent class* +in the child class. Then when the child class *method* is called, the child +implementation knows that the pointer passed to it is of a parent contained +within the child. Thus, the child can compute the pointer to itself because the +pointer to the parent is always a fixed offset from the pointer to the child. +This offset is the offset of the parent contained in the child struct. For +example: + +.. code-block:: c + + struct shape { + int (*area)(struct shape *this); + }; + + struct rectangle { + struct shape parent; + int length; + int width; + }; + + int rectangle_area(struct shape *this) + { + struct rectangle *self = container_of(this, struct rectangle, parent); + + return self->length * self->width; + }; + + void rectangle_new(struct rectangle *self, int length, int width) + { + self->parent.area = rectangle_area; + self->length = length; + self->width = width; + } + +In this example, computing the pointer to the child from the pointer to the +parent is done by ``container_of``. + +Faking Classes +~~~~~~~~~~~~~~ + +In order to unit test a piece of code that calls a method in a class, the +behavior of the method must be controllable, otherwise the test ceases to be a +unit test and becomes an integration test. + +A fake class implements a piece of code that is different than what runs in a +production instance, but behaves identical from the standpoint of the callers. +This is done to replace a dependency that is hard to deal with, or is slow. For +example, implementing a fake EEPROM that stores the "contents" in an +internal buffer. Assume we have a class that represents an EEPROM: + +.. code-block:: c + + struct eeprom { + ssize_t (*read)(struct eeprom *this, size_t offset, char *buffer, size_t count); + ssize_t (*write)(struct eeprom *this, size_t offset, const char *buffer, size_t count); + }; + +And we want to test code that buffers writes to the EEPROM: + +.. code-block:: c + + struct eeprom_buffer { + ssize_t (*write)(struct eeprom_buffer *this, const char *buffer, size_t count); + int flush(struct eeprom_buffer *this); + size_t flush_count; /* Flushes when buffer exceeds flush_count. */ + }; + + struct eeprom_buffer *new_eeprom_buffer(struct eeprom *eeprom); + void destroy_eeprom_buffer(struct eeprom *eeprom); + +We can test this code by *faking out* the underlying EEPROM: + +.. code-block:: c + + struct fake_eeprom { + struct eeprom parent; + char contents[FAKE_EEPROM_CONTENTS_SIZE]; + }; + + ssize_t fake_eeprom_read(struct eeprom *parent, size_t offset, char *buffer, size_t count) + { + struct fake_eeprom *this = container_of(parent, struct fake_eeprom, parent); + + count = min(count, FAKE_EEPROM_CONTENTS_SIZE - offset); + memcpy(buffer, this->contents + offset, count); + + return count; + } + + ssize_t fake_eeprom_write(struct eeprom *parent, size_t offset, const char *buffer, size_t count) + { + struct fake_eeprom *this = container_of(parent, struct fake_eeprom, parent); + + count = min(count, FAKE_EEPROM_CONTENTS_SIZE - offset); + memcpy(this->contents + offset, buffer, count); + + return count; + } + + void fake_eeprom_init(struct fake_eeprom *this) + { + this->parent.read = fake_eeprom_read; + this->parent.write = fake_eeprom_write; + memset(this->contents, 0, FAKE_EEPROM_CONTENTS_SIZE); + } + +We can now use it to test ``struct eeprom_buffer``: + +.. code-block:: c + + struct eeprom_buffer_test { + struct fake_eeprom *fake_eeprom; + struct eeprom_buffer *eeprom_buffer; + }; + + static void eeprom_buffer_test_does_not_write_until_flush(struct kunit *test) + { + struct eeprom_buffer_test *ctx = test->priv; + struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer; + struct fake_eeprom *fake_eeprom = ctx->fake_eeprom; + char buffer[] = {0xff}; + + eeprom_buffer->flush_count = SIZE_MAX; + + eeprom_buffer->write(eeprom_buffer, buffer, 1); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0); + + eeprom_buffer->write(eeprom_buffer, buffer, 1); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0); + + eeprom_buffer->flush(eeprom_buffer); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff); + } + + static void eeprom_buffer_test_flushes_after_flush_count_met(struct kunit *test) + { + struct eeprom_buffer_test *ctx = test->priv; + struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer; + struct fake_eeprom *fake_eeprom = ctx->fake_eeprom; + char buffer[] = {0xff}; + + eeprom_buffer->flush_count = 2; + + eeprom_buffer->write(eeprom_buffer, buffer, 1); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0); + + eeprom_buffer->write(eeprom_buffer, buffer, 1); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff); + } + + static void eeprom_buffer_test_flushes_increments_of_flush_count(struct kunit *test) + { + struct eeprom_buffer_test *ctx = test->priv; + struct eeprom_buffer *eeprom_buffer = ctx->eeprom_buffer; + struct fake_eeprom *fake_eeprom = ctx->fake_eeprom; + char buffer[] = {0xff, 0xff}; + + eeprom_buffer->flush_count = 2; + + eeprom_buffer->write(eeprom_buffer, buffer, 1); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0); + + eeprom_buffer->write(eeprom_buffer, buffer, 2); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[0], 0xff); + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[1], 0xff); + /* Should have only flushed the first two bytes. */ + KUNIT_EXPECT_EQ(test, fake_eeprom->contents[2], 0); + } + + static int eeprom_buffer_test_init(struct kunit *test) + { + struct eeprom_buffer_test *ctx; + + ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx); + + ctx->fake_eeprom = kunit_kzalloc(test, sizeof(*ctx->fake_eeprom), GFP_KERNEL); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx->fake_eeprom); + fake_eeprom_init(ctx->fake_eeprom); + + ctx->eeprom_buffer = new_eeprom_buffer(&ctx->fake_eeprom->parent); + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx->eeprom_buffer); + + test->priv = ctx; + + return 0; + } + + static void eeprom_buffer_test_exit(struct kunit *test) + { + struct eeprom_buffer_test *ctx = test->priv; + + destroy_eeprom_buffer(ctx->eeprom_buffer); + } + +Testing Against Multiple Inputs +------------------------------- + +Testing just a few inputs is not enough to ensure that the code works correctly, +for example: testing a hash function. + +We can write a helper macro or function. The function is called for each input. +For example, to test ``sha1sum(1)``, we can write: + +.. code-block:: c + + #define TEST_SHA1(in, want) \ + sha1sum(in, out); \ + KUNIT_EXPECT_STREQ_MSG(test, out, want, "sha1sum(%s)", in); + + char out[40]; + TEST_SHA1("hello world", "2aae6c35c94fcfb415dbe95f408b9ce91ee846ed"); + TEST_SHA1("hello world!", "430ce34d020724ed75a196dfc2ad67c77772d169"); + +Note the use of the ``_MSG`` version of ``KUNIT_EXPECT_STREQ`` to print a more +detailed error and make the assertions clearer within the helper macros. + +The ``_MSG`` variants are useful when the same expectation is called multiple +times (in a loop or helper function) and thus the line number is not enough to +identify what failed, as shown below. + +In complicated cases, we recommend using a *table-driven test* compared to the +helper macro variation, for example: + +.. code-block:: c + + int i; + char out[40]; + + struct sha1_test_case { + const char *str; + const char *sha1; + }; + + struct sha1_test_case cases[] = { + { + .str = "hello world", + .sha1 = "2aae6c35c94fcfb415dbe95f408b9ce91ee846ed", + }, + { + .str = "hello world!", + .sha1 = "430ce34d020724ed75a196dfc2ad67c77772d169", + }, + }; + for (i = 0; i < ARRAY_SIZE(cases); ++i) { + sha1sum(cases[i].str, out); + KUNIT_EXPECT_STREQ_MSG(test, out, cases[i].sha1, + "sha1sum(%s)", cases[i].str); + } + + +There is more boilerplate code involved, but it can: + +* be more readable when there are multiple inputs/outputs (due to field names). + + * For example, see ``fs/ext4/inode-test.c``. + +* reduce duplication if test cases are shared across multiple tests. + + * For example: if we want to test ``sha256sum``, we could add a ``sha256`` + field and reuse ``cases``. + +* be converted to a "parameterized test". + +Parameterized Testing +~~~~~~~~~~~~~~~~~~~~~ + +The table-driven testing pattern is common enough that KUnit has special +support for it. + +By reusing the same ``cases`` array from above, we can write the test as a +"parameterized test" with the following. + +.. code-block:: c + + // This is copy-pasted from above. + struct sha1_test_case { + const char *str; + const char *sha1; + }; + const struct sha1_test_case cases[] = { + { + .str = "hello world", + .sha1 = "2aae6c35c94fcfb415dbe95f408b9ce91ee846ed", + }, + { + .str = "hello world!", + .sha1 = "430ce34d020724ed75a196dfc2ad67c77772d169", + }, + }; + + // Need a helper function to generate a name for each test case. + static void case_to_desc(const struct sha1_test_case *t, char *desc) + { + strcpy(desc, t->str); + } + // Creates `sha1_gen_params()` to iterate over `cases`. + KUNIT_ARRAY_PARAM(sha1, cases, case_to_desc); + + // Looks no different from a normal test. + static void sha1_test(struct kunit *test) + { + // This function can just contain the body of the for-loop. + // The former `cases[i]` is accessible under test->param_value. + char out[40]; + struct sha1_test_case *test_param = (struct sha1_test_case *)(test->param_value); + + sha1sum(test_param->str, out); + KUNIT_EXPECT_STREQ_MSG(test, out, test_param->sha1, + "sha1sum(%s)", test_param->str); + } + + // Instead of KUNIT_CASE, we use KUNIT_CASE_PARAM and pass in the + // function declared by KUNIT_ARRAY_PARAM. + static struct kunit_case sha1_test_cases[] = { + KUNIT_CASE_PARAM(sha1_test, sha1_gen_params), + {} + }; + +Allocating Memory +----------------- + +Where you might use ``kzalloc``, you can instead use ``kunit_kzalloc`` as KUnit +will then ensure that the memory is freed once the test completes. + +This is useful because it lets us use the ``KUNIT_ASSERT_EQ`` macros to exit +early from a test without having to worry about remembering to call ``kfree``. +For example: + +.. code-block:: c + + void example_test_allocation(struct kunit *test) + { + char *buffer = kunit_kzalloc(test, 16, GFP_KERNEL); + /* Ensure allocation succeeded. */ + KUNIT_ASSERT_NOT_ERR_OR_NULL(test, buffer); + + KUNIT_ASSERT_STREQ(test, buffer, ""); + } + +Registering Cleanup Actions +--------------------------- + +If you need to perform some cleanup beyond simple use of ``kunit_kzalloc``, +you can register a custom "deferred action", which is a cleanup function +run when the test exits (whether cleanly, or via a failed assertion). + +Actions are simple functions with no return value, and a single ``void*`` +context argument, and fulfill the same role as "cleanup" functions in Python +and Go tests, "defer" statements in languages which support them, and +(in some cases) destructors in RAII languages. + +These are very useful for unregistering things from global lists, closing +files or other resources, or freeing resources. + +For example: + +.. code-block:: C + + static void cleanup_device(void *ctx) + { + struct device *dev = (struct device *)ctx; + + device_unregister(dev); + } + + void example_device_test(struct kunit *test) + { + struct my_device dev; + + device_register(&dev); + + kunit_add_action(test, &cleanup_device, &dev); + } + +Note that, for functions like device_unregister which only accept a single +pointer-sized argument, it's possible to directly cast that function to +a ``kunit_action_t`` rather than writing a wrapper function, for example: + +.. code-block:: C + + kunit_add_action(test, (kunit_action_t *)&device_unregister, &dev); + +``kunit_add_action`` can fail if, for example, the system is out of memory. +You can use ``kunit_add_action_or_reset`` instead which runs the action +immediately if it cannot be deferred. + +If you need more control over when the cleanup function is called, you +can trigger it early using ``kunit_release_action``, or cancel it entirely +with ``kunit_remove_action``. + + +Testing Static Functions +------------------------ + +If we do not want to expose functions or variables for testing, one option is to +conditionally ``#include`` the test file at the end of your .c file. For +example: + +.. code-block:: c + + /* In my_file.c */ + + static int do_interesting_thing(); + + #ifdef CONFIG_MY_KUNIT_TEST + #include "my_kunit_test.c" + #endif + +Injecting Test-Only Code +------------------------ + +Similar to as shown above, we can add test-specific logic. For example: + +.. code-block:: c + + /* In my_file.h */ + + #ifdef CONFIG_MY_KUNIT_TEST + /* Defined in my_kunit_test.c */ + void test_only_hook(void); + #else + void test_only_hook(void) { } + #endif + +This test-only code can be made more useful by accessing the current ``kunit_test`` +as shown in next section: *Accessing The Current Test*. + +Accessing The Current Test +-------------------------- + +In some cases, we need to call test-only code from outside the test file. This +is helpful, for example, when providing a fake implementation of a function, or +to fail any current test from within an error handler. +We can do this via the ``kunit_test`` field in ``task_struct``, which we can +access using the ``kunit_get_current_test()`` function in ``kunit/test-bug.h``. + +``kunit_get_current_test()`` is safe to call even if KUnit is not enabled. If +KUnit is not enabled, or if no test is running in the current task, it will +return ``NULL``. This compiles down to either a no-op or a static key check, +so will have a negligible performance impact when no test is running. + +The example below uses this to implement a "mock" implementation of a function, ``foo``: + +.. code-block:: c + + #include <kunit/test-bug.h> /* for kunit_get_current_test */ + + struct test_data { + int foo_result; + int want_foo_called_with; + }; + + static int fake_foo(int arg) + { + struct kunit *test = kunit_get_current_test(); + struct test_data *test_data = test->priv; + + KUNIT_EXPECT_EQ(test, test_data->want_foo_called_with, arg); + return test_data->foo_result; + } + + static void example_simple_test(struct kunit *test) + { + /* Assume priv (private, a member used to pass test data from + * the init function) is allocated in the suite's .init */ + struct test_data *test_data = test->priv; + + test_data->foo_result = 42; + test_data->want_foo_called_with = 1; + + /* In a real test, we'd probably pass a pointer to fake_foo somewhere + * like an ops struct, etc. instead of calling it directly. */ + KUNIT_EXPECT_EQ(test, fake_foo(1), 42); + } + +In this example, we are using the ``priv`` member of ``struct kunit`` as a way +of passing data to the test from the init function. In general ``priv`` is +pointer that can be used for any user data. This is preferred over static +variables, as it avoids concurrency issues. + +Had we wanted something more flexible, we could have used a named ``kunit_resource``. +Each test can have multiple resources which have string names providing the same +flexibility as a ``priv`` member, but also, for example, allowing helper +functions to create resources without conflicting with each other. It is also +possible to define a clean up function for each resource, making it easy to +avoid resource leaks. For more information, see Documentation/dev-tools/kunit/api/resource.rst. + +Failing The Current Test +------------------------ + +If we want to fail the current test, we can use ``kunit_fail_current_test(fmt, args...)`` +which is defined in ``<kunit/test-bug.h>`` and does not require pulling in ``<kunit/test.h>``. +For example, we have an option to enable some extra debug checks on some data +structures as shown below: + +.. code-block:: c + + #include <kunit/test-bug.h> + + #ifdef CONFIG_EXTRA_DEBUG_CHECKS + static void validate_my_data(struct data *data) + { + if (is_valid(data)) + return; + + kunit_fail_current_test("data %p is invalid", data); + + /* Normal, non-KUnit, error reporting code here. */ + } + #else + static void my_debug_function(void) { } + #endif + +``kunit_fail_current_test()`` is safe to call even if KUnit is not enabled. If +KUnit is not enabled, or if no test is running in the current task, it will do +nothing. This compiles down to either a no-op or a static key check, so will +have a negligible performance impact when no test is running. diff --git a/Documentation/dev-tools/sparse.rst b/Documentation/dev-tools/sparse.rst new file mode 100644 index 0000000000..dc791c8d84 --- /dev/null +++ b/Documentation/dev-tools/sparse.rst @@ -0,0 +1,104 @@ +.. Copyright 2004 Linus Torvalds +.. Copyright 2004 Pavel Machek <pavel@ucw.cz> +.. Copyright 2006 Bob Copeland <me@bobcopeland.com> + +Sparse +====== + +Sparse is a semantic checker for C programs; it can be used to find a +number of potential problems with kernel code. See +https://lwn.net/Articles/689907/ for an overview of sparse; this document +contains some kernel-specific sparse information. +More information on sparse, mainly about its internals, can be found in +its official pages at https://sparse.docs.kernel.org. + + +Using sparse for typechecking +----------------------------- + +"__bitwise" is a type attribute, so you have to do something like this:: + + typedef int __bitwise pm_request_t; + + enum pm_request { + PM_SUSPEND = (__force pm_request_t) 1, + PM_RESUME = (__force pm_request_t) 2 + }; + +which makes PM_SUSPEND and PM_RESUME "bitwise" integers (the "__force" is +there because sparse will complain about casting to/from a bitwise type, +but in this case we really _do_ want to force the conversion). And because +the enum values are all the same type, now "enum pm_request" will be that +type too. + +And with gcc, all the "__bitwise"/"__force stuff" goes away, and it all +ends up looking just like integers to gcc. + +Quite frankly, you don't need the enum there. The above all really just +boils down to one special "int __bitwise" type. + +So the simpler way is to just do:: + + typedef int __bitwise pm_request_t; + + #define PM_SUSPEND ((__force pm_request_t) 1) + #define PM_RESUME ((__force pm_request_t) 2) + +and you now have all the infrastructure needed for strict typechecking. + +One small note: the constant integer "0" is special. You can use a +constant zero as a bitwise integer type without sparse ever complaining. +This is because "bitwise" (as the name implies) was designed for making +sure that bitwise types don't get mixed up (little-endian vs big-endian +vs cpu-endian vs whatever), and there the constant "0" really _is_ +special. + +Using sparse for lock checking +------------------------------ + +The following macros are undefined for gcc and defined during a sparse +run to use the "context" tracking feature of sparse, applied to +locking. These annotations tell sparse when a lock is held, with +regard to the annotated function's entry and exit. + +__must_hold - The specified lock is held on function entry and exit. + +__acquires - The specified lock is held on function exit, but not entry. + +__releases - The specified lock is held on function entry, but not exit. + +If the function enters and exits without the lock held, acquiring and +releasing the lock inside the function in a balanced way, no +annotation is needed. The three annotations above are for cases where +sparse would otherwise report a context imbalance. + +Getting sparse +-------------- + +You can get tarballs of the latest released versions from: +https://www.kernel.org/pub/software/devel/sparse/dist/ + +Alternatively, you can get snapshots of the latest development version +of sparse using git to clone:: + + git://git.kernel.org/pub/scm/devel/sparse/sparse.git + +Once you have it, just do:: + + make + make install + +as a regular user, and it will install sparse in your ~/bin directory. + +Using sparse +------------ + +Do a kernel make with "make C=1" to run sparse on all the C files that get +recompiled, or use "make C=2" to run sparse on the files whether they need to +be recompiled or not. The latter is a fast way to check the whole tree if you +have already built it. + +The optional make variable CF can be used to pass arguments to sparse. The +build system passes -Wbitwise to sparse automatically. + +Note that sparse defines the __CHECKER__ preprocessor symbol. diff --git a/Documentation/dev-tools/testing-overview.rst b/Documentation/dev-tools/testing-overview.rst new file mode 100644 index 0000000000..0aaf6ea536 --- /dev/null +++ b/Documentation/dev-tools/testing-overview.rst @@ -0,0 +1,180 @@ +.. SPDX-License-Identifier: GPL-2.0 + +==================== +Kernel Testing Guide +==================== + + +There are a number of different tools for testing the Linux kernel, so knowing +when to use each of them can be a challenge. This document provides a rough +overview of their differences, and how they fit together. + + +Writing and Running Tests +========================= + +The bulk of kernel tests are written using either the kselftest or KUnit +frameworks. These both provide infrastructure to help make running tests and +groups of tests easier, as well as providing helpers to aid in writing new +tests. + +If you're looking to verify the behaviour of the Kernel — particularly specific +parts of the kernel — then you'll want to use KUnit or kselftest. + + +The Difference Between KUnit and kselftest +------------------------------------------ + +KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system +for "white box" testing: because test code is part of the kernel, it can access +internal structures and functions which aren't exposed to userspace. + +KUnit tests therefore are best written against small, self-contained parts +of the kernel, which can be tested in isolation. This aligns well with the +concept of 'unit' testing. + +For example, a KUnit test might test an individual kernel function (or even a +single codepath through a function, such as an error handling case), rather +than a feature as a whole. + +This also makes KUnit tests very fast to build and run, allowing them to be +run frequently as part of the development process. + +There is a KUnit test style guide which may give further pointers in +Documentation/dev-tools/kunit/style.rst + + +kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is +largely implemented in userspace, and tests are normal userspace scripts or +programs. + +This makes it easier to write more complicated tests, or tests which need to +manipulate the overall system state more (e.g., spawning processes, etc.). +However, it's not possible to call kernel functions directly from kselftest. +This means that only kernel functionality which is exposed to userspace somehow +(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To +work around this, some tests include a companion kernel module which exposes +more information or functionality. If a test runs mostly or entirely within the +kernel, however, KUnit may be the more appropriate tool. + +kselftest is therefore suited well to tests of whole features, as these will +expose an interface to userspace, which can be tested, but not implementation +details. This aligns well with 'system' or 'end-to-end' testing. + +For example, all new system calls should be accompanied by kselftest tests. + +Code Coverage Tools +=================== + +The Linux Kernel supports two different code coverage measurement tools. These +can be used to verify that a test is executing particular functions or lines +of code. This is useful for determining how much of the kernel is being tested, +and for finding corner-cases which are not covered by the appropriate test. + +Documentation/dev-tools/gcov.rst is GCC's coverage testing tool, which can be +used with the kernel to get global or per-module coverage. Unlike KCOV, it +does not record per-task coverage. Coverage data can be read from debugfs, +and interpreted using the usual gcov tooling. + +Documentation/dev-tools/kcov.rst is a feature which can be built in to the +kernel to allow capturing coverage on a per-task level. It's therefore useful +for fuzzing and other situations where information about code executed during, +for example, a single syscall is useful. + + +Dynamic Analysis Tools +====================== + +The kernel also supports a number of dynamic analysis tools, which attempt to +detect classes of issues when they occur in a running kernel. These typically +each look for a different class of bugs, such as invalid memory accesses, +concurrency issues such as data races, or other undefined behaviour like +integer overflows. + +Some of these tools are listed below: + +* kmemleak detects possible memory leaks. See + Documentation/dev-tools/kmemleak.rst +* KASAN detects invalid memory accesses such as out-of-bounds and + use-after-free errors. See Documentation/dev-tools/kasan.rst +* UBSAN detects behaviour that is undefined by the C standard, like integer + overflows. See Documentation/dev-tools/ubsan.rst +* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst +* KFENCE is a low-overhead detector of memory issues, which is much faster than + KASAN and can be used in production. See Documentation/dev-tools/kfence.rst +* lockdep is a locking correctness validator. See + Documentation/locking/lockdep-design.rst +* There are several other pieces of debug instrumentation in the kernel, many + of which can be found in lib/Kconfig.debug + +These tools tend to test the kernel as a whole, and do not "pass" like +kselftest or KUnit tests. They can be combined with KUnit or kselftest by +running tests on a kernel with these tools enabled: you can then be sure +that none of these errors are occurring during the test. + +Some of these tools integrate with KUnit or kselftest and will +automatically fail tests if an issue is detected. + +Static Analysis Tools +===================== + +In addition to testing a running kernel, one can also analyze kernel source code +directly (**at compile time**) using **static analysis** tools. The tools +commonly used in the kernel allow one to inspect the whole source tree or just +specific files within it. They make it easier to detect and fix problems during +the development process. + +Sparse can help test the kernel by performing type-checking, lock checking, +value range checking, in addition to reporting various errors and warnings while +examining the code. See the Documentation/dev-tools/sparse.rst documentation +page for details on how to use it. + +Smatch extends Sparse and provides additional checks for programming logic +mistakes such as missing breaks in switch statements, unused return values on +error checking, forgetting to set an error code in the return of an error path, +etc. Smatch also has tests against more serious issues such as integer +overflows, null pointer dereferences, and memory leaks. See the project page at +http://smatch.sourceforge.net/. + +Coccinelle is another static analyzer at our disposal. Coccinelle is often used +to aid refactoring and collateral evolution of source code, but it can also help +to avoid certain bugs that occur in common code patterns. The types of tests +available include API tests, tests for correct usage of kernel iterators, checks +for the soundness of free operations, analysis of locking behavior, and further +tests known to help keep consistent kernel usage. See the +Documentation/dev-tools/coccinelle.rst documentation page for details. + +Beware, though, that static analysis tools suffer from **false positives**. +Errors and warns need to be evaluated carefully before attempting to fix them. + +When to use Sparse and Smatch +----------------------------- + +Sparse does type checking, such as verifying that annotated variables do not +cause endianness bugs, detecting places that use ``__user`` pointers improperly, +and analyzing the compatibility of symbol initializers. + +Smatch does flow analysis and, if allowed to build the function database, it +also does cross function analysis. Smatch tries to answer questions like where +is this buffer allocated? How big is it? Can this index be controlled by the +user? Is this variable larger than that variable? + +It's generally easier to write checks in Smatch than it is to write checks in +Sparse. Nevertheless, there are some overlaps between Sparse and Smatch checks. + +Strong points of Smatch and Coccinelle +-------------------------------------- + +Coccinelle is probably the easiest for writing checks. It works before the +pre-processor so it's easier to check for bugs in macros using Coccinelle. +Coccinelle also creates patches for you, which no other tool does. + +For example, with Coccinelle you can do a mass conversion from +``kmalloc(x * size, GFP_KERNEL)`` to ``kmalloc_array(x, size, GFP_KERNEL)``, and +that's really useful. If you just created a Smatch warning and try to push the +work of converting on to the maintainers they would be annoyed. You'd have to +argue about each warning if can really overflow or not. + +Coccinelle does no analysis of variable values, which is the strong point of +Smatch. On the other hand, Coccinelle allows you to do simple things in a simple +way. diff --git a/Documentation/dev-tools/ubsan.rst b/Documentation/dev-tools/ubsan.rst new file mode 100644 index 0000000000..1be6618e23 --- /dev/null +++ b/Documentation/dev-tools/ubsan.rst @@ -0,0 +1,89 @@ +The Undefined Behavior Sanitizer - UBSAN +======================================== + +UBSAN is a runtime undefined behaviour checker. + +UBSAN uses compile-time instrumentation to catch undefined behavior (UB). +Compiler inserts code that perform certain kinds of checks before operations +that may cause UB. If check fails (i.e. UB detected) __ubsan_handle_* +function called to print error message. + +GCC has that feature since 4.9.x [1_] (see ``-fsanitize=undefined`` option and +its suboptions). GCC 5.x has more checkers implemented [2_]. + +Report example +-------------- + +:: + + ================================================================================ + UBSAN: Undefined behaviour in ../include/linux/bitops.h:110:33 + shift exponent 32 is to large for 32-bit type 'unsigned int' + CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.0-rc1+ #26 + 0000000000000000 ffffffff82403cc8 ffffffff815e6cd6 0000000000000001 + ffffffff82403cf8 ffffffff82403ce0 ffffffff8163a5ed 0000000000000020 + ffffffff82403d78 ffffffff8163ac2b ffffffff815f0001 0000000000000002 + Call Trace: + [<ffffffff815e6cd6>] dump_stack+0x45/0x5f + [<ffffffff8163a5ed>] ubsan_epilogue+0xd/0x40 + [<ffffffff8163ac2b>] __ubsan_handle_shift_out_of_bounds+0xeb/0x130 + [<ffffffff815f0001>] ? radix_tree_gang_lookup_slot+0x51/0x150 + [<ffffffff8173c586>] _mix_pool_bytes+0x1e6/0x480 + [<ffffffff83105653>] ? dmi_walk_early+0x48/0x5c + [<ffffffff8173c881>] add_device_randomness+0x61/0x130 + [<ffffffff83105b35>] ? dmi_save_one_device+0xaa/0xaa + [<ffffffff83105653>] dmi_walk_early+0x48/0x5c + [<ffffffff831066ae>] dmi_scan_machine+0x278/0x4b4 + [<ffffffff8111d58a>] ? vprintk_default+0x1a/0x20 + [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 + [<ffffffff830b2240>] setup_arch+0x405/0xc2c + [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 + [<ffffffff830ae053>] start_kernel+0x83/0x49a + [<ffffffff830ad120>] ? early_idt_handler_array+0x120/0x120 + [<ffffffff830ad386>] x86_64_start_reservations+0x2a/0x2c + [<ffffffff830ad4f3>] x86_64_start_kernel+0x16b/0x17a + ================================================================================ + +Usage +----- + +To enable UBSAN configure kernel with:: + + CONFIG_UBSAN=y + +and to check the entire kernel:: + + CONFIG_UBSAN_SANITIZE_ALL=y + +To enable instrumentation for specific files or directories, add a line +similar to the following to the respective kernel Makefile: + +- For a single file (e.g. main.o):: + + UBSAN_SANITIZE_main.o := y + +- For all files in one directory:: + + UBSAN_SANITIZE := y + +To exclude files from being instrumented even if +``CONFIG_UBSAN_SANITIZE_ALL=y``, use:: + + UBSAN_SANITIZE_main.o := n + +and:: + + UBSAN_SANITIZE := n + +Detection of unaligned accesses controlled through the separate option - +CONFIG_UBSAN_ALIGNMENT. It's off by default on architectures that support +unaligned accesses (CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y). One could +still enable it in config, just note that it will produce a lot of UBSAN +reports. + +References +---------- + +.. _1: https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html +.. _2: https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html +.. _3: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html |