From 36d22d82aa202bb199967e9512281e9a53db42c9 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 21:33:14 +0200 Subject: Adding upstream version 115.7.0esr. Signed-off-by: Daniel Baumann --- docs/code-quality/static-analysis/existing.rst | 245 +++++++++++++++++++++ docs/code-quality/static-analysis/index.rst | 30 +++ .../static-analysis/writing-new/adding-a-check.rst | 107 +++++++++ .../writing-new/advanced-check-features.rst | 148 +++++++++++++ .../static-analysis/writing-new/clang-query.rst | 167 ++++++++++++++ .../writing-new/documentation-expanded.png | Bin 0 -> 41142 bytes .../static-analysis/writing-new/index.rst | 14 ++ .../writing-new/matcher-cookbook.rst | 23 ++ .../writing-new/narrowing-matcher.png | Bin 0 -> 47851 bytes .../writing-new/narrowing-matcher.xcf | Bin 0 -> 79407 bytes .../writing-new/writing-matchers.rst | 199 +++++++++++++++++ 11 files changed, 933 insertions(+) create mode 100644 docs/code-quality/static-analysis/existing.rst create mode 100644 docs/code-quality/static-analysis/index.rst create mode 100644 docs/code-quality/static-analysis/writing-new/adding-a-check.rst create mode 100644 docs/code-quality/static-analysis/writing-new/advanced-check-features.rst create mode 100644 docs/code-quality/static-analysis/writing-new/clang-query.rst create mode 100644 docs/code-quality/static-analysis/writing-new/documentation-expanded.png create mode 100644 docs/code-quality/static-analysis/writing-new/index.rst create mode 100644 docs/code-quality/static-analysis/writing-new/matcher-cookbook.rst create mode 100644 docs/code-quality/static-analysis/writing-new/narrowing-matcher.png create mode 100644 docs/code-quality/static-analysis/writing-new/narrowing-matcher.xcf create mode 100644 docs/code-quality/static-analysis/writing-new/writing-matchers.rst (limited to 'docs/code-quality/static-analysis') diff --git a/docs/code-quality/static-analysis/existing.rst b/docs/code-quality/static-analysis/existing.rst new file mode 100644 index 0000000000..81ed79a54f --- /dev/null +++ b/docs/code-quality/static-analysis/existing.rst @@ -0,0 +1,245 @@ +Existing Infrastructure and Analysis +==================================== + +This document is about how Static Analysis occurs at Mozilla: the Firefox-specific and general llvm clang-tidy checks that are run on submissions in Phabricator and how to run them locally. For information about how to develop your own static analysis checks, please see `Writing New Firefox-Specific Checks `_. + +For linting, please see the `linting documentation `_. + +For reviews, use the `#static-analysis-reviewers review group `__. +Ask questions on `#static-analysis:mozilla.org `__. + + +Clang-Tidy static analysis +-------------------------- + +As explained earlier, our current static-analysis infrastructure is based on +`clang-tidy `__. The checkers that +we use are split into 3 categories: + +#. :searchfox:`Firefox specific checkers `. They detect incorrect Gecko programming + patterns which could lead to bugs or security issues. +#. `Clang-tidy checkers `_. They aim to suggest better programming practices + and to improve memory efficiency and performance. +#. `Clang-analyzer checkers `_. These checks are more advanced, for example + some of them can detect dead code or memory leaks, but as a typical + side effect they have false positives. Because of that, we have + disabled them for now, but will enable some of them in the near + future. + +In order to simplify the process of static-analysis we have focused on +integrating this process with Phabricator and mach. A list of some +checkers that are used during automated scan can be found +:searchfox:`here `. + +Static analysis at review phase +------------------------------- + +We created a TaskCluster bot that runs clang static analysis on every +patch submitted to Phabricator. It then quickly reports any code defects +directly on the review platform, thus preventing bad patches from +landing until all their defects are fixed. Currently, its feedback is +posted in about 10 minutes after a patch series is published on the +review platform. + +As part of the process, the various linting jobs are also executed +using try. This can be also used to add new jobs, see: :ref:`attach-job-review`. +An example of automated review can be found `on +phabricator `__. + + +./mach static-analysis +---------------------- + +The ``./mach static-analysis`` command is supported on all Firefox built platforms. During the first run it +automatically installs all of its dependencies, such as the clang-tidy +executable, in the .mozbuild folder thus making it very easy to use. The +resources that are used are provided by toolchain artifacts clang-tidy +target. + +This is used through ``mach static-analysis`` command that has the +following parameters: + +- ``check`` - Runs the checks using the installed helper tool from + ~/.mozbuild. +- ``--checks, -c`` - Checks to enabled during the scan. The checks + enabled + :searchfox:`in the yaml file ` + are used by default. +- ``--fix, -f`` - Try to autofix errors detected by the checkers. + Depending on the checker, this option might not do anything. + The list of checkers with autofix can be found on the `clang-tidy website `__. +- ``--header-filter, -h-f`` - Regular expression matching the names of + the headers to output diagnostic from.Diagnostic from the main file + of each translation unit are always displayed. + +As an example we run static-analysis through mach on +``dom/presentation/Presentation.cpp`` with +``google-readability-braces-around-statements`` check and autofix we +would have: + +.. code-block:: shell + + ./mach static-analysis check --checks="-*, google-readability-braces-around-statements" --fix dom/presentation/Presentation.cpp + +If you want to use a custom clang-tidy binary this can be done by using +the ``install`` subcommand of ``mach static-analysis``, but please note +that the archive that is going to be used must be compatible with the +directory structure clang-tidy from toolchain artifacts. + +.. code-block:: shell + + ./mach static-analysis install clang.tar.gz + + +Regression Testing +------------------ + +In order to prevent regressions in our clang-tidy based static analysis, +we have created a +:searchfox:`task ` +on automation. This task runs on each commit and launches a test suite +that is integrated into mach. + +The test suite implements the following: + +- Downloads the necessary clang-tidy artifacts. +- Reads the + :searchfox:`configuration ` + file. +- For each checker reads the test file plus the expected result. A + sample of test and expected result can be found + :searchfox:`in the test file ` + and + :searchfox:`the json file `. + +This testing suit can be run locally by doing the following: + +.. code-block:: shell + + ./mach static-analysis autotest + +If we want to test only a specific checker, let's say +modernize-raw-string-literal, we can run: + +.. code-block:: shell + + ./mach static-analysis autotest modernize-raw-string-literal + +If we want to add a new checker we need to generated the expected result +file, by doing: + +.. code-block:: shell + + ./mach static-analysis autotest modernize-raw-string-literal -d + + +Build-time static-analysis +-------------------------- + +If you want to build with the Firefox Clang plug-in +(located in ``/build/clang-plugin`` and associated with +``MOZ_CLANG_PLUGIN`` and the attributes in ``/mfbt/Attributes.h``) +just add ``--enable-clang-plugin`` to your mozconfig! +If you want to also have our experimental checkers that will produce ``warnings`` as +diagnostic messages also add ``--enable-clang-plugin-alpha``. +This requires to build Firefox using Clang. + +Configuring the build environment +--------------------------------- + +Once you have your Clang build in place, you will need to set up tools +to use it. +A full working .mozconfig for the desktop browser is: + +.. code-block:: shell + + . $topsrcdir/browser/config/mozconfig + mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-ff-dbg + + ac_add_options --enable-debug + +Attempts to use ``ccache`` will likely result in failure to compile. It +is also necessary to avoid optimized builds, as these will modify macros +which will result in many false positives. + +At this point, your Firefox build environment should be configured to +compile via the Clang static analyzer! + + +Performing scanning builds +-------------------------- + +It is not enough to simply start the build like normal. Instead, you +need to run the build through a Clang utility script which will keep +track of all produced analysis and consolidate it automatically. + +Reports are published daily on +`https://sylvestre.ledru.info/reports/fx-scan-build/ `__ +Many of the defects reported as sources for Good First Bug. + +That script is scan-build. You can find it in +``$clang_source/tools/scan-build/scan-build``. + +Try running your build through ``scan-build``: + +.. code-block:: shell + + $ cd /path/to/mozilla/source + + # Blow away your object directory because incremental builds don't make sense + $ rm -rf obj-dir + + # To start the build: + scan-build --show-description ./mach build -v + + # The above should execute without any errors. However, it should take longer than + # normal because all compilation will be executing through Clang's static analyzer, + # which adds overhead. + +If things are working properly, you should see a bunch of console spew, +just like any build. + +The first time you run scan-build, CTRL+C after a few files are +compiled. You should see output like: + +.. code-block:: shell + + scan-build: 3 bugs found. + scan-build: Run 'scan-view /Users/gps/tmp/mcsb/2011-12-15-3' to examine bug reports. + +If you see a message like: + +.. code-block:: shell + + scan-build: Removing directory '/var/folders/s2/zc78dpsx2rz6cpc_21r9g5hr0000gn/T/scan-build-2011-12-15-1' because it contains no reports. + +either no static analysis results were available yet or your environment +is not configured properly. + +By default, ``scan-build`` writes results to a folder in a +pseudo-temporary location. You can control where results go by passing +the ``-o /path/to/output`` arguments to ``scan-build``. + +You may also want to run ``scan-build --help`` to see all the options +available. For example, it is possible to selectively enable and disable +individual analyzers. + + +Analyzing the output +-------------------- + +Once the build has completed, ``scan-build`` will produce a report +summarizing all the findings. This is called ``index.html`` in the +output directory. You can run ``scan-view`` (from +``$clang_source/tools/scan-view/scan-view``) as ``scan-build's`` output +suggests; this merely fires up a local HTTP server. Or you should be +able to open the ``index.html`` directly with your browser. + + +False positives +--------------- + +By definition, there are currently false positives in the static +analyzer. A lot of these are due to the analyzer having difficulties +following the relatively complicated error handling in various +preprocessor macros. diff --git a/docs/code-quality/static-analysis/index.rst b/docs/code-quality/static-analysis/index.rst new file mode 100644 index 0000000000..595eab363d --- /dev/null +++ b/docs/code-quality/static-analysis/index.rst @@ -0,0 +1,30 @@ +Static Analysis +=============== + +Static Analysis is running an analysis of the source code without actually executing the code. For the most part, at Mozilla static analysis refers to the stuff we do with `clang-tidy `__. It uses +checkers in order to prevent different programming errors present in the +code. The checkers that we use are split into 3 categories: + +#. :searchfox:`Firefox specific checkers `. They detect incorrect Gecko programming + patterns which could lead to bugs or security issues. +#. `Clang-tidy checkers `_. They aim to suggest better programming practices + and to improve memory efficiency and performance. +#. `Clang-analyzer checkers `_. These checks are more advanced, for example + some of them can detect dead code or memory leaks, but as a typical + side effect they have false positives. Because of that, we have + disabled them for now, but will enable some of them in the near + future. + +In order to simplify the process of static-analysis we have focused on +integrating this process with Phabricator and mach. A list of some +checkers that are used during automated scan can be found +:searchfox:`here `. + +This documentation is split into two parts: + +.. toctree:: + :maxdepth: 1 + :glob: + + existing.rst + writing-new/index.rst diff --git a/docs/code-quality/static-analysis/writing-new/adding-a-check.rst b/docs/code-quality/static-analysis/writing-new/adding-a-check.rst new file mode 100644 index 0000000000..ec3bc030af --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/adding-a-check.rst @@ -0,0 +1,107 @@ +.. _add_a_check: + +Adding a check +============== + +After you've completed a matcher using clang-query, it's time to take it to the next step and turn it into C++ and run it on the whole m-c codebase and see what happens. + +Clang plugins live in `build/clang-plugin `_ and here we'll cover what is needed to add one. To see how the most recent check was added, you can look at the log for `Checks.inc `_ which is one of the necessary files to edit. That's also what we'll be covering next. + +Boilerplate Steps to Add a New Check +------------------------------------ + +First pick a name. Pick something that makes sense without punctuation, in no more than 8 words or so. For this example we'll call it "Missing Else In Enum Comparisons". + +#. Add it alphabetically in build/clang-plugin/Checks.inc, ChecksIncludes.inc, and moz.build +#. ``cd build/clang-plugin && touch MissingElseInEnumComparisons.h MissingElseInEnumComparisons.cpp`` +#. Copy the contents of an existing, simple .h file (e.g. `build/clang-plugin/ScopeChecker.h `_) and edit the class name and header guards. +#. Create the following boilerplate for your implementation: + +:: + + /* This Source Code Form is subject to the terms of the Mozilla Public + * License, v. 2.0. If a copy of the MPL was not distributed with this + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ + + #include "MissingElseInEnumComparisons.h" + #include "CustomMatchers.h" + + void MissingElseInEnumComparisons::registerMatchers(MatchFinder *AstMatcher) { + + } + + void MissingElseInEnumComparisons::check(const MatchFinder::MatchResult &Result) { + + } + + +Converting your matcher to C++ +------------------------------ +With the boilerplate out of the way, now we can focus on converting the matcher over to C++. Once it's in C++ you'll also be able to take advantage of techniques that will make your matcher easier to read and understand. + +The gist of converting your matcher is to take the following pseudo-code and paste your entire matcher in where 'foo' is; keeping the `.bind("node")` there: + +:: + + AstMatcher->addMatcher( + traverse(TK_IgnoreUnlessSpelledInSource, + foo + .bind("node")), + this); + + +It honest to god is usually that easy. Here's a working example where I pasted in straight from Compiler Explorer: + +:: + + AstMatcher->addMatcher( + traverse(TK_IgnoreUnlessSpelledInSource, + ifStmt(allOf( + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl().bind("enum"))) + ) + ) + ), + hasElse( + ifStmt(allOf( + unless(hasElse(anything())), + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl())) + ) + ) + ) + )) + ) + )) + .bind("node")), + this); + + + +If for some reason you're not using the ``IgnoreUnlessSpelledInSource`` Traversal Mode, remove the call to traverse and the corresponding closing paren. (Also, if you're comparing this code to existing source code, know that because this traversal mode is a new clang feature, most historical clang checks do not use it.) + +Wiring up Warnings and Errors +----------------------------- +To get started with a some simple output, just take the boilerplate warning here and stick it in: + +:: + + const auto *MatchedDecl = Result.Nodes.getNodeAs("node"); + diag(MatchedDecl->getIfLoc(), + "Enum comparisons in an if/else if block without a trailing else.", + DiagnosticIDs::Warning); + + +You'll need to edit two things: + +#. Make sure "node" matches whatever you put in `.bind()` up above. +#. ``getNodeAs`` needs to be changed to whatever type of element "node" is. Above, we bind "node" to an ifStmt so that's what we need to cast it to. Doing this step incorrectly will cause clang to crash during compilation as if there was some internal compiler error. + + +Running it on Central +---------------------- +After this, the next thing to do is to add ``ac_add_options --enable-clang-plugin`` to your .mozconfig and do a build. Your plugin will be automatically compiled and used across the entire codebase. I suggest using ``./mach build | tee output.txt`` and then ``grep "Enum comparisons" output.txt | cut -d " " -f 3- | sort | uniq``. (The ``cut`` is there to get rid of the timestamp in the line.) diff --git a/docs/code-quality/static-analysis/writing-new/advanced-check-features.rst b/docs/code-quality/static-analysis/writing-new/advanced-check-features.rst new file mode 100644 index 0000000000..e8adcf664f --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/advanced-check-features.rst @@ -0,0 +1,148 @@ +.. _advanced_check_features: + +Advanced Check Features +======================= + +This page covers additional ways to improve and extend the check you've added to build/clang-plugin. + +Adding Tests +------------ + +No doubt you've seen the tests for existing checks in `build/clang-plugin/tests `_. Adding tests is straightforward; and your reviewer should insist you do so. Simply copying the existing format of any test and how diagnostics are marked as expected. + +One wrinkle - all clang plugin checks are applied to all tests. We try to write tests so that only one check applies to it. If you write a check that triggers on an existing test, try to fix the existing test slightly so the new check does not trigger on it. + +Using Bind To Output More Useful Information +-------------------------------------------- + +You've probably been wondering what the heck ``.bind()`` is for. You've been seeing it all over the place but never has it actually been explained what it's for and when to use it. + +``.bind()`` is used to give a name to part of the AST discovered through your matcher, so you can use it later. Let's go back to our sample matcher: + +:: + + AstMatcher->addMatcher( + traverse(TK_IgnoreUnlessSpelledInSource, + ifStmt(allOf( + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl())) + ) + ) + ), + hasElse( + ifStmt(allOf( + unless(hasElse(anything())), + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl())) + ) + ) + ) + )) + ) + )) + .bind("node")), + this); + +Now the ``.bind("node")`` makes more sense. We're naming the If statement we matched, so we can refer to it later when we call ``Result.Nodes.getNodeAs("node")``. + +Let's say we want to provide the *type* of the enum in our warning message. There are two enums we end up seeing in our matcher - the enum in the first if statement, and the enum in the second. We're going to arbitrarily pick the first and give it the name ``enumType``: + +:: + + AstMatcher->addMatcher( + traverse(TK_IgnoreUnlessSpelledInSource, + ifStmt(allOf( + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl().bind("enumType"))) + ) + ) + ), + hasElse( + ifStmt(allOf( + unless(hasElse(anything())), + has( + binaryOperator( + has( + declRefExpr(hasType(enumDecl())) + ) + ) + ) + )) + ) + )) + .bind("node")), + this); + +And in our check() function, we can use it like so: + +:: + + void MissingElseInEnumComparisons::check( + const MatchFinder::MatchResult &Result) { + const auto *MatchedDecl = Result.Nodes.getNodeAs("node"); + const auto *EnumType = Result.Nodes.getNodeAs("enumType"); + + diag(MatchedDecl->getIfLoc(), + "Enum comparisons to %0 in an if/else if block without a trailing else.", + DiagnosticIDs::Warning) << EnumType->getName(); + } + +Repeated matcher calls +-------------------------- + +If you find yourself repeating the same several matchers in several spots, you can turn it into a variable to use. + +:: + + auto isTemporaryLifetimeBoundCall = + cxxMemberCallExpr( + onImplicitObjectArgument(anyOf(has(cxxTemporaryObjectExpr()), + has(materializeTemporaryExpr()))), + callee(functionDecl(isMozTemporaryLifetimeBound()))); + + auto hasTemporaryLifetimeBoundCall = + anyOf(isTemporaryLifetimeBoundCall, + conditionalOperator( + anyOf(hasFalseExpression(isTemporaryLifetimeBoundCall), + hasTrueExpression(isTemporaryLifetimeBoundCall)))); + +The above example is parameter-less, but if you need to supply a parameter that changes, you can turn it into a lambda: + +:: + + auto hasConstCharPtrParam = [](const unsigned int Position) { + return hasParameter( + Position, hasType(hasCanonicalType(pointsTo(asString("const char"))))); + }; + + auto hasParamOfType = [](const unsigned int Position, const char *Name) { + return hasParameter(Position, hasType(asString(Name))); + }; + + auto hasIntegerParam = [](const unsigned int Position) { + return hasParameter(Position, hasType(isInteger())); + }; + + AstMatcher->addMatcher( + callExpr( + hasName("fopen"), + hasConstCharPtrParam(0)) + .bind("funcCall"), + this); + + +Allow-listing existing callsites +-------------------------------- + +While it's not a great situation, you can set up an allow-list of existing callsites if you need to. A simple allow-list is demonstrated in `NoGetPrincipalURI `_. The `NoNewThreadsChecker `_ is an example of a more sophisticated way of setting up a larger allow-list. + + +Custom Annotations +------------------ +It's possible to create custom annotations that will be a no-op when compiled, but can be used by a static analysis check. These can be used to annotate special types of sources and sinks (for example). We have some examples of this in-tree presently (such as ``MOZ_CAN_RUN_SCRIPT``) but currently don't have a detailed walkthrough in this documentation of how to set these up and use them. (Patches welcome.) diff --git a/docs/code-quality/static-analysis/writing-new/clang-query.rst b/docs/code-quality/static-analysis/writing-new/clang-query.rst new file mode 100644 index 0000000000..1308a32821 --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/clang-query.rst @@ -0,0 +1,167 @@ +.. _using_clang_query: + +Using clang-query +================= + +clang-query is a tool that allows you to quickly iterate and develop the difficult part of a matcher. +Once the design of the matcher is completed, it can be transferred to a C++ clang-tidy plugin, `similar +to the ones in mozilla-central `_. + +Recommended Boilerplate +----------------------- + +:: + + set traversal IgnoreUnlessSpelledInSource + set bind-root true + # ^ true unless you use any .bind("foo") commands + set print-matcher true + enable output dump + + +clang-query Options +------------------- + +set traversal +~~~~~~~~~~~~~ + +`Traversal mode `_ specifies how the AST Matcher will traverse the nodes in the Abstract Syntax Tree. There are two values: + +AsIs + This mode notes all the nodes in the AST, even if they are not explicitly spelled out in the source. This will include nodes you have never seen and probably don't immediately understand, for example ``ExprWithCleanups`` and ``MaterializeTemporaryExpr``. In this mode, it is necessary to write matchers that expliticly match or otherwise traverse these potentially unexpected nodes. + +IgnoreUnlessSpelledInSource + This mode skips over 'implicit' nodes that are created as a result of implicit casts or other usually-low-level language details. This is typically much more user-friendly. **Typically you would want to use** ``set traversal IgnoreUnlessSpelledInSource``. + +More examples are available `in the documentation `_, but here is a simple example: + +:: + + B func1() { + return 42; + } + + /* + AST Dump in 'Asis' mode for C++17/C++20 dialect: + + FunctionDecl + `-CompoundStmt + `-ReturnStmt + `-ImplicitCastExpr + `-CXXConstructExpr + `-IntegerLiteral 'int' 42 + + AST Dump in 'IgnoreUnlessSpelledInSource' mode for all dialects: + + FunctionDecl + `-CompoundStmt + `-ReturnStmt + `-IntegerLiteral 'int' 42 + */ + + +set bind-root +~~~~~~~~~~~~~ + +If you are matching objects and assigning them names for later use, this option may be relevant. If you are debugging a single matcher and not using any ``.bind()``, it is irrelevant. + +Consider the output of ``match functionDecl().bind("x")``: + +:: + + clang-query> match functionDecl().bind("x") + + Match #1: + + testfile.cpp:1:1: note: "root" binds here + int addTwo(int num) + ^~~~~~~~~~~~~~~~~~~ + testfile.cpp:1:1: note: "x" binds here + int addTwo(int num) + ^~~~~~~~~~~~~~~~~~~ + + Match #2: + + testfile.cpp:6:1: note: "root" binds here + int main(int, char**) + ^~~~~~~~~~~~~~~~~~~~~ + testfile.cpp:6:1: note: "x" binds here + int main(int, char**) + ^~~~~~~~~~~~~~~~~~~~~ + 2 matches. + + +clang-query automatically binds ``root`` to the match, but we also bound the name ``x`` to that match. The ``root`` is redundant. If you ``set bind-root false`` then the output is less noisy: + +:: + + clang-query> set bind-root false + clang-query> m functionDecl().bind("x") + + Match #1: + + testfile.cpp:1:1: note: "x" binds here + int addtwo(int num) + ^~~~~~~~~~~~~~~~~~~ + + Match #2: + + testfile.cpp:6:1: note: "x" binds here + int main(int, char**) + ^~~~~~~~~~~~~~~~~~~~~ + 2 matches. + + +set print-matcher +~~~~~~~~~~~~~~~~~ + +``set print-matcher true`` will print a header line of the form 'Matcher: ' where foo is the matcher you have written. It is helpful when debugging multiple matchers at the same time, and no inconvience otherwise. + +enable/disable/set output +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +These commands will control the type of output you get from clang-query. The options are: + +``print`` + Shows you the C++ form of the node you are matching. This is typically not useful. + +``diag`` + Shows you the individual node you are matching. + +``dump`` (alias: ``detailed-ast``) + Shows you the node you are matching and the entire subtree for the node + +By default, you get ``diag`` output. You can change the output by choosing ``set output``. You can *add* output by using ``enable output``. You can *disable* output using ``disable output`` but this is typically not needed. + +So if you want to get all three output formats you can do: + +:: + + # diag output happens automatically because you did not override with 'set' + enable output print + enable output dump + + +Patches +------- + +This section tracks some patches; they are currently not used, but we may want them in the future. + +- Functionality: + + - `traverse() operator available to clang-query `_ + - `srclog output `_ + - `allow anyOf() to be empty `_ + - breakpoints + - debug + - profile + +- Matcher Changes: + + - `binaryOperation() matcher `_ + +- Plumbing: + + - `mapAnyOf() `_ (`Example of usage `_) + - `Make cxxOperatorCallExpr matchers API-compatible with n-ary operators `_ + - `CXXRewrittenBinaryOperator `_ diff --git a/docs/code-quality/static-analysis/writing-new/documentation-expanded.png b/docs/code-quality/static-analysis/writing-new/documentation-expanded.png new file mode 100644 index 0000000000..82f12a516d Binary files /dev/null and b/docs/code-quality/static-analysis/writing-new/documentation-expanded.png differ diff --git a/docs/code-quality/static-analysis/writing-new/index.rst b/docs/code-quality/static-analysis/writing-new/index.rst new file mode 100644 index 0000000000..9107fb1b59 --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/index.rst @@ -0,0 +1,14 @@ +Writing New Firefox-Specific Static Analysis Checks +=================================================== + +This section is intended to help Mozilla engineers either casually play with writing a static analysis check +or seriously develop one we can land and run internally. While being written for internal consumption, it's broadly applicable outside Mozilla. + +.. toctree:: + :maxdepth: 2 + + clang-query.rst + writing-matchers.rst + matcher-cookbook.rst + adding-a-check.rst + advanced-check-features.rst diff --git a/docs/code-quality/static-analysis/writing-new/matcher-cookbook.rst b/docs/code-quality/static-analysis/writing-new/matcher-cookbook.rst new file mode 100644 index 0000000000..9eb0d96c43 --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/matcher-cookbook.rst @@ -0,0 +1,23 @@ +.. _matcher_cookbook: + +Matcher Cookbook +================= + +This page is designed to be a selection of common ingredients to a more complicated matcher. + +.. list-table:: + :widths: 35 65 + :header-rows: 1 + :class: matcher-cookbook + + * - Desired Outcome + - Syntax + * - Ignore header files + + *If you have an #include in your example code, your matcher may match things in the header files.* + - Add **isExpansionInMainFile()** to the matcher. e.g. + + ``m functionDecl(isExpansionInMainFile())`` + + +*More coming* diff --git a/docs/code-quality/static-analysis/writing-new/narrowing-matcher.png b/docs/code-quality/static-analysis/writing-new/narrowing-matcher.png new file mode 100644 index 0000000000..52c82791d3 Binary files /dev/null and b/docs/code-quality/static-analysis/writing-new/narrowing-matcher.png differ diff --git a/docs/code-quality/static-analysis/writing-new/narrowing-matcher.xcf b/docs/code-quality/static-analysis/writing-new/narrowing-matcher.xcf new file mode 100644 index 0000000000..0102f79e76 Binary files /dev/null and b/docs/code-quality/static-analysis/writing-new/narrowing-matcher.xcf differ diff --git a/docs/code-quality/static-analysis/writing-new/writing-matchers.rst b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst new file mode 100644 index 0000000000..5b693f5f27 --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst @@ -0,0 +1,199 @@ +.. _writing_matchers: + +Writing Matchers +================ + +On this page we will give some information about what a matcher is, and then provide an example of developing a simple match iteratively. + +Types of Matchers +----------------- + +There are three types of matches: Node, Narrowing, and Traversal. There isn't always a clear separation or distinction between them, so treat this explanation as illustrative rather than definitive. Here is the documentation on matchers: `https://clang.llvm.org/docs/LibASTMatchersReference.html `_ + +On that page it is not obvious, so we want to note, **cicking on the name of a matcher expands help about that matcher.** Example: + +.. image:: documentation-expanded.png + +Node Matchers +~~~~~~~~~~~~~ + +Node matchers can be thought of as 'Nouns'. They specify a **type** of node you want to match, that is, a particular *thing*. A function, a binary operation, a variable, a type. + +A full list of `node matchers are listed in the documentation `_. Some common ones are ``functionDecl()``, ``binaryOperator()``, and ``stmt()``. + +Narrowing Matchers +~~~~~~~~~~~~~~~~~~ + +Narrowing matchers can be thought of as 'Adjectives'. They narrow, or describe, a node, and therefore must be applied to a Node Matcher. For instance a node matcher may be a ``functionDecl``, and the narrowing matcher applied to it may be ``parameterCountIs``. + +The `table in the documentation `_ lists all the narrowing matchers, which they apply to and how to use them. Here is how to read the table: + +.. image:: narrowing-matcher.png + +And some examples: + +:: + + m functionDecl(parameterCountIs(1)) + m functionDecl(anyOf(isDefinition(), isVariadic())) + + +As you can see **only one Narrowing Matcher is allowed** and it goes inside the parens of the Node Matcher. In the first example, the matcher is ``parameterCountIs``, in the second it is ``anyOf``. + +In the second, we use the singular ``anyOf`` matcher to match any of multiple other Narrowing Matchers: ``isDefinition`` or ``isVariadic``. The other two common combining narrowing matchers are ``allOf()`` and ``unless()``. + +If you *need* to specify a narrowing matcher (because it's a required argument to some other matcher), you can use the ``anything()`` narrowing matcher to have a no-op narrowing matcher. + +Traversal Matchers +~~~~~~~~~~~~~~~~~~ + +Traversal Matchers *also* can be thought of as adjectives - at least most of them. They also describe a specific node, but the difference from a narrowing matcher is that the scope of the description is broader than the individual node. A narrowing matcher says something about the node in isolation (e.g. the number of arguments it has) while a traversal matcher says something about the node's contents or place in the program. + +Again, the `the documentation `_ is the best place to explore and understand these, but here is a simple example for the traversal matcher ``hasArraySize()``: + +:: + + Given: + class MyClass { }; + MyClass *p1 = new MyClass[10]; + + + cxxNewExpr() + matches the expression 'new MyClass[10]'. + + cxxNewExpr(hasArraySize(integerLiteral(equals(9)))) + does not match anything + + cxxNewExpr(hasArraySize(integerLiteral(equals(10)))) + matches the expression 'new MyClass[10]'. + + + +Example of Iterative Matcher Development +---------------------------------------- + +When developing matchers, it will be much easier if you do the following: + +1. Write out the code you want to match. Write it out in as many different ways as you can. Examples: For some value in the code use a variable, a constant and a function that returns a value. Put the code you want to match inside of a function, inside of a conditional, inside of a function call, and inside of an inline function definition. +2. Write out the code you *don't* want to match, but looks like code you do. Write out benign function calls, benign assignments, etc. +3. Iterate on your matcher and treat it as _code_ you're writing. Indent it, copy it somewhere in case your browser crashes, even stick it in a tiny temporary version-controlled file. + +As an example of the above, below is a sample iterative development process of a more complicated matcher. + + **Goal**: Match function calls where one of the parameters is an assignment expression with an integer literal, but the function parameter has a default value in the function definition. + +:: + + int add1(int a, int b) { return a + b; } + int add2(int c, int d = 8) { return c + d; } + + int main() { + int x, y, z; + + add1(x, y); // <- No match, no assignment + add1(3 + 4, y); // <- No match, no assignment + add1(z = x, y); // <- No match, assignment, but not an integer literal + add1(z = 2, y); // <- No match, assignment, integer literal, but function parameter lacks default value + add2(3, z = 2); // <- Match + } + + +Here is the iterative development process: + +:: + + //------------------------------------- + // Step 1: Find all the function calls + m callExpr() + // Matches all calls, as expected. + + //------------------------------------- + // Step 2: Start refining based on the arguments to the call + m callExpr(forEachArgumentWithParam())) + // Error: forEachArgumentWithParam expects two parameters + + //------------------------------------- + // Step 3: Figure out the syntax to matching all the calls with this new operator + m callExpr( + forEachArgumentWithParam( + anything(), + anything() + ) + ) + // Matches all calls, as expected + + //------------------------------------- + // Step 4: Find the calls with a binary operator of any kind + m callExpr( + forEachArgumentWithParam( + binaryOperator(), + anything() + ) + ) + // Does not match the first call, but matches the others + + //------------------------------------- + // Step 5: Limit the binary operator to assignments + m callExpr( + forEachArgumentWithParam( + binaryOperator(isAssignmentOperator()), + anything() + ) + ) + // Now matches the final three calls + + //------------------------------------- + // Step 6: Starting to refine matching the right-hand of the assignment + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS() + )), + anything() + ) + ) + // Error, hasRHS expects a parameter + + //------------------------------------- + // Step 7: + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(anything()) + )), + anything() + ) + ) + // Okay, back to matching the final three calls + + //------------------------------------- + // Step 8: Refine to just integer literals + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(integerLiteral()) + )), + anything() + ) + ) + // Now we match the final two calls + + //------------------------------------- + // Step 9: Apply a restriction to the parameter definition + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(integerLiteral()) + )), + hasDefaultArgument() + ) + ) + // Now we match the final call -- cgit v1.2.3