summaryrefslogtreecommitdiffstats
path: root/docs/code-quality/static-analysis/writing-new/writing-matchers.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/code-quality/static-analysis/writing-new/writing-matchers.rst')
-rw-r--r--docs/code-quality/static-analysis/writing-new/writing-matchers.rst199
1 files changed, 199 insertions, 0 deletions
diff --git a/docs/code-quality/static-analysis/writing-new/writing-matchers.rst b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst
new file mode 100644
index 0000000000..5b693f5f27
--- /dev/null
+++ b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst
@@ -0,0 +1,199 @@
+.. _writing_matchers:
+
+Writing Matchers
+================
+
+On this page we will give some information about what a matcher is, and then provide an example of developing a simple match iteratively.
+
+Types of Matchers
+-----------------
+
+There are three types of matches: Node, Narrowing, and Traversal. There isn't always a clear separation or distinction between them, so treat this explanation as illustrative rather than definitive. Here is the documentation on matchers: `https://clang.llvm.org/docs/LibASTMatchersReference.html <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_
+
+On that page it is not obvious, so we want to note, **cicking on the name of a matcher expands help about that matcher.** Example:
+
+.. image:: documentation-expanded.png
+
+Node Matchers
+~~~~~~~~~~~~~
+
+Node matchers can be thought of as 'Nouns'. They specify a **type** of node you want to match, that is, a particular *thing*. A function, a binary operation, a variable, a type.
+
+A full list of `node matchers are listed in the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#node-matchers>`_. Some common ones are ``functionDecl()``, ``binaryOperator()``, and ``stmt()``.
+
+Narrowing Matchers
+~~~~~~~~~~~~~~~~~~
+
+Narrowing matchers can be thought of as 'Adjectives'. They narrow, or describe, a node, and therefore must be applied to a Node Matcher. For instance a node matcher may be a ``functionDecl``, and the narrowing matcher applied to it may be ``parameterCountIs``.
+
+The `table in the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#narrowing-matchers>`_ lists all the narrowing matchers, which they apply to and how to use them. Here is how to read the table:
+
+.. image:: narrowing-matcher.png
+
+And some examples:
+
+::
+
+ m functionDecl(parameterCountIs(1))
+ m functionDecl(anyOf(isDefinition(), isVariadic()))
+
+
+As you can see **only one Narrowing Matcher is allowed** and it goes inside the parens of the Node Matcher. In the first example, the matcher is ``parameterCountIs``, in the second it is ``anyOf``.
+
+In the second, we use the singular ``anyOf`` matcher to match any of multiple other Narrowing Matchers: ``isDefinition`` or ``isVariadic``. The other two common combining narrowing matchers are ``allOf()`` and ``unless()``.
+
+If you *need* to specify a narrowing matcher (because it's a required argument to some other matcher), you can use the ``anything()`` narrowing matcher to have a no-op narrowing matcher.
+
+Traversal Matchers
+~~~~~~~~~~~~~~~~~~
+
+Traversal Matchers *also* can be thought of as adjectives - at least most of them. They also describe a specific node, but the difference from a narrowing matcher is that the scope of the description is broader than the individual node. A narrowing matcher says something about the node in isolation (e.g. the number of arguments it has) while a traversal matcher says something about the node's contents or place in the program.
+
+Again, the `the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#traversal-matchers>`_ is the best place to explore and understand these, but here is a simple example for the traversal matcher ``hasArraySize()``:
+
+::
+
+ Given:
+ class MyClass { };
+ MyClass *p1 = new MyClass[10];
+
+
+ cxxNewExpr()
+ matches the expression 'new MyClass[10]'.
+
+ cxxNewExpr(hasArraySize(integerLiteral(equals(9))))
+ does not match anything
+
+ cxxNewExpr(hasArraySize(integerLiteral(equals(10))))
+ matches the expression 'new MyClass[10]'.
+
+
+
+Example of Iterative Matcher Development
+----------------------------------------
+
+When developing matchers, it will be much easier if you do the following:
+
+1. Write out the code you want to match. Write it out in as many different ways as you can. Examples: For some value in the code use a variable, a constant and a function that returns a value. Put the code you want to match inside of a function, inside of a conditional, inside of a function call, and inside of an inline function definition.
+2. Write out the code you *don't* want to match, but looks like code you do. Write out benign function calls, benign assignments, etc.
+3. Iterate on your matcher and treat it as _code_ you're writing. Indent it, copy it somewhere in case your browser crashes, even stick it in a tiny temporary version-controlled file.
+
+As an example of the above, below is a sample iterative development process of a more complicated matcher.
+
+ **Goal**: Match function calls where one of the parameters is an assignment expression with an integer literal, but the function parameter has a default value in the function definition.
+
+::
+
+ int add1(int a, int b) { return a + b; }
+ int add2(int c, int d = 8) { return c + d; }
+
+ int main() {
+ int x, y, z;
+
+ add1(x, y); // <- No match, no assignment
+ add1(3 + 4, y); // <- No match, no assignment
+ add1(z = x, y); // <- No match, assignment, but not an integer literal
+ add1(z = 2, y); // <- No match, assignment, integer literal, but function parameter lacks default value
+ add2(3, z = 2); // <- Match
+ }
+
+
+Here is the iterative development process:
+
+::
+
+ //-------------------------------------
+ // Step 1: Find all the function calls
+ m callExpr()
+ // Matches all calls, as expected.
+
+ //-------------------------------------
+ // Step 2: Start refining based on the arguments to the call
+ m callExpr(forEachArgumentWithParam()))
+ // Error: forEachArgumentWithParam expects two parameters
+
+ //-------------------------------------
+ // Step 3: Figure out the syntax to matching all the calls with this new operator
+ m callExpr(
+ forEachArgumentWithParam(
+ anything(),
+ anything()
+ )
+ )
+ // Matches all calls, as expected
+
+ //-------------------------------------
+ // Step 4: Find the calls with a binary operator of any kind
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(),
+ anything()
+ )
+ )
+ // Does not match the first call, but matches the others
+
+ //-------------------------------------
+ // Step 5: Limit the binary operator to assignments
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(isAssignmentOperator()),
+ anything()
+ )
+ )
+ // Now matches the final three calls
+
+ //-------------------------------------
+ // Step 6: Starting to refine matching the right-hand of the assignment
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(
+ allOf(
+ isAssignmentOperator(),
+ hasRHS()
+ )),
+ anything()
+ )
+ )
+ // Error, hasRHS expects a parameter
+
+ //-------------------------------------
+ // Step 7:
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(
+ allOf(
+ isAssignmentOperator(),
+ hasRHS(anything())
+ )),
+ anything()
+ )
+ )
+ // Okay, back to matching the final three calls
+
+ //-------------------------------------
+ // Step 8: Refine to just integer literals
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(
+ allOf(
+ isAssignmentOperator(),
+ hasRHS(integerLiteral())
+ )),
+ anything()
+ )
+ )
+ // Now we match the final two calls
+
+ //-------------------------------------
+ // Step 9: Apply a restriction to the parameter definition
+ m callExpr(
+ forEachArgumentWithParam(
+ binaryOperator(
+ allOf(
+ isAssignmentOperator(),
+ hasRHS(integerLiteral())
+ )),
+ hasDefaultArgument()
+ )
+ )
+ // Now we match the final call