diff options
Diffstat (limited to 'docs/code-quality/static-analysis/writing-new/writing-matchers.rst')
-rw-r--r-- | docs/code-quality/static-analysis/writing-new/writing-matchers.rst | 199 |
1 files changed, 199 insertions, 0 deletions
diff --git a/docs/code-quality/static-analysis/writing-new/writing-matchers.rst b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst new file mode 100644 index 0000000000..5b693f5f27 --- /dev/null +++ b/docs/code-quality/static-analysis/writing-new/writing-matchers.rst @@ -0,0 +1,199 @@ +.. _writing_matchers: + +Writing Matchers +================ + +On this page we will give some information about what a matcher is, and then provide an example of developing a simple match iteratively. + +Types of Matchers +----------------- + +There are three types of matches: Node, Narrowing, and Traversal. There isn't always a clear separation or distinction between them, so treat this explanation as illustrative rather than definitive. Here is the documentation on matchers: `https://clang.llvm.org/docs/LibASTMatchersReference.html <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_ + +On that page it is not obvious, so we want to note, **cicking on the name of a matcher expands help about that matcher.** Example: + +.. image:: documentation-expanded.png + +Node Matchers +~~~~~~~~~~~~~ + +Node matchers can be thought of as 'Nouns'. They specify a **type** of node you want to match, that is, a particular *thing*. A function, a binary operation, a variable, a type. + +A full list of `node matchers are listed in the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#node-matchers>`_. Some common ones are ``functionDecl()``, ``binaryOperator()``, and ``stmt()``. + +Narrowing Matchers +~~~~~~~~~~~~~~~~~~ + +Narrowing matchers can be thought of as 'Adjectives'. They narrow, or describe, a node, and therefore must be applied to a Node Matcher. For instance a node matcher may be a ``functionDecl``, and the narrowing matcher applied to it may be ``parameterCountIs``. + +The `table in the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#narrowing-matchers>`_ lists all the narrowing matchers, which they apply to and how to use them. Here is how to read the table: + +.. image:: narrowing-matcher.png + +And some examples: + +:: + + m functionDecl(parameterCountIs(1)) + m functionDecl(anyOf(isDefinition(), isVariadic())) + + +As you can see **only one Narrowing Matcher is allowed** and it goes inside the parens of the Node Matcher. In the first example, the matcher is ``parameterCountIs``, in the second it is ``anyOf``. + +In the second, we use the singular ``anyOf`` matcher to match any of multiple other Narrowing Matchers: ``isDefinition`` or ``isVariadic``. The other two common combining narrowing matchers are ``allOf()`` and ``unless()``. + +If you *need* to specify a narrowing matcher (because it's a required argument to some other matcher), you can use the ``anything()`` narrowing matcher to have a no-op narrowing matcher. + +Traversal Matchers +~~~~~~~~~~~~~~~~~~ + +Traversal Matchers *also* can be thought of as adjectives - at least most of them. They also describe a specific node, but the difference from a narrowing matcher is that the scope of the description is broader than the individual node. A narrowing matcher says something about the node in isolation (e.g. the number of arguments it has) while a traversal matcher says something about the node's contents or place in the program. + +Again, the `the documentation <https://clang.llvm.org/docs/LibASTMatchersReference.html#traversal-matchers>`_ is the best place to explore and understand these, but here is a simple example for the traversal matcher ``hasArraySize()``: + +:: + + Given: + class MyClass { }; + MyClass *p1 = new MyClass[10]; + + + cxxNewExpr() + matches the expression 'new MyClass[10]'. + + cxxNewExpr(hasArraySize(integerLiteral(equals(9)))) + does not match anything + + cxxNewExpr(hasArraySize(integerLiteral(equals(10)))) + matches the expression 'new MyClass[10]'. + + + +Example of Iterative Matcher Development +---------------------------------------- + +When developing matchers, it will be much easier if you do the following: + +1. Write out the code you want to match. Write it out in as many different ways as you can. Examples: For some value in the code use a variable, a constant and a function that returns a value. Put the code you want to match inside of a function, inside of a conditional, inside of a function call, and inside of an inline function definition. +2. Write out the code you *don't* want to match, but looks like code you do. Write out benign function calls, benign assignments, etc. +3. Iterate on your matcher and treat it as _code_ you're writing. Indent it, copy it somewhere in case your browser crashes, even stick it in a tiny temporary version-controlled file. + +As an example of the above, below is a sample iterative development process of a more complicated matcher. + + **Goal**: Match function calls where one of the parameters is an assignment expression with an integer literal, but the function parameter has a default value in the function definition. + +:: + + int add1(int a, int b) { return a + b; } + int add2(int c, int d = 8) { return c + d; } + + int main() { + int x, y, z; + + add1(x, y); // <- No match, no assignment + add1(3 + 4, y); // <- No match, no assignment + add1(z = x, y); // <- No match, assignment, but not an integer literal + add1(z = 2, y); // <- No match, assignment, integer literal, but function parameter lacks default value + add2(3, z = 2); // <- Match + } + + +Here is the iterative development process: + +:: + + //------------------------------------- + // Step 1: Find all the function calls + m callExpr() + // Matches all calls, as expected. + + //------------------------------------- + // Step 2: Start refining based on the arguments to the call + m callExpr(forEachArgumentWithParam())) + // Error: forEachArgumentWithParam expects two parameters + + //------------------------------------- + // Step 3: Figure out the syntax to matching all the calls with this new operator + m callExpr( + forEachArgumentWithParam( + anything(), + anything() + ) + ) + // Matches all calls, as expected + + //------------------------------------- + // Step 4: Find the calls with a binary operator of any kind + m callExpr( + forEachArgumentWithParam( + binaryOperator(), + anything() + ) + ) + // Does not match the first call, but matches the others + + //------------------------------------- + // Step 5: Limit the binary operator to assignments + m callExpr( + forEachArgumentWithParam( + binaryOperator(isAssignmentOperator()), + anything() + ) + ) + // Now matches the final three calls + + //------------------------------------- + // Step 6: Starting to refine matching the right-hand of the assignment + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS() + )), + anything() + ) + ) + // Error, hasRHS expects a parameter + + //------------------------------------- + // Step 7: + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(anything()) + )), + anything() + ) + ) + // Okay, back to matching the final three calls + + //------------------------------------- + // Step 8: Refine to just integer literals + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(integerLiteral()) + )), + anything() + ) + ) + // Now we match the final two calls + + //------------------------------------- + // Step 9: Apply a restriction to the parameter definition + m callExpr( + forEachArgumentWithParam( + binaryOperator( + allOf( + isAssignmentOperator(), + hasRHS(integerLiteral()) + )), + hasDefaultArgument() + ) + ) + // Now we match the final call |