summaryrefslogtreecommitdiffstats
path: root/src/tools/clippy/book
diff options
context:
space:
mode:
Diffstat (limited to 'src/tools/clippy/book')
-rw-r--r--src/tools/clippy/book/src/SUMMARY.md1
-rw-r--r--src/tools/clippy/book/src/configuration.md6
-rw-r--r--src/tools/clippy/book/src/development/adding_lints.md18
-rw-r--r--src/tools/clippy/book/src/development/proposals/syntax-tree-patterns.md986
4 files changed, 997 insertions, 14 deletions
diff --git a/src/tools/clippy/book/src/SUMMARY.md b/src/tools/clippy/book/src/SUMMARY.md
index 0b945faf9..1f0b8db28 100644
--- a/src/tools/clippy/book/src/SUMMARY.md
+++ b/src/tools/clippy/book/src/SUMMARY.md
@@ -21,3 +21,4 @@
- [The Clippy Book](development/infrastructure/book.md)
- [Proposals](development/proposals/README.md)
- [Roadmap 2021](development/proposals/roadmap-2021.md)
+ - [Syntax Tree Patterns](development/proposals/syntax-tree-patterns.md)
diff --git a/src/tools/clippy/book/src/configuration.md b/src/tools/clippy/book/src/configuration.md
index 77f1d2e87..430ff8b73 100644
--- a/src/tools/clippy/book/src/configuration.md
+++ b/src/tools/clippy/book/src/configuration.md
@@ -11,8 +11,8 @@ disallowed-names = ["toto", "tata", "titi"]
cognitive-complexity-threshold = 30
```
-See the [list of lints](https://rust-lang.github.io/rust-clippy/master/index.html) for more information about which
-lints can be configured and the meaning of the variables.
+See the [list of configurable lints](https://rust-lang.github.io/rust-clippy/master/index.html#Configuration),
+the lint descriptions contain the names and meanings of these configuration variables.
To deactivate the "for further information visit *lint-link*" message you can define the `CLIPPY_DISABLE_DOCS_LINKS`
environment variable.
@@ -72,7 +72,7 @@ minimum supported Rust version (MSRV) in the clippy configuration file.
msrv = "1.30.0"
```
-The MSRV can also be specified as an inner attribute, like below.
+The MSRV can also be specified as an attribute, like below.
```rust
#![feature(custom_inner_attributes)]
diff --git a/src/tools/clippy/book/src/development/adding_lints.md b/src/tools/clippy/book/src/development/adding_lints.md
index 3c3f368a5..8b4eee8c9 100644
--- a/src/tools/clippy/book/src/development/adding_lints.md
+++ b/src/tools/clippy/book/src/development/adding_lints.md
@@ -443,27 +443,27 @@ value is passed to the constructor in `clippy_lints/lib.rs`.
```rust
pub struct ManualStrip {
- msrv: Option<RustcVersion>,
+ msrv: Msrv,
}
impl ManualStrip {
#[must_use]
- pub fn new(msrv: Option<RustcVersion>) -> Self {
+ pub fn new(msrv: Msrv) -> Self {
Self { msrv }
}
}
```
The project's MSRV can then be matched against the feature MSRV in the LintPass
-using the `meets_msrv` utility function.
+using the `Msrv::meets` method.
``` rust
-if !meets_msrv(self.msrv, msrvs::STR_STRIP_PREFIX) {
+if !self.msrv.meets(msrvs::STR_STRIP_PREFIX) {
return;
}
```
-The project's MSRV can also be specified as an inner attribute, which overrides
+The project's MSRV can also be specified as an attribute, which overrides
the value from `clippy.toml`. This can be accounted for using the
`extract_msrv_attr!(LintContext)` macro and passing
`LateContext`/`EarlyContext`.
@@ -483,19 +483,15 @@ have a case for the version below the MSRV and one with the same contents but
for the MSRV version itself.
```rust
-#![feature(custom_inner_attributes)]
-
...
+#[clippy::msrv = "1.44"]
fn msrv_1_44() {
- #![clippy::msrv = "1.44"]
-
/* something that would trigger the lint */
}
+#[clippy::msrv = "1.45"]
fn msrv_1_45() {
- #![clippy::msrv = "1.45"]
-
/* something that would trigger the lint */
}
```
diff --git a/src/tools/clippy/book/src/development/proposals/syntax-tree-patterns.md b/src/tools/clippy/book/src/development/proposals/syntax-tree-patterns.md
new file mode 100644
index 000000000..c5587c4bf
--- /dev/null
+++ b/src/tools/clippy/book/src/development/proposals/syntax-tree-patterns.md
@@ -0,0 +1,986 @@
+- Feature Name: syntax-tree-patterns
+- Start Date: 2019-03-12
+- RFC PR: [#3875](https://github.com/rust-lang/rust-clippy/pull/3875)
+
+# Summary
+
+Introduce a domain-specific language (similar to regular expressions) that
+allows to describe lints using *syntax tree patterns*.
+
+
+# Motivation
+
+Finding parts of a syntax tree (AST, HIR, ...) that have certain properties
+(e.g. "*an if that has a block as its condition*") is a major task when writing
+lints. For non-trivial lints, it often requires nested pattern matching of AST /
+HIR nodes. For example, testing that an expression is a boolean literal requires
+the following checks:
+
+```rust
+if let ast::ExprKind::Lit(lit) = &expr.node {
+ if let ast::LitKind::Bool(_) = &lit.node {
+ ...
+ }
+}
+```
+
+Writing this kind of matching code quickly becomes a complex task and the
+resulting code is often hard to comprehend. The code below shows a simplified
+version of the pattern matching required by the `collapsible_if` lint:
+
+```rust
+// simplified version of the collapsible_if lint
+if let ast::ExprKind::If(check, then, None) = &expr.node {
+ if then.stmts.len() == 1 {
+ if let ast::StmtKind::Expr(inner) | ast::StmtKind::Semi(inner) = &then.stmts[0].node {
+ if let ast::ExprKind::If(check_inner, content, None) = &inner.node {
+ ...
+ }
+ }
+ }
+}
+```
+
+The `if_chain` macro can improve readability by flattening the nested if
+statements, but the resulting code is still quite hard to read:
+
+```rust
+// simplified version of the collapsible_if lint
+if_chain! {
+ if let ast::ExprKind::If(check, then, None) = &expr.node;
+ if then.stmts.len() == 1;
+ if let ast::StmtKind::Expr(inner) | ast::StmtKind::Semi(inner) = &then.stmts[0].node;
+ if let ast::ExprKind::If(check_inner, content, None) = &inner.node;
+ then {
+ ...
+ }
+}
+```
+
+The code above matches if expressions that contain only another if expression
+(where both ifs don't have an else branch). While it's easy to explain what the
+lint does, it's hard to see that from looking at the code samples above.
+
+Following the motivation above, the first goal this RFC is to **simplify writing
+and reading lints**.
+
+The second part of the motivation is clippy's dependence on unstable
+compiler-internal data structures. Clippy lints are currently written against
+the compiler's AST / HIR which means that even small changes in these data
+structures might break a lot of lints. The second goal of this RFC is to **make
+lints independant of the compiler's AST / HIR data structures**.
+
+# Approach
+
+A lot of complexity in writing lints currently seems to come from having to
+manually implement the matching logic (see code samples above). It's an
+imparative style that describes *how* to match a syntax tree node instead of
+specifying *what* should be matched against declaratively. In other areas, it's
+common to use declarative patterns to describe desired information and let the
+implementation do the actual matching. A well-known example of this approach are
+[regular expressions](https://en.wikipedia.org/wiki/Regular_expression). Instead
+of writing code that detects certain character sequences, one can describe a
+search pattern using a domain-specific language and search for matches using
+that pattern. The advantage of using a declarative domain-specific language is
+that its limited domain (e.g. matching character sequences in the case of
+regular expressions) allows to express entities in that domain in a very natural
+and expressive way.
+
+While regular expressions are very useful when searching for patterns in flat
+character sequences, they cannot easily be applied to hierarchical data
+structures like syntax trees. This RFC therefore proposes a pattern matching
+system that is inspired by regular expressions and designed for hierarchical
+syntax trees.
+
+# Guide-level explanation
+
+This proposal adds a `pattern!` macro that can be used to specify a syntax tree
+pattern to search for. A simple pattern is shown below:
+
+```rust
+pattern!{
+ my_pattern: Expr =
+ Lit(Bool(false))
+}
+```
+
+This macro call defines a pattern named `my_pattern` that can be matched against
+an `Expr` syntax tree node. The actual pattern (`Lit(Bool(false))` in this case)
+defines which syntax trees should match the pattern. This pattern matches
+expressions that are boolean literals with value `false`.
+
+The pattern can then be used to implement lints in the following way:
+
+```rust
+...
+
+impl EarlyLintPass for MyAwesomeLint {
+ fn check_expr(&mut self, cx: &EarlyContext, expr: &syntax::ast::Expr) {
+
+ if my_pattern(expr).is_some() {
+ cx.span_lint(
+ MY_AWESOME_LINT,
+ expr.span,
+ "This is a match for a simple pattern. Well done!",
+ );
+ }
+
+ }
+}
+```
+
+The `pattern!` macro call expands to a function `my_pattern` that expects a
+syntax tree expression as its argument and returns an `Option` that indicates
+whether the pattern matched.
+
+> Note: The result type is explained in more detail in [a later
+> section](#the-result-type). For now, it's enough to know that the result is
+> `Some` if the pattern matched and `None` otherwise.
+
+## Pattern syntax
+
+The following examples demonstate the pattern syntax:
+
+
+#### Any (`_`)
+
+The simplest pattern is the any pattern. It matches anything and is therefore
+similar to regex's `*`.
+
+```rust
+pattern!{
+ // matches any expression
+ my_pattern: Expr =
+ _
+}
+```
+
+#### Node (`<node-name>(<args>)`)
+
+Nodes are used to match a specific variant of an AST node. A node has a name and
+a number of arguments that depends on the node type. For example, the `Lit` node
+has a single argument that describes the type of the literal. As another
+example, the `If` node has three arguments describing the if's condition, then
+block and else block.
+
+```rust
+pattern!{
+ // matches any expression that is a literal
+ my_pattern: Expr =
+ Lit(_)
+}
+
+pattern!{
+ // matches any expression that is a boolean literal
+ my_pattern: Expr =
+ Lit(Bool(_))
+}
+
+pattern!{
+ // matches if expressions that have a boolean literal in their condition
+ // Note: The `_?` syntax here means that the else branch is optional and can be anything.
+ // This is discussed in more detail in the section `Repetition`.
+ my_pattern: Expr =
+ If( Lit(Bool(_)) , _, _?)
+}
+```
+
+
+#### Literal (`<lit>`)
+
+A pattern can also contain Rust literals. These literals match themselves.
+
+```rust
+pattern!{
+ // matches the boolean literal false
+ my_pattern: Expr =
+ Lit(Bool(false))
+}
+
+pattern!{
+ // matches the character literal 'x'
+ my_pattern: Expr =
+ Lit(Char('x'))
+}
+```
+
+#### Alternations (`a | b`)
+
+```rust
+pattern!{
+ // matches if the literal is a boolean or integer literal
+ my_pattern: Lit =
+ Bool(_) | Int(_)
+}
+
+pattern!{
+ // matches if the expression is a char literal with value 'x' or 'y'
+ my_pattern: Expr =
+ Lit( Char('x' | 'y') )
+}
+```
+
+#### Empty (`()`)
+
+The empty pattern represents an empty sequence or the `None` variant of an
+optional.
+
+```rust
+pattern!{
+ // matches if the expression is an empty array
+ my_pattern: Expr =
+ Array( () )
+}
+
+pattern!{
+ // matches if expressions that don't have an else clause
+ my_pattern: Expr =
+ If(_, _, ())
+}
+```
+
+#### Sequence (`<a> <b>`)
+
+```rust
+pattern!{
+ // matches the array [true, false]
+ my_pattern: Expr =
+ Array( Lit(Bool(true)) Lit(Bool(false)) )
+}
+```
+
+#### Repetition (`<a>*`, `<a>+`, `<a>?`, `<a>{n}`, `<a>{n,m}`, `<a>{n,}`)
+
+Elements may be repeated. The syntax for specifying repetitions is identical to
+[regex's syntax](https://docs.rs/regex/1.1.2/regex/#repetitions).
+
+```rust
+pattern!{
+ // matches arrays that contain 2 'x's as their last or second-last elements
+ // Examples:
+ // ['x', 'x'] match
+ // ['x', 'x', 'y'] match
+ // ['a', 'b', 'c', 'x', 'x', 'y'] match
+ // ['x', 'x', 'y', 'z'] no match
+ my_pattern: Expr =
+ Array( _* Lit(Char('x')){2} _? )
+}
+
+pattern!{
+ // matches if expressions that **may or may not** have an else block
+ // Attn: `If(_, _, _)` matches only ifs that **have** an else block
+ //
+ // | if with else block | if witout else block
+ // If(_, _, _) | match | no match
+ // If(_, _, _?) | match | match
+ // If(_, _, ()) | no match | match
+ my_pattern: Expr =
+ If(_, _, _?)
+}
+```
+
+#### Named submatch (`<a>#<name>`)
+
+```rust
+pattern!{
+ // matches character literals and gives the literal the name foo
+ my_pattern: Expr =
+ Lit(Char(_)#foo)
+}
+
+pattern!{
+ // matches character literals and gives the char the name bar
+ my_pattern: Expr =
+ Lit(Char(_#bar))
+}
+
+pattern!{
+ // matches character literals and gives the expression the name baz
+ my_pattern: Expr =
+ Lit(Char(_))#baz
+}
+```
+
+The reason for using named submatches is described in the section [The result
+type](#the-result-type).
+
+### Summary
+
+The following table gives an summary of the pattern syntax:
+
+| Syntax | Concept | Examples |
+|-------------------------|------------------|--------------------------------------------|
+|`_` | Any | `_` |
+|`<node-name>(<args>)` | Node | `Lit(Bool(true))`, `If(_, _, _)` |
+|`<lit>` | Literal | `'x'`, `false`, `101` |
+|`<a> \| <b>` | Alternation | `Char(_) \| Bool(_)` |
+|`()` | Empty | `Array( () )` |
+|`<a> <b>` | Sequence | `Tuple( Lit(Bool(_)) Lit(Int(_)) Lit(_) )` |
+|`<a>*` <br> `<a>+` <br> `<a>?` <br> `<a>{n}` <br> `<a>{n,m}` <br> `<a>{n,}` | Repetition <br> <br> <br> <br> <br><br> | `Array( _* )`, <br> `Block( Semi(_)+ )`, <br> `If(_, _, Block(_)?)`, <br> `Array( Lit(_){10} )`, <br> `Lit(_){5,10}`, <br> `Lit(Bool(_)){10,}` |
+|`<a>#<name>` | Named submatch | `Lit(Int(_))#foo` `Lit(Int(_#bar))` |
+
+
+## The result type
+
+A lot of lints require checks that go beyond what the pattern syntax described
+above can express. For example, a lint might want to check whether a node was
+created as part of a macro expansion or whether there's no comment above a node.
+Another example would be a lint that wants to match two nodes that have the same
+value (as needed by lints like `almost_swapped`). Instead of allowing users to
+write these checks into the pattern directly (which might make patterns hard to
+read), the proposed solution allows users to assign names to parts of a pattern
+expression. When matching a pattern against a syntax tree node, the return value
+will contain references to all nodes that were matched by these named
+subpatterns. This is similar to capture groups in regular expressions.
+
+For example, given the following pattern
+
+```rust
+pattern!{
+ // matches character literals
+ my_pattern: Expr =
+ Lit(Char(_#val_inner)#val)#val_outer
+}
+```
+
+one could get references to the nodes that matched the subpatterns in the
+following way:
+
+```rust
+...
+fn check_expr(expr: &syntax::ast::Expr) {
+ if let Some(result) = my_pattern(expr) {
+ result.val_inner // type: &char
+ result.val // type: &syntax::ast::Lit
+ result.val_outer // type: &syntax::ast::Expr
+ }
+}
+```
+
+The types in the `result` struct depend on the pattern. For example, the
+following pattern
+
+```rust
+pattern!{
+ // matches arrays of character literals
+ my_pattern_seq: Expr =
+ Array( Lit(_)*#foo )
+}
+```
+
+matches arrays that consist of any number of literal expressions. Because those
+expressions are named `foo`, the result struct contains a `foo` attribute which
+is a vector of expressions:
+
+```rust
+...
+if let Some(result) = my_pattern_seq(expr) {
+ result.foo // type: Vec<&syntax::ast::Expr>
+}
+```
+
+Another result type occurs when a name is only defined in one branch of an
+alternation:
+
+```rust
+pattern!{
+ // matches if expression is a boolean or integer literal
+ my_pattern_alt: Expr =
+ Lit( Bool(_#bar) | Int(_) )
+}
+```
+
+In the pattern above, the `bar` name is only defined if the pattern matches a
+boolean literal. If it matches an integer literal, the name isn't set. To
+account for this, the result struct's `bar` attribute is an option type:
+
+```rust
+...
+if let Some(result) = my_pattern_alt(expr) {
+ result.bar // type: Option<&bool>
+}
+```
+
+It's also possible to use a name in multiple alternation branches if they have
+compatible types:
+
+```rust
+pattern!{
+ // matches if expression is a boolean or integer literal
+ my_pattern_mult: Expr =
+ Lit(_#baz) | Array( Lit(_#baz) )
+}
+...
+if let Some(result) = my_pattern_mult(expr) {
+ result.baz // type: &syntax::ast::Lit
+}
+```
+
+Named submatches are a **flat** namespace and this is intended. In the example
+above, two different sub-structures are assigned to a flat name. I expect that
+for most lints, a flat namespace is sufficient and easier to work with than a
+hierarchical one.
+
+#### Two stages
+
+Using named subpatterns, users can write lints in two stages. First, a coarse
+selection of possible matches is produced by the pattern syntax. In the second
+stage, the named subpattern references can be used to do additional tests like
+asserting that a node hasn't been created as part of a macro expansion.
+
+## Implementing clippy lints using patterns
+
+As a "real-world" example, I re-implemented the `collapsible_if` lint using
+patterns. The code can be found
+[here](https://github.com/fkohlgrueber/rust-clippy-pattern/blob/039b07ecccaf96d6aa7504f5126720d2c9cceddd/clippy_lints/src/collapsible_if.rs#L88-L163).
+The pattern-based version passes all test cases that were written for
+`collapsible_if`.
+
+
+# Reference-level explanation
+
+## Overview
+
+The following diagram shows the dependencies between the main parts of the
+proposed solution:
+
+```
+ Pattern syntax
+ |
+ | parsing / lowering
+ v
+ PatternTree
+ ^
+ |
+ |
+ IsMatch trait
+ |
+ |
+ +---------------+-----------+---------+
+ | | | |
+ v v v v
+ syntax::ast rustc::hir syn ...
+```
+
+The pattern syntax described in the previous section is parsed / lowered into
+the so-called *PatternTree* data structure that represents a valid syntax tree
+pattern. Matching a *PatternTree* against an actual syntax tree (e.g. rust ast /
+hir or the syn ast, ...) is done using the *IsMatch* trait.
+
+The *PatternTree* and the *IsMatch* trait are introduced in more detail in the
+following sections.
+
+## PatternTree
+
+The core data structure of this RFC is the **PatternTree**.
+
+It's a data structure similar to rust's AST / HIR, but with the following
+differences:
+
+- The PatternTree doesn't contain parsing information like `Span`s
+- The PatternTree can represent alternatives, sequences and optionals
+
+The code below shows a simplified version of the current PatternTree:
+
+> Note: The current implementation can be found
+> [here](https://github.com/fkohlgrueber/pattern-matching/blob/dfb3bc9fbab69cec7c91e72564a63ebaa2ede638/pattern-match/src/pattern_tree.rs#L50-L96).
+
+
+```rust
+pub enum Expr {
+ Lit(Alt<Lit>),
+ Array(Seq<Expr>),
+ Block_(Alt<BlockType>),
+ If(Alt<Expr>, Alt<BlockType>, Opt<Expr>),
+ IfLet(
+ Alt<BlockType>,
+ Opt<Expr>,
+ ),
+}
+
+pub enum Lit {
+ Char(Alt<char>),
+ Bool(Alt<bool>),
+ Int(Alt<u128>),
+}
+
+pub enum Stmt {
+ Expr(Alt<Expr>),
+ Semi(Alt<Expr>),
+}
+
+pub enum BlockType {
+ Block(Seq<Stmt>),
+}
+```
+
+The `Alt`, `Seq` and `Opt` structs look like these:
+
+> Note: The current implementation can be found
+> [here](https://github.com/fkohlgrueber/pattern-matching/blob/dfb3bc9fbab69cec7c91e72564a63ebaa2ede638/pattern-match/src/matchers.rs#L35-L60).
+
+```rust
+pub enum Alt<T> {
+ Any,
+ Elmt(Box<T>),
+ Alt(Box<Self>, Box<Self>),
+ Named(Box<Self>, ...)
+}
+
+pub enum Opt<T> {
+ Any, // anything, but not None
+ Elmt(Box<T>),
+ None,
+ Alt(Box<Self>, Box<Self>),
+ Named(Box<Self>, ...)
+}
+
+pub enum Seq<T> {
+ Any,
+ Empty,
+ Elmt(Box<T>),
+ Repeat(Box<Self>, RepeatRange),
+ Seq(Box<Self>, Box<Self>),
+ Alt(Box<Self>, Box<Self>),
+ Named(Box<Self>, ...)
+}
+
+pub struct RepeatRange {
+ pub start: usize,
+ pub end: Option<usize> // exclusive
+}
+```
+
+## Parsing / Lowering
+
+The input of a `pattern!` macro call is parsed into a `ParseTree` first and then
+lowered to a `PatternTree`.
+
+Valid patterns depend on the *PatternTree* definitions. For example, the pattern
+`Lit(Bool(_)*)` isn't valid because the parameter type of the `Lit` variant of
+the `Expr` enum is `Any<Lit>` and therefore doesn't support repetition (`*`). As
+another example, `Array( Lit(_)* )` is a valid pattern because the parameter of
+`Array` is of type `Seq<Expr>` which allows sequences and repetitions.
+
+> Note: names in the pattern syntax correspond to *PatternTree* enum
+> **variants**. For example, the `Lit` in the pattern above refers to the `Lit`
+> variant of the `Expr` enum (`Expr::Lit`), not the `Lit` enum.
+
+## The IsMatch Trait
+
+The pattern syntax and the *PatternTree* are independant of specific syntax tree
+implementations (rust ast / hir, syn, ...). When looking at the different
+pattern examples in the previous sections, it can be seen that the patterns
+don't contain any information specific to a certain syntax tree implementation.
+In contrast, clippy lints currently match against ast / hir syntax tree nodes
+and therefore directly depend on their implementation.
+
+The connection between the *PatternTree* and specific syntax tree
+implementations is the `IsMatch` trait. It defines how to match *PatternTree*
+nodes against specific syntax tree nodes. A simplified implementation of the
+`IsMatch` trait is shown below:
+
+```rust
+pub trait IsMatch<O> {
+ fn is_match(&self, other: &'o O) -> bool;
+}
+```
+
+This trait needs to be implemented on each enum of the *PatternTree* (for the
+corresponding syntax tree types). For example, the `IsMatch` implementation for
+matching `ast::LitKind` against the *PatternTree's* `Lit` enum might look like
+this:
+
+```rust
+impl IsMatch<ast::LitKind> for Lit {
+ fn is_match(&self, other: &ast::LitKind) -> bool {
+ match (self, other) {
+ (Lit::Char(i), ast::LitKind::Char(j)) => i.is_match(j),
+ (Lit::Bool(i), ast::LitKind::Bool(j)) => i.is_match(j),
+ (Lit::Int(i), ast::LitKind::Int(j, _)) => i.is_match(j),
+ _ => false,
+ }
+ }
+}
+```
+
+All `IsMatch` implementations for matching the current *PatternTree* against
+`syntax::ast` can be found
+[here](https://github.com/fkohlgrueber/pattern-matching/blob/dfb3bc9fbab69cec7c91e72564a63ebaa2ede638/pattern-match/src/ast_match.rs).
+
+
+# Drawbacks
+
+#### Performance
+
+The pattern matching code is currently not optimized for performance, so it
+might be slower than hand-written matching code. Additionally, the two-stage
+approach (matching against the coarse pattern first and checking for additional
+properties later) might be slower than the current practice of checking for
+structure and additional properties in one pass. For example, the following lint
+
+```rust
+pattern!{
+ pat_if_without_else: Expr =
+ If(
+ _,
+ Block(
+ Expr( If(_, _, ())#inner )
+ | Semi( If(_, _, ())#inner )
+ )#then,
+ ()
+ )
+}
+...
+fn check_expr(&mut self, cx: &EarlyContext<'_>, expr: &ast::Expr) {
+ if let Some(result) = pat_if_without_else(expr) {
+ if !block_starts_with_comment(cx, result.then) {
+ ...
+ }
+}
+```
+
+first matches against the pattern and then checks that the `then` block doesn't
+start with a comment. Using clippy's current approach, it's possible to check
+for these conditions earlier:
+
+```rust
+fn check_expr(&mut self, cx: &EarlyContext<'_>, expr: &ast::Expr) {
+ if_chain! {
+ if let ast::ExprKind::If(ref check, ref then, None) = expr.node;
+ if !block_starts_with_comment(cx, then);
+ if let Some(inner) = expr_block(then);
+ if let ast::ExprKind::If(ref check_inner, ref content, None) = inner.node;
+ then {
+ ...
+ }
+ }
+}
+```
+
+Whether or not this causes performance regressions depends on actual patterns.
+If it turns out to be a problem, the pattern matching algorithms could be
+extended to allow "early filtering" (see the [Early Filtering](#early-filtering)
+section in Future Possibilities).
+
+That being said, I don't see any conceptual limitations regarding pattern
+matching performance.
+
+#### Applicability
+
+Even though I'd expect that a lot of lints can be written using the proposed
+pattern syntax, it's unlikely that all lints can be expressed using patterns. I
+suspect that there will still be lints that need to be implemented by writing
+custom pattern matching code. This would lead to mix within clippy's codebase
+where some lints are implemented using patterns and others aren't. This
+inconsistency might be considered a drawback.
+
+
+# Rationale and alternatives
+
+Specifying lints using syntax tree patterns has a couple of advantages compared
+to the current approach of manually writing matching code. First, syntax tree
+patterns allow users to describe patterns in a simple and expressive way. This
+makes it easier to write new lints for both novices and experts and also makes
+reading / modifying existing lints simpler.
+
+Another advantage is that lints are independent of specific syntax tree
+implementations (e.g. AST / HIR, ...). When these syntax tree implementations
+change, only the `IsMatch` trait implementations need to be adapted and existing
+lints can remain unchanged. This also means that if the `IsMatch` trait
+implementations were integrated into the compiler, updating the `IsMatch`
+implementations would be required for the compiler to compile successfully. This
+could reduce the number of times clippy breaks because of changes in the
+compiler. Another advantage of the pattern's independence is that converting an
+`EarlyLintPass` lint into a `LatePassLint` wouldn't require rewriting the whole
+pattern matching code. In fact, the pattern might work just fine without any
+adaptions.
+
+
+## Alternatives
+
+### Rust-like pattern syntax
+
+The proposed pattern syntax requires users to know the structure of the
+`PatternTree` (which is very similar to the AST's / HIR's structure) and also
+the pattern syntax. An alternative would be to introduce a pattern syntax that
+is similar to actual Rust syntax (probably like the `quote!` macro). For
+example, a pattern that matches `if` expressions that have `false` in their
+condition could look like this:
+
+```rust
+if false {
+ #[*]
+}
+```
+
+#### Problems
+
+Extending Rust syntax (which is quite complex by itself) with additional syntax
+needed for specifying patterns (alternations, sequences, repetisions, named
+submatches, ...) might become difficult to read and really hard to parse
+properly.
+
+For example, a pattern that matches a binary operation that has `0` on both
+sides might look like this:
+
+```
+0 #[*:BinOpKind] 0
+```
+
+Now consider this slightly more complex example:
+
+```
+1 + 0 #[*:BinOpKind] 0
+```
+
+The parser would need to know the precedence of `#[*:BinOpKind]` because it
+affects the structure of the resulting AST. `1 + 0 + 0` is parsed as `(1 + 0) +
+0` while `1 + 0 * 0` is parsed as `1 + (0 * 0)`. Since the pattern could be any
+`BinOpKind`, the precedence cannot be known in advance.
+
+Another example of a problem would be named submatches. Take a look at this
+pattern:
+
+```rust
+fn test() {
+ 1 #foo
+}
+```
+
+Which node is `#foo` referring to? `int`, `ast::Lit`, `ast::Expr`, `ast::Stmt`?
+Naming subpatterns in a rust-like syntax is difficult because a lot of AST nodes
+don't have a syntactic element that can be used to put the name tag on. In these
+situations, the only sensible option would be to assign the name tag to the
+outermost node (`ast::Stmt` in the example above), because the information of
+all child nodes can be retrieved through the outermost node. The problem with
+this then would be that accessing inner nodes (like `ast::Lit`) would again
+require manual pattern matching.
+
+In general, Rust syntax contains a lot of code structure implicitly. This
+structure is reconstructed during parsing (e.g. binary operations are
+reconstructed using operator precedence and left-to-right) and is one of the
+reasons why parsing is a complex task. The advantage of this approach is that
+writing code is simpler for users.
+
+When writing *syntax tree patterns*, each element of the hierarchy might have
+alternatives, repetitions, etc.. Respecting that while still allowing
+human-friendly syntax that contains structure implicitly seems to be really
+complex, if not impossible.
+
+Developing such a syntax would also require to maintain a custom parser that is
+at least as complex as the Rust parser itself. Additionally, future changes in
+the Rust syntax might be incompatible with such a syntax.
+
+In summary, I think that developing such a syntax would introduce a lot of
+complexity to solve a relatively minor problem.
+
+The issue of users not knowing about the *PatternTree* structure could be solved
+by a tool that, given a rust program, generates a pattern that matches only this
+program (similar to the clippy author lint).
+
+For some simple cases (like the first example above), it might be possible to
+successfully mix Rust and pattern syntax. This space could be further explored
+in a future extension.
+
+# Prior art
+
+The pattern syntax is heavily inspired by regular expressions (repetitions,
+alternatives, sequences, ...).
+
+From what I've seen until now, other linters also implement lints that directly
+work on syntax tree data structures, just like clippy does currently. I would
+therefore consider the pattern syntax to be *new*, but please correct me if I'm
+wrong.
+
+# Unresolved questions
+
+#### How to handle multiple matches?
+
+When matching a syntax tree node against a pattern, there are possibly multiple
+ways in which the pattern can be matched. A simple example of this would be the
+following pattern:
+
+```rust
+pattern!{
+ my_pattern: Expr =
+ Array( _* Lit(_)+#literals)
+}
+```
+
+This pattern matches arrays that end with at least one literal. Now given the
+array `[x, 1, 2]`, should `1` be matched as part of the `_*` or the `Lit(_)+`
+part of the pattern? The difference is important because the named submatch
+`#literals` would contain 1 or 2 elements depending how the pattern is matched.
+In regular expressions, this problem is solved by matching "greedy" by default
+and "non-greedy" optionally.
+
+I haven't looked much into this yet because I don't know how relevant it is for
+most lints. The current implementation simply returns the first match it finds.
+
+# Future possibilities
+
+#### Implement rest of Rust Syntax
+
+The current project only implements a small part of the Rust syntax. In the
+future, this should incrementally be extended to more syntax to allow
+implementing more lints. Implementing more of the Rust syntax requires extending
+the `PatternTree` and `IsMatch` implementations, but should be relatively
+straight-forward.
+
+#### Early filtering
+
+As described in the *Drawbacks/Performance* section, allowing additional checks
+during the pattern matching might be beneficial.
+
+The pattern below shows how this could look like:
+
+```rust
+pattern!{
+ pat_if_without_else: Expr =
+ If(
+ _,
+ Block(
+ Expr( If(_, _, ())#inner )
+ | Semi( If(_, _, ())#inner )
+ )#then,
+ ()
+ )
+ where
+ !in_macro(#then.span);
+}
+```
+
+The difference compared to the currently proposed two-stage filtering is that
+using early filtering, the condition (`!in_macro(#then.span)` in this case)
+would be evaluated as soon as the `Block(_)#then` was matched.
+
+Another idea in this area would be to introduce a syntax for backreferences.
+They could be used to require that multiple parts of a pattern should match the
+same value. For example, the `assign_op_pattern` lint that searches for `a = a
+op b` and recommends changing it to `a op= b` requires that both occurrances of
+`a` are the same. Using `=#...` as syntax for backreferences, the lint could be
+implemented like this:
+
+```rust
+pattern!{
+ assign_op_pattern: Expr =
+ Assign(_#target, Binary(_, =#target, _)
+}
+```
+
+#### Match descendant
+
+A lot of lints currently implement custom visitors that check whether any
+subtree (which might not be a direct descendant) of the current node matches
+some properties. This cannot be expressed with the proposed pattern syntax.
+Extending the pattern syntax to allow patterns like "a function that contains at
+least two return statements" could be a practical addition.
+
+#### Negation operator for alternatives
+
+For patterns like "a literal that is not a boolean literal" one currently needs
+to list all alternatives except the boolean case. Introducing a negation
+operator that allows to write `Lit(!Bool(_))` might be a good idea. This pattern
+would be eqivalent to `Lit( Char(_) | Int(_) )` (given that currently only three
+literal types are implemented).
+
+#### Functional composition
+
+Patterns currently don't have any concept of composition. This leads to
+repetitions within patterns. For example, one of the collapsible-if patterns
+currently has to be written like this:
+
+```rust
+pattern!{
+ pat_if_else: Expr =
+ If(
+ _,
+ _,
+ Block_(
+ Block(
+ Expr((If(_, _, _?) | IfLet(_, _?))#else_) |
+ Semi((If(_, _, _?) | IfLet(_, _?))#else_)
+ )#block_inner
+ )#block
+ ) |
+ IfLet(
+ _,
+ Block_(
+ Block(
+ Expr((If(_, _, _?) | IfLet(_, _?))#else_) |
+ Semi((If(_, _, _?) | IfLet(_, _?))#else_)
+ )#block_inner
+ )#block
+ )
+}
+```
+
+If patterns supported defining functions of subpatterns, the code could be
+simplified as follows:
+
+```rust
+pattern!{
+ fn expr_or_semi(expr: Expr) -> Stmt {
+ Expr(expr) | Semi(expr)
+ }
+ fn if_or_if_let(then: Block, else: Opt<Expr>) -> Expr {
+ If(_, then, else) | IfLet(then, else)
+ }
+ pat_if_else: Expr =
+ if_or_if_let(
+ _,
+ Block_(
+ Block(
+ expr_or_semi( if_or_if_let(_, _?)#else_ )
+ )#block_inner
+ )#block
+ )
+}
+```
+
+Additionally, common patterns like `expr_or_semi` could be shared between
+different lints.
+
+#### Clippy Pattern Author
+
+Another improvement could be to create a tool that, given some valid Rust
+syntax, generates a pattern that matches this syntax exactly. This would make
+starting to write a pattern easier. A user could take a look at the patterns
+generated for a couple of Rust code examples and use that information to write a
+pattern that matches all of them.
+
+This is similar to clippy's author lint.
+
+#### Supporting other syntaxes
+
+Most of the proposed system is language-agnostic. For example, the pattern
+syntax could also be used to describe patterns for other programming languages.
+
+In order to support other languages' syntaxes, one would need to implement
+another `PatternTree` that sufficiently describes the languages' AST and
+implement `IsMatch` for this `PatternTree` and the languages' AST.
+
+One aspect of this is that it would even be possible to write lints that work on
+the pattern syntax itself. For example, when writing the following pattern
+
+
+```rust
+pattern!{
+ my_pattern: Expr =
+ Array( Lit(Bool(false)) Lit(Bool(false)) )
+}
+```
+
+a lint that works on the pattern syntax's AST could suggest using this pattern
+instead:
+
+```rust
+pattern!{
+ my_pattern: Expr =
+ Array( Lit(Bool(false)){2} )
+}
+```
+
+In the future, clippy could use this system to also provide lints for custom
+syntaxes like those found in macros.