diff options
Diffstat (limited to 'src/doc/rustc-dev-guide/src/solve')
-rw-r--r-- | src/doc/rustc-dev-guide/src/solve/canonicalization.md | 84 | ||||
-rw-r--r-- | src/doc/rustc-dev-guide/src/solve/coinduction.md | 250 | ||||
-rw-r--r-- | src/doc/rustc-dev-guide/src/solve/the-solver.md | 17 | ||||
-rw-r--r-- | src/doc/rustc-dev-guide/src/solve/trait-solving.md | 114 |
4 files changed, 465 insertions, 0 deletions
diff --git a/src/doc/rustc-dev-guide/src/solve/canonicalization.md b/src/doc/rustc-dev-guide/src/solve/canonicalization.md new file mode 100644 index 000000000..a14be5216 --- /dev/null +++ b/src/doc/rustc-dev-guide/src/solve/canonicalization.md @@ -0,0 +1,84 @@ +# Canonicalization + +Canonicalization is the process of *isolating* a value from its context and is necessary +for global caching of goals which include inference variables. + +The idea is that given the goals `u32: Trait<?x>` and `u32: Trait<?y>`, where `?x` and `?y` +are two different currently unconstrained inference variables, we should get the same result +for both goals. We can therefore prove *the canonical query* `exists<T> u32: Trait<T>` once +and reuse the result. + +Let's first go over the way canonical queries work and then dive into the specifics of +how canonicalization works. + +## A walkthrough of canonical queries + +To make this a bit easier, let's use the trait goal `u32: Trait<?x>` as an example with the +assumption that the only relevant impl is `impl<T> Trait<Vec<T>> for u32`. + +### Canonicalizing the input + +We start by *canonicalizing* the goal, replacing inference variables with existential and +placeholders with universal bound variables. This would result in the *canonical goal* +`exists<T> u32: Trait<T>`. + +We remember the original values of all bound variables in the original context. Here this would +map `T` back to `?x`. These original values are used later on when dealing with the query +response. + +We now call the canonical query with the canonical goal. + +### Instantiating the canonical goal inside of the query + +To actually try to prove the canonical goal we start by instantiating the bound variables with +inference variables and placeholders again. + +This happens inside of the query in a completely separate `InferCtxt`. Inside of the query we +now have a goal `u32: Trait<?0>`. We also remember which value we've used to instantiate the bound +variables in the canonical goal, which maps `T` to `?0`. + +We now compute the goal `u32: Trait<?0>` and figure out that this holds, but we've constrained +`?0` to `Vec<?1>`. We finally convert this result to something useful to the caller. + +### Canonicalizing the query response + +We have to return to the caller both whether the goal holds, and the inference constraints +from inside of the query. + +To return the inference results to the caller we canonicalize the mapping from bound variables +to the instantiated values in the query. This means that the query response is `Certainty::Yes` +and a mapping from `T` to `exists<U> Vec<U>`. + +### Instantiating the query response + +The caller now has to apply the constraints returned by the query. For this they first +instantiate the bound variables of the canonical response with inference variables and +placeholders again, so the mapping in the response is now from `T` to `Vec<?z>`. + +It now equates the original value of `T` (`?x`) with the value for `T` in the +response (`Vec<?z>`), which correctly constrains `?x` to `Vec<?z>`. + +## `ExternalConstraints` + +Computing a trait goal may not only constrain inference variables, it can also add region +obligations, e.g. given a goal `(): AOutlivesB<'a, 'b>` we would like to return the fact that +`'a: 'b` has to hold. + +This is done by not only returning the mapping from bound variables to the instantiated values +from the query but also extracting additional `ExternalConstraints` from the `InferCtxt` context +while building the response. + +## How exactly does canonicalization work + +TODO: link to code once the PR lands and elaborate + +- types and consts: infer to existentially bound var, placeholder to universally bound var, + considering universes +- generic parameters in the input get treated as placeholders in the root universe +- all regions in the input get all mapped to existentially bound vars and we "uniquify" them. + `&'a (): Trait<'a>` gets canonicalized to `exists<'0, '1> &'0 (): Trait<'1>`. We do not care + about their universes and simply put all regions into the highest universe of the input. +- once we collected all canonical vars we compress their universes, see comment in `finalize`. +- in the output everything in a universe of the caller gets put into the root universe and only + gets its correct universe when we unify the var values with the orig values of the caller +- we do not uniquify regions in the response and don't canonicalize `'static`
\ No newline at end of file diff --git a/src/doc/rustc-dev-guide/src/solve/coinduction.md b/src/doc/rustc-dev-guide/src/solve/coinduction.md new file mode 100644 index 000000000..c682e002d --- /dev/null +++ b/src/doc/rustc-dev-guide/src/solve/coinduction.md @@ -0,0 +1,250 @@ +# Coinduction + +The trait solver may use coinduction when proving goals. +Coinduction is fairly subtle so we're giving it its own chapter. + +## Coinduction and induction + +With induction, we recursively apply proofs until we end up with a finite proof tree. +Consider the example of `Vec<Vec<Vec<u32>>>: Debug` which results in the following tree. + +- `Vec<Vec<Vec<u32>>>: Debug` + - `Vec<Vec<u32>>: Debug` + - `Vec<u32>: Debug` + - `u32: Debug` + +This tree is finite. But not all goals we would want to hold have finite proof trees, +consider the following example: + +```rust +struct List<T> { + value: T, + next: Option<Box<List<T>>>, +} +``` + +For `List<T>: Send` to hold all its fields have to recursively implement `Send` as well. +This would result in the following proof tree: + +- `List<T>: Send` + - `T: Send` + - `Option<Box<List<T>>>: Send` + - `Box<List<T>>: Send` + - `List<T>: Send` + - `T: Send` + - `Option<Box<List<T>>>: Send` + - `Box<List<T>>: Send` + - ... + +This tree would be infinitely large which is exactly what coinduction is about. + +> To **inductively** prove a goal you need to provide a finite proof tree for it. +> To **coinductively** prove a goal the provided proof tree may be infinite. + +## Why is coinduction correct + +When checking whether some trait goals holds, we're asking "does there exist an `impl` +which satisfies this bound". Even if are infinite chains of nested goals, we still have a +unique `impl` which should be used. + +## How to implement coinduction + +While our implementation can not check for coinduction by trying to construct an infinite +tree as that would take infinite resources, it still makes sense to think of coinduction +from this perspective. + +As we cannot check for infinite trees, we instead search for patterns for which we know that +they would result in an infinite proof tree. The currently pattern we detect are (canonical) +cycles. If `T: Send` relies on `T: Send` then it's pretty clear that this will just go on forever. + +With cycles we have to be careful with caching. Because of canonicalization of regions and +inference variables encountering a cycle doesn't mean that we would get an infinite proof tree. +Looking at the following example: +```rust +trait Foo {} +struct Wrapper<T>(T); + +impl<T> Foo for Wrapper<Wrapper<T>> +where + Wrapper<T>: Foo +{} +``` +Proving `Wrapper<?0>: Foo` uses the impl `impl<T> Foo for Wrapper<Wrapper<T>>` which constrains +`?0` to `Wrapper<?1>` and then requires `Wrapper<?1>: Foo`. Due to canonicalization this would be +detected as a cycle. + +The idea to solve is to return a *provisional result* whenever we detect a cycle and repeatedly +retry goals until the *provisional result* is equal to the final result of that goal. We +start out by using `Yes` with no constraints as the result and then update it to the result of +the previous iteration whenever we have to rerun. + +TODO: elaborate here. We use the same approach as chalk for coinductive cycles. +Note that the treatment for inductive cycles currently differs by simply returning `Overflow`. +See [the relevant chapters][chalk] in the chalk book. + +[chalk]: https://rust-lang.github.io/chalk/book/recursive/inductive_cycles.html + + +## Future work + +We currently only consider auto-traits, `Sized`, and `WF`-goals to be coinductive. +In the future we pretty much intend for all goals to be coinductive. +Lets first elaborate on why allowing more coinductive proofs is even desirable. + +### Recursive data types already rely on coinduction... + +...they just tend to avoid them in the trait solver. + +```rust +enum List<T> { + Nil, + Succ(T, Box<List<T>>), +} + +impl<T: Clone> Clone for List<T> { + fn clone(&self) -> Self { + match self { + List::Nil => List::Nil, + List::Succ(head, tail) => List::Succ(head.clone(), tail.clone()), + } + } +} +``` + +We are using `tail.clone()` in this impl. For this we have to prove `Box<List<T>>: Clone` +which requires `List<T>: Clone` but that relies on the impl which we are currently checking. +By adding that requirement to the `where`-clauses of the impl, which is what we would +do with [perfect derive], we move that cycle into the trait solver and [get an error][ex1]. + +### Recursive data types + +We also need coinduction to reason about recursive types containing projections, +e.g. the following currently fails to compile even though it should be valid. +```rust +use std::borrow::Cow; +pub struct Foo<'a>(Cow<'a, [Foo<'a>]>); +``` +This issue has been known since at least 2015, see +[#23714](https://github.com/rust-lang/rust/issues/23714) if you want to know more. + +### Explicitly checked implied bounds + +When checking an impl, we assume that the types in the impl headers are well-formed. +This means that when using instantiating the impl we have to prove that's actually the case. +[#100051](https://github.com/rust-lang/rust/issues/100051) shows that this is not the case. +To fix this, we have to add `WF` predicates for the types in impl headers. +Without coinduction for all traits, this even breaks `core`. + +```rust +trait FromResidual<R> {} +trait Try: FromResidual<<Self as Try>::Residual> { + type Residual; +} + +struct Ready<T>(T); +impl<T> Try for Ready<T> { + type Residual = Ready<()>; +} +impl<T> FromResidual<<Ready<T> as Try>::Residual> for Ready<T> {} +``` + +When checking that the impl of `FromResidual` is well formed we get the following cycle: + +The impl is well formed if `<Ready<T> as Try>::Residual` and `Ready<T>` are well formed. +- `wf(<Ready<T> as Try>::Residual)` requires +- `Ready<T>: Try`, which requires because of the super trait +- `Ready<T>: FromResidual<Ready<T> as Try>::Residual>`, **because of implied bounds on impl** +- `wf(<Ready<T> as Try>::Residual)` :tada: **cycle** + +### Issues when extending coinduction to more goals + +There are some additional issues to keep in mind when extending coinduction. +The issues here are not relevant for the current solver. + +#### Implied super trait bounds + +Our trait system currently treats super traits, e.g. `trait Trait: SuperTrait`, +by 1) requiring that `SuperTrait` has to hold for all types which implement `Trait`, +and 2) assuming `SuperTrait` holds if `Trait` holds. + +Relying on 2) while proving 1) is unsound. This can only be observed in case of +coinductive cycles. Without cycles, whenever we rely on 2) we must have also +proven 1) without relying on 2) for the used impl of `Trait`. + +```rust +trait Trait: SuperTrait {} + +impl<T: Trait> Trait for T {} + +// Keeping the current setup for coinduction +// would allow this compile. Uff :< +fn sup<T: SuperTrait>() {} +fn requires_trait<T: Trait>() { sup::<T>() } +fn generic<T>() { requires_trait::<T>() } +``` +This is not really fundamental to coinduction but rather an existing property +which is made unsound because of it. + +##### Possible solutions + +The easiest way to solve this would be to completely remove 2) and always elaborate +`T: Trait` to `T: Trait` and `T: SuperTrait` outside of the trait solver. +This would allow us to also remove 1), but as we still have to prove ordinary +`where`-bounds on traits, that's just additional work. + +While one could imagine ways to disable cyclic uses of 2) when checking 1), +at least the ideas of myself - @lcnr - are all far to complex to be reasonable. + +#### `normalizes_to` goals and progress + +A `normalizes_to` goal represents the requirement that `<T as Trait>::Assoc` normalizes +to some `U`. This is achieved by defacto first normalizing `<T as Trait>::Assoc` and then +equating the resulting type with `U`. It should be a mapping as each projection should normalize +to exactly one type. By simply allowing infinite proof trees, we would get the following behavior: + +```rust +trait Trait { + type Assoc; +} + +impl Trait for () { + type Assoc = <() as Trait>::Assoc; +} +``` + +If we now compute `normalizes_to(<() as Trait>::Assoc, Vec<u32>)`, we would resolve the impl +and get the associated type `<() as Trait>::Assoc`. We then equate that with the expected type, +causing us to check `normalizes_to(<() as Trait>::Assoc, Vec<u32>)` again. +This just goes on forever, resulting in an infinite proof tree. + +This means that `<() as Trait>::Assoc` would be equal to any other type which is unsound. + +##### How to solve this + +**WARNING: THIS IS SUBTLE AND MIGHT BE WRONG** + +Unlike trait goals, `normalizes_to` has to be *productive*[^1]. A `normalizes_to` goal +is productive once the projection normalizes to a rigid type constructor, +so `<() as Trait>::Assoc` normalizing to `Vec<<() as Trait>::Assoc>` would be productive. + +A `normalizes_to` goal has two kinds of nested goals. Nested requirements needed to actually +normalize the projection, and the equality between the normalized projection and the +expected type. Only the equality has to be productive. A branch in the proof tree is productive +if it is either finite, or contains at least one `normalizes_to` where the alias is resolved +to a rigid type constructor. + +Alternatively, we could simply always treat the equate branch of `normalizes_to` as inductive. +Any cycles should result in infinite types, which aren't supported anyways and would only +result in overflow when deeply normalizing for codegen. + +experimentation and examples: https://hackmd.io/-8p0AHnzSq2VAE6HE_wX-w?view + +Another attempt at a summary. +- in projection eq, we must make progress with constraining the rhs +- a cycle is only ok if while equating we have a rigid ty on the lhs after norm at least once +- cycles outside of the recursive `eq` call of `normalizes_to` are always fine + +[^1]: related: https://coq.inria.fr/refman/language/core/coinductive.html#top-level-definitions-of-corecursive-functions + +[perfect derive]: https://smallcultfollowing.com/babysteps/blog/2022/04/12/implied-bounds-and-perfect-derive +[ex1]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0a9c3830b93a2380e6978d6328df8f72 diff --git a/src/doc/rustc-dev-guide/src/solve/the-solver.md b/src/doc/rustc-dev-guide/src/solve/the-solver.md new file mode 100644 index 000000000..61e6cad1c --- /dev/null +++ b/src/doc/rustc-dev-guide/src/solve/the-solver.md @@ -0,0 +1,17 @@ +# The solver + +Also consider reading the documentation for [the recursive solver in chalk][chalk] +as it is very similar to this implementation and also talks about limitations of this +approach. + +[chalk]: https://rust-lang.github.io/chalk/book/recursive.html + +The basic structure of the solver is a pure function +`fn evaluate_goal(goal: Goal<'tcx>) -> Response`. +While the actual solver is not fully pure to deal with overflow and cycles, we are +going to defer that for now. + +To deal with inference variables and to improve caching, we use +[canonicalization](./canonicalization.md). + +TODO: write the remaining code for this as well. diff --git a/src/doc/rustc-dev-guide/src/solve/trait-solving.md b/src/doc/rustc-dev-guide/src/solve/trait-solving.md new file mode 100644 index 000000000..71f6581c2 --- /dev/null +++ b/src/doc/rustc-dev-guide/src/solve/trait-solving.md @@ -0,0 +1,114 @@ +# Trait solving (new) + +This chapter describes how trait solving works with the new WIP solver located in +[`rustc_trait_selection/solve`][solve]. Feel free to also look at the docs for +[the current solver](../traits/resolution.md) and [the chalk solver](../traits/chalk.md) +can be found separately. + +## Core concepts + +The goal of the trait system is to check whether a given trait bound is satisfied. +Most notably when typechecking the body of - potentially generic - functions. +For example: + +```rust +fn uses_vec_clone<T: Clone>(x: Vec<T>) -> (Vec<T>, Vec<T>) { + (x.clone(), x) +} +``` +Here the call to `x.clone()` requires us to prove that `Vec<T>` implements `Clone` given +the assumption that `T: Clone` is true. We can assume `T: Clone` as that will be proven by +callers of this function. + +The concept of "prove the `Vec<T>: Clone` with the assumption `T: Clone`" is called a [`Goal`]. +Both `Vec<T>: Clone` and `T: Clone` are represented using [`Predicate`]. There are other +predicates, most notably equality bounds on associated items: `<Vec<T> as IntoIterator>::Item == T`. +See the `PredicateKind` enum for an exhaustive list. A `Goal` is represented as the `predicate` we +have to prove and the `param_env` in which this predicate has to hold. + +We prove goals by checking whether each possible [`Candidate`] applies for the given goal by +recursively proving its nested goals. For a list of possible candidates with examples, look at +[`CandidateSource`]. The most important candidates are `Impl` candidates, i.e. trait implementations +written by the user, and `ParamEnv` candidates, i.e. assumptions in our current environment. + +Looking at the above example, to prove `Vec<T>: Clone` we first use +`impl<T: Clone> Clone for Vec<T>`. To use this impl we have to prove the nested +goal that `T: Clone` holds. This can use the assumption `T: Clone` from the `ParamEnv` +which does not have any nested goals. Therefore `Vec<T>: Clone` holds. + +The trait solver can either return success, ambiguity or an error as a [`CanonicalResponse`]. +For success and ambiguity it also returns constraints inference and region constraints. + +## Requirements + +Before we dive into the new solver lets first take the time to go through all of our requirements +on the trait system. We can then use these to guide our design later on. + +TODO: elaborate on these rules and get more precise about their meaning. +Also add issues where each of these rules have been broken in the past +(or still are). + +### 1. The trait solver has to be *sound* + +This means that we must never return *success* for goals for which no `impl` exists. That would +simply be unsound by assuming a trait is implemented even though it is not. When using predicates +from the `where`-bounds, the `impl` will be proved by the user of the item. + +### 2. If type checker solves generic goal concrete instantiations of that goal have the same result + +Pretty much: If we successfully typecheck a generic function concrete instantiations +of that function should also typeck. We should not get errors post-monomorphization. +We can however get overflow as in the following snippet: + +```rust +fn foo<T: Trait>(x: ) +``` + +### 3. Trait goals in empty environments are proven by a unique impl + +If a trait goal holds with an empty environment, there is a unique `impl`, +either user-defined or builtin, which is used to prove that goal. + +This is necessary for codegen to select a unique method. +An exception here are *marker traits* which are allowed to overlap. + +### 4. Normalization in empty environments results in a unique type + +Normalization for alias types/consts has a unique result. Otherwise we can easily implement +transmute in safe code. Given the following function, we have to make sure that the input and +output types always get normalized to the same concrete type. +```rust +fn foo<T: Trait>( + x: <T as Trait>::Assoc +) -> <T as Trait>::Assoc { + x +} +``` + +### 5. During coherence trait solving has to be complete + +During coherence we never return *error* for goals which can be proven. This allows overlapping +impls which would break rule 3. + +### 6. Trait solving must be (free) lifetime agnostic + +Trait solving during codegen should have the same result as during typeck. As we erase +all free regions during codegen we must not rely on them during typeck. A noteworthy example +is special behavior for `'static`. + +### 7. Removing ambiguity makes strictly more things compile + +We *should* not rely on ambiguity for things to compile. +Not doing that will cause future improvements to be breaking changes. + +### 8. semantic equality implies structural equality + +Two types being equal in the type system must mean that they have the same `TypeId`. + + +[solve]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/index.html +[`Goal`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/struct.Goal.html +[`Predicate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Predicate.html +[`Candidate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/assembly/struct.Candidate.html +[`CandidateSource`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/assembly/enum.CandidateSource.html +[`CanonicalResponse`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_selection/solve/type.CanonicalResponse.html |