Save liveness results for DestinationPropagation
`DestinationPropagation` needs to verify that merge candidates do not conflict with each other. This is done by verifying that a local is not live when its counterpart is written to.
To get the liveness information, the pass runs `MaybeLiveLocals` dataflow analysis repeatedly, once for each propagation round. This is quite costly, and the main driver for the perf impact on `ucd` and `diesel`. (See https://github.com/rust-lang/rust/pull/115105#issuecomment-1689205908)
In order to mitigate this cost, this PR proposes to save the result of the analysis into a `SparseIntervalMatrix`, and mirror merges of locals into that matrix: `liveness(destination) := liveness(destination) union liveness(source)`.
<details>
<summary>Proof</summary>
We denote by `'` all the quantities of the transformed program. Let $\varphi$ be a mapping of locals, which maps `source` to `destination`, and is identity otherwise. The exact liveness set after a statement is $out'(statement)$, and the proposed liveness set is $\varphi(out(statement))$.
Consider a statement. Suppose that the output state verifies $out' \subset phi(out)$. We want to prove that $in' \subset \varphi(in)$ where $in = (out - kill) \cup gen$, and conclude by induction.
We have 2 cases: either that statement is kept with locals renumbered by $\varphi$, or it is a tautological assignment and it removed.
1. If the statement is kept: the gen-set and the kill-set of $statement' = \varphi(statement)$ are $gen' = \varphi(gen)$ and $kill' = \varphi(kill)$ exactly.
From soundness requirement 3, $\varphi(in)$ is disjoint from $\varphi(kill)$.
This implies that $\varphi(out - kill)$ is disjoint from $\varphi(kill)$, and so $\varphi(out - kill) = \varphi(out) - \varphi(kill)$. Then $\varphi(in) = (\varphi(out) - \varphi(kill)) \cup \varphi(gen) = (\varphi(out) - kill') \cup gen'$.
We can conclude that $out' \subset \varphi(out) \implies in' \subset \varphi(in)$.
2. If the statement is removed. As $\varphi(statement)$ is a tautological assignment, we know that $\varphi(gen) = \varphi(kill) = \\{ destination \\}$, while $gen' = kill' = \emptyset$. So $\varphi(in) = \varphi(out) \cup \\{ destination \\}$. Then $in' = out' \subset out \subset \varphi(in)$.
By recursion, we can conclude by that $in' \subset \varphi(in)$ everywhere.
</details>
This approximate liveness results is only suboptimal if there are locals that fully disappear from the CFG due to an assignment cycle. These cases are quite unlikely, so we do not bother with them.
This change allows to reduce the perf impact of DestinationPropagation by half on diesel and ucd (https://github.com/rust-lang/rust/pull/115105#issuecomment-1694701904).
cc ````@JakobDegen````
fix fn/const items implied bounds and wf check (rebase)
A rebase of #104098, see that PR for discussion. This is pretty much entirely the work of `@aliemjay.` I received his permission for this rebase.
---
These are two distinct changes (edit: actually three, see below):
1. Wf-check all fn item args. This is a soundness fix.
Fixes#104005
2. Use implied bounds from impl header in borrowck of associated functions/consts. This strictly accepts more code and helps to mitigate the impact of other breaking changes.
Fixes#98852Fixes#102611
The first is a breaking change and will likely have a big impact without the the second one. See the first commit for how it breaks libstd.
Landing the second one without the first will allow more incorrect code to pass. For example an exploit of #104005 would be as simple as:
```rust
use core::fmt::Display;
trait ExtendLt<Witness> {
fn extend(self) -> Box<dyn Display>;
}
impl<T: Display> ExtendLt<&'static T> for T {
fn extend(self) -> Box<dyn Display> {
Box::new(self)
}
}
fn main() {
let val = (&String::new()).extend();
println!("{val}");
}
```
The third change is to to check WF of user type annotations before normalizing them (fixes#104764, fixes#104763). It is mutually dependent on the second change above: an attempt to land it separately in #104746 caused several crater regressions that can all be mitigated by using the implied from the impl header. It is also necessary for the soundness of associated consts that use the implied bounds of impl header. See #104763 and how the third commit fixes the soundness issue in `tests/ui/wf/wf-associated-const.rs` that was introduces by the previous commit.
r? types
Simplify `closure_env_ty` and `closure_env_param`
Random cleanup that I found when working on async closures. This makes it easier to separate the latter into a new tykind.
Use `zip_eq` to enforce that things being zipped have equal sizes
Some `zip`s are best enforced to be equal, since size mismatches suggest deeper bugs in the compiler.
`OutputTypeParameterMismatch` -> `SignatureMismatch`
I'm probably missing something that made this rename more complicated. What did you end up getting stuck on when renaming this selection error, `@lcnr?`
**also** I renamed the `FulfillmentErrorCode` variants. This is just churn but I wanted to do it forever. I can move it out of this PR if desired.
r? lcnr
But we can't easily switch from `Vec<Diagnostic>` to
`Vec<DiagnosticBuilder<G>>` because there's a mix of errors and warnings
which result in different `G` types. So we must make
`DiagnosticBuilder::into_diagnostic` public, but that's ok, and it will
get more use in subsequent commits.
In #119606 I added them and used a `_mv` suffix, but that wasn't great.
A `with_` prefix has three different existing uses.
- Constructors, e.g. `Vec::with_capacity`.
- Wrappers that provide an environment to execute some code, e.g.
`with_session_globals`.
- Consuming chaining methods, e.g. `Span::with_{lo,hi,ctxt}`.
The third case is exactly what we want, so this commit changes
`DiagnosticBuilder::foo_mv` to `DiagnosticBuilder::with_foo`.
Thanks to @compiler-errors for the suggestion.
We have `span_delayed_bug` and often pass it a `DUMMY_SP`. This commit
adds `delayed_bug`, which matches pairs like `err`/`span_err` and
`warn`/`span_warn`.
Because it takes an error code after the span. This avoids the confusing
overlap with the `DiagCtxt::struct_span_err` method, which doesn't take
an error code.
This works for most of its call sites. This is nice, because `emit` very
much makes sense as a consuming operation -- indeed,
`DiagnosticBuilderState` exists to ensure no diagnostic is emitted
twice, but it uses runtime checks.
For the small number of call sites where a consuming emit doesn't work,
the commit adds `DiagnosticBuilder::emit_without_consuming`. (This will
be removed in subsequent commits.)
Likewise, `emit_unless` becomes consuming. And `delay_as_bug` becomes
consuming, while `delay_as_bug_without_consuming` is added (which will
also be removed in subsequent commits.)
All this requires significant changes to `DiagnosticBuilder`'s chaining
methods. Currently `DiagnosticBuilder` method chaining uses a
non-consuming `&mut self -> &mut Self` style, which allows chaining to
be used when the chain ends in `emit()`, like so:
```
struct_err(msg).span(span).emit();
```
But it doesn't work when producing a `DiagnosticBuilder` value,
requiring this:
```
let mut err = self.struct_err(msg);
err.span(span);
err
```
This style of chaining won't work with consuming `emit` though. For
that, we need to use to a `self -> Self` style. That also would allow
`DiagnosticBuilder` production to be chained, e.g.:
```
self.struct_err(msg).span(span)
```
However, removing the `&mut self -> &mut Self` style would require that
individual modifications of a `DiagnosticBuilder` go from this:
```
err.span(span);
```
to this:
```
err = err.span(span);
```
There are *many* such places. I have a high tolerance for tedious
refactorings, but even I gave up after a long time trying to convert
them all.
Instead, this commit has it both ways: the existing `&mut self -> Self`
chaining methods are kept, and new `self -> Self` chaining methods are
added, all of which have a `_mv` suffix (short for "move"). Changes to
the existing `forward!` macro lets this happen with very little
additional boilerplate code. I chose to add the suffix to the new
chaining methods rather than the existing ones, because the number of
changes required is much smaller that way.
This doubled chainging is a bit clumsy, but I think it is worthwhile
because it allows a *lot* of good things to subsequently happen. In this
commit, there are many `mut` qualifiers removed in places where
diagnostics are emitted without being modified. In subsequent commits:
- chaining can be used more, making the code more concise;
- more use of chaining also permits the removal of redundant diagnostic
APIs like `struct_err_with_code`, which can be replaced easily with
`struct_err` + `code_mv`;
- `emit_without_diagnostic` can be removed, which simplifies a lot of
machinery, removing the need for `DiagnosticBuilderState`.
Check yield terminator's resume type in borrowck
In borrowck, we didn't check that the lifetimes of the `TerminatorKind::Yield`'s `resume_place` were actually compatible with the coroutine's signature. That means that the lifetimes were totally going unchecked. Whoops!
This PR implements this checking.
Fixes#119564
r? types
Make closures carry their own ClosureKind
Right now, we use the "`movability`" field of `hir::Closure` to distinguish a closure and a coroutine. This is paired together with the `CoroutineKind`, which is located not in the `hir::Closure`, but the `hir::Body`. This is strange and redundant.
This PR introduces `ClosureKind` with two variants -- `Closure` and `Coroutine`, which is put into `hir::Closure`. The `CoroutineKind` is thus removed from `hir::Body`, and `Option<Movability>` no longer needs to be a stand-in for "is this a closure or a coroutine".
r? eholk
Split coroutine desugaring kind from source
What a coroutine is desugared from (gen/async gen/async) should be separate from where it comes (fn/block/closure).
`IntoDiagnostic` defaults to `ErrorGuaranteed`, because errors are the
most common diagnostic level. It makes sense to do likewise for the
closely-related (and much more widely used) `DiagnosticBuilder` type,
letting us write `DiagnosticBuilder<'a, ErrorGuaranteed>` as just
`DiagnosticBuilder<'a>`. This cuts over 200 lines of code due to many
multi-line things becoming single line things.
Currently, `emit_diagnostic` takes `&mut self`.
This commit changes it so `emit_diagnostic` takes `self` and the new
`emit_diagnostic_without_consuming` function takes `&mut self`.
I find the distinction useful. The former case is much more common, and
avoids a bunch of `mut` and `&mut` occurrences. We can also restrict the
latter with `pub(crate)` which is nice.
Renamings:
- find -> opt_hir_node
- get -> hir_node
- find_by_def_id -> opt_hir_node_by_def_id
- get_by_def_id -> hir_node_by_def_id
Fix rebase changes using removed methods
Use `tcx.hir_node_by_def_id()` whenever possible in compiler
Fix clippy errors
Fix compiler
Apply suggestions from code review
Co-authored-by: Vadim Petrochenkov <vadim.petrochenkov@gmail.com>
Add FIXME for `tcx.hir()` returned type about its removal
Simplify with with `tcx.hir_node_by_def_id`
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and
removing redundant imports code into two PR.
`GenKillAnalysis` has five methods that take a transfer function arg:
- `statement_effect`
- `before_statement_effect`
- `terminator_effect`
- `before_terminator_effect`
- `call_return_effect`
All the transfer function args have type `&mut impl GenKill<Self::Idx>`,
except for `terminator_effect`, which takes the simpler `Self::Domain`.
But only the first two need to be `impl GenKill`. The other
three can all be `Self::Domain`, just like `Analysis`. So this commit
changes the last two to take `Self::Domain`, making `GenKillAnalysis`
and `Analysis` more similar.
(Another idea would be to make all these methods `impl GenKill`. But
that doesn't work: `MaybeInitializedPlaces::terminator_effect` requires
the arg be `Self::Domain` so that `self_is_unwind_dead(place, state)`
can be called on it.)
This results in two non-generic types being used: `BorrowckResults` and
`BorrowckFlowState`. It's a net reduction in lines of code, and a little
easier to read.
It is used just once. With it removed, the relevant code is a little
boilerplate-y but much easier to read, and is the same length. Overall I
think it's an improvement.
When encountering multiple mutable borrows, suggest cloning and adding
derive annotations as needed.
```
error[E0596]: cannot borrow `sm.x` as mutable, as it is behind a `&` reference
--> $DIR/accidentally-cloning-ref-borrow-error.rs:32:9
|
LL | foo(&mut sm.x);
| ^^^^^^^^^ `sm` is a `&` reference, so the data it refers to cannot be borrowed as mutable
|
help: `Str` doesn't implement `Clone`, so this call clones the reference `&Str`
--> $DIR/accidentally-cloning-ref-borrow-error.rs:31:21
|
LL | let mut sm = sr.clone();
| ^^^^^^^
help: consider annotating `Str` with `#[derive(Clone)]`
|
LL + #[derive(Clone)]
LL | struct Str {
|
help: consider specifying this binding's type
|
LL | let mut sm: &mut Str = sr.clone();
| ++++++++++
```
```
error[E0596]: cannot borrow `*inner` as mutable, as it is behind a `&` reference
--> $DIR/issue-91206.rs:14:5
|
LL | inner.clear();
| ^^^^^ `inner` is a `&` reference, so the data it refers to cannot be borrowed as mutable
|
help: you can `clone` the `Vec<usize>` value and consume it, but this might not be your desired behavior
--> $DIR/issue-91206.rs:11:17
|
LL | let inner = client.get_inner_ref();
| ^^^^^^^^^^^^^^^^^^^^^^
help: consider specifying this binding's type
|
LL | let inner: &mut Vec<usize> = client.get_inner_ref();
| +++++++++++++++++
```
When encountering a move error, look for implementations of `Clone` for
the moved type. If there is one, check if all its obligations are met.
If they are, we suggest cloning without caveats. If they aren't, we
suggest cloning while mentioning the unmet obligations, potentially
suggesting `#[derive(Clone)]` when appropriate.
```
error[E0507]: cannot move out of a shared reference
--> $DIR/suggest-clone-when-some-obligation-is-unmet.rs:20:28
|
LL | let mut copy: Vec<U> = map.clone().into_values().collect();
| ^^^^^^^^^^^ ------------- value moved due to this method call
| |
| move occurs because value has type `HashMap<T, U, Hash128_1>`, which does not implement the `Copy` trait
|
note: `HashMap::<K, V, S>::into_values` takes ownership of the receiver `self`, which moves value
--> $SRC_DIR/std/src/collections/hash/map.rs:LL:COL
help: you could `clone` the value and consume it, if the `Hash128_1: Clone` trait bound could be satisfied
|
LL | let mut copy: Vec<U> = <HashMap<T, U, Hash128_1> as Clone>::clone(&map.clone()).into_values().collect();
| ++++++++++++++++++++++++++++++++++++++++++++ +
help: consider annotating `Hash128_1` with `#[derive(Clone)]`
|
LL + #[derive(Clone)]
LL | pub struct Hash128_1;
|
```
Fix#109429.
When going through auto-deref, the `<T as Clone>` impl sometimes needs
to be specified for rustc to actually clone the value and not the
reference.
```
error[E0507]: cannot move out of dereference of `S`
--> $DIR/needs-clone-through-deref.rs:15:18
|
LL | for _ in self.clone().into_iter() {}
| ^^^^^^^^^^^^ ----------- value moved due to this method call
| |
| move occurs because value has type `Vec<usize>`, which does not implement the `Copy` trait
|
note: `into_iter` takes ownership of the receiver `self`, which moves value
--> $SRC_DIR/core/src/iter/traits/collect.rs:LL:COL
help: you can `clone` the value and consume it, but this might not be your desired behavior
|
LL | for _ in <Vec<usize> as Clone>::clone(&self.clone()).into_iter() {}
| ++++++++++++++++++++++++++++++ +
```
CC #109429.
Liveness data is pushed from multiple parts of NLL. Instead of changing
the call sites to maintain live loans, move the latter to `LivenessValues` where
this liveness data is pushed to, and maintain live loans there.
This fixes the differences in polonius scopes on some CFGs where a
variable was dead in tracing but as a MIR terminator its regions were marked
live from "constraint generation"
Refactor NLL constraint generation and most of polonius fact generation
As discussed in #118175, NLL "constraint generation" is only about liveness, but currently also contains legacy polonius fact generation. The latter is quite messy, and this PR cleans this up to prepare for its future removal:
- splits polonius fact generation out of NLL constraint generation
- merges NLL constraint generation to its more natural place, liveness
- extracts all of the polonius fact generation from NLLs apart from MIR typeck (as fact generation is somewhat in a single place there already, but should be cleaned up) into its own explicit module, with a single entry point instead of many.
There should be no behavior changes, and tests seem to behave the same as master: without polonius, with legacy polonius, with the in-tree polonius.
I've split everything into smaller logical commits for easier review, as it required quite a bit of code to be split and moved around, but it should all be trivial changes.
r? `@matthewjasper`
to help review, this duplicates the existing NLL + polonius constraint
generation component, before splitting them up to only do what they
individually need.
Refactor borrowck liveness values
This PR starts cleaning up `rustc_borrowck`, in particular around liveness values:
- refactors simple names that make no sense anymore: either referring to older structures using region elements, or to bitset containers and values.
- improves comments and fixes others
- removes unused return values and unneeded generic arguments
r? `@matthewjasper`
Currently we always do this:
```
use rustc_fluent_macro::fluent_messages;
...
fluent_messages! { "./example.ftl" }
```
But there is no need, we can just do this everywhere:
```
rustc_fluent_macro::fluent_messages! { "./example.ftl" }
```
which is shorter.
The `fluent_messages!` macro produces uses of
`crate::{D,Subd}iagnosticMessage`, which means that every crate using
the macro must have this import:
```
use rustc_errors::{DiagnosticMessage, SubdiagnosticMessage};
```
This commit changes the macro to instead use
`rustc_errors::{D,Subd}iagnosticMessage`, which avoids the need for the
imports.