Dejargonize `subst`
In favor of #110793, replace almost every occurence of `subst` and `substitution` from rustc codes, but they still remains in subtrees under `src/tools/` like clippy and test codes (I'd like to replace them after this)
Fix async closures in CTFE
First commit renames `is_coroutine_or_closure` into `is_closure_like`, because `is_coroutine_or_closure_or_coroutine_closure` seems confusing and long.
Second commit fixes some forgotten cases where we want to handle `TyKind::CoroutineClosure` the same as closures and coroutines.
The test exercises the change to `ValidityVisitor::aggregate_field_path_elem` which is the source of #120946, but not the change to `UsedParamsNeedSubstVisitor`, though I feel like it's not that big of a deal. Let me know if you'd like for me to look into constructing a test for the latter, though I have no idea what it'd look like (we can't assert against `TooGeneric` anywhere?).
Fixes#120946
r? oli-obk cc ``@RalfJung``
Check that the ABI of the instance we are inlining is correct
When computing the `CallSite` in the mir inliner, double check that the instance of the function that we are inlining is compatible with the signature from the trait definition that we acquire from the MIR.
Fixes#120940
r? ``@oli-obk`` or ``@cjgillot``
Remove a bunch of dead parameters in functions
Found this kind of issue when working on https://github.com/rust-lang/rust/pull/119650
I wrote a trivial toy lint and manual review to find these.
Assert that params with the same *index* have the same *name*
Found this bug when trying to build libcore with the new solver, since it will canonicalize two params with the same index into *different* placeholders if those params differ by name.
Fold pointer operations in GVN
This PR proposes 2 combinations of cast operations in MIR GVN:
- a chain of `PtrToPtr` or `MutToConstPointer` casts can be folded together into a single `PtrToPtr` cast;
- we attempt to evaluate more ptr ops when there is no provenance.
In particular, this allows to read from static slices.
This is not yet sufficient to see through slice operations that use `PtrComponents` (because that's a union), but still a step forward.
r? `@ghost`
These crates all needed specialization for `newtype_index!`, which will no
longer be necessary when the current nightly eventually becomes the next
bootstrap compiler.
Fix more `ty::Error` ICEs in MIR passes
Fixes#120791 - Add a check for `ty::Error` in the `ByMove` coroutine pass
Fixes#120816 - Add a check for `ty::Error` in the MIR validator
Also a drive-by fix for a FIXME I had asked oli to add
r? oli-obk
Invert diagnostic lints.
That is, change `diagnostic_outside_of_impl` and `untranslatable_diagnostic` from `allow` to `deny`, because more than half of the compiler has been converted to use translated diagnostics.
This commit removes more `deny` attributes than it adds `allow` attributes, which proves that this change is warranted.
r? ````@davidtwco````
Toggle assert_unsafe_precondition in codegen instead of expansion
The goal of this PR is to make some of the unsafe precondition checks in the standard library available in debug builds. Some UI tests are included to verify that it does that.
The diff is large, but most of it is blessing mir-opt tests and I've also split up this PR so it can be reviewed commit-by-commit.
This PR:
1. Adds a new intrinsic, `debug_assertions` which is lowered to a new MIR NullOp, and only to a constant after monomorphization
2. Rewrites `assume_unsafe_precondition` to check the new intrinsic, and be monomorphic.
3. Skips codegen of the `assume` intrinsic in unoptimized builds, because that was silly before but with these checks it's *very* silly
4. The checks with the most overhead are `ptr::read`/`ptr::write` and `NonNull::new_unchecked`. I've simply added `#[cfg(debug_assertions)]` to the checks for `ptr::read`/`ptr::write` because I was unable to come up with any (good) ideas for decreasing their impact. But for `NonNull::new_unchecked` I found that the majority of callers can use a different function, often a safe one.
Yes, this PR slows down the compile time of some programs. But in our benchmark suite it's never more than 1% icount, and the average icount change in debug-full programs is 0.22%. I think that is acceptable for such an improvement in developer experience.
https://github.com/rust-lang/rust/issues/120539#issuecomment-1922687101
Fix mir pass ICE in the presence of other errors
fixes#120779
it is impossible to add a ui test for this, because it only reproduces in build-fail, but a test that also has errors in check-fail mode can't be made build-fail 🙃
I would have to add a run-make test or sth, which is overkill for such a tiny thing imo.
Remove unused args from functions
`#[instrument]` suppresses the unused arguments from a function, *and* suppresses unused methods too! This PR removes things which are only used via `#[instrument]` calls, and fixes some other errors (privacy?) that I will comment inline.
It's possible that some of these arguments were being passed in for the purposes of being instrumented, but I am unconvinced by most of them.
Rollup of 9 pull requests
Successful merges:
- #119592 (resolve: Unload speculatively resolved crates before freezing cstore)
- #120103 (Make it so that async-fn-in-trait is compatible with a concrete future in implementation)
- #120206 (hir: Make sure all `HirId`s have corresponding HIR `Node`s)
- #120214 (match lowering: consistently lower bindings deepest-first)
- #120688 (GVN: also turn moves into copies with projections)
- #120702 (docs: also check the inline stmt during redundant link check)
- #120727 (exhaustiveness: Prefer "`0..MAX` not covered" to "`_` not covered")
- #120734 (Add `SubdiagnosticMessageOp` as a trait alias.)
- #120739 (improve pretty printing for associated items in trait objects)
r? `@ghost`
`@rustbot` modify labels: rollup
MirPass: make name more const
Continues #120161, this time applied to `MirPass` instead of `MirLint`, locally shaves few (very few) instructions off.
r? ``@cjgillot``
coverage: Split out counter increment sites from BCB node/edge counters
This makes it possible for two nodes/edges in the coverage graph to share the same counter, without causing the instrumentor to inject unwanted duplicate counter-increment statements.
---
````@rustbot```` label +A-code-coverage
That is, change `diagnostic_outside_of_impl` and
`untranslatable_diagnostic` from `allow` to `deny`, because more than
half of the compiler has be converted to use translated diagnostics.
This commit removes more `deny` attributes than it adds `allow`
attributes, which proves that this change is warranted.
This sidesteps the normal span refinement code in cases where we know that we
are only dealing with the special signature span that represents having called
an async function.
This makes it possible for two nodes/edges in the coverage graph to share the
same counter, without causing the instrumentor to inject unwanted duplicate
counter-increment statements.
The query accept arbitrary DefIds, not just owner DefIds.
The return can be an `Option` because if there are no nodes, then it doesn't matter whether it's due to NonOwner or Phantom.
Also rename the query to `opt_hir_owner_nodes`.
Because it's almost always static.
This makes `impl IntoDiagnosticArg for DiagnosticArgValue` trivial,
which is nice.
There are a few diagnostics constructed in
`compiler/rustc_mir_build/src/check_unsafety.rs` and
`compiler/rustc_mir_transform/src/errors.rs` that now need symbols
converted to `String` with `to_string` instead of `&str` with `as_str`,
but that' no big deal, and worth it for the simplifications elsewhere.
Error codes are integers, but `String` is used everywhere to represent
them. Gross!
This commit introduces `ErrCode`, an integral newtype for error codes,
replacing `String`. It also introduces a constant for every error code,
e.g. `E0123`, and removes the `error_code!` macro. The constants are
imported wherever used with `use rustc_errors::codes::*`.
With the old code, we have three different ways to specify an error code
at a use point:
```
error_code!(E0123) // macro call
struct_span_code_err!(dcx, span, E0123, "msg"); // bare ident arg to macro call
\#[diag(name, code = "E0123")] // string
struct Diag;
```
With the new code, they all use the `E0123` constant.
```
E0123 // constant
struct_span_code_err!(dcx, span, E0123, "msg"); // constant
\#[diag(name, code = E0123)] // constant
struct Diag;
```
The commit also changes the structure of the error code definitions:
- `rustc_error_codes` now just defines a higher-order macro listing the
used error codes and nothing else.
- Because that's now the only thing in the `rustc_error_codes` crate, I
moved it into the `lib.rs` file and removed the `error_codes.rs` file.
- `rustc_errors` uses that macro to define everything, e.g. the error
code constants and the `DIAGNOSTIC_TABLES`. This is in its new
`codes.rs` file.
Remove unused/unnecessary features
~~The bulk of the actual code changes here is replacing try blocks with equivalent closures. I'm not entirely sure that's a good idea since it may have perf impact, happy to revert if that's the case/the change is unwanted.~~
I also removed a lot of `recursion_limit = "256"` since everything seems to build fine without that and most don't have any comment justifying it.
Remove coroutine info when building coroutine drop body
Coroutine drop shims are not themselves coroutines, so erase the "`coroutine`" field from the body so that helper fns like `yield_ty` and `coroutine_kind` properly return `None` for the drop shim.
coverage: Dismantle `Instrumentor` and flatten span refinement
This is a combination of two refactorings that are unrelated, but would otherwise have a merge conflict.
No functional changes, other than a small tweak to debug logging as part of rearranging some functions.
Ignoring whitespace is highly recommended, since most of the modified lines have just been reindented.
---
The first change is to dismantle `Instrumentor` into ordinary functions.
This is one of those cases where encapsulating several values into a struct ultimately hurts more than it helps. With everything stored as local variables in one main function, and passed explicitly into helper functions, it's easier to see what is used where, and make changes as necessary.
---
The second change is to flatten the functions for extracting/refining coverage spans.
Consolidating this code into flatter functions reduces the amount of pointer-chasing required to read and modify it.
Remove all ConstPropNonsense
We track all locals and projections on them ourselves within the const propagator and only use the InterpCx to actually do some low level operations or read from constants (via `OpTy` we get for said constants).
This helps moving the const prop lint out from the normal pipeline and running it just based on borrowck information. This in turn allows us to make progress on https://github.com/rust-lang/rust/pull/108730#issuecomment-1875557745
there are various follow up cleanups that can be done after this PR (e.g. not matching on Rvalue twice and doing binop checks twice), but lets try landing this one first.
r? `@RalfJung`
coverage: Don't instrument `#[automatically_derived]` functions
This PR makes the coverage instrumentor detect and skip functions that have [`#[automatically_derived]`](https://doc.rust-lang.org/reference/attributes/derive.html#the-automatically_derived-attribute) on their enclosing impl block.
Most notably, this means that methods generated by built-in derives (e.g. `Clone`, `Debug`, `PartialEq`) are now ignored by coverage instrumentation, and won't appear as executed or not-executed in coverage reports.
This is a noticeable change in user-visible behaviour, but overall I think it's a net improvement. For example, we've had a few user requests for this sort of change (e.g. #105055, https://github.com/rust-lang/rust/issues/84605#issuecomment-1902069040), and I believe it's the behaviour that most users will expect/prefer by default.
It's possible to imagine situations where users would want to instrument these derived implementations, but I think it's OK to treat that as an opportunity to consider adding more fine-grained option flags to control the details of coverage instrumentation, while leaving this new behaviour as the default.
(Also note that while `-Cinstrument-coverage` is a stable feature, the exact details of coverage instrumentation are allowed to change. So we *can* make this change; the main question is whether we *should*.)
Fixes#105055.
coverage: Never emit improperly-ordered coverage regions
If we emit a coverage region that is improperly ordered (end < start), `llvm-cov` will fail with `coveragemap_error::malformed`, which is inconvenient for users and also very hard to debug.
Ideally we would fix the root causes of these situations, but they tend to occur in very obscure edge-case scenarios (often involving nested macros), and we don't always have a good MCVE to work from. So it makes sense to also have a catch-all check that will prevent improperly-ordered regions from ever being emitted.
---
This is mainly aimed at resolving #119453. We don't have a specific way to reproduce it, which is why I haven't been able to add a test case in this PR. But based on the information provided in that issue, this change seems likely to avoid the error in `llvm-cov`.
`````@rustbot````` label +A-code-coverage
Rollup of 9 pull requests
Successful merges:
- #112806 (Small code improvements in `collect_intra_doc_links.rs`)
- #119766 (Split tait and impl trait in assoc items logic)
- #120139 (Do not normalize closure signature when building `FnOnce` shim)
- #120160 (Manually implement derived `NonZero` traits.)
- #120171 (Fix assume and assert in jump threading)
- #120183 (Add `#[coverage(off)]` to closures introduced by `#[test]` and `#[bench]`)
- #120195 (add several resolution test cases)
- #120259 (Split Diagnostics for Uncommon Codepoints: Add List to Display Characters Involved)
- #120261 (Provide structured suggestion to use trait objects in some cases of `if` arm type divergence)
r? `@ghost`
`@rustbot` modify labels: rollup
Only use dense bitsets in dataflow analyses
When a dataflow state has the size close to the number of locals, we should prefer a dense bitset, like we already store locals in a dense vector.
Other occurrences of `ChunkedBitSet` need to be justified by the size of the dataflow state.
Save liveness results for DestinationPropagation
`DestinationPropagation` needs to verify that merge candidates do not conflict with each other. This is done by verifying that a local is not live when its counterpart is written to.
To get the liveness information, the pass runs `MaybeLiveLocals` dataflow analysis repeatedly, once for each propagation round. This is quite costly, and the main driver for the perf impact on `ucd` and `diesel`. (See https://github.com/rust-lang/rust/pull/115105#issuecomment-1689205908)
In order to mitigate this cost, this PR proposes to save the result of the analysis into a `SparseIntervalMatrix`, and mirror merges of locals into that matrix: `liveness(destination) := liveness(destination) union liveness(source)`.
<details>
<summary>Proof</summary>
We denote by `'` all the quantities of the transformed program. Let $\varphi$ be a mapping of locals, which maps `source` to `destination`, and is identity otherwise. The exact liveness set after a statement is $out'(statement)$, and the proposed liveness set is $\varphi(out(statement))$.
Consider a statement. Suppose that the output state verifies $out' \subset phi(out)$. We want to prove that $in' \subset \varphi(in)$ where $in = (out - kill) \cup gen$, and conclude by induction.
We have 2 cases: either that statement is kept with locals renumbered by $\varphi$, or it is a tautological assignment and it removed.
1. If the statement is kept: the gen-set and the kill-set of $statement' = \varphi(statement)$ are $gen' = \varphi(gen)$ and $kill' = \varphi(kill)$ exactly.
From soundness requirement 3, $\varphi(in)$ is disjoint from $\varphi(kill)$.
This implies that $\varphi(out - kill)$ is disjoint from $\varphi(kill)$, and so $\varphi(out - kill) = \varphi(out) - \varphi(kill)$. Then $\varphi(in) = (\varphi(out) - \varphi(kill)) \cup \varphi(gen) = (\varphi(out) - kill') \cup gen'$.
We can conclude that $out' \subset \varphi(out) \implies in' \subset \varphi(in)$.
2. If the statement is removed. As $\varphi(statement)$ is a tautological assignment, we know that $\varphi(gen) = \varphi(kill) = \\{ destination \\}$, while $gen' = kill' = \emptyset$. So $\varphi(in) = \varphi(out) \cup \\{ destination \\}$. Then $in' = out' \subset out \subset \varphi(in)$.
By recursion, we can conclude by that $in' \subset \varphi(in)$ everywhere.
</details>
This approximate liveness results is only suboptimal if there are locals that fully disappear from the CFG due to an assignment cycle. These cases are quite unlikely, so we do not bother with them.
This change allows to reduce the perf impact of DestinationPropagation by half on diesel and ucd (https://github.com/rust-lang/rust/pull/115105#issuecomment-1694701904).
cc ````@JakobDegen````
Rework how diagnostic lints are stored.
`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.
Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.
It's all a bit messy and confused and seems unintentional.
This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
new type containing a lint name and a `has_future_breakage` bool, so
all lints can have those, error code or not.
r? `@oli-obk`
Sandwich MIR optimizations between DSE.
This PR reorders MIR optimization passes in an attempt to increase their efficiency.
- Stop running CopyProp before GVN, it's useless as GVN will do the same thing anyway. Instead, we perform CopyProp at the end of the pipeline, to ensure we do not emit copy/move chains.
- Run DSE before GVN, as it increases the probability to have single-assignment locals.
- Run DSE after the final CopyProp to turn copies into moves.
r? `@ghost`
Avoid some redundant work in GVN
The first 2 commits are about reducing the perf effect.
Third commit avoids doing redundant work: is a local is SSA, it already has been simplified, and the resulting value is in `self.locals`. No need to call any code on it.
The last commit avoids removing some storage statements.
r? wg-mir-opt
`Diagnostic::code` has the type `DiagnosticId`, which has `Error` and
`Lint` variants. Plus `Diagnostic::is_lint` is a bool, which should be
redundant w.r.t. `Diagnostic::code`.
Seems simple. Except it's possible for a lint to have an error code, in
which case its `code` field is recorded as `Error`, and `is_lint` is
required to indicate that it's a lint. This is what happens with
`derive(LintDiagnostic)` lints. Which means those lints don't have a
lint name or a `has_future_breakage` field because those are stored in
the `DiagnosticId::Lint`.
It's all a bit messy and confused and seems unintentional.
This commit:
- removes `DiagnosticId`;
- changes `Diagnostic::code` to `Option<String>`, which means both
errors and lints can straightforwardly have an error code;
- changes `Diagnostic::is_lint` to `Option<IsLint>`, where `IsLint` is a
new type containing a lint name and a `has_future_breakage` bool, so
all lints can have those, error code or not.
This also switches from `split_off(0)` to `std::mem::take` when emptying the
accumulated list of blocks, because `split_off(0)` handles capacity in a way
that is unintuitive when used in a loop.
The old loop had two separate places where it would flush the acumulated list
of straight-line blocks into a new BCB. One occurred at the start of the loop
body when the current block couldn't be chained into, and the other occurred at
the end of the loop body when the current block couldn't be chained from.
The latter check can be hoisted to the start of the loop body by making it
examine the previous block (which has added itself to the list) instead of the
current block. With that done, we can combine the two separate flushes into one
flush with two possible trigger conditions.
Filtering out unreachable successors is only needed by the main graph traversal
loop, so we can move the filtering step into that loop instead, eliminating the
need to pass the MIR body into `bcb_filtered_successors`.
coverage: Add enums to accommodate other kinds of coverage mappings
Extracted from #118305.
LLVM supports several different kinds of coverage mapping regions, but currently we only ever emit ordinary “code” regions. This PR performs the plumbing required to add other kinds of regions as enum variants, but does not add any specific variants other than `Code`.
The main motivation for this change is branch coverage, but it will also allow separate experimentation with gap regions and skipped regions, which might help in producing more accurate and useful coverage reports.
---
``@rustbot`` label +A-code-coverage
This is less elegant in some ways, since we no longer visit a BCB's spans as a
batch, but will make it much easier to add support for other kinds of coverage
mapping regions (e.g. branch regions or gap regions).
Reorder early post-inlining passes.
`RemoveZsts`, `RemoveUnneededDrops` and `UninhabitedEnumBranching` only depend on types, so they should be executed together early after MIR inlining introduces those types.
This does not change the end-result, but this makes the pipeline a bit more consistent.
Merge dead bb pruning and unreachable bb deduplication.
Both routines share the same basic structure: iterate on all bbs to identify work, and then renumber bbs.
We can do both at once.
coverage: `llvm-cov` expects column numbers to be bytes, not code points
Normally the compiler emits column numbers as a 1-based number of Unicode code points.
But when we embed coverage mappings for `-Cinstrument-coverage`, those mappings will ultimately be read by the `llvm-cov` tool. That tool assumes that column numbers are 1-based numbers of *bytes*, and relies on that assumption when slicing up source code to apply highlighting (in HTML reports, and in text-based reports with colour).
For the very common case of all-ASCII source code, bytes and code points are the same, so the difference isn't noticeable. But for code that contains non-ASCII characters, emitting column numbers as code points will result in `llvm-cov` slicing strings in the wrong places, producing mangled output or fatal errors.
(See https://github.com/taiki-e/cargo-llvm-cov/issues/275 as an example of what can go wrong.)
Currently it's used for two dynamic checks:
- When a diagnostic is emitted, has it been emitted before?
- When a diagnostic is dropped, has it been emitted/cancelled?
The first check is no longer need, because `emit` is consuming, so it's
impossible to emit a `DiagnosticBuilder` twice. The second check is
still needed.
This commit replaces `DiagnosticBuilderState` with a simpler
`Option<Box<Diagnostic>>`, which is enough for the second check:
functions like `emit` and `cancel` can take the `Diagnostic` and then
`drop` can check that the `Diagnostic` was taken.
The `DiagCtxt` reference from `DiagnosticBuilderState` is now stored as
its own field, removing the need for the `dcx` method.
As well as making the code shorter and simpler, the commit removes:
- One (deprecated) `ErrorGuaranteed::unchecked_claim_error_was_emitted`
call.
- Two `FIXME(eddyb)` comments that are no longer relevant.
- The use of a dummy `Diagnostic` in `into_diagnostic`.
Nice!
Stop allowing `rustc::potential_query_instability` on all of
rustc_mir_transform and instead allow it on a case-by-case basis if it
is safe to do so. In this particular crate, all instances were safe to
allow.
rustc_mir_transform: Make DestinationPropagation stable for queries
By using `FxIndexMap` instead of `FxHashMap`, so that the order of visiting of locals is deterministic.
We also need to bless
`copy_propagation_arg.foo.DestinationPropagation.panic*.diff`. Do not review the diff of the diff. Instead look at the diff files before and after this commit. Both before and after this commit, 3 statements are replaced with nop. It's just that due to change in ordering, different statements are replaced. But the net result is the same. In other words, compare this diff (before fix):
* 090d5eac72/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff
With this diff (after fix):
* f603babd63/tests/mir-opt/dest-prop/copy_propagation_arg.foo.DestinationPropagation.panic-unwind.diff
and you can see that both before and after the fix, we replace 3 statements with `nop`s.
I find it _slightly_ surprising that the test this PR affects did not previously fail spuriously due to the indeterminism of `FxHashMap`, but I guess in can be explained with the predictability of small `FxHashMap`s with `usize` (`Local`) keys, or something along those lines.
This should fix [this](https://github.com/rust-lang/rust/pull/119252#discussion_r1436101791) comment, but I wanted to make a separate PR for this fix for a simpler development and review process.
Part of https://github.com/rust-lang/rust/issues/84447 which is E-help-wanted.
r? `@cjgillot` who is reviewer for the highly related PR https://github.com/rust-lang/rust/pull/119252.
Rollup of 9 pull requests
Successful merges:
- #119208 (coverage: Hoist some complex code out of the main span refinement loop)
- #119216 (Use diagnostic namespace in stdlib)
- #119414 (bootstrap: Move -Clto= setting from Rustc::run to rustc_cargo)
- #119420 (Handle ForeignItem as TAIT scope.)
- #119468 (rustdoc-search: tighter encoding for f index)
- #119628 (remove duplicate test)
- #119638 (fix cyle error when suggesting to use associated function instead of constructor)
- #119640 (library: Fix warnings in rtstartup)
- #119642 (library: Fix a symlink test failing on Windows)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Hoist some complex code out of the main span refinement loop
The span refinement loop in `spans.rs` takes the spans that have been extracted from MIR, and modifies them to produce more helpful output in coverage reports.
It is also one of the most complicated pieces of code in the coverage instrumentor. It has an abundance of moving pieces that make it difficult to understand, and most attempts to modify it end up accidentally changing its behaviour in unacceptable ways.
This PR nevertheless tries to make a dent in it by hoisting two pieces of special-case logic out of the main loop, and into separate preprocessing passes. Coverage tests show that the resulting mappings are *almost* identical, with all known differences being unimportant.
This should hopefully unlock further simplifications to the refinement loop, since it now has fewer edge cases to worry about.
By using FxIndexMap instead of FxHashMap, so that the order of visiting
of locals is deterministic.
We also need to bless
copy_propagation_arg.foo.DestinationPropagation.panic*.diff. Do not
review the diff of the diff. Instead look at the diff file before and
after this commit. Both before and after this commit, 3 statements are
replaced with nop. It's just that due to change in ordering, different
statements are replaced. But the net result is the same.
Check yield terminator's resume type in borrowck
In borrowck, we didn't check that the lifetimes of the `TerminatorKind::Yield`'s `resume_place` were actually compatible with the coroutine's signature. That means that the lifetimes were totally going unchecked. Whoops!
This PR implements this checking.
Fixes#119564
r? types
Migrate memory overlap check from validator to lint
The check attempts to identify potential undefined behaviour, rather
than whether MIR is well-formed. It belongs in the lint not validator.
Follow up to changes from #119077.
This draws a clear distinction between the fields/methods that are needed by
initial span extraction and preprocessing, and those that are needed by the
main "refinement" loop.
Reevaluate `body.should_skip()` after updating the MIR phase to ensure
that injected MIR is processed correctly.
Update a few custom MIR tests that were ill-formed for the injected
phase.
`Diagnostic` has 40 methods that return `&mut Self` and could be
considered setters. Four of them have a `set_` prefix. This doesn't seem
necessary for a type that implements the builder pattern. This commit
removes the `set_` prefixes on those four methods.
coverage: Prepare mappings separately from injecting statements
These two tasks historically needed to be interleaved, but after various recent changes (including #116046 and #116917) they can now be fully separated.
---
`@rustbot` label +A-code-coverage
These two tasks historically needed to be interleaved, but after various recent
changes (including #116046 and #116917) they can now be fully separated.
Implement constant propagation on top of MIR SSA analysis
This implements the idea I proposed in https://github.com/rust-lang/rust/pull/110719#issuecomment-1718324700
Based on https://github.com/rust-lang/rust/pull/109597
The value numbering "GVN" pass formulates each rvalue that appears in MIR with an abstract form (the `Value` enum), and assigns an integer `VnIndex` to each. This abstract form can be used to deduplicate values, reusing an earlier local that holds the same value instead of recomputing. This part is proposed in #109597.
From this abstract representation, we can perform more involved simplifications, for example in https://github.com/rust-lang/rust/pull/111344.
With the abstract representation `Value`, we can also attempt to evaluate each to a constant using the interpreter. This builds a `VnIndex -> OpTy` map. From this map, we can opportunistically replace an operand or a rvalue with a constant if their value has an associated `OpTy`.
The most relevant commit is [Evaluated computed values to constants.](2767c4912e)"
r? `@oli-obk`
Make closures carry their own ClosureKind
Right now, we use the "`movability`" field of `hir::Closure` to distinguish a closure and a coroutine. This is paired together with the `CoroutineKind`, which is located not in the `hir::Closure`, but the `hir::Body`. This is strange and redundant.
This PR introduces `ClosureKind` with two variants -- `Closure` and `Coroutine`, which is put into `hir::Closure`. The `CoroutineKind` is thus removed from `hir::Body`, and `Option<Movability>` no longer needs to be a stand-in for "is this a closure or a coroutine".
r? eholk
Split coroutine desugaring kind from source
What a coroutine is desugared from (gen/async gen/async) should be separate from where it comes (fn/block/closure).
Separate MIR lints from validation
Add a MIR lint pass, enabled with -Zlint-mir, which identifies undefined or
likely erroneous behaviour.
The initial implementation mostly migrates existing checks of this nature from
MIR validator, where they did not belong (those checks have false positives and
there is nothing inherently invalid about MIR with undefined behaviour).
Fixes#104736Fixes#104843Fixes#116079Fixes#116736Fixes#118990
The old code used a heuristic to detect async functions and adjust their
coverage spans to produce better output. But there's no need to resort to a
heuristic when we can just check whether the current function is actually an
`async fn`.
And make all hand-written `IntoDiagnostic` impls generic, by using
`DiagnosticBuilder::new(dcx, level, ...)` instead of e.g.
`dcx.struct_err(...)`.
This means the `create_*` functions are the source of the error level.
This change will let us remove `struct_diagnostic`.
Note: `#[rustc_lint_diagnostics]` is added to `DiagnosticBuilder::new`,
it's necessary to pass diagnostics tests now that it's used in
`into_diagnostic` functions.
coverage: Skip instrumenting a function if no spans were extracted from MIR
The immediate symptoms of #118643 were fixed by #118666, but some users reported that their builds now encounter another coverage-related ICE:
```
error: internal compiler error: compiler/rustc_codegen_llvm/src/coverageinfo/mapgen.rs:98:17: A used function should have had coverage mapping data but did not: (...)
```
I was able to reproduce at least one cause of this error: if no relevant spans could be extracted from a function, but the function contains `CoverageKind::SpanMarker` statements, then codegen still thinks the function is instrumented and complains about the fact that it has no coverage spans.
This PR prevents that from happening in two ways:
- If we didn't extract any relevant spans from MIR, skip instrumenting the entire function and don't create a `FunctionCoverateInfo` for it.
- If coverage codegen sees a `CoverageKind::SpanMarker` statement, skip it early and avoid creating `func_coverage`.
---
Fixes#118850.
Rollup of 3 pull requests
Successful merges:
- #116888 (Add discussion that concurrent access to the environment is unsafe)
- #118888 (Uplift `TypeAndMut` and `ClosureKind` to `rustc_type_ir`)
- #118929 (coverage: Tidy up early parts of the instrumentor pass)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Tidy up early parts of the instrumentor pass
This is extracted from #118237, which needed to be manually rebased anyway.
Unlike that PR, this one only affects the coverage instrumentor, and doesn't attempt to move any code into the MIR builder. That can be left to a future version of #118305, which can still benefit from these improvements.
So this is now mostly a refactoring of some internal parts of the instrumentor.
Fix cases where std accidentally relied on inline(never)
This PR increases the power of `-Zcross-crate-inline-threshold=always` so that it applies through `#[inline(never)]`. Note that though this is called "cross-crate-inlining" in this case especially it is _just_ lazy per-CGU codegen. The MIR inliner and LLVM still respect the attribute as much as they ever have.
Trying to bootstrap with the new `-Zcross-crate-inline-threshold=always` change revealed two bugs:
We have special intrinsics `assert_inhabited`, `assert_zero_valid`, and `assert_mem_uniniitalized_valid` which codegen backends will lower to nothing or a call to `panic_nounwind`. Since we may not have any call to `panic_nounwind` in MIR but emit one anyway, we need to specially tell `MirUsedCollector` about this situation.
`#[lang = "start"]` is special-cased already so that `MirUsedCollector` will collect it, but then when we make it cross-crate-inlinable it is only assigned to a CGU based on whether `MirUsedCollector` saw a call to it, which of course we didn't.
---
I started looking into this because https://github.com/rust-lang/rust/pull/118683 revealed a case where we were accidentally relying on a function being `#[inline(never)]`, and cranking up cross-crate-inlinability seems like a way to find other situations like that.
r? `@nnethercote` because I don't like what I'm doing to the CGU partitioning code here but I can't come up with something much better
If we want to know whether two byte positions are in the same file, we don't
need to clone and compare `Lrc<SourceFile>`; we can just get their indices and
compare those instead.
Changes in this patch:
- Extract local variable `def_id`
- Check `is_fn_like` without retrieving HIR
- Inline some locals that are used once and aren't needed for clarity
It's necessary for `derive(Diagnostic)`, but is best avoided elsewhere
because there are clearer alternatives.
This required adding `Handler::struct_almost_fatal`.
Renamings:
- find -> opt_hir_node
- get -> hir_node
- find_by_def_id -> opt_hir_node_by_def_id
- get_by_def_id -> hir_node_by_def_id
Fix rebase changes using removed methods
Use `tcx.hir_node_by_def_id()` whenever possible in compiler
Fix clippy errors
Fix compiler
Apply suggestions from code review
Co-authored-by: Vadim Petrochenkov <vadim.petrochenkov@gmail.com>
Add FIXME for `tcx.hir()` returned type about its removal
Simplify with with `tcx.hir_node_by_def_id`
State transforms retains storage statements for locals that are not
stored inside a coroutine. It ensures those locals are live when
resuming by inserting StorageLive as appropriate. It forgot to end the
storage of those locals when suspending, which is fixed here.
While the end of live range is implicit when executing return, it is
nevertheless useful for inliner which would otherwise extend the live
range beyond return.
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and
removing redundant imports code into two PR.
Make async generators fused by default
I actually changed my mind about this since the implementation PR landed. I think it's beneficial for `async gen` blocks to be "fused" by default -- i.e., for them to repeatedly return `Poll::Ready(None)` -- rather than panic.
We have [`FusedStream`](https://docs.rs/futures/latest/futures/stream/trait.FusedStream.html) in futures-rs to represent streams with this capability already anyways.
r? eholk
cc ```@rust-lang/wg-async,``` would like to know if anyone else has opinions about this.
coverage: Simplify the heuristic for ignoring `async fn` return spans
The code for extracting coverage spans from MIR has a special heuristic for dealing with `async fn`, so that the function's closing brace does not have a confusing double count.
The code implementing that heuristic is currently mixed in with the code for flushing remaining spans after the main refinement loop, making the refinement code harder to understand.
We can solve that by hoisting the heuristic to an earlier stage, after the spans have been extracted and sorted but before they have been processed by the refinement loop.
The coverage tests verify that the heuristic is still effective, so coverage mappings/reports for `async fn` have not changed.
---
This PR also has the side-effect of fixing the `None some_prev` panic that started appearing after #118525.
The old code assumed that `prev` would always be present after the refinement loop. That was only true if the list of collected spans was non-empty, but prior to #118525 that didn't seem to come up in practice. After that change, the list of collected spans could be empty in some specific circumstances, leading to panics.
The new code uses an `if let` to inspect `prev`, which correctly does nothing if there is no span present.
coverage: Use `SpanMarker` to improve coverage spans for `if !` expressions
Coverage instrumentation works by extracting source code spans from MIR. However, some kinds of syntax are effectively erased during MIR building, so their spans don't necessarily exist anywhere in MIR, making them invisible to the coverage instrumentor (unless we resort to various heuristics and hacks to recover them).
This PR introduces `CoverageKind::SpanMarker`, which is a new variant of `StatementKind::Coverage`. Its sole purpose is to represent spans that would otherwise not appear in MIR, so that the coverage instrumentor can extract them.
When coverage is enabled, the MIR builder can insert these dummy statements as needed, to improve the accuracy of spans used by coverage mappings.
Fixes#115468.
---
```@rustbot``` label +A-code-coverage
There are cases where coverage instrumentation wants to show a span for some
syntax element, but there is no MIR node that naturally carries that span, so
the instrumentor can't see it.
MIR building can now use this new kind of coverage statement to deliberately
include those spans in MIR, attached to a dummy statement that has no other
effect.
coverage: Merge refined spans in a separate final pass
Pulling this merge step out of `push_refined_span` and into a separate pass lets us push directly to `refined_spans` instead of calling a helper method.
Because the compiler can now see partial borrows of `refined_spans`, we can remove some extra code that was jumping through hoops to satisfy the borrow checker.
---
``@rustbot`` label +A-code-coverage
coverage: Avoid unnecessary macros in unit tests
These macros don't provide enough value to justify their complexity, when they can just as easily be functions instead.
---
`@rustbot` label +A-code-coverage
compile-time evaluation: detect writes through immutable pointers
This has two motivations:
- it unblocks https://github.com/rust-lang/rust/pull/116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in https://github.com/rust-lang/rust/pull/117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB
When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.
The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.
I just hope perf works out.