continue `TypingMode` refactor
There are still quite a few places which (indirectly) rely on the `Reveal` of a `ParamEnv`, but we're slowly getting there
r? `@compiler-errors`
Mark `simplify_aggregate_to_copy` mir-opt as unsound
Mark the `simplify_aggregate_to_copy` mir-opt added in #128299 as unsound as it seems to miscompile the MCVE reported in https://github.com/rust-lang/rust/issues/132353. The mir-opt can be re-enabled once this case is fixed.
```rs
fn pop_min(mut score2head: Vec<Option<usize>>) -> Option<usize> {
loop {
if let Some(col) = score2head[0] {
score2head[0] = None;
return Some(col);
}
}
}
fn main() {
let min = pop_min(vec![Some(1)]);
println!("min: {:?}", min);
// panic happens here on beta in release mode
// but not in debug mode
min.unwrap();
}
```
This MCVE is included as a `run-pass` ui regression test in the first commit. I built the ui test with a nightly manually, and can reproduce the behavioral difference with `-C opt-level=0` and `-C opt-level=1`. Locally, this ui test will fail unless it was run on a compiler built with the second commit marking the mir-opt as unsound thus disabling it by default.
This PR **partially reverts** commit e7386b3, reversing changes made to 02b1be1. The mir-opt implementation is just marked as unsound but **not** reverted to make reland reviews easier. Test changes are **reverted if they were not pure additions**. Tests added by the original PR received `-Z unsound-mir-opts` compile-flags.
cc `@DianQK` `@cjgillot` (PR author and reviewer of #128299)
They represent a lot of abstraction and indirection, but they're only
used for `ConstAnalysis`, and apparently won't be used for any other
analyses in the future. This commit inlines and removes them, which
makes `ConstAnalysis` easier to read and understand.
Rename `rustc_abi::Abi` to `BackendRepr`
Remove the confabulation of `rustc_abi::Abi` with what "ABI" actually means by renaming it to `BackendRepr`, and rename `Abi::Aggregate` to `BackendRepr::Memory`. The type never actually represented how things are passed, as that has to have `PassMode` considered, at minimum, but rather it just is how we represented some things to the backend. This conflation arose because LLVM, the primary backend at the time, would lower certain IR forms using certain ABIs. Even that only somewhat was true, as it broke down when one ventured significantly afield of what is described by the System V AMD64 ABI either by using different architectures, ABI-modifying IR annotations, the same architecture **with different ISA extensions enabled**, or other... unexpected delights.
Unfortunately both names are still somewhat of a misnomer right now, as people have written code for years based on this misunderstanding. Still, their original names are even moreso, and for better or worse, this backend code hasn't received as much maintenance as the rest of the compiler, lately. Actually arriving at a correct end-state will simply require us to disentangle a lot of code in order to fix, much of it pointlessly repeated in several places. Thus this is not an "actual fix", just a way to deflect further misunderstandings.
This is a standard pattern:
```
MyAnalysis.into_engine(tcx, body).iterate_to_fixpoint()
```
`into_engine` and `iterate_to_fixpoint` are always called in pairs, but
sometimes with a builder-style `pass_name` call between them. But a
builder-style interface is overkill here. This has been bugging me a for
a while.
This commit:
- Merges `Engine::new` and `Engine::iterate_to_fixpoint`. This removes
the need for `Engine` to have fields, leaving it as a trivial type
that the next commit will remove.
- Renames `Analysis::into_engine` as `Analysis::iterate_to_fixpoint`,
gives it an extra argument for the optional pass name, and makes it
call `Engine::iterate_to_fixpoint` instead of `Engine::new`.
This turns the pattern from above into this:
```
MyAnalysis.iterate_to_fixpoint(tcx, body, None)
```
which is shorter at every call site, and there's less plumbing required
to support it.
The initial naming of "Abi" was an awful mistake, conveying wrong ideas
about how psABIs worked and even more about what the enum meant.
It was only meant to represent the way the value would be described to
a codegen backend as it was lowered to that intermediate representation.
It was never meant to mean anything about the actual psABI handling!
The conflation is because LLVM typically will associate a certain form
with a certain ABI, but even that does not hold when the special cases
that actually exist arise, plus the IR annotations that modify the ABI.
Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename
`BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to
the persistent misunderstandings, this too is now incorrect:
- Scattered ABI-relevant code is entangled with BackendRepr
- We do not always pre-compute a correct BackendRepr that reflects how
we "actually" want this value to be handled, so we leave the backend
interface to also inject various special-cases here
- In some cases `BackendRepr::Memory` is a "real" aggregate, but in
others it is in fact using memory, and in some cases it is a scalar!
Our rustc-to-backend lowering code handles this sort of thing right now.
That will eventually be addressed by lifting duplicated lowering code
to either rustc_codegen_ssa or rustc_target as appropriate.
coverage: Don't rely on the custom traversal to find enclosing loops
This opens up the possibility of modifying or removing the custom graph traversal used in coverage counter creation, without losing access to the heuristics that care about a node's enclosing loops.
Actually changing the traversal is left for future work, because this PR on its own doesn't change the emitted coverage mappings at all.
Effects cleanup
- removed extra bits from predicates queries that are no longer needed in the new system
- removed the need for `non_erasable_generics` to take in tcx and DefId, removed unused arguments in callers
r? compiler-errors
- removed extra bits from predicates queries that are no longer needed in the new system
- removed the need for `non_erasable_generics` to take in tcx and DefId, removed unused arguments in callers
Then we can rename the _raw functions to drop their suffix, and instead
explicitly use is_stable_const_fn for the few cases where that is really what
you want.
Move `cmp_in_dominator_order` out of graph dominator computation
Dominator-order information is only needed for coverage graphs, and is easy enough to collect by just traversing the graph again.
This avoids wasted work when computing graph dominators for any other purpose.
Dominator-order information is only needed for coverage graphs, and is easy
enough to collect by just traversing the graph again.
This avoids wasted work when computing graph dominators for any other purpose.
coverage: Make counter creation handle node/edge counters more uniformly
Similar to #130380, this is another round of small improvements informed by my ongoing attempts to overhaul coverage counter creation.
One of the big benefits is getting rid of the awkward special-case that would sometimes attach an edge counter to a node instead. That was needed by the code that chooses which out-edge should be given a counter expression, but we can avoid that by making the corresponding check a little smarter.
I've also renamed several things to be simpler and more consistent, which should help with future changes.
Continue to get rid of `ty::Const::{try_}eval*`
This PR mostly does:
* Removes all of the `try_eval_*` and `eval_*` helpers from `ty::Const`, and replace their usages with `try_to_*`.
* Remove `ty::Const::eval`.
* Rename `ty::Const::normalize` to `ty::Const::normalize_internal`. This function is still used in the normalization code itself.
* Fix some weirdness around the `TransmuteFrom` goal.
I'm happy to split it out further; for example, I could probably land the first part which removes the helpers, or the changes to codegen which are more obvious than the changes to tools.
r? BoxyUwU
Part of https://github.com/rust-lang/rust/issues/130704
Dont ICE when computing coverage of synthetic async closure body
I'm not totally certain if this is *right*, but at least it doesn't ICE.
The issue is that we end up generating two MIR bodies for each async closure, since the `FnOnce` and `Fn`/`FnMut` implementations have different borrowing behavior of their captured variables. They should ideally both contribute to the coverage, since those MIR bodies are (*to the user*) the same code and should have no behavioral differences.
This PR at least suppresses the ICEs, and then I guess worst case we can fix this the right way later.
r? Zalathar or re-roll
Fixes#131190
Make destructors on `extern "C"` frames to be executed
This would make the example in #123231 print "Noisy Drop". I didn't mark this as fixing the issue because the behaviour is yet to be spec'ed.
Tracking:
- https://github.com/rust-lang/rust/issues/74990
Don't check unsize goal in MIR validation when opaques remain
Similarly to `mir_assign_valid_types`, let's just skip when there are opaques. Fixes#130921.
coverage: Multiple small tweaks to counter creation
I've been experimenting with some larger changes to how coverage counters are assigned to parts of the control-flow graph, and while none of that is ready yet, along the way I've repeatedly found myself wanting these smaller tweaks as a base.
There are no changes to compiler output.
Don't use Immediate::offset to transmute pointers to integers
This applies the relatively new `assert_matches_abi` check in the `offset` operation on immediates, which makes sure that if offsets are used to alter the layout (which is possible because the field layout is arbitrarily picked by the caller), this is not done in a way that breaks the invariant of the `Immediate` type.
This leads to ICEs in a GVN mir-opt test, so the second commit fixes GVN.
Fixes https://github.com/rust-lang/rust/issues/131064.
This makes it possible for other parts of counter-assignment to check whether a
node is guaranteed to end up with some kind of counter.
Switching from `impl Fn` to a concrete `&BitSet` just avoids the hassle of
trying to store a closure in a struct field, and currently there's no
foreseeable need for this information to not be a bitset.
Disable jump threading `UnOp::Not` for non-bool
Fix#131195, where jumpthreading was optimizing `!a == b` into `a != b` for non-bool, where this is definitely not true.
This code can sometimes witness malformed coverage attributes in builds that
are going to fail, so use `span_delayed_bug` to avoid an inappropriate ICE in
that case.
Add `File` constructors that return files wrapped with a buffer
In addition to the light convenience, these are intended to raise visibility that buffering is something you should consider when opening a file, since unbuffered I/O is a common performance footgun to Rust newcomers.
ACP: https://github.com/rust-lang/libs-team/issues/446
Tracking Issue: #130804
Apply `EarlyOtherwiseBranch` to scalar value
In the future, I'm thinking of hoisting discriminant via GVN so that we only need to write very little code here.
r? `@cjgillot`
Encode `coroutine_by_move_body_def_id` in crate metadata
We synthesize the MIR for a by-move body for the `FnOnce` implementation of async closures. It can be accessed with the `coroutine_by_move_body_def_id` query. We weren't encoding this query in the metadata though, nor were we properly recording that synthetic MIR in `mir_keys`, so the `optimized_mir` wasn't getting encoded either!
Stacked on top is a fix to consider `DefKind::SyntheticCoroutineBody` to return true in several places I missed. Specifically, we should consider the def-kind in `fn DefKind::is_fn_like()`, since that's what we were using to make sure we ensure `query mir_inliner_callees` before the MIR gets stolen for the body. This led to some CI failures that were caught by miri but which I added a test for.
Remove semi-nondeterminism of `DefPathHash` ordering from inliner
Déjà vu or something because I kinda thought I had put this PR up before. I recall a discussion somewhere where I think it was `@saethlin` mentioning that this check was no longer needed since we have "proper" cycle detection. Putting that up as a PR now.
This may slighlty negatively affect inlining, since the cycle breaking here means that we still inlined some cycles when the def path hashes were ordered in certain ways, this leads to really bad nondeterminism that makes minimizing ICEs and putting up inliner bugfixes difficult.
r? `@cjgillot` or `@saethlin` or someone else idk
coverage: Clarify some parts of coverage counter creation
This is a series of semi-related changes that are trying to make the `counters` module easier to read, understand, and modify.
For example, the existing code happens to avoid ever using the count for a `TerminatorKind::Yield` node as the count for its sole out-edge (since doing so would be incorrect), but doesn't do so explicitly, so seemingly-innocent changes can result in baffling test failures.
This PR also takes the opportunity to simplify some debug-logging code that was making its surrounding code disproportionately hard to read.
There should be no changes to the resulting coverage instrumentation/mappings, as demonstrated by the absence of changes to the coverage test suite.
Given that we directly access the graph predecessors/successors in so many
other places, and sometimes must do so to satisfy the borrow checker, there is
little value in having this trivial helper method.
- Look up the node's predecessors only once
- Get rid of some overly verbose logging
- Explain why some nodes need physical counters
- Extract a helper method to create and set a physical node counter
Simplify the canonical clone method and the copy-like forms to copy
Fixes#128081.
The optimized clone method ends up as the following MIR:
```
_2 = copy ((*_1).0: i32);
_3 = copy ((*_1).1: u64);
_4 = copy ((*_1).2: [i8; 3]);
_0 = Foo { a: move _2, b: move _3, c: move _4 };
```
We can transform this to:
```
_0 = copy (*_1);
```
r? `@cjgillot`
Don't call closure_by_move_body_def_id on FnOnce async closures in MIR validation
Refactors the check in #129847 to not unncessarily call the `closure_by_move_body_def_id` query for async closures that don't *need* a by-move body.
Fixes#130167
- Replace non-standard names like 's, 'p, 'rg, 'ck, 'parent, 'this, and
'me with vanilla 'a. These are cases where the original name isn't
really any more informative than 'a.
- Replace names like 'cx, 'mir, and 'body with vanilla 'a when the lifetime
applies to multiple fields and so the original lifetime name isn't
really accurate.
- Put 'tcx last in lifetime lists, and 'a before 'b.
coverage: Simplify creation of sum counters
A small and self-contained improvement, extracted from some larger changes that I'm still working on.
Ultimately I want to avoid creating these sum counter-expressions in some cases (in favour of just adding physical counters directly to the nodes we care about), so a good incremental move towards that is splitting the “gather edge counters” step out from the ”build a sum of those counters” step.
Creating an extra intermediate vector should have negligible cost (and coverage isn't exercised by the benchmark suite anyway). The removed logging is redundant with the `#[instrument(..)]` logging we already have on the underlying method calls.
some const cleanup: remove unnecessary attributes, add const-hack indications
I learned that we use `FIXME(const-hack)` on top of the "const-hack" label. That seems much better since it marks the right place in the code and moves around with the code. So I went through the PRs with that label and added appropriate FIXMEs in the code. IMO this means we can then remove the label -- Cc ``@rust-lang/wg-const-eval.``
I also noticed some const stability attributes that don't do anything useful, and removed them.
r? ``@fee1-dead``
coverage: Clean up terminology in counter creation
Some of the terminology in this module is confusing, or has drifted out of sync with other parts of the coverage code.
This PR therefore renames some variables and methods, and adjusts comments and debug logging statements, to make things clearer and more consistent.
No functional changes, other than some small tweaks to debug logging.
In all cases the struct can own the relevant thing instead of having a
reference to it. This makes the code simpler, and in some cases removes
a struct lifetime.
These are all functions with a single callsite, where having a separate
function does nothing to help with readability. These changes make the
code a little shorter and easier to read.
There are four related dataflow structs: `MaybeInitializedPlaces`,
`MaybeUninitializedPlaces`, and `EverInitializedPlaces`,
`DefinitelyInitializedPlaces`. They all have a `&Body` and a
`&MoveData<'tcx>` field. The first three use different lifetimes for the
two fields, but the last one uses the same lifetime for both.
This commit changes the first three to use the same lifetime, removing
the need for one of the lifetimes. Other structs that also lose a
lifetime as a result of this are `LivenessContext`, `LivenessResults`,
`InitializationData`.
It then does similar things in various other structs.
Currently it constructs two vectors `calls_to_terminated` and
`cleanups_to_remove` in the main loop, and then processes them after the
main loop. But the processing can be done in the main loop, avoiding the
need for the vectors.
Do not call query to compute coroutine layout for synthetic body of async closure
There is code in the MIR validator that attempts to prevent query cycles when inlining a coroutine into itself, and will use the coroutine layout directly from the body when it detects that's the same coroutine as the one that's being validated. After #128506, this logic didn't take into account the fact that the coroutine def id will differ if it's the "by-move body" of an async closure. This PR implements that.
Fixes#129811
coverage: Count await when the Future is immediately ready
Currently `await` is only counted towards coverage if the containing
function is suspended and resumed at least once. This is because it
expands to code which contains a branch on the discriminant of `Poll`.
By treating it like a branching macro (e.g. `assert!`), these
implementation details will be hidden from the coverage results.
I added a test to ensure the fix works in simple cases, but the heuristic of picking only the first await-related covspan might be unreliable. I plan on testing more thoroughly with a real codebase over the next couple of weeks.
closes#98712
Make `Ty::boxed_ty` return an `Option`
Looks like a good place to use Rust's type system.
---
Most of 4ac7bcbaad/compiler/rustc_middle/src/ty/sty.rs (L971-L1963) looks like it could be moved to `TyKind` (then I guess `Ty` should be made to deref to `TyKind`).
Currently `await` is only counted towards coverage if the containing
function is suspended and resumed at least once. This is because it
expands to code which contains a branch on the discriminant of `Poll`.
By treating it like a branching macro (e.g. `assert!`), these
implementation details will be hidden from the coverage results.
Rename dump of coroutine by-move-body to be more consistent, fix ICE in dump_mir
First, we add a missing match for `DefKind::SyntheticCoroutineBody` in `dump_mir`. Fixes#129703. The second commit (directly below) serves as a test.
Second, we reorder the `dump_mir` in `coroutine_by_move_body_def_id` to be *after* we adjust the body source, and change the disambiguator so it reads more like any other MIR body. This also serves as a test for the ICE, since we're dumping the MIR of a body with `DefKind::SyntheticCoroutineBody`.
Third, we change the parenting of the synthetic MIR body to have the *coroutine-closure* (i.e. async closure) as its parent, so we don't have long strings of `{closure#0}-{closure#0}-{closure#0}`.
try-job: test-various
Move `SanityCheck` and `MirPass`
They are currently in `rustc_middle`. This PR moves them to `rustc_mir_transform`, which makes more sense.
r? ``@cjgillot``
Because that's now the only crate that uses it.
Moving stuff out of `rustc_middle` is always welcome.
I chose to use `impl crate::MirPass`/`impl crate::MirLint` (with
explicit `crate::`) everywhere because that's the only mention of
`MirPass`/`MirLint` used in all of these files. (Prior to this change,
`MirPass` was mostly imported via `use rustc_middle::mir::*` items.)
The actual implementation remains in `rustc_mir_dataflow`, but this
commit moves the `MirPass` impl to `rustc_mir_transform` and changes it
to a `MirLint` (fixing a `FIXME` comment).
(I originally tried moving the full implementation from
`rustc_mir_dataflow` but I had some trait problems with `HasMoveData`
and `RustcPeekAt` and `MaybeLiveLocals`. This commit was much smaller
and simpler, but still will allow some follow-up cleanups.)
Remove `#[macro_use] extern crate tracing`, round 4
Because explicit importing of macros via use items is nicer (more standard and readable) than implicit importing via #[macro_use]. Continuing the work from #124511, #124914, and #125434. After this PR no `rustc_*` crates use `#[macro_use] extern crate tracing` except for `rustc_codegen_gcc` which is a special case and I will do separately.
r? ```@jieyouxu```
Remove `Option<!>` return types.
Several compiler functions have `Option<!>` for their return type. That's odd. The only valid return value is `None`, so why is this type used?
Because it lets you write certain patterns slightly more concisely. E.g. if you have these common patterns:
```
let Some(a) = f() else { return };
let Ok(b) = g() else { return };
```
you can shorten them to these:
```
let a = f()?;
let b = g().ok()?;
```
Huh.
An `Option` return type typically designates success/failure. How should I interpret the type signature of a function that always returns (i.e. doesn't panic), does useful work (modifying `&mut` arguments), and yet only ever fails? This idiom subverts the type system for a cute syntactic trick.
Furthermore, returning `Option<!>` from a function F makes things syntactically more convenient within F, but makes things worse at F's callsites. The callsites can themselves use `?` with F but should not, because they will get an unconditional early return, which is almost certainly not desirable. Instead the return value should be ignored. (Note that some of callsites of `process_operand`, `process_immedate`, `process_assign` actually do use `?`, though the early return doesn't matter in these cases because nothing of significance comes after those calls. Ugh.)
When I first saw this pattern I had no idea how to interpret it, and it took me several minutes of close reading to understand everything I've written above. I even started a Zulip thread about it to make sure I understood it properly. "Save a few characters by introducing types so weird that compiler devs have to discuss it on Zulip" feels like a bad trade-off to me. This commit replaces all the `Option<!>` return values and uses `else`/`return` (or something similar) to replace the relevant `?` uses. The result is slightly more verbose but much easier to understand.
r? ``````@cjgillot``````
Several compiler functions have `Option<!>` for their return type.
That's odd. The only valid return value is `None`, so why is this type
used?
Because it lets you write certain patterns slightly more concisely. E.g.
if you have these common patterns:
```
let Some(a) = f() else { return };
let Ok(b) = g() else { return };
```
you can shorten them to these:
```
let a = f()?;
let b = g().ok()?;
```
Huh.
An `Option` return type typically designates success/failure. How should
I interpret the type signature of a function that always returns (i.e.
doesn't panic), does useful work (modifying `&mut` arguments), and yet
only ever fails? This idiom subverts the type system for a cute
syntactic trick.
Furthermore, returning `Option<!>` from a function F makes things
syntactically more convenient within F, but makes things worse at F's
callsites. The callsites can themselves use `?` with F but should not,
because they will get an unconditional early return, which is almost
certainly not desirable. Instead the return value should be ignored.
(Note that some of callsites of `process_operand`, `process_immedate`,
`process_assign` actually do use `?`, though the early return doesn't
matter in these cases because nothing of significance comes after those
calls. Ugh.)
When I first saw this pattern I had no idea how to interpret it, and it
took me several minutes of close reading to understand everything I've
written above. I even started a Zulip thread about it to make sure I
understood it properly. "Save a few characters by introducing types so
weird that compiler devs have to discuss it on Zulip" feels like a bad
trade-off to me. This commit replaces all the `Option<!>` return values
and uses `else`/`return` (or something similar) to replace the relevant
`?` uses. The result is slightly more verbose but much easier to
understand.
By making it own the index maps, instead of holding references to them.
This requires moving the free function `find_candidate` into
`Candidate::reset_and_find`. It lets the `'alloc` lifetime be removed
everywhere that still has it.
LLVM uses the word "code" to refer to a particular kind of coverage mapping.
This unrelated usage of the word is confusing, and makes it harder to introduce
types whose names correspond to the LLVM classification of coverage kinds.
When deduplicating unreachable blocks, erase the source information.
After deduplication the block conceptually belongs to multiple locations in the source. Although these blocks are unreachable, in #123341 we did come across a real side effect, an unreachable block that survives into the compiled code can cause a debugger to set a breakpoint on the wrong instruction. Erasing the source information ensures that a debugger will never be misled into thinking that the unreachable block is worth setting a breakpoint on, especially after #128627.
Technically we don't need to erase the source information if all the deduplicated blocks have identical source information, but tracking that seems like more effort than it's worth.
I'll let njn redirect this one too. r? `@nnethercote`
Fix projections when parent capture is by-ref but child capture is by-value in the `ByMoveBody` pass
This fixes a somewhat strange bug where we build the incorrect MIR in #129074. This one is weird, but I don't expect it to actually matter in practice since it almost certainly results in a move error in borrowck. However, let's not ICE.
Given the code:
```
#![feature(async_closure)]
// NOT copy.
struct Ty;
fn hello(x: &Ty) {
let c = async || {
*x;
//~^ ERROR cannot move out of `*x` which is behind a shared reference
};
}
fn main() {}
```
The parent coroutine-closure captures `x: &Ty` by-ref, resulting in an upvar of `&&Ty`. The child coroutine captures `x` by-value, resulting in an upvar of `&Ty`. When constructing the by-move body for the coroutine-closure, we weren't applying an additional deref projection to convert the parent capture into the child capture, resulting in an type error in assignment, which is a validation ICE.
As I said above, this only occurs (AFAICT) in code that eventually results in an error, because it is only triggered by HIR that attempts to move a non-copy value out of a ref. This doesn't occur if `Ty` is `Copy`, since we'd instead capture `x` by-ref in the child coroutine.
Fixes#129074
Use `append` instead of `extend(drain(..))`
The first commit adds `IndexVec::append` that forwards to `Vec::append`, and uses it in a couple places.
The second commit updates `indexmap` for its new `IndexMap::append`, and also uses that in a couple places.
These changes are similar to what [`clippy::extend_with_drain`](https://rust-lang.github.io/rust-clippy/master/index.html#/extend_with_drain) would suggest, just for other collection types.
Shrink `TyKind::FnPtr`.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
r? `@compiler-errors`
Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
Fix `ElaborateBoxDerefs` on debug varinfo
Slightly simplifies the `ElaborateBoxDerefs` pass to fix cases where it was applying the wrong projections to debug var infos containing places that deref boxes.
From what I can tell[^1], we don't actually have any tests (or code anywhere, really) that exercise `debug x => *(...: Box<T>)`, and it's very difficult to trigger this in surface Rust, so I wrote a custom MIR test.
What happens is that the pass was turning `*(SOME_PLACE: Box<T>)` into `*(*((((SOME_PLACE).0: Unique<T>).0: NonNull<T>).0: *const T))` in debug var infos. In particular, notice the *double deref*, which was wrong.
This is the root cause of #128554, so this PR fixes#128554 as well. The reason that async closures was affected is because of the way that we compute the [`ByMove` body](https://github.com/rust-lang/rust/blob/master/compiler/rustc_mir_transform/src/coroutine/by_move_body.rs), which resulted in `*(...: Box<T>)` in debug var info. But this really has nothing to do with async closures.
[^1]: Validated by literally replacing the `if elem == PlaceElem::Deref && base_ty.is_box() { ... }` innards with a `panic!()`, which compiled all of stage2 without panicking.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and
`FnHeader`, which can be packed more efficiently. This reduces the size
of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms.
This reduces peak memory usage by a few percent on some benchmarks. It
also reduces cache misses and page faults similarly, though this doesn't
translate to clear cycles or wall-time improvements on CI.
After deduplication the block conceptually belongs to multiple locations
in the source. Although these blocks are unreachable, in #123341 we did
come across a real side effect, an unreachable block that survives into
the compiled code can cause a debugger to set a breakpoint on the wrong
instruction. Erasing the source information ensures that a debugger will
never be misled into thinking that the unreachable block is worth setting
a breakpoint on, especially after #128627.
Technically we don't need to erase the source information if all the
deduplicated blocks have identical source information, but tracking
that seems like more effort than it's worth.
Jump threading stores values as `u128` (`ScalarInt`) and does its
comparisons for equality as integer comparisons.
This works great for integers. Sadly, not everything is an integer.
Floats famously have wonky equality semantcs, with `NaN!=NaN` and
`0.0 == -0.0`. This does not match our beautiful integer bitpattern
equality and therefore causes things to go horribly wrong.
While jump threading could be extended to support floats by remembering
that they're floats in the value state and handling them properly,
it's signficantly easier to just disable it for now.
Let InstCombine remove Clone shims inside Clone shims
The Clone shims that we generate tend to recurse into other Clone shims, which gets very silly very quickly. Here's our current state: https://godbolt.org/z/E69YeY8eq
So I've added InstSimplify to the shims optimization passes, and improved `is_trivially_pure_clone_copy` so that it can delete those calls inside the shim. This makes the shim way smaller because most of its size is the required ceremony for unwinding.
This change also completely breaks the UI test added for https://github.com/rust-lang/rust/issues/104870. With this PR, that program ICEs in MIR type checking because `is_trivially_pure_clone_copy` and the trait solver disagree on whether `*mut u8` is `Copy`. And adding the requisite `Copy` impl to make them agree makes the test not generate any diagnostics. Considering that I spent most of my time on this PR fixing `#![no_core]` tests, I would prefer to just delete this one. The maintenance burden of `#![no_core]` is uniquely high because when they break they tend to break in very confusing ways.
try-job: x86_64-mingw
Make Clone::clone a lang item
I want to absorb all the logic for picking whether an Instance is LocalCopy or GloballyShared into one place. As part of this, I wanted to identify Clone shims inside `cross_crate_inlinable` and found that rather tricky. `@compiler-errors` suggested that I add a lang item for `Clone::clone` because that would produce other cleanups in the compiler.
That sounds good to me, but I have looked and I've only been able to find one.
r? compiler-errors
Clean up a few minor refs in `format!` macro, as it has a performance cost. Apparently the compiler is unable to inline `format!("{}", &variable)`, and does a run-time double-reference instead (format macro already does one level referencing). Inlining format args prevents accidental `&` misuse.
In the future, branch and MC/DC mappings might have expressions that don't
correspond to any single point in the control-flow graph. That makes it
trickier to keep track of which expressions should expect an `ExpressionUsed`
node.
We therefore sidestep that complexity by only performing `ExpressionUsed`
simplification for expressions associated directly with ordinary `Code`
mappings.
[Coverage][MCDC] Group mcdc tests and fix panic when generating mcdc code for inlined expressions.
### Changes
1. Group all mcdc tests to one directory.
2. Since mcdc instruments different mappings for boolean expressions with normal branch coverage as #125766 introduces, it would be better also trace branch coverage results in mcdc tests.
3. So far rustc does not call `CoverageInfoBuilderMethods::init_coverage` for inlined functions. As a result, it could panic if it tries to instrument mcdc statements for inlined functions due to uninitialized cond bitmaps. We can reproduce this issue by current nightly rustc and [the test](https://github.com/rust-lang/rust/pull/127234/files#diff-c81af6bf4869aa42f5c7334e3e86344475de362f673f54ce439ec75fcb5ac3e5) with flag `--release`. This patch fixes it.
Support tail calls in mir via `TerminatorKind::TailCall`
This is one of the interesting bits in tail call implementation — MIR support.
This adds a new `TerminatorKind` which represents a tail call:
```rust
TailCall {
func: Operand<'tcx>,
args: Vec<Operand<'tcx>>,
fn_span: Span,
},
```
*Structurally* this is very similar to a normal `Call` but is missing a few fields:
- `destination` — tail calls don't write to destination, instead they pass caller's destination to the callee (such that eventual `return` will write to the caller of the function that used tail call)
- `target` — similarly to `destination` tail calls pass the caller's return address to the callee, so there is nothing to do
- `unwind` — I _think_ this is applicable too, although it's a bit confusing
- `call_source` — `become` forbids operators and is not created as a lowering of something else; tail calls always come from HIR (at least for now)
It might be helpful to read the interpreter implementation to understand what `TailCall` means exactly, although I've tried documenting it too.
-----
There are a few `FIXME`-questions still left, ideally we'd be able to answer them during review ':)
-----
r? `@oli-obk`
cc `@scottmcm` `@DrMeepster` `@JakobDegen`
Make jump threading state sparse
Continuation of https://github.com/rust-lang/rust/pull/127024
Both dataflow const-prop and jump threading involve cloning the state vector a lot. This PR replaces the data structure by a sparse vector, considering:
- that jump threading state is typically very sparse (at most 1 or 2 set entries);
- that dataflow const-prop is disabled by default;
- that place/value map is very eager, and prone to creating an overly large state.
The first commit is shared with the previous PR to avoid needless conflicts.
r? `@oli-obk`
Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Fix `FnMut::call_mut`/`Fn::call` shim for async closures that capture references
I adjusted async closures to be able to implement `Fn` and `FnMut` *even if* they capture references, as long as those references did not need to borrow data from the closure captures themselves. See #125259.
However, when I did this, I didn't actually relax an assertion in the `build_construct_coroutine_by_move_shim` shim code, which builds the `Fn`/`FnMut`/`FnOnce` implementations for async closures. Therefore, if we actually tried to *call* `FnMut`/`Fn` on async closures, it would ICE.
This PR adjusts this assertion to ensure that we only capture immutable references in closures if they implement `Fn`/`FnMut`. It also adds a bunch of tests and makes more of the async-closure tests into `build-pass` since we often care about these tests actually generating the right closure shims and stuff. I think it might be excessive to *always* use build-pass here, but 🤷 it's not that big of a deal.
Fixes#127019Fixes#127012
r? oli-obk
In 126578 we ended up with more binary size increases than expected.
This change attempts to avoid inlining large things into small things, to avoid that kind of increase, in cases when top-down inlining will still be able to do that inlining later.
Rollup of 7 pull requests
Successful merges:
- #126923 (test: dont optimize to invalid bitcasts)
- #127090 (Reduce merge conflicts from rustfmt's wrapping)
- #127105 (Only update `Eq` operands in GVN if it can update both sides)
- #127150 (Fix x86_64 code being produced for bare-metal LoongArch targets' `compiler_builtins`)
- #127181 (Introduce a `rustc_` attribute to dump all the `DefId` parents of a `DefId`)
- #127182 (Fix error in documentation for IpAddr::to_canonical and Ipv6Addr::to_canonical)
- #127191 (Ensure `out_of_scope_macro_calls` lint is registered)
r? `@ghost`
`@rustbot` modify labels: rollup
Automatically taint InferCtxt when errors are emitted
r? `@nnethercote`
Basically `InferCtxt::dcx` now returns a `DiagCtxt` that refers back to the `Cell<Option<ErrorGuaranteed>>` of the `InferCtxt` and thus when invoking `Diag::emit`, and the diagnostic is an error, we taint the `InferCtxt` directly.
That change on its own has no effect at all, because `InferCtxt` already tracks whether errors have been emitted by recording the global error count when it gets opened, and checking at the end whether the count changed. So I removed that error count check, which had a bit of fallout that I immediately fixed by invoking `InferCtxt::dcx` instead of `TyCtxt::dcx` in a bunch of places.
The remaining new errors are because an error was reported in another query, and never bubbled up. I think they are minor enough for this to be ok, and sometimes it actually improves diagnostics, by not silencing useful diagnostics anymore.
fixes#126485 (cc `@olafes)`
There are more improvements we can do (like tainting in hir ty lowering), but I would rather do that in follow up PRs, because it requires some refactorings.
coverage: Avoid getting extra unexpansion info when we don't need it
Several callers of `unexpand_into_body_span_with_visible_macro` would immediately discard the additional macro-related information, which is wasteful. We can avoid this by having them instead call a simpler method that just returns the span they care about.
This PR also moves the relevant functions out of `coverage::spans::from_mir` and into a new submodule `coverage::unexpand`, so that calling them from `coverage::mappings` is less awkward.
There should be no actual changes to coverage-instrumentation output, as demonstrated by the absence of test updates.
Avoid cloning jump threading state when possible
The current implementation of jump threading passes most of its time cloning its state. This PR attempts to avoid such clones by special-casing the last predecessor when recursing through a terminator.
This is not optimal, but a first step while I refactor the state data structure to be sparse.
The two other commits are drive-by.
Fixes https://github.com/rust-lang/rust/issues/116721
r? `@oli-obk`
These particular callers don't actually use the returned macro information, so
they can use a simpler span-unexpansion function that doesn't return it.
coverage: Make `#[coverage(..)]` apply recursively to nested functions
This PR makes the (currently-unstable) `#[coverage(off)]` and `#[coverage(on)]` attributes apply recursively to all nested functions/closures, instead of just the function they are directly attached to.
Those attributes can now also be applied to modules and to impl/impl-trait blocks, where they have no direct effect, but will be inherited by all enclosed functions/closures/methods that don't override the inherited value.
---
Fixes#126625.
Remove more `PtrToPtr` casts in GVN
This addresses two things I noticed in MIR:
1. `NonNull::<T>::eq` does `(a as *mut T) == (b as *mut T)`, but it could just compare the `*const T`s, so this removes `PtrToPtr` casts that are on both sides of a pointer comparison, so long as they're not fat-to-thin casts.
2. `NonNull::<T>::addr` does `transmute::<_, usize>(p as *const ())`, but so long as `T: Thin` that cast doesn't do anything, and thus we can directly transmute the `*const T` instead.
r? mir-opt
Save 2 pointers in `TerminatorKind` (96 → 80 bytes)
These things don't need to be `Vec`s; boxed slices are enough.
The frequent one here is call arguments, but MIR building knows the number of arguments from the THIR, so the collect is always getting the allocation right in the first place, and thus this shouldn't ever add the shrink-in-place overhead.
These things don't need to be `Vec`s; boxed slices are enough.
The frequent one here is call arguments, but MIR building knows the number of arguments from the THIR, so the collect is always getting the allocation right in the first place, and thus this shouldn't ever add the shrink-in-place overhead.
`PtrMetadata` doesn't care about `*const`/`*mut`/`&`/`&mut`, so GVN away those casts in its argument.
This includes updating MIR to allow calling PtrMetadata on references too, not just raw pointers. That means that `[T]::len` can be just `_0 = PtrMetadata(_1)`, for example.
# Conflicts:
# tests/mir-opt/pre-codegen/slice_index.slice_get_unchecked_mut_range.PreCodegen.after.panic-abort.mir
# tests/mir-opt/pre-codegen/slice_index.slice_get_unchecked_mut_range.PreCodegen.after.panic-unwind.mir
Account for things that optimize out in inlining costs
This updates the MIR inlining `CostChecker` to have both bonuses and penalties, rather than just penalties.
That lets us add bonuses for some things where we want to encourage inlining without risking wrapping into a gigantic cost. For example, `switchInt(const …)` we give an inlining bonus because codegen will actually eliminate the branch (and associated dead blocks) once it's monomorphized, so measuring both sides of the branch gives an unrealistically-high cost to it. Similarly, an `unreachable` terminator gets a small bonus, because whatever branch leads there doesn't actually exist post-codegen.
Clean up some comments near `use` declarations
#125443 will reformat all `use` declarations in the repository. There are a few edge cases involving comments on `use` declarations that require care. This PR cleans up some clumsy comment cases, taking us a step closer to #125443 being able to merge.
r? ``@lqd``
Stabilise `c_unwind`
Fix#74990Fix#115285 (that's also where FCP is happening)
Marking as draft PR for now due to `compiler_builtins` issues
r? `@Amanieu`
Most modules have such a blank line, but some don't. Inserting the blank
line makes it clearer that the `//!` comments are describing the entire
module, rather than the `use` declaration(s) that immediately follows.
Apparently MIR borrowck cares about at least one of these for checking variance.
In runtime MIR, though, there's no need for them as `PtrToPtr` does the same thing.
(Banning them simplifies passes like GVN that no longer need to handle multiple cast possibilities.)
Replace all `&DiagCtxt` with a `DiagCtxtHandle<'_>` wrapper type
r? `@davidtwco`
This paves the way for tracking more state (e.g. error tainting) in the diagnostic context handle
Basically I will add a field to the `DiagCtxtHandle` that refers back to the `InferCtxt`'s (and others) `Option<ErrorHandled>`, allowing us to immediately taint these contexts when emitting an error and not needing manual tainting anymore (which is easy to forget and we don't do in general anyway)
coverage: Add debugging flag `-Zcoverage-options=no-mir-spans`
When set, this flag skips the code that normally extracts coverage spans from MIR statements and terminators. That sometimes makes it easier to debug branch coverage and MC/DC coverage instrumentation, because the coverage output is less noisy.
For internal debugging only. If future code changes would make it hard to keep supporting this flag, it should be removed at that time.
`@rustbot` label +A-code-coverage
Rename `InstanceDef` -> `InstanceKind`
Renames `InstanceDef` to `InstanceKind`. The `Def` here is confusing, and makes it hard to distinguish `Instance` and `InstanceDef`. `InstanceKind` makes this more obvious, since it's really just describing what *kind* of instance we have.
Not sure if this is large enough to warrant a types team MCP -- it's only 53 files. I don't personally think it does, but happy to write one if anyone disagrees. cc ``@rust-lang/types``
r? types
When set, this flag skips the code that normally extracts coverage spans from
MIR statements and terminators. That sometimes makes it easier to debug branch
coverage and MC/DC coverage, because the coverage output is less noisy.
For internal debugging only. If other code changes would make it hard to keep
supporting this flag, remove it.
coverage: Several small improvements to graph code
This PR combines a few small improvements to coverage graph handling code:
- Remove some low-value implementation tests that were getting in the way of other changes.
- Clean up `pub` visibility.
- Flatten some code using let-else.
- Prefer `.copied()` over `.cloned()`.
`@rustbot` label +A-code-coverage
These tests might have originally been useful as an implementation aid, but now
they don't provide enough value to justify the burden of updating them as the
underlying code changes.
The code they test is still exercised by the main end-to-end coverage tests.
smir: merge identical Constant and ConstOperand types
The first commit renames the const operand visitor functions on regular MIR to match the type name, that was forgotten in the original rename.
The second commit changes stable MIR, fixing https://github.com/rust-lang/project-stable-mir/issues/71. Previously there were two different smir types for the MIR type `ConstOperand`, one used in `Operand` and one in `VarDebugInfoContents`.
Maybe we should have done this with https://github.com/rust-lang/rust/pull/125967, so there's only a single breaking change... but I saw that PR too late.
Fixes https://github.com/rust-lang/project-stable-mir/issues/71
Use `Variance` glob imported variants everywhere
Fully commit to using the globbed variance. Could be convinced the other way, and change this PR to not use the globbed variants anywhere, but I'd rather we do one or the other.
r? lcnr