The issue here is that the logic used to determine which CGU to put the
dead function stubs in doesn't handle cases where a module is never
assigned to a CGU.
The partitioning logic also caused issues in #85461 where inline
functions were duplicated into multiple CGUs resulting in duplicate
symbols.
This commit fixes the issue by removing the complex logic used to assign
dead code stubs to CGUs and replaces it with a much simplier model: we
pick one CGU to hold all the dead code stubs. We pick a CGU which has
exported items which increases the likelihood the linker won't throw
away our dead functions and we pick the smallest to minimize the impact
on compilation times for crates with very large CGUs.
Fixes#86177Fixes#85718Fixes#79622
Move generator check earlier in inlining.
Inlining into generator may create references to other generators. For instance, inlining `Pin::<&mut from_generator::GenFuture<[generator1]>>::new_unchecked` into `generator2`. This cross reference can then create cycles when computing inlining for `generator1`.
In order to avoid this kind of surprises, we forbid all inlining into generators, and rely on LLVM to do the right thing. The existing `remove-zst-query-cycle` already ICEs in inline-mir mode, so we use it as test.
Split from #91743.
Previously some code paths would fail to evaluate the rvalue, while
incorrectly indicating success with `Ok`. As a result the previous value
of lhs could have been incorrectly const propagated.
This optimization pass previously made excessive assumptions as to the nature of
the blocks being optimized. We remove those assumptions and make sure to
rigorously justify all changes that are made to the MIR. Details can be found
in the file.
Remove `in_band_lifetimes` from `rustc_mir_transform`
Like #91580, this was inspired by the conversation in #44524 about possibly removing the feature from the compiler. This crate is a heavy `'tcx` user, so is a nice case study.
r? ``@petrochenkov``
Three interesting ones:
This one had the `'tcx` declared on the function, despite the trait taking a `'tcx`:
```diff
-impl Visitor<'_> for UsedLocals {
+impl<'tcx> Visitor<'tcx> for UsedLocals {
fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
```
This one use in-band for one, and underscore for the other:
```diff
-pub fn remove_dead_blocks(tcx: TyCtxt<'tcx>, body: &mut Body<'_>) {
+pub fn remove_dead_blocks<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
```
A spurious name, since there's no single-use-lifetime warning:
```diff
-pub fn run_passes(tcx: TyCtxt<'tcx>, body: &'mir mut Body<'tcx>, passes: &[&dyn MirPass<'tcx>]) {
+pub fn run_passes<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>, passes: &[&dyn MirPass<'tcx>]) {
```
Address some FIXMEs left over from #91475
This shouldn't change behavior, only clarify what we're currently doing. I filed #91576 to see if the treatment of generator drop shims is intentional.
cc #91475
This one is a heavy `'tcx` user.
Two interesting ones:
This one had the `'tcx` declared on the function, despite the trait taking a `'tcx`:
```diff
-impl Visitor<'_> for UsedLocals {
+impl<'tcx> Visitor<'tcx> for UsedLocals {
fn visit_statement(&mut self, statement: &Statement<'tcx>, location: Location) {
```
This one use in-band for one, and underscore for the other:
```diff
-pub fn remove_dead_blocks(tcx: TyCtxt<'tcx>, body: &mut Body<'_>) {
+pub fn remove_dead_blocks<'tcx>(tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
```
Add a MIR pass manager (Taylor's Version)
The final draft of #91386 and #77665.
While the compile-time constraints in #91386 are cool, I decided on a more minimal approach for now. I want to explore phase constraints and maybe relative-ordering constraints in the future, though. This should preserve existing behavior **exactly** (please let me know if it doesn't) while making the following changes to the way we organize things today:
- Each `MirPhase` now corresponds to a single MIR pass. `run_passes` is not responsible for listing the correct MIR phase.
- `run_passes` no longer silently skips passes if the declared MIR phase is greater than or equal to the body's. This has bitten me multiple times. If you want this behavior, you can always branch on `body.phase` yourself.
- If your pass is solely to emit errors, you can use the `MirLint` interface instead, which gets a shared reference to `Body` instead of a mutable one. By differentiating the two, I hope to make it clearer in the short term where lints belong in the pipeline. In the long term perhaps we could enforce this at compile-time?
- MIR is no longer dumped for passes that aren't enabled, or for lints.
I tried to check that `-Zvalidate` still works correctly, since the MIR phase is now updated as soon as the associated pass is done, instead of at the end of all the passes in `run_passes`. However, it looks like `-Zvalidate` is broken with current nightlies anyways 😢 (it spits out a bunch of errors).
cc `@oli-obk` `@wesleywiser`
r? rust-lang/wg-mir-opt
Looks like Generator drop shims already have `post_borrowck_cleanup` run
on them. That's a bit surprising, since it means they're getting const-
and maybe borrow-checked? This merits further investigation, but for now
just preserve the status quo.
Move `#![feature(const_precise_live_drops)]` checks earlier in the pipeline
Should mitigate the issues found during MCP on #73255.
Once this is done, we should clean up the queries a bit, since I think `mir_drops_elaborated_and_const_checked` can be merged back into `mir_promoted`.
Fixes#90770.
cc ``@rust-lang/wg-const-eval``
r? ``@nikomatsakis`` (since they reviewed #71824)
Instead we run `RemoveFalseEdges` and `RemoveUninitDrops` at the
appropriate time. The extra `SimplifyCfg` avoids visiting unreachable
blocks during `RemoveUninitDrops`.
Otherwise dataflow state will propagate along false edges and cause
things to be marked as maybe init unnecessarily. These should be
separate, since `SimplifyBranches` also makes `if true {} else {}` into
a `goto`, which means we wouldn't lint anything in the `else` block.
Previously for enums using the `Variants::Single` layout, the variant
index was being confused with its discriminant. For example, in the case
of `enum E { A = 1 }`.
Use `discriminant_for_variant` to avoid the issue.
The change is limited to the iteration over indices instead of using
`basic_blocks_mut()` directly, in the case the previous implementation
intentionally avoided invalidating the caches stored in MIR body.
Remove hir::map::blocks and use FnKind instead
The principal tool is `FnLikeNode`, which is not often used and can be easily implemented using `rustc_hir::intravisit::FnKind`.
Adopt let_else across the compiler
This performs a substitution of code following the pattern:
```
let <id> = if let <pat> = ... { identity } else { ... : ! };
```
To simplify it to:
```
let <pat> = ... { identity } else { ... : ! };
```
By adopting the `let_else` feature (cc #87335).
The PR also updates the syn crate because the currently used version of the crate doesn't support `let_else` syntax yet.
Note: Generally I'm the person who *removes* usages of unstable features from the compiler, not adds more usages of them, but in this instance I think it hopefully helps the feature get stabilized sooner and in a better state. I have written a [comment](https://github.com/rust-lang/rust/issues/87335#issuecomment-944846205) on the tracking issue about my experience and what I feel could be improved before stabilization of `let_else`.
Index and hash HIR as part of lowering
Part of https://github.com/rust-lang/rust/pull/88186
~Based on https://github.com/rust-lang/rust/pull/88880 (see merge commit).~
Once HIR is lowered, it is later indexed by the `index_hir` query and hashed for `crate_hash`. This PR moves those post-processing steps to lowering itself. As a side objective, the HIR crate data structure is refactored as an `IndexVec<LocalDefId, Option<OwnerInfo<'hir>>>` where `OwnerInfo` stores all the relevant information for an HIR owner.
r? `@michaelwoerister`
cc `@petrochenkov`
This performs a substitution of code following the pattern:
let <id> = if let <pat> = ... { identity } else { ... : ! };
To simplify it to:
let <pat> = ... { identity } else { ... : ! };
By adopting the let_else feature.
Stabilize `const_panic`
Closes#51999
FCP completed in #89006
```@rustbot``` label +A-const-eval +A-const-fn +T-lang
cc ```@oli-obk``` for review (not `r?`'ing as not on lang team)
Rework HIR API to make invocations of the hir_crate query harder.
`hir_crate` forces the recomputation of queries that depend on it.
This PR aims at avoiding useless invocations of `hir_crate` by making dependent code go through `tcx.hir()`.
Correct caller/callsite confusion in inliner message
`callee_body` is the MIR `Body` for the `callsite.callee` so this message basically says `"Inline {bar span} into bar"` when it should say `"Inline bar into foo"`.
Extracted out of #82280
When remapping a resume argument with projections rebase them on top of
the new base.
The case where resume argument has projections is unusual, but might
arise with box syntax where the assignment is performed directly into
the box without an intermediate temporary.
Introduce `Rvalue::ShallowInitBox`
Polished version of #88700.
Implements MCP rust-lang/compiler-team#460, and should allow #43596 to go forward.
In short, creating an empty box is split from a nullary-op `NullOp::Box` into two steps, first a call to `exchange_malloc`, then a `Rvalue::ShallowInitBox` which transmutes `*mut u8` to a shallow-initialized `Box<T>`. This allows the `exchange_malloc` call to unwind. Details can be found in the MCP.
`NullOp::Box` is not yet removed, purely to make reverting easier in case anything goes wrong as the result of this PR. If revert is needed a reversion of "Use Rvalue::ShallowInitBox for box expression" commit followed by a test bless should be sufficient.
Experiments in #88700 showed a very slight compile-time perf regression due to (supposedly) slightly more time spent in LLVM. We could omit unwind edge generation (in non-`oom=panic` case) in box expression MIR construction to restore perf; but I don't think it's necessary since runtime perf isn't affected and perf difference is rather small.
This PR allows applying a `#[track_caller]` attribute to a
closure/generator expression. The attribute as interpreted as applying
to the compiler-generated implementation of the corresponding trait
method (`FnOnce::call_once`, `FnMut::call_mut`, `Fn::call`, or
`Generator::resume`).
This feature does not have its own feature gate - however, it requires
`#![feature(stmt_expr_attributes)]` in order to actually apply
an attribute to a closure or generator.
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
If any block on a goto chain has more than one predecessor, then the new
start block would have basic block predecessors.
Skip the transformation for the start block altogether, to avoid
violating the new invariant that the start block does not have any basic
block predecessors.
Rollup of 10 pull requests
Successful merges:
- #88292 (Enable --generate-link-to-definition for rustc's docs)
- #88729 (Recover from `Foo(a: 1, b: 2)`)
- #88875 (cleanup(rustc_trait_selection): remove vestigial code from rustc_on_unimplemented)
- #88892 (Move object safety suggestions to the end of the error)
- #88928 (Document the closure arguments for `reduce`.)
- #88976 (Clean up and add doc comments for CStr)
- #88983 (Allow calling `get_body_with_borrowck_facts` without `-Z polonius`)
- #88985 (Update clobber_abi list to include k[1-7] regs)
- #88986 (Update the backtrace crate)
- #89009 (Fix typo in `break` docs)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Querying layout of a generator requires its optimized MIR. Thus
computing layout during MIR optimization of a generator might create a
query cycle. Disable RemoveZsts in generators to avoid the issue
(similar approach is used in ConstProp transform already).
Introduce NullOp::AlignOf
This PR introduces `Rvalue::NullaryOp(NullOp::AlignOf, ty)`, which will be lowered from `align_of`, similar to `size_of` lowering to `Rvalue::NullaryOp(NullOp::SizeOf, ty)`.
The changes are originally part of #88700 but since it's not dependent on other changes and could have performance impact on its own, it's separated into its own PR.
Add -Z panic-in-drop={unwind,abort} command-line option
This PR changes `Drop` to abort if an unwinding panic attempts to escape it, making the process abort instead. This has several benefits:
- The current behavior when unwinding out of `Drop` is very unintuitive and easy to miss: unwinding continues, but the remaining drops in scope are simply leaked.
- A lot of unsafe code doesn't expect drops to unwind, which can lead to unsoundness:
- https://github.com/servo/rust-smallvec/issues/14
- https://github.com/bluss/arrayvec/issues/3
- There is a code size and compilation time cost to this: LLVM needs to generate extra landing pads out of all calls in a drop implementation. This can compound when functions are inlined since unwinding will then continue on to process drops in the callee, which can itself unwind, etc.
- Initial measurements show a 3% size reduction and up to 10% compilation time reduction on some crates (`syn`).
One thing to note about `-Z panic-in-drop=abort` is that *all* crates must be built with this option for it to be sound since it makes the compiler assume that dropping `Box<dyn Any>` will never unwind.
cc https://github.com/rust-lang/lang-team/issues/97
generic_const_exprs: use thir for abstract consts instead of mir
Changes `AbstractConst` building to use `thir` instead of `mir` so that there's less chance of consts unifying when they shouldn't because lowering to mir dropped information (see `abstract-consts-as-cast-5.rs` test)
r? `@lcnr`
Encode spans relative to the enclosing item
The aim of this PR is to avoid recomputing queries when code is moved without modification.
MCP at https://github.com/rust-lang/compiler-team/issues/443
This is achieved by :
1. storing the HIR owner LocalDefId information inside the span;
2. encoding and decoding spans relative to the enclosing item in the incremental on-disk cache;
3. marking a dependency to the `source_span(LocalDefId)` query when we translate a span from the short (`Span`) representation to its explicit (`SpanData`) representation.
Since all client code uses `Span`, step 3 ensures that all manipulations
of span byte positions actually create the dependency edge between
the caller and the `source_span(LocalDefId)`.
This query return the actual absolute span of the parent item.
As a consequence, any source code motion that changes the absolute byte position of a node will either:
- modify the distance to the parent's beginning, so change the relative span's hash;
- dirty `source_span`, and trigger the incremental recomputation of all code that
depends on the span's absolute byte position.
With this scheme, I believe the dependency tracking to be accurate.
For the moment, the spans are marked during lowering.
I'd rather do this during def-collection,
but the AST MutVisitor is not practical enough just yet.
The only difference is that we attach macro-expanded spans
to their expansion point instead of the macro itself.