[libs] Simplify `unchecked_{shl,shr}`
There's no need for the `const_eval_select` dance here. And while I originally wrote the `.try_into().unwrap_unchecked()` implementation here, it's kinda a mess in MIR -- this new one is substantially simpler, as shown by the old one being above the inlining threshold but the new one being below it in the `mir-opt/inline/unchecked_shifts` tests.
We don't need `u32::checked_shl` doing a dance through both `Result` *and* `Option` 🙃
Remove `box_free` lang item
This PR removes the `box_free` lang item, replacing it with `Box`'s `Drop` impl. Box dropping is still slightly magic because the contained value is still dropped by the compiler.
There's no need for the `const_eval_select` dance here. And while I originally wrote the `.try_into().unwrap_unchecked()` implementation here, it's kinda a mess in MIR -- this new one is substantially simpler, as shown by the old one being above the inlining threshold but the new one being below it.
To reproduce the changes in this commit locally:
- Run `./x test tidy` and remove all the output files not associated
with a test file anymore, as reported by tidy.
- Run `./x test tests/mir-opt --bless` to generate the new outputs.
Only check inlining counter after recursing.
This PR aims to reduce the strength of https://github.com/rust-lang/rust/pull/105119 even more.
In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3.
This PR checks the counter after recursion, to give a chance to inline at depth >= 3.
r? `@scottmcm`
cc `@JakobDegen`
Enable ConstGoto and SeparateConstSwitch passes by default
These 2 passes implement a limited form of jump-threading.
Filing this PR to see if enabling them would be lighter than https://github.com/rust-lang/rust/pull/107009.
Enable ScalarReplacementOfAggregates in optimized builds
Like MatchBranchSimplification, this pass is known to produce significant runtime improvements in Cranelift artifacts, and I believe based on the perf runs here that the primary effect of this pass is to empower MatchBranchSimplification. ScalarReplacementOfAggregates on its own has little effect on anything, but when this was rebased up to include https://github.com/rust-lang/rust/pull/112001 we started seeing significant and majority-positive results.
Based on the fact that we see most of the regressions in debug builds (https://github.com/rust-lang/rust/pull/112002#issuecomment-1566270144) and some rather significant ones in cycles and wall time, I'm only enabling this in optimized builds at the moment.
All the implementations of the trait already are `Copy`, and this seems to be enough to simplify the implementations enough to make the MIR inliner willing to inline basics like `Range::next`.
MIR: opt-in normalization of `BasicBlock` and `Local` numbering
This doesn't matter at all for actual codegen, but after spending some time reading pre-codegen MIR, I was wishing I didn't have to jump around so much in reading post-inlining code.
So this add two passes that are off by default for every mir level, but can be enabled (`-Zmir-enable-passes=+ReorderBasicBlocks,+ReorderLocals`) for humans.
Since this only affects `PreCodegen MIR, and it would be nice for that to be resilient to permutations of things that don't affect the actual semantic behaviours.
Stop turning transmutes into discriminant reads in mir-opt
Partially reverts #109612, as after #109993 these aren't actually equivalent any more, and I'm no longer confident this was ever an improvement in the first place.
Having this "simplification" meant that similar-looking code actually did somewhat different things. For example,
```rust
pub unsafe fn demo1(x: std::cmp::Ordering) -> u8 {
std::mem::transmute(x)
}
pub unsafe fn demo2(x: std::cmp::Ordering) -> i8 {
std::mem::transmute(x)
}
```
in nightly today is generating <https://rust.godbolt.org/z/dPK58zW18>
```llvm
define noundef i8 `@_ZN7example5demo117h341ef313673d2ee6E(i8` noundef %x) unnamed_addr #0 {
%0 = icmp uge i8 %x, -1
%1 = icmp ule i8 %x, 1
%2 = or i1 %0, %1
call void `@llvm.assume(i1` %2)
ret i8 %x
}
define noundef i8 `@_ZN7example5demo217h5ad29f361a3f5700E(i8` noundef %0) unnamed_addr #0 {
%x = alloca i8, align 1
store i8 %0, ptr %x, align 1
%1 = load i8, ptr %x, align 1, !range !2, !noundef !3
ret i8 %1
}
```
Which feels too different when the original code is essentially identical.
---
Aside: that example is different *after* optimizations too:
```llvm
define noundef i8 `@_ZN7example5demo117h341ef313673d2ee6E(i8` noundef returned %x) unnamed_addr #0 {
%0 = add i8 %x, 1
%1 = icmp ult i8 %0, 3
tail call void `@llvm.assume(i1` %1)
ret i8 %x
}
define noundef i8 `@_ZN7example5demo217h5ad29f361a3f5700E(i8` noundef returned %0) unnamed_addr #1 {
ret i8 %0
}
```
so turning the `Transmute` into a `Discriminant` was arguably just making things worse, so leaving it alone instead -- and thus having less code in rustc -- seems clearly better.
debug format `Const`'s less verbosely
Not user visible change only visible to people debugging const generics.
Currently debug output for `ty::Const` is super verbose (even for `-Zverbose` lol), things like printing infer vars as `Infer(Var(?0c))` instead of just `?0c`, bound vars and placeholders not using `^0_1` or `!0_1` syntax respectively. With these changes its imo better but not perfect:
`Const { ty: usize, kind: ^0_1 }`
is still a lot for not much information. not entirely sure what to do about that so not dealing with it yet.
Need to do formatting for `ConstKind::Expr` at some point too since rn it sucks (doesn't even print anything with `Display`) not gonna do that in this PR either.
r? `@compiler-errors`
Partially reverts 109612, as after 109993 these aren't actually equivalent any more, and I'm no longer confident this was ever an improvement in the first place.