Don't use `zip` to compare iterators during pretty-print hack
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Split out from https://github.com/rust-lang/rust/pull/76130
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Make `ensure_sufficient_stack()` non-generic, using cargo-llvm-lines
Inspired by [this blog post](https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed-up-the-rust-compiler-some-more-in-2020/) from `@nnethercote,` I used [cargo-llvm-lines](https://github.com/dtolnay/cargo-llvm-lines/) on the rust compiler itself, to improve it's compile time. This PR contains only one low-hanging fruit, but I also want to share some measurements.
The function `ensure_sufficient_stack()` was monomorphized 1500 times, and with it the `stacker` and `psm` crates, for a total of 1.5% of all llvm IR lines. With some trickery I convert the generic closure into a dynamic one, and thus all that code is only monomorphized once.
# Measurements
Getting these numbers took some fiddling with CLI flags and I [modified](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/src/main.rs#L115) cargo-llvm-lines to read from a folder instead of invoking cargo. Commands I used:
```
./x.py clean
RUSTFLAGS="--emit=llvm-ir -C link-args=-fuse-ld=lld -Z self-profile=profile" CARGOFLAGS_BOOTSTRAP="-Ztimings" RUSTC_BOOTSTRAP=1 ./x.py build -i --stage 1 library/std
# Then manually copy all .ll files into a folder I hardcoded in cargo-llvm-lines in main.rs#L115
cd ../cargo-llvm-lines
cargo run llvm-lines
```
The result is this list (see [first 500 lines](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/llvm-lines-rustc-before.txt) ), before the change:
```
Lines Copies Function name
----- ------ -------------
16894211 (100%) 58417 (100%) (TOTAL)
2223855 (13.2%) 502 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (7.9%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
774434 (4.6%) 12043 (20.6%) core::ptr::drop_in_place
294170 (1.7%) 499 (0.9%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_task_impl
245410 (1.5%) 1552 (2.7%) psm::on_stack::with_on_stack
210311 (1.2%) 1 (0.0%) rustc_target::spec::load_specific
200962 (1.2%) 513 (0.9%) rustc_query_system::query::plumbing::get_query_impl
190704 (1.1%) 1 (0.0%) rustc_middle::ty::query::<impl rustc_middle::ty::context::TyCtxt>::alloc_self_profile_query_strings
180272 (1.1%) 468 (0.8%) rustc_query_system::query::plumbing::load_from_disk_and_cache_in_memory
177396 (1.1%) 114 (0.2%) rustc_query_system::query::plumbing::force_query_impl
161134 (1.0%) 445 (0.8%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_anon_task
141551 (0.8%) 186 (0.3%) rustc_query_system::query::plumbing::incremental_verify_ich
110191 (0.7%) 7 (0.0%) rustc_middle::ty::context::_DERIVE_rustc_serialize_Decodable_D_FOR_TypeckResults::<impl rustc_serialize::serialize::Decodable<__D> for rustc_middle::ty::context::TypeckResults>::decode::{{closure}}
108590 (0.6%) 420 (0.7%) core::ops::function::FnOnce::call_once
88488 (0.5%) 21 (0.0%) rustc_query_system::dep_graph::graph::DepGraph<K>::try_mark_previous_green
86368 (0.5%) 1 (0.0%) rustc_middle::ty::query::stats::query_stats
85654 (0.5%) 3973 (6.8%) <&T as core::fmt::Debug>::fmt
84475 (0.5%) 1 (0.0%) rustc_middle::ty::query::Queries::try_collect_active_jobs
81220 (0.5%) 862 (1.5%) <hashbrown::raw::RawIterHash<T> as core::iter::traits::iterator::Iterator>::next
77636 (0.5%) 54 (0.1%) core::slice::sort::recurse
66484 (0.4%) 461 (0.8%) <hashbrown::raw::RawIter<T> as core::iter::traits::iterator::Iterator>::next
```
All `.ll` files together had 4.4GB. After my change they had 4.2GB. So a few percent less code LLVM has to process. Hurray!
Sadly, I couldn't measure an actual wall-time improvement. Watching YouTube while compiling added to much noise...
Here is the top of the list after the change:
```
16460866 (100%) 58341 (100%) (TOTAL)
1903085 (11.6%) 504 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (8.1%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
777796 (4.7%) 12031 (20.6%) core::ptr::drop_in_place
551462 (3.4%) 1519 (2.6%) rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}
```
Note that the total was reduced by 430 000 lines and `psm::on_stack::with_on_stack` has disappeared. Instead `rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}` appeared. I'm confused about that one, but it seems to consist of inlined calls to `rustc_query_system::*` stuff.
Further note the other two big culprits in this list: `rustc_query_system` and `hashbrown`. These two are monomorphized many times, the query system summing to more than 20% of all lines, not even counting code that's probably inlined elsewhere.
Assuming compile times scale linearly with llvm-lines, that means a possible 20% compile time reduction.
Reducing eg. `get_query_impl` would probably need a major refactoring of the qery system though. _Everything_ in there is generic over multiple types, has associated types and passes generic Self arguments by value. Which means you can't simply make things `dyn`.
---------------------------------------
This PR is a small step to make rustc compile faster and thus make contributing to rustc less painful. Nonetheless I love Rust and I find the work around rustc fascinating :)
Remove DeclareMethods
Most of the `DeclareMethods` API was only used internally by rustc_codegen_llvm. As such, it makes no sense to require other backends to implement them.
(`get_declared_value` and `declare_cfn` were used, in one place, specific to the `main` symbol, which I've replaced with a more specialized function to allow more flexibility in implementation - the intent is that `declare_c_main` can go away once we do something more clever, e.g. @eddyb has ideas around having a MIR shim or somesuch we can explore in a follow-up PR)
Let user see the full type of type-length limit error
Seeing the full type of the error is sometimes essential to diagnosing the problem, but the type itself is too long to be displayed in the terminal in a useful fashion. This change solves this dilemma by writing the full offending type name to a file, and displays this filename as a note.
> note: the full type name been written to '$TEST_BUILD_DIR/issues/issue-22638/issue-22638.long-type.txt'
Closes#76777
Remove MMX from Rust
Follow-up to https://github.com/rust-lang/stdarch/pull/890
This removes most of MMX from Rust (tests pass with small changes), keeping stable `is_x86_feature_detected!("mmx")` working.
New MIR optimization pass to reduce branches on match of tuples of enums
Fixes#68867 by adding a new pass that turns something like
```rust
let x: Option<()>;
let y: Option<()>;
match (x,y) {
(Some(_), Some(_)) => {0},
_ => {1}
}
```
into something like
```rust
let x: Option<()>;
let y: Option<()>;
let discriminant_x = // get discriminant of x
let discriminant_y = // get discriminant of x
if discriminant_x != discriminant_y {1} else {0}
```
The opt-diffs still have the old basic blocks like
```
bb3: {
_8 = discriminant((*(_4.1: &ViewportPercentageLength))); // scope 0 at $DIR/early-otherwise-branch-68867.rs:21:21: 21:30
switchInt(move _8) -> [1_isize: bb7, otherwise: bb2]; // scope 0 at $DIR/early-otherwise-branch-68867.rs:21:21: 21:30
}
bb4: {
_9 = discriminant((*(_4.1: &ViewportPercentageLength))); // scope 0 at $DIR/early-otherwise-branch-68867.rs:22:23: 22:34
switchInt(move _9) -> [2_isize: bb8, otherwise: bb2]; // scope 0 at $DIR/early-otherwise-branch-68867.rs:22:23: 22:34
}
bb5: {
_10 = discriminant((*(_4.1: &ViewportPercentageLength))); // scope 0 at $DIR/early-otherwise-branch-68867.rs:23:23: 23:34
switchInt(move _10) -> [3_isize: bb9, otherwise: bb2]; // scope 0 at $DIR/early-otherwise-branch-68867.rs:23:23: 23:34
}
```
These do get removed on later passes. I'm not sure if I should include those passes in the test to make it clear?
Don't allow implementing trait directly on type-alias-impl-trait
This is specifically disallowed by the RFC, but we never added a check
for it.
Fixes#76202
transmute: use diagnostic item
closes#66075, we now have no remaining uses of `match_def_path` in the compiler while some uses still remain in `clippy`.
cc @RalfJung
Rollup of 15 pull requests
Successful merges:
- #76722 (Test and fix Send and Sync traits of BTreeMap artefacts)
- #76766 (Extract some intrinsics out of rustc_codegen_llvm)
- #76800 (Don't generate bootstrap usage unless it's needed)
- #76809 (simplfy condition in ItemLowerer::with_trait_impl_ref())
- #76815 (Fix wording in mir doc)
- #76818 (Don't compile regex at every function call.)
- #76821 (Remove redundant nightly features)
- #76823 (black_box: silence unused_mut warning when building with cfg(miri))
- #76825 (use `array_windows` instead of `windows` in the compiler)
- #76827 (fix array_windows docs)
- #76828 (use strip_prefix over starts_with and manual slicing based on pattern length (clippy::manual_strip))
- #76840 (Move to intra doc links in core/src/future)
- #76845 (Use intra docs links in core::{ascii, option, str, pattern, hash::map})
- #76853 (Use intra-doc links in library/core/src/task/wake.rs)
- #76871 (support panic=abort in Miri)
Failed merges:
r? `@ghost`
use `array_windows` instead of `windows` in the compiler
I do think these changes are beautiful, but do have to admit that using type inference for the window length
can easily be confusing. This seems like a general issue with const generics, where inferring constants adds an additional
complexity which users have to learn and keep in mind.
Remove redundant nightly features
Removes a bunch of redundant/outdated nightly features. The first commit removes a `core_intrinsics` use for which a stable wrapper has been provided since. The second commit replaces the `const_generics` feature with `min_const_generics` which might get stabilized this year. The third commit is the result of a trial/error run of removing every single feature and then adding it back if compile failed. A bunch of unused features are the result that the third commit removes.
Don't compile regex at every function call.
Use `SyncOnceCell` to only compile it once.
I believe this still adds some kind of locking mechanism?
Related issue: https://github.com/rust-lang/rust/issues/76817
Extract some intrinsics out of rustc_codegen_llvm
A significant amount of intrinsics do not actually need backend-specific behaviors to be implemented, instead relying on methods already in rustc_codegen_ssa. So, extract those methods out to rustc_codegen_ssa, so that each backend doesn't need to reimplement the same code.
Almost everything should be a pretty direct translation. A notable not-direct-translation is `add_with_overflow` and friends being changed to `bx.checked_binop`, but it's pretty simple.
I could have been a lot more aggressive here and pulled out way more methods, and add a few new methods in the rustc_codegen_ssa "API". However, because this is my second rustc PR, I thought that moving those to a follow-up PR and doing more incremental changes here would be better (and I guess ask if this work is even desired in the first place). I'm hoping to eventually remove the mess of intrinsic handling in the backend entirely, which would be hecking fantastic ✨
Validate constants during `const_eval_raw`
This PR implements the groundwork for https://github.com/rust-lang/rust/issues/72396
* constants are now validated during `const_eval_raw`
* to prevent cycle errors, we do not validate references to statics anymore beyond the fact that they are not dangling
* the `const_eval` query ICEs if used on `static` items
* as a side effect promoteds are now evaluated to `ConstValue::Scalar` again (since they are just a reference to the actual promoted allocation in most cases).
Some promotion cleanup
Based on top of both https://github.com/rust-lang/rust/pull/75502 and https://github.com/rust-lang/rust/pull/75585, this does some cleanup of the promotion code. The last 2 commits are new.
* Remove the remaining cases where `const fn` is treated different from `fn`. This means no longer promoting ptr-to-int casts, raw ptr operations, and union field accesses in `const fn` -- or anywhere, for that matter. These are all unstable in const-context so this should not break any stable code. Fixes https://github.com/rust-lang/rust/issues/75586.
* ~~Promote references to statics even outside statics (i.e., in functions) for consistency.~~
* Promote `&mut []` everywhere, not just in non-`const` functions, for consistency.
* Explain why we do not promote deref's of statics outside statics. ~~(This is the only remaining direct user of `const_kind`.)~~
This can only land once the other two PRs land; I am mostly putting this up already because I couldn't wait ;) and to get some feedback from `@rust-lang/wg-const-eval` .
shim: monomorphic `FnPtrShim`s during construction
Fixes#69925.
This PR adjusts MIR shim construction so that substitutions are applied to function pointer shims during construction, rather than during codegen (as determined by `substs_for_mir_body`).
r? `@eddyb`
Implement a generic Destination Propagation optimization on MIR
This takes the work that was originally started by `@eddyb` in https://github.com/rust-lang/rust/pull/47954, and then explored by me in https://github.com/rust-lang/rust/pull/71003, and implements it in a general (ie. not limited to acyclic CFGs) and dataflow-driven way (so that no additional infrastructure in rustc is needed).
The pass is configured to run at `mir-opt-level=2` and higher only. To enable it by default, some followup work on it is still needed:
* Performance needs to be evaluated. I did some light optimization work and tested against `tuple-stress`, which caused trouble in my last attempt, but didn't go much in depth here.
* We can also enable the pass only at `opt-level=2` and higher, if it is too slow to run in debug mode, but fine when optimizations run anyways.
* Debuginfo needs to be fixed after locals are merged. I did not look into what is required for this.
* Live ranges of locals (aka `StorageLive` and `StorageDead`) are currently deleted. We either need to decide that this is fine, or if not, merge the variable's live ranges (or remove these statements entirely – https://github.com/rust-lang/rust/issues/68622).
Some benchmarks of the pass were done in https://github.com/rust-lang/rust/pull/72635.
Rollup of 14 pull requests
Successful merges:
- #73963 (deny(unsafe_op_in_unsafe_fn) in libstd/path.rs)
- #75099 (lint/ty: move fns to avoid abstraction violation)
- #75502 (Use implicit (not explicit) rules for promotability by default in `const fn`)
- #75580 (Add test for checking duplicated branch or-patterns)
- #76310 (Add `[T; N]: TryFrom<Vec<T>>` (insta-stable))
- #76400 (Clean up vec benches bench_in_place style)
- #76434 (do not inline black_box when building for Miri)
- #76492 (Add associated constant `BITS` to all integer types)
- #76525 (Add as_str() to string::Drain.)
- #76636 (assert ScalarMaybeUninit size)
- #76749 (give *even better* suggestion when matching a const range)
- #76757 (don't convert types to the same type with try_into (clippy::useless_conversion))
- #76796 (Give a better error message when x.py uses the wrong stage for CI)
- #76798 (Build fixes for RISC-V 32-bit Linux support)
Failed merges:
r? `@ghost`
give *even better* suggestion when matching a const range
notice that the err already has "constant defined here"
so this is now *exceedingly clear*
extension to #76222
r? @estebank
assert ScalarMaybeUninit size
I noticed most low-level Miri types have such an assert but `ScalarMaybeUninit` does not, so let's add that. Good t see that the `Option`-like optimization kicks in and this is no bigger than `Scalar`. :)
r? @oli-obk
Add associated constant `BITS` to all integer types
Recently I've regularly come across this snippet (in a few different crates, including `core` and `std`):
```rust
std::mem::size_of<usize>() * 8
```
I think it's time for a `usize::BITS`.
lint/ty: move fns to avoid abstraction violation
This PR moves `transparent_newtype_field` and `is_zst` to `LateContext` where they are used, rather than being on the `VariantDef` and `TyS` types, hopefully addressing @eddyb's concern [from this comment](https://github.com/rust-lang/rust/pull/74340#discussion_r456534910).
Wrap recursive predicate evaluation with `ensure_sufficient_stack`
I haven't been able to come up with a minimized test case for #76770,
but this fixes a stack overflow in rustc as well.
As a side effect, we now represent most promoteds as `ConstValue::Scalar` again. This is useful because all implict promoteds are just references anyway and most explicit promoteds are numeric arguments to `asm!` or SIMD instructions.
Dogfood new_uninit and maybe_uninit_slice in rustc_arena
Dogfoods a few cool `MaybeUninit` related features in the compiler's rustc_arena crate.
Split off from #76821
r? `@oli-obk`
Some MIR statements and terminators have an (undocumented...) invariant
that some of their input and outputs must not overlap. This records
conflicts between locals used in these positions.
compare generic constants using `AbstractConst`s
This is a MVP of rust-lang/compiler-team#340. The changes in this PR should only be relevant if `feature(const_evaluatable_checked)` is enabled.
~~currently based on top of #76559, so blocked on that.~~
r? `@oli-obk` cc `@varkor` `@eddyb`
Issue 72408 nested closures exponential
This fixes#72408.
Nested closures were resulting in exponential compilation time.
This PR is enhancing asymptotic complexity, but also increasing the constant, so I would love to see perf run results.
[mir-opt] Disable the `ConsideredEqual` logic in SimplifyBranchSame opt
The logic is currently broken and we need to disable it to fix a beta
regression (see #76803)
r? `@oli-obk`
Mostly to fix ui/issues/issue-37311-type-length-limit/issue-37311.rs.
Most parts of the compiler can handle deeply nested types with a lot
of duplicates just fine, but some parts still attempt to naively
traverse type tree.
Before such problems were caught by type length limit check,
but now these places will have to be changed to handle
duplicated types gracefully.
This fixes#72408.
Nested closures were resulting in exponential compilation time.
As a performance optimization this change introduces MiniSet,
which is a simple small storage optimized set.
move guaranteed{ne,eq} implementation to compile-time machine
Currently, Miri needs a special hack to avoid using the core engine implementation of these intrinsics. That seems silly, so let's move them to the CTFE machine, which is the only machine that wants to use them.
I also added a reference to https://github.com/rust-lang/rust/issues/73722 as a warning to anyone who wants to adjust `guaranteed_eq`.
Improve E0118
- Changes the "base type" terminology to "nominal type" (according to the [reference](https://doc.rust-lang.org/stable/reference/items/implementations.html#inherent-implementations)).
- Suggests removing a reference, if one is present on the type.
- Clarify what is meant by a "nominal type".
closes#69392
This is my first not-entirely-trivial PR, so please let me know if I missed anything or if something could be improved. Though I probably won't be able to fix anything in the upcoming week.