Rollup of 15 pull requests
Successful merges:
- #76932 (Relax promises about condition variable.)
- #76973 (Unstably allow assume intrinsic in const contexts)
- #77005 (BtreeMap: refactoring around edges)
- #77066 (Fix dest prop miscompilation around references)
- #77073 (dead_code: look at trait impls even if they don't contain items)
- #77086 (Include libunwind in the rust-src component.)
- #77097 (Make [].as_[mut_]ptr_range() (unstably) const.)
- #77106 (clarify that `changelog-seen = 1` goes to the beginning of config.toml)
- #77120 (Add `--keep-stage-std` to `x.py` for keeping only standard library artifacts)
- #77126 (Invalidate local LLVM cache less often)
- #77146 (Install std for non-host targets)
- #77155 (remove enum name from ImplSource variants)
- #77176 (Removing erroneous semicolon in transmute documentation)
- #77183 (Allow multiple allow_internal_unstable attributes)
- #77189 (Remove extra space from vec drawing)
Failed merges:
r? `@ghost`
This refactors handling of `Rvalue::{Unary,Binary}Op` in the
const-checker. Now we `span_bug` if there's an unexpected type in a
primitive operation. This also allows unary negation on
`char` values through the const-checker because it makes the code a bit
cleaner. `char` does not actually support these operations, and if it
did, we could evaluate them at compile-time.
Ignore ZST offsets when deciding whether to use Scalar/ScalarPair layout
This is important because Scalar/ScalarPair layout previously would not be used if any ZST had nonzero offset.
For example, before this change, only `((), u128)` would be laid out like `u128`, not `(u128, ())`.
Fixes#63244
perf: move cold path of `process_obligations` into a separate function
cc #76575
This probably won't matter too much in the long run once #69218 is merged so we may not want to merge this.
r? `@ecstatic-morse`
Fix#76803 miscompilation
Fixes#76803
Seems like it was an oversight that the discriminant value being set was not compared to the target value from the SwitchInt, as a comment says this is a requirement for the optimization to be sound.
r? `@wesleywiser` since you are probably familiar with the optimization and made #76837 to workaround the bug
Rollup of 15 pull requests
Successful merges:
- #75438 (Use adaptive SVG favicon for rustdoc like other rust sites)
- #76304 (Make delegation methods of `std::net::IpAddr` unstably const)
- #76724 (Allow a unique name to be assigned to dataflow graphviz output)
- #76978 (Documented From impls in std/sync/mpsc/mod.rs)
- #77044 (Liballoc bench vec use mem take not replace)
- #77050 (Typo fix: "satsify" -> "satisfy")
- #77074 (add array::from_ref)
- #77078 (Don't use an if guard to check equality with a constant)
- #77079 (Use `Self` in docs when possible)
- #77081 (Merge two almost identical match arms)
- #77121 (Updated html_root_url for compiler crates)
- #77136 (Suggest `const_mut_refs`, not `const_fn` for mutable references in `const fn`)
- #77160 (Suggest `const_fn_transmute`, not `const_fn`)
- #77164 (Remove workaround for deref issue that no longer exists.)
- #77165 (Followup to #76673)
Failed merges:
r? `@ghost`
Suggest `const_fn_transmute`, not `const_fn`
More fallout from #76850 in the vein of #77134. The fix is the same. I looked through the structured errors file and didn't see any more of this kind of diagnostics bug.
r? @oli-obk
Suggest `const_mut_refs`, not `const_fn` for mutable references in `const fn`
Resolves#77134.
Prior to #76850, most uses of `&mut` in `const fn` ~~required~~ involved two feature gates, `const_mut_refs` and `const_fn`. The first allowed all mutable borrows of locals. The second allowed only locals, arguments and return values whose types contained `&mut`. I switched the second check to the `const_mut_refs` gate. However, I forgot update the error message with the new suggestion.
Alternatively, we could revert to having two different feature gates for this. OP's code never borrows anything mutably, so it didn't need `const_mut_refs` in the past, only `const_fn`. I'd prefer to keep everything under a single gate, however.
r? @oli-obk
Allow a unique name to be assigned to dataflow graphviz output
Previously, if the same analysis were invoked multiple times in a single compilation session, the graphviz output for later runs would overwrite that of previous runs. Allow callers to add a unique identifier to each run so this can be avoided.
DroplessArena: Allocate objects from the end of memory chunk
Allocating from the end of memory chunk simplifies the alignment code
and reduces the number of checked arithmetic operations.
Add fast path for match checking
This adds a fast path that would reduce the complexity to linear on matches consisting of only variant patterns (i.e. enum matches). (Also see: #7462) Unfortunately, I was too lazy to add a similar fast path for constants (mostly for integer matches), ideally that could be added another day.
TBH, I'm not confident with the performance claims due to the fact that enums tends to be small and FxHashMap could add a lot of overhead.
r? `@Mark-Simulacrum`
needs perf
The main use case of TrustedLen is allowing APIs to specialize on it,
but no use of it uses that specialization. Instead, only the .len()
function provided by ExactSizeIterator is used, which is already
required to be accurate.
Thus, the TrustedLen requirement on BuilderMethods::switch is redundant.
const_evaluatable_checked: extend predicate collection
We now walk the hir instead of using `ty` so that we get better spans here, While I am still not completely sure if that's
what we want in the end, it does seem a lot closer to the final goal than the previous version.
We also look into type aliases (and use a `TypeVisitor` here), about which I am not completely sure, but we will see how well this works.
We also look into fn decls, so the following should work now.
```rust
fn test<T>() -> [u8; std::mem::size_of::<T>()] {
[0; std::mem::size_of::<T>()]
}
```
Additionally, we visit the optional trait and self type of impls.
r? `@oli-obk`
Fix underflow when calculating the number of no-op jumps folded
When removing unwinds to no-op blocks and folding jumps to no-op blocks,
remove the unwind target first. Otherwise we cannot determine if target
has been already folded or not.
Previous implementation incorrectly assumed that all resume targets had
been folded already, occasionally resulting in an underflow:
```
remove_noop_landing_pads: removed 18446744073709551613 jumps and 3 landing pads
```
- Add `PrimTy::name` and `PrimTy::name_str`
- Use those new functions to distinguish between the name in scope and
the canonical name
- Fix diagnostics for primitive types
- Add tests for primitives
Rollup of 9 pull requests
Successful merges:
- #76898 (Record `tcx.def_span` instead of `item.span` in crate metadata)
- #76939 (emit errors during AbstractConst building)
- #76965 (Add cfg(target_has_atomic_equal_alignment) and use it for Atomic::from_mut.)
- #76993 (Changing the alloc() to accept &self instead of &mut self)
- #76994 (fix small typo in docs and comments)
- #77017 (Add missing examples on Vec iter types)
- #77042 (Improve documentation for ToSocketAddrs)
- #77047 (Miri: more informative deallocation error messages)
- #77055 (Add #[track_caller] to more panicking Cell functions)
Failed merges:
r? `@ghost`
MIR pass to remove unneeded drops on types not needing drop
This is heavily dependent on MIR inlining running to actually see the drop statement.
Do we want to special case replacing a call to std::mem::drop with a goto aswell?
Add cfg(target_has_atomic_equal_alignment) and use it for Atomic::from_mut.
Fixes some platform-specific problems with #74532 by using the actual alignment of the types instead of hardcoding a few `target_arch`s.
r? @RalfJung
emit errors during AbstractConst building
There changes are currently still untested, so I don't expect this to pass CI 😆
It seems to me like this is the direction we want to go in, though we didn't have too much of a discussion about this.
r? @oli-obk
Record `tcx.def_span` instead of `item.span` in crate metadata
This was missed in PR #75465. As a result, a few places have been using
the full body span of functions, instead of just the header span.
SimplifyComparisonIntegral: fix miscompilation
Fixes#76432
Only insert StorageDeads if we actually removed one.
Fixes an issue where we added StorageDead to a place with no StorageLive
r? `@oli-obk`
Remove `qualify_min_const_fn`
~~Blocked on #76807 (the first six commits).~~
With this PR, all checks in `qualify_min_const_fn` are replicated in `check_consts`, and the former is no longer invoked. My goal was to have as few changes to test output as possible, since making sweeping changes to the code *while* doing big batches of diagnostics updates turned out to be a headache. To this end, there's a few `HACK`s in `check_consts` to achieve parity with `qualify_min_const_fn`.
The new system that replaces `is_min_const_fn` is referred to as "const-stability" My end goal for the const-stability rules is this:
* Const-stability is only applicable to functions defined in `staged_api` crates.
* All functions not marked `rustc_const_unstable` are considered "const-stable".
- NB. This is currently not implemented. `#[unstable]` functions are also const-unstable. This causes problems when searching for feature gates.
- All "const-unstable" functions have an associated feature gate
* const-stable functions can only call other const-stable functions
- `allow_internal_unstable` can be used to circumvent this.
* All const-stable functions are subject to some additional checks (the ones that were unique to `qualify_min_const_fn`)
The plan is to remove each `HACK` individually in subsequent PRs. That way, changes to error message output can be reviewed in isolation.
use if let instead of single match arm expressions
use if let instead of single match arm expressions to compact code and reduce nesting (clippy::single_match)
Use const-checking to forbid use of unstable features in const-stable functions
First step towards #76618.
Currently this code isn't ever hit because `qualify_min_const_fn` runs first and catches pretty much everything. One exception is `const_precise_live_drops`, which does not use the newly added code since it runs as part of a separate pass.
Also contains some unrelated refactoring, which is split into separate commits.
r? @oli-obk
Don't use `zip` to compare iterators during pretty-print hack
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Split out from https://github.com/rust-lang/rust/pull/76130
If the right-hand iterator has exactly one more element than the
left-hand iterator, then both iterators will be fully consumed, but
the extra element will never be compared.
Make `ensure_sufficient_stack()` non-generic, using cargo-llvm-lines
Inspired by [this blog post](https://blog.mozilla.org/nnethercote/2020/08/05/how-to-speed-up-the-rust-compiler-some-more-in-2020/) from `@nnethercote,` I used [cargo-llvm-lines](https://github.com/dtolnay/cargo-llvm-lines/) on the rust compiler itself, to improve it's compile time. This PR contains only one low-hanging fruit, but I also want to share some measurements.
The function `ensure_sufficient_stack()` was monomorphized 1500 times, and with it the `stacker` and `psm` crates, for a total of 1.5% of all llvm IR lines. With some trickery I convert the generic closure into a dynamic one, and thus all that code is only monomorphized once.
# Measurements
Getting these numbers took some fiddling with CLI flags and I [modified](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/src/main.rs#L115) cargo-llvm-lines to read from a folder instead of invoking cargo. Commands I used:
```
./x.py clean
RUSTFLAGS="--emit=llvm-ir -C link-args=-fuse-ld=lld -Z self-profile=profile" CARGOFLAGS_BOOTSTRAP="-Ztimings" RUSTC_BOOTSTRAP=1 ./x.py build -i --stage 1 library/std
# Then manually copy all .ll files into a folder I hardcoded in cargo-llvm-lines in main.rs#L115
cd ../cargo-llvm-lines
cargo run llvm-lines
```
The result is this list (see [first 500 lines](https://github.com/Julian-Wollersberger/cargo-llvm-lines/blob/master/llvm-lines-rustc-before.txt) ), before the change:
```
Lines Copies Function name
----- ------ -------------
16894211 (100%) 58417 (100%) (TOTAL)
2223855 (13.2%) 502 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (7.9%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
774434 (4.6%) 12043 (20.6%) core::ptr::drop_in_place
294170 (1.7%) 499 (0.9%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_task_impl
245410 (1.5%) 1552 (2.7%) psm::on_stack::with_on_stack
210311 (1.2%) 1 (0.0%) rustc_target::spec::load_specific
200962 (1.2%) 513 (0.9%) rustc_query_system::query::plumbing::get_query_impl
190704 (1.1%) 1 (0.0%) rustc_middle::ty::query::<impl rustc_middle::ty::context::TyCtxt>::alloc_self_profile_query_strings
180272 (1.1%) 468 (0.8%) rustc_query_system::query::plumbing::load_from_disk_and_cache_in_memory
177396 (1.1%) 114 (0.2%) rustc_query_system::query::plumbing::force_query_impl
161134 (1.0%) 445 (0.8%) rustc_query_system::dep_graph::graph::DepGraph<K>::with_anon_task
141551 (0.8%) 186 (0.3%) rustc_query_system::query::plumbing::incremental_verify_ich
110191 (0.7%) 7 (0.0%) rustc_middle::ty::context::_DERIVE_rustc_serialize_Decodable_D_FOR_TypeckResults::<impl rustc_serialize::serialize::Decodable<__D> for rustc_middle::ty::context::TypeckResults>::decode::{{closure}}
108590 (0.6%) 420 (0.7%) core::ops::function::FnOnce::call_once
88488 (0.5%) 21 (0.0%) rustc_query_system::dep_graph::graph::DepGraph<K>::try_mark_previous_green
86368 (0.5%) 1 (0.0%) rustc_middle::ty::query::stats::query_stats
85654 (0.5%) 3973 (6.8%) <&T as core::fmt::Debug>::fmt
84475 (0.5%) 1 (0.0%) rustc_middle::ty::query::Queries::try_collect_active_jobs
81220 (0.5%) 862 (1.5%) <hashbrown::raw::RawIterHash<T> as core::iter::traits::iterator::Iterator>::next
77636 (0.5%) 54 (0.1%) core::slice::sort::recurse
66484 (0.4%) 461 (0.8%) <hashbrown::raw::RawIter<T> as core::iter::traits::iterator::Iterator>::next
```
All `.ll` files together had 4.4GB. After my change they had 4.2GB. So a few percent less code LLVM has to process. Hurray!
Sadly, I couldn't measure an actual wall-time improvement. Watching YouTube while compiling added to much noise...
Here is the top of the list after the change:
```
16460866 (100%) 58341 (100%) (TOTAL)
1903085 (11.6%) 504 (0.9%) rustc_query_system::query::plumbing::get_query_impl::{{closure}}
1331918 (8.1%) 1287 (2.2%) hashbrown::raw::RawTable<T>::reserve_rehash
777796 (4.7%) 12031 (20.6%) core::ptr::drop_in_place
551462 (3.4%) 1519 (2.6%) rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}
```
Note that the total was reduced by 430 000 lines and `psm::on_stack::with_on_stack` has disappeared. Instead `rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}` appeared. I'm confused about that one, but it seems to consist of inlined calls to `rustc_query_system::*` stuff.
Further note the other two big culprits in this list: `rustc_query_system` and `hashbrown`. These two are monomorphized many times, the query system summing to more than 20% of all lines, not even counting code that's probably inlined elsewhere.
Assuming compile times scale linearly with llvm-lines, that means a possible 20% compile time reduction.
Reducing eg. `get_query_impl` would probably need a major refactoring of the qery system though. _Everything_ in there is generic over multiple types, has associated types and passes generic Self arguments by value. Which means you can't simply make things `dyn`.
---------------------------------------
This PR is a small step to make rustc compile faster and thus make contributing to rustc less painful. Nonetheless I love Rust and I find the work around rustc fascinating :)