With https://reviews.llvm.org/D86310 LLVM now has i128 aligned to
16-bytes on x86 based platforms. This will be in LLVM-18. This patch
updates all our spec targets to be 16-byte aligned, and removes the
alignment when speaking to older LLVM.
This results in Rust overaligning things relative to LLVM on older LLVMs.
This alignment change was discussed in rust-lang/compiler-team#683
See #54341 for additional information about why this is happening and
where this will be useful in the future.
This *does not* stabilize `i128`/`u128` for FFI.
A-diagnostics is already labeled correctly thanks to the template and there usually isn't much to do on those issues, so it's fine to just add them to the pile.
Update documentation for Vec::into_boxed_slice to be more clear about excess capacity
Currently, the documentation for Vec::into_boxed_slice says that "if the vector has excess capacity, its items will be moved into a newly-allocated buffer with exactly the right capacity." This is misleading, as copies do not necessarily occur, depending on if the allocator supports in-place shrinking. I copied some of the wording from shrink_to_fit, though it could potentially still be worded better than this.
dead_code treats #[repr(transparent)] the same as #[repr(C)]
In #92972 we enabled linting on unused fields in tuple structs. In #118297 that lint was enabled by default. That exposed issues like #119659, where the fields of a struct marked `#[repr(transparent)]` were reported by the `dead_code` lint. The language team [decided](https://github.com/rust-lang/rust/issues/119659#issuecomment-1885172045) that the lint should treat `repr(transparent)` the same as `#[repr(C)]`.
Fixes#119659
Warn when not having a profiler runtime means that coverage tests won't be run/blessed
On a few occasions (e.g. #118036, #119984) people have been tripped up by the fact that half of the coverage test suite is skipped by default, because it `// needs-profiler-support` and the profiler runtime is not actually built in any of the default config profiles.
(This is made worse by the fact that it isn't enabled in any of the PR CI jobs either. So people think that they've successfully blessed the test suite, and then get a rude surprise when their merge only fails in the full CI job suite.)
This PR adds a simple warning to compiletest that should alert the user in some cases. It's not foolproof, but it should increase the chances of catching this problem earlier in the PR process.
Update `fn()` trait implementation docs
Fixes#119903
This was FCP'd and approved for the 1.70.0 release, this is just a docs update to match that change.
Docs: Use non-SeqCst in module example of atomics
I done this for this reasons:
1. The example now shows that there is more Orderings than just SeqCst.
2. People who would copy from example would now have more suitable orderings for the job.
3. SeqCst is both much harder to reason about and not needed in most situations.
IMHO, we should encourage people to think and use memory orderings that is suitable to task instead of blindly defaulting to SeqCst.
r? `@m-ou-se`
Consolidate all associated items on the NonZero integer types into a single impl block per type
**Before:**
```rust
#[repr(transparent)]
#[rustc_layout_scalar_valid_range_start(1)]
pub struct NonZeroI8(i8);
impl NonZeroI8 {
pub const fn new(n: i8) -> Option<Self> ...
pub const fn get(self) -> i8 ...
}
impl NonZeroI8 {
pub const fn leading_zeros(self) -> u32 ...
pub const fn trailing_zeros(self) -> u32 ...
}
impl NonZeroI8 {
pub const fn abs(self) -> NonZeroI8 ...
}
...
```
**After:**
```rust
#[repr(transparent)]
#[rustc_layout_scalar_valid_range_start(1)]
pub struct NonZeroI8(i8);
impl NonZeroI8 {
pub const fn new(n: i8) -> Option<Self> ...
pub const fn get(self) -> i8 ...
pub const fn leading_zeros(self) -> u32 ...
pub const fn trailing_zeros(self) -> u32 ...
pub const fn abs(self) -> NonZeroI8 ...
...
}
```
Having 6-7 different impl blocks per type is not such a problem in today's implementation, but becomes awful upon the switch to a generic `NonZero<T>` type (context: https://github.com/rust-lang/rust/issues/82363#issuecomment-921513910).
In the implementation from https://github.com/rust-lang/rust/pull/100428, there end up being **67** impl blocks on that type.
<img src="https://github.com/rust-lang/rust/assets/1940490/5b68bd6f-8a36-4922-baa3-348e30dbfcc1" width="200"><img src="https://github.com/rust-lang/rust/assets/1940490/2cfec71e-c2cd-4361-a542-487f13f435d9" width="200"><img src="https://github.com/rust-lang/rust/assets/1940490/2fe00337-7307-405d-9036-6fe1e58b2627" width="200">
Without the refactor to a single impl block first, introducing `NonZero<T>` would be a usability regression compared to today's separate pages per type. With all those blocks expanded, Ctrl+F is obnoxious because you need to skip 12× past every match you don't care about. With all the blocks collapsed, Ctrl+F is useless. Getting to a state in which exactly one type's (e.g. `NonZero<u32>`) impl blocks are expanded while the rest are collapsed is annoying.
After this refactor to a single impl block, we can move forward with making `NonZero<T>` a generic struct whose docs all go on the same rustdoc page. The rustdoc will have 12 impl blocks, one per choice of `T` supported by the standard library. The reader can expand a single one of those impl blocks e.g. `NonZero<u32>` to understand the entire API of that type.
Note that moving the API into a generic `impl<T> NonZero<T> { ... }` is not going to be an option until after `NonZero<T>` has been stabilized, which may be months or years after its introduction. During the period while generic `NonZero` is unstable, it will be extra important to offer good documentation on all methods demonstrating the API being used through the stable aliases such as `NonZeroI8`.
This PR follows a `key = $value` syntax for the macros which is similar to the macros we already use for producing a single large impl block on the integer primitives.
1dd4db5062/library/core/src/num/mod.rs (L288-L309)
Best reviewed one commit at a time.
Optimize large array creation in const-eval
This changes repeated memcpy's to a memset for the case that we're propagating a single byte into a region of memory. It also optimizes the element-by-element copies to have a tighter loop; I'm pretty sure the old code was actually doing a multiply within each loop iteration.
For an 8GB array (`static SLICE: [u8; SIZE] = [0u8; 1 << 33];`) this takes us from ~23 seconds to ~6 seconds locally, which is spent roughly 50/50 in (a) memset to zero and (b) memcpy of the original place into a new place, when popping stack frame. The latter seems hard to avoid but is a big memcpy (since we're copying the type rather than initializing a region, so it's pretty fast), and the first is as good as it's going to get without special casing constant-valued arrays.
Closes https://github.com/rust-lang/rust/issues/55795. (That issue's references to lint checking don't appear true anymore, but I think this closes that case as something that is slow due to *time* pretty fully. An 8GB array taking only 6 seconds feels reasonable enough to not merit further tracking).
Get rid of the hir_owner query.
This query was meant as a firewall between `hir_owner_nodes` which is supposed to change often, and the queries that only depend on the item signature. That firewall was inefficient, leaking the contents of the HIR body through `HirId`s.
`hir_owner` incurs a significant cost, as we need to hash HIR twice in multiple modes. This PR proposes to remove it, and simplify the hashing scheme.
For the future, `def_kind`, `def_span`... are much more efficient for incremental decoupling, and should be preferred.
change `.unwrap()` to `?` on write where `fmt::Result` is returned
Fixes#120090 which points out that some of the `.unwrap()`s in `rustc_middle/src/mir/pretty.rs` are likely meant to be `?`s
Remove `next_root_ty_var`
Uhh we seem to not have any test that relies on this anymore. Maybe due to the way we changed alias-relate or whatever.
Removing this hack helper fn because #119106 is the general solution.
r? lcnr
Improved collapse_debuginfo attribute, added command-line flag
Improved attribute collapse_debuginfo with variants: `#[collapse_debuginfo=(no|external|yes)]`.
Added command-line flag for default behaviour.
Work-in-progress: will add more tests.
cc https://github.com/rust-lang/rust/issues/100758
bootstrap: handle vendored sources when remapping crate paths
#115872 introduced a feature to add path remapping for crate dependencies, but only when they came from Cargo's registry cache, not a vendor directory.
This caused builds that used remapped debuginfo and vendor directories to fail with:
```
std::fs::read_dir(registry_src) failed with No such file or directory (os error 2)
```
or (if the `registry/src` directory exists but is empty)
```
error: --remap-path-prefix must contain '=' between FROM and TO
```
Fixes#117885 by explicitly supporting the `vendor` directory and adding it to `RUSTC_CARGO_REGISTRY_SRC_TO_REMAP`.
Note that `bootstrap.py` already assumes that `./vendor` within the rust repo is the only supported vendoring location.
r? `@pietroalbini`
[rustc_data_structures] Use partition_point to find slice range end.
This PR uses approach introduced in https://github.com/rust-lang/rust/pull/114152 to find
the end of the range. It's much easier to understand and reason about invariants of such
implementation.
Technically it's possible to make it even shorter by returning `&[start..end]` unconditionally
because even if searched item is not present in the slice, `start` and `end` would point at
the same index, so the range would be empty. The reason I decided not to use this shorter
implementation is because it would involve more comparisons in case there are no elements
in the slice with key equal to `key`.
Also, not that it matters much, but this implementation also improves perf according to the
benchmark below:
https://gist.github.com/ttsugriy/63c0ed39ae132b131931fa1f8a3dea55
The results on my M1 macbook air are:
```
Running benches/bin_search_slice_benchmark.rs (target/release/deps/bin_search_slice_benchmark-90fa6d68c3bd1298)
Benchmarking multiply add/binary_search_slice: Collecting 100 samples in estimated 5.0002 s (1
multiply add/binary_search_slice
time: [44.719 ns 44.918 ns 45.158 ns]
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
Benchmarking multiply add/binary_search_slice_new: Collecting 100 samples in estimated 5.0001
multiply add/binary_search_slice_new
time: [36.955 ns 37.060 ns 37.221 ns]
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
```
Rollup of 8 pull requests
Successful merges:
- #119172 (Detect `NulInCStr` error earlier.)
- #119833 (Make tcx optional from StableMIR run macro and extend it to accept closures)
- #119967 (Add `PatKind::Err` to AST/HIR)
- #119978 (Move async closure parameters into the resultant closure's future eagerly)
- #120021 (don't store const var origins for known vars)
- #120038 (Don't create a separate "basename" when naming and opening a MIR dump file)
- #120057 (Don't ICE when deducing future output if other errors already occurred)
- #120073 (Remove spastorino from users_on_vacation)
r? `@ghost`
`@rustbot` modify labels: rollup
error on incorrect implied bounds in wfcheck except for Bevy dependents
Rebase of #109763
Additionally, special cases Bevy `ParamSet` types to not trigger the lint. This is tracked in #119956.
Fixes#109628
Don't ICE when deducing future output if other errors already occurred
The situation can't really happen outside of erroneous code. What was interesting is that it ICEd before emitting any other diagnostics. This was because the other errors were silenced due to cycle_delay_bug cycle errors.
r? ```@compiler-errors```
fixes#119890
Don't create a separate "basename" when naming and opening a MIR dump file
These functions were split up by #77080, in order to support passing the dump file's “basename” (filename without extension) to the implementation of `-Zdump-mir-spanview`, so that it could be used as a page title.
That flag has since been removed (#119566), so now there's no particular reason for this code to handle the basename separately from the filename or full path.
This PR therefore restores things to (roughly) how they were before #77080.
Move async closure parameters into the resultant closure's future eagerly
Move async closure parameters into the closure's resultant future eagerly.
Before, we used to desugar `async |p1, p2, ..| { body }` as `|p1, p2, ..| { || async { body } }`. Now, we desugar the above like `|p1, p2, ..| { async move { let p1 = p1; let p2 = p2; ... body } }`. This mirrors the same desugaring that `async fn` does with its parameter types, and the compiler literally uses the same code via a shared helper function.
This removes the necessity for E0708, since now expressions like `async |x: i32| { x }` will not give you confusing borrow errors.
This does *not* fix the case where async closures have self-borrows. This will come with a general implementation of async closures, which is still in the works.
r? oli-obk