feat: specialize `SpecFromElem` for `()`
# Description
This PR adds a specialization `SpecFromElem for ()` which allows to significantly reduce `vec![(), N]` time in debug builds (specifically, tests) turning it from observable $O(n)$ to $O(1)$.
# Observing the change
The problem this PR aims to fix explicitly is slow `vec![(), N]` on big `N`s which may appear in tests (see [Background section](#Background) for more details).
The minimal example to see the problem:
```rust
#![feature(test)]
extern crate test;
#[cfg(test)]
mod tests {
const HUGE_SIZE: usize = i32::MAX as usize + 1;
#[bench]
fn bench_vec_literal(b: &mut test::Bencher) {
b.iter(|| vec![(); HUGE_SIZE]);
}
#[bench]
fn bench_vec_repeat(b: &mut test::Bencher) {
b.iter(|| [(); 1].repeat(HUGE_SIZE));
}
}
```
<details><summary>Output</summary>
<p>
```bash
cargo +nightly test -- --report-time -Zunstable-options
Compiling huge-zst-vec-literal-bench v0.1.0 (/home/progrm_jarvis/RustroverProjects/huge-zst-vec-literal-bench)
Finished test [unoptimized + debuginfo] target(s) in 0.31s
Running unittests src/lib.rs (target/debug/deps/huge_zst_vec_literal_bench-e43b1ef287ba8b36)
running 2 tests
test tests::bench_vec_repeat ... ok <0.000s>
test tests::bench_vec_literal ... ok <14.382s>
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 14.38s
Doc-tests huge-zst-vec-literal-bench
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
```
</p>
</details>
> [!IMPORTANT]
> This problem is only observable in Debug (unoptimized) builds, while Release (optimized) ones do not observe this problem. It still is worth fixing it, IMO, since the original scenario observes the problem in tests for which optimizations are disabled by default and it seems unreasonable to override this for the whole project while the problem is very local.
# Background
While working on a crate for a custom data format which has an `i32::MAX` limit on its list's sizes, I wrote the following test to ensure that this invariant is upheld:
```rust
#[test]
fn lists_should_have_i32_size() {
assert!(
RawNbtList::try_from(vec![(); i32::MAX as usize]).is_ok(),
"lists should permit the size up to {}",
i32::MAX
);
assert!(
RawNbtList::try_from(vec![(); i32::MAX as usize + 1]).is_err(),
"lists should have the size of at most {}",
i32::MAX
);
}
```
Soon I discovered that this takes $\approx 3--4s$ per assertion on my machine, almost all of which is spent on `vec![..]`.
While this would be logical for a non-ZST vector (which would require actual $O(n)$ allocation), here `()` was used intentionally considering that for ZSTs size-changing operations should anyway be $O(1)$ (at least from allocator perspective). Yet, this "overhead" is logical if we consider that in general case `clone()` (which is used by `Vec` literal) may have a non-trivial implementation and thus each element has to actually be visited (even if they occupy no space).
In my specific case, the following (contextual) equivalent solved the issue:
```rust
#[test]
fn lists_should_have_i32_size() {
assert!(
RawNbtList::try_from([(); 1].repeat(i32::MAX as usize)).is_ok(),
"lists should permit the size up to {}",
i32::MAX
);
assert!(
RawNbtList::try_from([(); 1].repeat(i32::MAX as usize + 1)).is_err(),
"lists should have the size of at most {}",
i32::MAX
);
}
```
which works since `repeat` explicitly uses `T: Copy` and so does not have to think about non-trivial `Clone`.
But it still may be counter-intuitive for users to observe such long time on the "canonical" vec literal thus the PR.
# Generic solution
This change is explicitly non-generic. Initially I considered it possible to implement in generically, but this would require the specialization to have the following type requirements:
- ✅ the type must be a ZST: easily done via
```rust
if core::mem::size_of::<T>() == 0 {
todo!("specialization")
}
```
or
```rust
use core::mem::SizedTypeProperties;
if T::IS_ZST {
todo!("specialization")
}
```
- :white_check_mark`: the type must implement `Copy`: implementable non-conflictable via a separate specialization:
```rust
trait IsCopyZst: Sized {
fn is_copy_zst() -> bool;
}
impl<T> IsCopyZst for T {
fn is_copy_zst() -> bool {
false
}
}
impl<T: Copy> IsCopyZst for T {
fn is_copy_zst() -> bool {
Self::IS_ZST
}
}
```
- ❌ the type should have a trivial `Clone` implementation: since `vec![t; n]` is specified to use `clone()`, we can only use this "performance optimization" when we are guaranteed that `clone()` does nothing except for copying.
The latter is the real blocker for a generic fix since I am unaware of any way to get this information in a compiler-guaranteed way.
While there may be a fix for this (my friend `@CertainLach` has suggested a potential solution by an perma-unstable fn in `Clone` like `is_trivially_cloneable()` defaulting to `false` and only overridable by `rustc` on derive), this is surely out of this PRs scope.
While a better approach would be to implement it for all ZSTs
which are `Copy` and have trivial `Clone`,
the last property cannot be detected for now.
Signed-off-by: Petr Portnov <me@progrm-jarvis.ru>
While a bare "NUL" *should* be redirected to the NUL device, especially in this simple case, let's be explicit that we aren't opening a file called "NUL" and instead open it directly.
This will also set a good example for people copying std code.
Adjust frame IP in backtraces relative to image base for SGX target
This is followup to https://github.com/rust-lang/backtrace-rs/pull/566.
The backtraces printed by `panic!` or generated by `std::backtrace::Backtrace` in SGX target are not usable. The frame addresses need to be relative to image base address so they can be used for symbol resolution. Here's an example panic backtrace generated before this change:
```
$ cargo r --target x86_64-fortanix-unknown-sgx
...
stack backtrace:
0: 0x7f8fe401d3a5 - <unknown>
1: 0x7f8fe4034780 - <unknown>
2: 0x7f8fe401c5a3 - <unknown>
3: 0x7f8fe401d1f5 - <unknown>
4: 0x7f8fe401e6f6 - <unknown>
```
Here's the same panic after this change:
```
$ cargo +stage1 r --target x86_64-fortanix-unknown-sgx
stack backtrace:
0: 0x198bf - <unknown>
1: 0x3d181 - <unknown>
2: 0x26164 - <unknown>
3: 0x19705 - <unknown>
4: 0x1ef36 - <unknown>
```
cc `@jethrogb` and `@workingjubilee`
Add Seek::seek_relative
The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (https://github.com/rust-lang/rust/issues/31100).
Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.
_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
Remove option_payload_ptr; redundant to offset_of
The `option_payload_ptr` intrinsic is no longer required as `offset_of` supports traversing enums (#114208). This PR removes it in order to dogfood offset_of (as suggested at https://github.com/rust-lang/rust/issues/106655#issuecomment-1790907626). However, it will not build until those changes reach beta (which I think is within the next 8 days?) so I've opened it as a draft.
Expose tests for {f32,f64}.total_cmp in docs
Expose tests for {f32,f64}.total_cmp in docs
Uncomment the helpful `assert_eq!` line, which is stripped out completely in docs, and leaves the reader to mentally play through the algorithm, or go to the playground and add a println!, to see what the result will be.
(If these tests are known to fail on some platforms, is there some mechanism to conditionalize this or escape the test so the `assert_eq!` source will be visible on the web? I am a newbie, which is why I was reading docs ;)
impl more traits for ptr::Alignment, add mask method
Changes:
* Adds `rustc_const_unstable` attributes where missing
* Makes `log2` method const
* Adds `mask` method
* Implements `Default`, which is equivalent to `Alignment::MIN`
No longer included in PR:
* Removes indirection of `AlignmentEnum` type alias (this was intentional)
* Implements `Display`, `Binary`, `Octal`, `LowerHex`, and `UpperHex` (should go through libs-api instead)
* Controversially implements `LowerExp` and `UpperExp` using `p` instead of `e` to indicate a power of 2 (also should go through libs-api)
Tracking issue for `ptr::Alignment`: #102070
Reenable effects in libcore
With #116670, #117531, and #117171, I think we would be comfortable with re-enabling the effects feature for more testing in libcore.
r? `@oli-obk`
cc `@fmease`
cc #110395
Add T: ?Sized to `RwLockReadGuard` and `RwLockWriteGuard`'s Debug impls.
For context, `MutexGuard` has `+ ?Sized` on its `Debug` impl, and all three have `+ ?Sized` on their `Display` impls.
It looks like the `?Sized` was just missed when the impls were added (the impl for `MutexGuard` was added in the same PR (https://github.com/rust-lang/rust/pull/38006) with support for `T: Debug + ?Sized`, and `RwLock*Guard`s did allow `T: ?Sized` types already); the `Display` impls were added later (https://github.com/rust-lang/rust/pull/42822) with support for `T: Debug + ?Sized` types.
I think this needs a T-libs-api FCP? I'm not sure if this also needs an ACP. If so I can make one.
These are changes to (stable) trait impls on stable types so will be insta-stable.
`@rustbot` label +T-libs-api
Remove asmjs
Fulfills [MCP 668](https://github.com/rust-lang/compiler-team/issues/668).
`asmjs-unknown-emscripten` does not work as-specified, and lacks essential upstream support for generating asm.js, so it should not exist at all.
feat: implement `DoubleEndedSearcher` for `CharArray[Ref]Searcher`
This PR implements `DoubleEndedSearcher` for both `CharArraySearcher` and `CharArrayRefSearcher`. I'm not sure whether this was just overlooked or if there is a reason for it, but since it behaves exactly like `CharSliceSearcher`, I think the implementations should be appropriate.
document ABI compatibility
I don't think we have any central place where we document our ABI compatibility rules, so let's create one. The `fn()` pointer type seems like a good place since ABI questions can only become relevant when invoking a function through a function pointer.
This will likely need T-lang FCP.
avoid exhaustive i16 test in Miri
https://github.com/rust-lang/rust/pull/116301 added a test that is way too slow to be running in Miri. So let's only test a few hopefully representative cases.
Custom MIR: Support cleanup blocks
Cleanup blocks are declared with `bb (cleanup) = { ... }`.
`Call` and `Drop` terminators take an additional argument describing the unwind action, which is one of the following:
* `UnwindContinue()`
* `UnwindUnreachable()`
* `UnwindTerminate(reason)`, where reason is `ReasonAbi` or `ReasonInCleanup`
* `UnwindCleanup(block)`
Also support unwind resume and unwind terminate terminators:
* `UnwindResume()`
* `UnwindTerminate(reason)`
Cleanup blocks are declared with `bb (cleanup) = { ... }`.
`Call` and `Drop` terminators take an additional argument describing the
unwind action, which is one of the following:
* `UnwindContinue()`
* `UnwindUnreachable()`
* `UnwindTerminate(reason)`, where reason is `ReasonAbi` or `ReasonInCleanup`
* `UnwindCleanup(block)`
Also support unwind resume and unwind terminate terminators:
* `UnwindResume()`
* `UnwindTerminate(reason)`
Add `std:#️⃣:{DefaultHasher, RandomState}` exports (needs FCP)
This implements rust-lang/libs-team#267 to move the libstd hasher types to `std::hash` where they belong, instead of `std::collections::hash_map`.
<details><summary>The below no longer applies, but is kept for clarity.</summary>
This is a small refactor for #27242, which moves the definitions of `RandomState` and `DefaultHasher` into `std::hash`, but in a way that won't be noticed in the public API.
I've opened rust-lang/libs-team#267 as a formal ACP to move these directly into the root of `std::hash`, but for now, they're at least separated out from the collections code in a way that will make moving that around easier.
I decided to simply copy the rustdoc for `std::hash` from `core::hash` since I think it would be ideal for the two to diverge longer-term, especially if the ACP is accepted. However, I would be willing to factor them out into a common markdown document if that's preferred.
</details>
Clarify UB in `get_unchecked(_mut)`
Inspired by #116915, it was unclear to me what exactly "out-of-bounds index" means in `get_unchecked`.
One could [potentially](https://rust.godbolt.org/z/hxM764orW) interpret it that `get_unchecked` is just another way to write `offset`, but I think `get_unchecked(len)` is supposed to be UB even though `.offet(len)` is well-defined (as is `.get_unchecked(..len)`), so write that more directly in the docs.
**libs-api folks**: Can you confirm whether this is what you expect this to mean? And is the situation any different for `<*const [T]>::get_unchecked`?
patterns: reject raw pointers that are not just integers
Matching against `0 as *const i32` is fine, matching against `&42 as *const i32` is not.
This extends the existing check against function pointers and wide pointers: we now uniformly reject all these pointer types during valtree construction, and then later lint because of that. See [here](https://github.com/rust-lang/rust/pull/116930#issuecomment-1784654073) for some more explanation and context.
Also fixes https://github.com/rust-lang/rust/issues/116929.
Cc `@oli-obk` `@lcnr`
Refactor the if/else checking on cmp::Ordering variants to a
"branchless" reassignment of left and right. This change results
in fewer branches and instructions.
Removes the private type `std::sys::solid::net::FileDesc`, replacing its
only usage in `std::sys::solid::net::Socket` with `std::os::solid::io::
OwnedFd`.
Documentation cleanup for core::error::Request.
This part of the documentation currently render like this:
![image](https://github.com/rust-lang/rust/assets/249196/b34cb907-4ce4-4e85-beca-510d8aa1fefb)
The new version renders like this:
![image](https://github.com/rust-lang/rust/assets/249196/fe18398a-15fb-42a7-82a4-f1856d48bd79)
Fixes:
* Add missing closing back tick.
* Remove spurious double back ticks.
* Add missing newline to render bullet point correctly.
* Fix grammar "there are methods calledrequest_ref and request_value are available" -> "there are methods calledrequest_ref and request_value".
* Change "methods" to "functions", which seems more appropriate for free functions.
document that the null pointer has the 0 address
Fixes https://github.com/rust-lang/rust/issues/116895
Will need t-lang FCP, but I think this is fairly uncontroversial -- there's probably already tons of code out there that relies on this.
Override `Waker::clone_from` to avoid cloning `Waker`s unnecessarily
This would be very useful for futures — I think it’s pretty much always what they want to do instead of `*waker = cx.waker().clone()`.
Tracking issue: https://github.com/rust-lang/rust/issues/98287
r? rust-lang/libs-api `@rustbot` label +T-libs-api -T-libs
Avoid unnecessary comparison in partition_equal
The branchy Hoare partition `partition_equal` as part of `slice::sort_unstable` has a bug that makes it perform a comparison of the last element twice.
Measuring inputs with a Zipfian distribution with characterizing exponent s == 1.0, yields a ~0.05% reduction in the total number of comparisons performed.
Rollup of 4 pull requests
Successful merges:
- #116017 (Don't pass `-stdlib=libc++` when building C files on macOS)
- #117524 (bootstrap/setup: create hooks directory if non-existing)
- #117588 (Remove unused LoadResult::DecodeIncrCache variant)
- #117596 (Add diagnostic items for a few of core's builtin macros)
r? `@ghost`
`@rustbot` modify labels: rollup
Add diagnostic items for a few of core's builtin macros
Specifically, `env`, `option_env`, and `include`. There are a number of reasons why people might want to look at these in lints (For example, to ensure that things behave consistently, detect things that might make builds less reproducible, etc).
Concretely, in PL/Rust (well, `plrustc`) we have lints that forbid these (which I'd like to [add to clippy as restriction lints](https://rust-lang.zulipchat.com/#narrow/stream/257328-clippy/topic/Landing.20a.20flotilla.20of.20lints.3F) eventually), and `dylint` also has [lints that look for `env!`/`option_env!`](109a07e9f2/examples/general/env_cargo_path/src/lib.rs) (although perhaps not `include`), which would benefit from this.
My experience is that it's pretty annoying to (robustly) check uses of builtin macros without these IME, although that's perhaps just my own fault (e.g. I could be doing it wrong).
At `@Nilstrieb's` suggestion, I've added a comment that explains why these are here, even though they are not used in the compiler. This is mostly to discourage removal, although it's not a big deal if it happens (I'm certainly not suggesting the presence of these be in any way stable).
---
In theory this is a library PR (in that it's in library/core), but I'm going to roll compiler because the existence of this or not is much more likely something they care about rather than libs. Hopefully nobody objects to this.
r? compiler
Remove obsolete support for linking unwinder on Android
Linking libgcc is no longer supported (see #103673), so remove the related link attributes and the check in unwind's build.rs. The check was the last remaining significant piece of logic in build.rs, so remove build.rs as well.
Stabilize `const_maybe_uninit_zeroed` and `const_mem_zeroed`
Make `MaybeUninit::zeroed` and `mem::zeroed` const stable. Newly stable API:
```rust
// core::mem
pub const unsafe fn zeroed<T>() ->;
impl<T> MaybeUninit<T> {
pub const fn zeroed() -> MaybeUninit<T>;
}
```
This relies on features based around `const_mut_refs`. Per `@RalfJung,` this should be OK since we do not leak any `&mut` to the user.
For this to be possible, intrinsics `assert_zero_valid` and `assert_mem_uninitialized_valid` were made const stable.
Tracking issue: #91850
Zulip discussion: https://rust-lang.zulipchat.com/#narrow/stream/146212-t-compiler.2Fconst-eval/topic/.60const_mut_refs.60.20dependents
r? libs-api
`@rustbot` label -T-libs +T-libs-api +A-const-eval
cc `@RalfJung` `@oli-obk` `@rust-lang/wg-const-eval`
Hint optimizer about try-reserved capacity
This is #116568, but limited only to the less-common `try_reserve` functions to reduce bloat in debug binaries from debug info, while still addressing the main use-case #116570