feat: specialize `SpecFromElem` for `()`
# Description
This PR adds a specialization `SpecFromElem for ()` which allows to significantly reduce `vec![(), N]` time in debug builds (specifically, tests) turning it from observable $O(n)$ to $O(1)$.
# Observing the change
The problem this PR aims to fix explicitly is slow `vec![(), N]` on big `N`s which may appear in tests (see [Background section](#Background) for more details).
The minimal example to see the problem:
```rust
#![feature(test)]
extern crate test;
#[cfg(test)]
mod tests {
const HUGE_SIZE: usize = i32::MAX as usize + 1;
#[bench]
fn bench_vec_literal(b: &mut test::Bencher) {
b.iter(|| vec![(); HUGE_SIZE]);
}
#[bench]
fn bench_vec_repeat(b: &mut test::Bencher) {
b.iter(|| [(); 1].repeat(HUGE_SIZE));
}
}
```
<details><summary>Output</summary>
<p>
```bash
cargo +nightly test -- --report-time -Zunstable-options
Compiling huge-zst-vec-literal-bench v0.1.0 (/home/progrm_jarvis/RustroverProjects/huge-zst-vec-literal-bench)
Finished test [unoptimized + debuginfo] target(s) in 0.31s
Running unittests src/lib.rs (target/debug/deps/huge_zst_vec_literal_bench-e43b1ef287ba8b36)
running 2 tests
test tests::bench_vec_repeat ... ok <0.000s>
test tests::bench_vec_literal ... ok <14.382s>
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 14.38s
Doc-tests huge-zst-vec-literal-bench
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
```
</p>
</details>
> [!IMPORTANT]
> This problem is only observable in Debug (unoptimized) builds, while Release (optimized) ones do not observe this problem. It still is worth fixing it, IMO, since the original scenario observes the problem in tests for which optimizations are disabled by default and it seems unreasonable to override this for the whole project while the problem is very local.
# Background
While working on a crate for a custom data format which has an `i32::MAX` limit on its list's sizes, I wrote the following test to ensure that this invariant is upheld:
```rust
#[test]
fn lists_should_have_i32_size() {
assert!(
RawNbtList::try_from(vec![(); i32::MAX as usize]).is_ok(),
"lists should permit the size up to {}",
i32::MAX
);
assert!(
RawNbtList::try_from(vec![(); i32::MAX as usize + 1]).is_err(),
"lists should have the size of at most {}",
i32::MAX
);
}
```
Soon I discovered that this takes $\approx 3--4s$ per assertion on my machine, almost all of which is spent on `vec![..]`.
While this would be logical for a non-ZST vector (which would require actual $O(n)$ allocation), here `()` was used intentionally considering that for ZSTs size-changing operations should anyway be $O(1)$ (at least from allocator perspective). Yet, this "overhead" is logical if we consider that in general case `clone()` (which is used by `Vec` literal) may have a non-trivial implementation and thus each element has to actually be visited (even if they occupy no space).
In my specific case, the following (contextual) equivalent solved the issue:
```rust
#[test]
fn lists_should_have_i32_size() {
assert!(
RawNbtList::try_from([(); 1].repeat(i32::MAX as usize)).is_ok(),
"lists should permit the size up to {}",
i32::MAX
);
assert!(
RawNbtList::try_from([(); 1].repeat(i32::MAX as usize + 1)).is_err(),
"lists should have the size of at most {}",
i32::MAX
);
}
```
which works since `repeat` explicitly uses `T: Copy` and so does not have to think about non-trivial `Clone`.
But it still may be counter-intuitive for users to observe such long time on the "canonical" vec literal thus the PR.
# Generic solution
This change is explicitly non-generic. Initially I considered it possible to implement in generically, but this would require the specialization to have the following type requirements:
- ✅ the type must be a ZST: easily done via
```rust
if core::mem::size_of::<T>() == 0 {
todo!("specialization")
}
```
or
```rust
use core::mem::SizedTypeProperties;
if T::IS_ZST {
todo!("specialization")
}
```
- :white_check_mark`: the type must implement `Copy`: implementable non-conflictable via a separate specialization:
```rust
trait IsCopyZst: Sized {
fn is_copy_zst() -> bool;
}
impl<T> IsCopyZst for T {
fn is_copy_zst() -> bool {
false
}
}
impl<T: Copy> IsCopyZst for T {
fn is_copy_zst() -> bool {
Self::IS_ZST
}
}
```
- ❌ the type should have a trivial `Clone` implementation: since `vec![t; n]` is specified to use `clone()`, we can only use this "performance optimization" when we are guaranteed that `clone()` does nothing except for copying.
The latter is the real blocker for a generic fix since I am unaware of any way to get this information in a compiler-guaranteed way.
While there may be a fix for this (my friend `@CertainLach` has suggested a potential solution by an perma-unstable fn in `Clone` like `is_trivially_cloneable()` defaulting to `false` and only overridable by `rustc` on derive), this is surely out of this PRs scope.
While a better approach would be to implement it for all ZSTs
which are `Copy` and have trivial `Clone`,
the last property cannot be detected for now.
Signed-off-by: Petr Portnov <me@progrm-jarvis.ru>
Adjust frame IP in backtraces relative to image base for SGX target
This is followup to https://github.com/rust-lang/backtrace-rs/pull/566.
The backtraces printed by `panic!` or generated by `std::backtrace::Backtrace` in SGX target are not usable. The frame addresses need to be relative to image base address so they can be used for symbol resolution. Here's an example panic backtrace generated before this change:
```
$ cargo r --target x86_64-fortanix-unknown-sgx
...
stack backtrace:
0: 0x7f8fe401d3a5 - <unknown>
1: 0x7f8fe4034780 - <unknown>
2: 0x7f8fe401c5a3 - <unknown>
3: 0x7f8fe401d1f5 - <unknown>
4: 0x7f8fe401e6f6 - <unknown>
```
Here's the same panic after this change:
```
$ cargo +stage1 r --target x86_64-fortanix-unknown-sgx
stack backtrace:
0: 0x198bf - <unknown>
1: 0x3d181 - <unknown>
2: 0x26164 - <unknown>
3: 0x19705 - <unknown>
4: 0x1ef36 - <unknown>
```
cc `@jethrogb` and `@workingjubilee`
Add Seek::seek_relative
The `BufReader` struct has a `seek_relative` method because its `Seek::seek` implementation involved dumping the internal buffer (https://github.com/rust-lang/rust/issues/31100).
Unfortunately, there isn't really a good way to take advantage of that method in generic code. This PR adds the same method to the main `Seek` trait with the straightforward default method, and an override for `BufReader` that calls its implementation.
_Also discussed in [this](https://internals.rust-lang.org/t/add-seek-seek-relative/19546) internals.rust-lang.org thread._
Remove option_payload_ptr; redundant to offset_of
The `option_payload_ptr` intrinsic is no longer required as `offset_of` supports traversing enums (#114208). This PR removes it in order to dogfood offset_of (as suggested at https://github.com/rust-lang/rust/issues/106655#issuecomment-1790907626). However, it will not build until those changes reach beta (which I think is within the next 8 days?) so I've opened it as a draft.
Expose tests for {f32,f64}.total_cmp in docs
Expose tests for {f32,f64}.total_cmp in docs
Uncomment the helpful `assert_eq!` line, which is stripped out completely in docs, and leaves the reader to mentally play through the algorithm, or go to the playground and add a println!, to see what the result will be.
(If these tests are known to fail on some platforms, is there some mechanism to conditionalize this or escape the test so the `assert_eq!` source will be visible on the web? I am a newbie, which is why I was reading docs ;)
impl more traits for ptr::Alignment, add mask method
Changes:
* Adds `rustc_const_unstable` attributes where missing
* Makes `log2` method const
* Adds `mask` method
* Implements `Default`, which is equivalent to `Alignment::MIN`
No longer included in PR:
* Removes indirection of `AlignmentEnum` type alias (this was intentional)
* Implements `Display`, `Binary`, `Octal`, `LowerHex`, and `UpperHex` (should go through libs-api instead)
* Controversially implements `LowerExp` and `UpperExp` using `p` instead of `e` to indicate a power of 2 (also should go through libs-api)
Tracking issue for `ptr::Alignment`: #102070
Reenable effects in libcore
With #116670, #117531, and #117171, I think we would be comfortable with re-enabling the effects feature for more testing in libcore.
r? `@oli-obk`
cc `@fmease`
cc #110395
Add T: ?Sized to `RwLockReadGuard` and `RwLockWriteGuard`'s Debug impls.
For context, `MutexGuard` has `+ ?Sized` on its `Debug` impl, and all three have `+ ?Sized` on their `Display` impls.
It looks like the `?Sized` was just missed when the impls were added (the impl for `MutexGuard` was added in the same PR (https://github.com/rust-lang/rust/pull/38006) with support for `T: Debug + ?Sized`, and `RwLock*Guard`s did allow `T: ?Sized` types already); the `Display` impls were added later (https://github.com/rust-lang/rust/pull/42822) with support for `T: Debug + ?Sized` types.
I think this needs a T-libs-api FCP? I'm not sure if this also needs an ACP. If so I can make one.
These are changes to (stable) trait impls on stable types so will be insta-stable.
`@rustbot` label +T-libs-api
Remove asmjs
Fulfills [MCP 668](https://github.com/rust-lang/compiler-team/issues/668).
`asmjs-unknown-emscripten` does not work as-specified, and lacks essential upstream support for generating asm.js, so it should not exist at all.
feat: implement `DoubleEndedSearcher` for `CharArray[Ref]Searcher`
This PR implements `DoubleEndedSearcher` for both `CharArraySearcher` and `CharArrayRefSearcher`. I'm not sure whether this was just overlooked or if there is a reason for it, but since it behaves exactly like `CharSliceSearcher`, I think the implementations should be appropriate.
document ABI compatibility
I don't think we have any central place where we document our ABI compatibility rules, so let's create one. The `fn()` pointer type seems like a good place since ABI questions can only become relevant when invoking a function through a function pointer.
This will likely need T-lang FCP.
avoid exhaustive i16 test in Miri
https://github.com/rust-lang/rust/pull/116301 added a test that is way too slow to be running in Miri. So let's only test a few hopefully representative cases.
Custom MIR: Support cleanup blocks
Cleanup blocks are declared with `bb (cleanup) = { ... }`.
`Call` and `Drop` terminators take an additional argument describing the unwind action, which is one of the following:
* `UnwindContinue()`
* `UnwindUnreachable()`
* `UnwindTerminate(reason)`, where reason is `ReasonAbi` or `ReasonInCleanup`
* `UnwindCleanup(block)`
Also support unwind resume and unwind terminate terminators:
* `UnwindResume()`
* `UnwindTerminate(reason)`
Cleanup blocks are declared with `bb (cleanup) = { ... }`.
`Call` and `Drop` terminators take an additional argument describing the
unwind action, which is one of the following:
* `UnwindContinue()`
* `UnwindUnreachable()`
* `UnwindTerminate(reason)`, where reason is `ReasonAbi` or `ReasonInCleanup`
* `UnwindCleanup(block)`
Also support unwind resume and unwind terminate terminators:
* `UnwindResume()`
* `UnwindTerminate(reason)`
Add `std:#️⃣:{DefaultHasher, RandomState}` exports (needs FCP)
This implements rust-lang/libs-team#267 to move the libstd hasher types to `std::hash` where they belong, instead of `std::collections::hash_map`.
<details><summary>The below no longer applies, but is kept for clarity.</summary>
This is a small refactor for #27242, which moves the definitions of `RandomState` and `DefaultHasher` into `std::hash`, but in a way that won't be noticed in the public API.
I've opened rust-lang/libs-team#267 as a formal ACP to move these directly into the root of `std::hash`, but for now, they're at least separated out from the collections code in a way that will make moving that around easier.
I decided to simply copy the rustdoc for `std::hash` from `core::hash` since I think it would be ideal for the two to diverge longer-term, especially if the ACP is accepted. However, I would be willing to factor them out into a common markdown document if that's preferred.
</details>
Clarify UB in `get_unchecked(_mut)`
Inspired by #116915, it was unclear to me what exactly "out-of-bounds index" means in `get_unchecked`.
One could [potentially](https://rust.godbolt.org/z/hxM764orW) interpret it that `get_unchecked` is just another way to write `offset`, but I think `get_unchecked(len)` is supposed to be UB even though `.offet(len)` is well-defined (as is `.get_unchecked(..len)`), so write that more directly in the docs.
**libs-api folks**: Can you confirm whether this is what you expect this to mean? And is the situation any different for `<*const [T]>::get_unchecked`?