Context is no longer Sync so this doesn't work.
error[E0277]: `*mut ()` cannot be shared between threads safely
--> library/core/tests/task.rs:24:21
|
24 | static CONTEXT: Context<'static> = Context::from_waker(&WAKER);
| ^^^^^^^^^^^^^^^^ `*mut ()` cannot be shared between threads safely
|
= help: within `Context<'static>`, the trait `Sync` is not implemented for `*mut ()`
= note: required because it appears within the type `PhantomData<*mut ()>`
= note: required because it appears within the type `Context<'static>`
= note: shared static variables must have a type that implements `Sync`
More inference-friendly API for lazy
The signature for new was
```
fn new<F>(f: F) -> Lazy<T, F>
```
Notably, with `F` unconstrained, `T` can be literally anything, and just `let _ = Lazy::new(|| 92)` would not typecheck.
This historiacally was a necessity -- `new` is a `const` function, it couldn't have any bounds. Today though, we can move `new` under the `F: FnOnce() -> T` bound, which gives the compiler enough data to infer the type of T from closure.
According to Godbolt¹, on x86_64 using binary and produces slightly
better code than using subtraction. Readability of both is pretty
much equivalent so might just as well use the shorter option.
¹ https://rust.godbolt.org/z/9jM3ejbMx
Add slice methods for indexing via an array of indices.
Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC.
---
# Update:
- Based on feedback, removed the `&[T]` variant of this API, and removed the requirements for the indices to be sorted.
# Description
This adds the following slice methods to `core`:
```rust
impl<T> [T] {
pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Option<[&mut T; N]>;
}
```
This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with `split_at_mut()` or `iter_mut()`. For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requires `N * (N + 1) / 2` comparison operations.
This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many
Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64).
# Example
```rust
let v = &mut [1, 2, 3, 4];
let [a, b] = v.get_many_mut([0, 2]).unwrap();
std::mem::swap(a, b);
*v += 100;
assert_eq!(v, &[3, 2, 101, 4]);
```
# Codegen Examples
<details>
<summary>Click to expand!</summary>
Disclaimer: Taken from local tests with the standalone implementation.
## Unchecked Indexing:
```rust
pub unsafe fn example_unchecked(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
slice.get_many_unchecked_mut(indices)
}
```
```nasm
example_unchecked:
mov rcx, qword, ptr, [r9]
mov r8, qword, ptr, [r9, +, 8]
mov r9, qword, ptr, [r9, +, 16]
lea rcx, [rdx, +, 8*rcx]
lea r8, [rdx, +, 8*r8]
lea rdx, [rdx, +, 8*r9]
mov qword, ptr, [rax], rcx
mov qword, ptr, [rax, +, 8], r8
mov qword, ptr, [rax, +, 16], rdx
ret
```
## Checked Indexing (Option):
```rust
pub unsafe fn example_option(slice: &mut [usize], indices: [usize; 3]) -> Option<[&mut usize; 3]> {
slice.get_many_mut(indices)
}
```
```nasm
mov r10, qword, ptr, [r9, +, 8]
mov rcx, qword, ptr, [r9, +, 16]
cmp rcx, r10
je .LBB0_7
mov r9, qword, ptr, [r9]
cmp rcx, r9
je .LBB0_7
cmp rcx, r8
jae .LBB0_7
cmp r10, r9
je .LBB0_7
cmp r9, r8
jae .LBB0_7
cmp r10, r8
jae .LBB0_7
lea r8, [rdx, +, 8*r9]
lea r9, [rdx, +, 8*r10]
lea rcx, [rdx, +, 8*rcx]
mov qword, ptr, [rax], r8
mov qword, ptr, [rax, +, 8], r9
mov qword, ptr, [rax, +, 16], rcx
ret
.LBB0_7:
mov qword, ptr, [rax], 0
ret
```
## Checked Indexing (Panic):
```rust
pub fn example_panic(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
let len = slice.len();
match slice.get_many_mut(indices) {
Some(s) => s,
None => {
let tmp = indices;
index_many::sorted_bound_check_failed(&tmp, len)
}
}
}
```
```nasm
example_panic:
sub rsp, 56
mov rax, qword, ptr, [r9]
mov r10, qword, ptr, [r9, +, 8]
mov r9, qword, ptr, [r9, +, 16]
cmp r9, r10
je .LBB0_6
cmp r9, rax
je .LBB0_6
cmp r9, r8
jae .LBB0_6
cmp r10, rax
je .LBB0_6
cmp rax, r8
jae .LBB0_6
cmp r10, r8
jae .LBB0_6
lea rax, [rdx, +, 8*rax]
lea r8, [rdx, +, 8*r10]
lea rdx, [rdx, +, 8*r9]
mov qword, ptr, [rcx], rax
mov qword, ptr, [rcx, +, 8], r8
mov qword, ptr, [rcx, +, 16], rdx
mov rax, rcx
add rsp, 56
ret
.LBB0_6:
mov qword, ptr, [rsp, +, 32], rax
mov qword, ptr, [rsp, +, 40], r10
mov qword, ptr, [rsp, +, 48], r9
lea rcx, [rsp, +, 32]
mov edx, 3
call index_many::bound_check_failed
ud2
```
</details>
# Extensions
There are multiple optional extensions to this.
## Indexing With Ranges
This could easily be expanded to allow indexing with `[I; N]` where `I: SliceIndex<Self>`. I wanted to keep the initial implementation simple, so I didn't include it yet.
## Panicking Variant
We could also add this method:
```rust
impl<T> [T] {
fn index_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
}
```
This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the `Option` variant.
This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.
`VecDeque::resize` should re-use the buffer in the passed-in element
Today it always copies it for *every* appended element, but one of those clones is avoidable.
This adds `iter::repeat_n` (https://github.com/rust-lang/rust/issues/104434) as the primitive needed to do this. If this PR is acceptable, I'll also use this in `Vec` rather than its custom `ExtendElement` type & infrastructure that is harder to share between multiple different containers:
101e1822c3/library/alloc/src/vec/mod.rs (L2479-L2492)
Fix mod_inv termination for the last iteration
On usize=u64 platforms, the 4th iteration would overflow the `mod_gate` back to 0. Similarly for usize=u32 platforms, the 3rd iteration would overflow much the same way.
I tested various approaches to resolving this, including approaches with `saturating_mul` and `widening_mul` to a double usize. Turns out LLVM likes `mul_with_overflow` the best. In fact now, that LLVM can see the iteration count is limited, it will happily unroll the loop into a nice linear sequence.
You will also notice that the code around the loop got simplified somewhat. Now that LLVM is handling the loop nicely, there isn’t any more reasons to manually unroll the first iteration out of the loop (though looking at the code today I’m not sure all that complexity was necessary in the first place).
Fixes#103361
Fix inconsistent rounding of 0.5 when formatted to 0 decimal places
As described in #70336, when displaying values to zero decimal places the value of 0.5 is rounded to 1, which is inconsistent with the display of other half-integer values which round to even.
From testing the flt2dec implementation, it looks like this comes down to the condition in the fixed-width Dragon implementation where an empty buffer is treated as a case to apply rounding up. I believe the change below fixes it and updates only the relevant tests.
Nevertheless I am aware this is very much a core piece of functionality, so please take a very careful look to make sure I haven't missed anything. I hope this change does not break anything in the wider ecosystem as having a consistent rounding behaviour in floating point formatting is in my opinion a useful feature to have.
Resolves#70336
Make `Hash`, `Hasher` and `BuildHasher` `#[const_trait]` and make `Sip` const `Hasher`
This PR enables using Hashes in const context.
r? ``@fee1-dead``
The signature for new was
```
fn new<F>(f: F) -> Lazy<T, F>
```
Notably, with `F` unconstrained, `T` can be literally anything, and just
`let _ = Lazy::new(|| 92)` would not typecheck.
This historiacally was a necessity -- `new` is a `const` function, it
couldn't have any bounds. Today though, we can move `new` under the `F:
FnOnce() -> T` bound, which gives the compiler enough data to infer the
type of T from closure.
Stabilize `duration_checked_float`
## Stabilization Report
This stabilization report is for a stabilization of `duration_checked_float`, tracking issue: https://github.com/rust-lang/rust/issues/83400.
### Implementation History
- https://github.com/rust-lang/rust/pull/82179
- https://github.com/rust-lang/rust/pull/90247
- https://github.com/rust-lang/rust/pull/96051
- Changed error type to `FromFloatSecsError` in https://github.com/rust-lang/rust/pull/90247
- https://github.com/rust-lang/rust/pull/96051 changes the rounding mode to round-to-nearest instead of truncate.
## API Summary
This stabilization report proposes the following API to be stabilized in `core`, along with their re-exports in `std`:
```rust
// core::time
impl Duration {
pub const fn try_from_secs_f32(secs: f32) -> Result<Duration, TryFromFloatSecsError>;
pub const fn try_from_secs_f64(secs: f64) -> Result<Duration, TryFromFloatSecsError>;
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct TryFromFloatSecsError { ... }
impl core::fmt::Display for TryFromFloatSecsError { ... }
impl core::error::Error for TryFromFloatSecsError { ... }
```
These functions are made const unstable under `duration_consts_float`, tracking issue #72440.
There is an open question in the tracking issue around what the error type should be called which I was hoping to resolve in the context of an FCP.
In this stabilization PR, I have altered the name of the error type to `TryFromFloatSecsError`. In my opinion, the error type shares the name of the method (adjusted to accommodate both types of floats), which is consistent with other error types in `core`, `alloc` and `std` like `TryReserveError` and `TryFromIntError`.
## Experience Report
Code such as this is ready to be converted to a checked API to ensure it is panic free:
```rust
impl Time {
pub fn checked_add_f64(&self, seconds: f64) -> Result<Self, TimeError> {
// Fail safely during `f64` conversion to duration
if seconds.is_nan() || seconds.is_infinite() {
return Err(TzOutOfRangeError::new().into());
}
if seconds.is_sign_positive() {
self.checked_add(Duration::from_secs_f64(seconds))
} else {
self.checked_sub(Duration::from_secs_f64(-seconds))
}
}
}
```
See: https://github.com/artichoke/artichoke/issues/2194.
`@rustbot` label +T-libs-api -T-libs
cc `@mbartlett21`
On usize=u64 platforms, the 4th iteration would overflow the `mod_gate`
back to 0. Similarly for usize=u32 platforms, the 3rd iteration would
overflow much the same way.
I tested various approaches to resolving this, including approaches with
`saturating_mul` and `widening_mul` to a double usize. Turns out LLVM
likes `mul_with_overflow` the best. In fact now, that LLVM can see the
iteration count is limited, it will happily unroll the loop into a nice
linear sequence.
You will also notice that the code around the loop got simplified
somewhat. Now that LLVM is handling the loop nicely, there isn’t any
more reasons to manually unroll the first iteration out of the loop
(though looking at the code today I’m not sure all that complexity was
necessary in the first place).
Fixes#103361
introduce `{char, u8}::is_ascii_octdigit`
This feature adds two new APIs: `char::is_ascii_octdigit` and `u8::is_ascii_octdigit`, under the feature gate `is_ascii_octdigit`. These methods are shorthands for `char::is_digit(self, 8)` and `u8::is_digit(self, 8)`:
```rust
// core::char
impl char {
pub fn is_ascii_octdigit(self) -> bool;
}
// core::num
impl u8 {
pub fn is_ascii_octdigit(self) -> bool;
}
```
---
Couple of things I need help understanding:
- `const`ness: have I used the right attribute in this case?
- is there a way to run the tests for `core::char` alone, instead of `./x.py test library/core`?
Reimplement `carrying_add` and `borrowing_sub` for signed integers.
As per the discussion in #85532, this PR reimplements `carrying_add` and `borrowing_sub` for signed integers.
It also adds unit tests for both unsigned and signed integers, emphasing on the behaviours of the methods.
Make use of `[wrapping_]byte_{add,sub}`
These new methods trivially replace old `.cast().wrapping_offset().cast()` & similar code.
Note that [`arith_offset`](https://doc.rust-lang.org/std/intrinsics/fn.arith_offset.html) and `wrapping_offset` are the same thing.
r? ``@scottmcm``
_split off from #100746_
Expose `Utf8Lossy` as `Utf8Chunks`
This PR changes the feature for `Utf8Lossy` from `str_internals` to `utf8_lossy` and improves the API. This is done to eventually expose the API as stable.
Proposal: rust-lang/libs-team#54
Tracking Issue: #99543
Refactor iteration logic in the `Flatten` and `FlatMap` iterators
The `Flatten` and `FlatMap` iterators both delegate to `FlattenCompat`:
```rust
struct FlattenCompat<I, U> {
iter: Fuse<I>,
frontiter: Option<U>,
backiter: Option<U>,
}
```
Every individual iterator method that `FlattenCompat` implements needs to carefully manage this state, checking whether the `frontiter` and `backiter` are present, and storing the current iterator appropriately if iteration is aborted. This has led to methods such as `next`, `advance_by`, and `try_fold` all having similar code for managing the iterator's state.
I have extracted this common logic of iterating the inner iterators with the option to exit early into a `iter_try_fold` method:
```rust
impl<I, U> FlattenCompat<I, U>
where
I: Iterator<Item: IntoIterator<IntoIter = U>>,
{
fn iter_try_fold<Acc, Fold, R>(&mut self, acc: Acc, fold: Fold) -> R
where
Fold: FnMut(Acc, &mut U) -> R,
R: Try<Output = Acc>,
{ ... }
}
```
It passes each of the inner iterators to the given function as long as it keep succeeding. It takes care of managing `FlattenCompat`'s state, so that the actual `Iterator` methods don't need to. The resulting code that makes use of this abstraction is much more straightforward:
```rust
fn next(&mut self) -> Option<U::Item> {
#[inline]
fn next<U: Iterator>((): (), iter: &mut U) -> ControlFlow<U::Item> {
match iter.next() {
None => ControlFlow::CONTINUE,
Some(x) => ControlFlow::Break(x),
}
}
self.iter_try_fold((), next).break_value()
}
```
Note that despite being implemented in terms of `iter_try_fold`, `next` is still able to benefit from `U`'s `next` method. It therefore does not take the performance hit that implementing `next` directly in terms of `Self::try_fold` causes (in some benchmarks).
This PR also adds `iter_try_rfold` which captures the shared logic of `try_rfold` and `advance_back_by`, as well as `iter_fold` and `iter_rfold` for folding without early exits (used by `fold`, `rfold`, `count`, and `last`).
Benchmark results:
```
before after
bench_flat_map_sum 423,255 ns/iter 414,338 ns/iter
bench_flat_map_ref_sum 1,942,139 ns/iter 2,216,643 ns/iter
bench_flat_map_chain_sum 1,616,840 ns/iter 1,246,445 ns/iter
bench_flat_map_chain_ref_sum 4,348,110 ns/iter 3,574,775 ns/iter
bench_flat_map_chain_option_sum 780,037 ns/iter 780,679 ns/iter
bench_flat_map_chain_option_ref_sum 2,056,458 ns/iter 834,932 ns/iter
```
I added the last two benchmarks specifically to demonstrate an extreme case where `FlatMap::next` can benefit from custom internal iteration of the outer iterator, so take it with a grain of salt. We should probably do a perf run to see if the changes to `next` are worth it in practice.
Add `Iterator::array_chunks` (take N+1)
A revival of https://github.com/rust-lang/rust/pull/92393.
r? `@Mark-Simulacrum`
cc `@rossmacarthur` `@scottmcm` `@the8472`
I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is `try_fold` implementation, I've left the "custom" one for now, not sure what exactly should it use.
Reoptimize layout array
This way it's one check instead of two, so hopefully (cc #99117) it'll be simpler for rustc perf too 🤞
Quick demonstration:
```rust
pub fn demo(n: usize) -> Option<Layout> {
Layout::array::<i32>(n).ok()
}
```
Nightly: <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=e97bf33508aa03f38968101cdeb5322d>
```nasm
mov rax, rdi
mov ecx, 4
mul rcx
seto cl
movabs rdx, 9223372036854775805
xor esi, esi
cmp rax, rdx
setb sil
shl rsi, 2
xor edx, edx
test cl, cl
cmove rdx, rsi
ret
```
This PR (note no `mul`, in addition to being much shorter):
```nasm
xor edx, edx
lea rax, [4*rcx]
shr rcx, 61
sete dl
shl rdx, 2
ret
```
This is built atop `@CAD97` 's #99136; the new changes are cb8aba66ef6a0e17f08a0574e4820653e31b45a0.
I added a bunch more tests for `Layout::from_size_align` and `Layout::array` too.
Fix slice::ChunksMut aliasing
Fixes https://github.com/rust-lang/rust/issues/94231, details in that issue.
cc `@RalfJung`
This isn't done just yet, all the safety comments are placeholders. But otherwise, it seems to work.
I don't really like this approach though. There's a lot of unsafe code where there wasn't before, but as far as I can tell the only other way to uphold the aliasing requirement imposed by `__iterator_get_unchecked` is to use raw slices, which I think require the same amount of unsafe code. All that would do is tie the `len` and `ptr` fields together.
Oh I just looked and I'm pretty sure that `ChunksExactMut`, `RChunksMut`, and `RChunksExactMut` also need to be patched. Even more reason to put up a draft.
core::any: replace some generic types with impl Trait
This gives a cleaner API since the caller only specifies the concrete type they usually want to.
r? `@yaahc`
Fix `Skip::next` for non-fused inner iterators
`iter.skip(n).next()` will currently call `nth` and `next` in succession on `iter`, without checking whether `nth` exhausts the iterator. Using `?` to propagate a `None` value returned by `nth` avoids this.
Add assertion that `transmute_copy`'s U is not larger than T
This is called out as a safety requirement in the docs, but because knowing this can be done at compile time and constant folded (just like the `align_of` branch is removed), we can just panic here.
I've looked at the asm (using `cargo-asm`) of a function that both is correct and incorrect, and the panic is completely removed, or is unconditional, without needing build-std.
I don't expect this to cause much breakage in the wild. I scanned through https://miri.saethlin.dev/ub for issues that would look like this (error: Undefined Behavior: memory access failed: alloc1768 has size 1, so pointer to 8 bytes starting at offset 0 is out-of-bounds), but couldn't find any.
That doesn't rule out it happening in crates tested that fail earlier for some other reason, though, but it indicates that doing this is rare, if it happens at all. A crater run for this would need to be build and test, since this is a runtime thing.
Also added a few more transmute_copy tests.
Add a special case for align_offset /w stride != 1
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
Stabilize `core::ffi:c_*` and rexport in `std::ffi`
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
Allow arithmetic and certain bitwise ops on AtomicPtr
This is mainly to support migrating from `AtomicUsize`, for the strict provenance experiment.
This is a pretty dubious set of APIs, but it should be sufficient to allow code that's using `AtomicUsize` to manipulate a tagged pointer atomically. It's under a new feature gate, `#![feature(strict_provenance_atomic_ptr)]`, but I'm not sure if it needs its own tracking issue. I'm happy to make one, but it's not clear that it's needed.
I'm unsure if it needs changes in the various non-LLVM backends. Because we just cast things to integers anyway (and were already doing so), I doubt it.
API change proposal: https://github.com/rust-lang/libs-team/issues/60Fixes#95492
ptr::copy and ptr::swap are doing untyped copies
The consensus in https://github.com/rust-lang/rust/issues/63159 seemed to be that these operations should be "untyped", i.e., they should treat the data as raw bytes, should work when these bytes violate the validity invariant of `T`, and should exactly preserve the initialization state of the bytes that are being copied. This is already somewhat implied by the description of "copying/swapping size*N bytes" (rather than "N instances of `T`").
The implementations mostly already work that way (well, for LLVM's intrinsics the documentation is not precise enough to say what exactly happens to poison, but if this ever gets clarified to something that would *not* perfectly preserve poison, then I strongly assume there will be some way to make a copy that *does* perfectly preserve poison). However, I had to adjust `swap_nonoverlapping`; after ``@scottmcm's`` [recent changes](https://github.com/rust-lang/rust/pull/94212), that one (sometimes) made a typed copy. (Note that `mem::swap`, which works on mutable references, is unchanged. It is documented as "swapping the values at two mutable locations", which to me strongly indicates that it is indeed typed. It is also safe and can rely on `&mut T` pointing to a valid `T` as part of its safety invariant.)
On top of adding a test (that will be run by Miri), this PR then also adjusts the documentation to indeed stably promise the untyped semantics. I assume this means the PR has to go through t-libs (and maybe t-lang?) FCP.
Fixes https://github.com/rust-lang/rust/issues/63159
Add the Provider api to core::any
This is an implementation of [RFC 3192](https://github.com/rust-lang/rfcs/pull/3192) ~~(which is yet to be merged, thus why this is a draft PR)~~. It adds an API for type-driven requests and provision of data from trait objects. A primary use case is for the `Error` trait, though that is not implemented in this PR. The only major difference to the RFC is that the functionality is added to the `any` module, rather than being in a sibling `provide_any` module (as discussed in the RFC thread).
~~Still todo: improve documentation on items, including adding examples.~~
cc `@yaahc`
Extend ptr::null and null_mut to all thin (including extern) types
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions. The change should be backward-compatible, but it is **insta-stable** since it cannot (easily, at all?) be made available only through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`, which makes sense it terms of its signature but not in terms of its implementation. `dangling` uses `align_of()` as an address. But what `align_of()` should be for extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff, the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(const_fn_trait_bound)`
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
These guards changed to pointers in #97027, but their `Display` was
formatting that field directly, which made it show the raw pointer
value. Now we go through `Deref` to display the real value again.
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions.
The change should be backward-compatible, but it is **insta-stable**
since it cannot (easily, at all?) be made available only
through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`,
which makes sense it terms of its signature but not in terms of its implementation.
`dangling` uses `align_of()` as an address. But what `align_of()` should be for
extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet
part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in
stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the
entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff,
the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
Add a dedicated length-prefixing method to `Hasher`
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Fixes#94026
r? rust-lang/libs
---
The core of this change is the following two new methods on `Hasher`:
```rust
pub trait Hasher {
/// Writes a length prefix into this hasher, as part of being prefix-free.
///
/// If you're implementing [`Hash`] for a custom collection, call this before
/// writing its contents to this `Hasher`. That way
/// `(collection![1, 2, 3], collection![4, 5])` and
/// `(collection![1, 2], collection![3, 4, 5])` will provide different
/// sequences of values to the `Hasher`
///
/// The `impl<T> Hash for [T]` includes a call to this method, so if you're
/// hashing a slice (or array or vector) via its `Hash::hash` method,
/// you should **not** call this yourself.
///
/// This method is only for providing domain separation. If you want to
/// hash a `usize` that represents part of the *data*, then it's important
/// that you pass it to [`Hasher::write_usize`] instead of to this method.
///
/// # Examples
///
/// ```
/// #![feature(hasher_prefixfree_extras)]
/// # // Stubs to make the `impl` below pass the compiler
/// # struct MyCollection<T>(Option<T>);
/// # impl<T> MyCollection<T> {
/// # fn len(&self) -> usize { todo!() }
/// # }
/// # impl<'a, T> IntoIterator for &'a MyCollection<T> {
/// # type Item = T;
/// # type IntoIter = std::iter::Empty<T>;
/// # fn into_iter(self) -> Self::IntoIter { todo!() }
/// # }
///
/// use std:#️⃣:{Hash, Hasher};
/// impl<T: Hash> Hash for MyCollection<T> {
/// fn hash<H: Hasher>(&self, state: &mut H) {
/// state.write_length_prefix(self.len());
/// for elt in self {
/// elt.hash(state);
/// }
/// }
/// }
/// ```
///
/// # Note to Implementers
///
/// If you've decided that your `Hasher` is willing to be susceptible to
/// Hash-DoS attacks, then you might consider skipping hashing some or all
/// of the `len` provided in the name of increased performance.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_length_prefix(&mut self, len: usize) {
self.write_usize(len);
}
/// Writes a single `str` into this hasher.
///
/// If you're implementing [`Hash`], you generally do not need to call this,
/// as the `impl Hash for str` does, so you can just use that.
///
/// This includes the domain separator for prefix-freedom, so you should
/// **not** call `Self::write_length_prefix` before calling this.
///
/// # Note to Implementers
///
/// The default implementation of this method includes a call to
/// [`Self::write_length_prefix`], so if your implementation of `Hasher`
/// doesn't care about prefix-freedom and you've thus overridden
/// that method to do nothing, there's no need to override this one.
///
/// This method is available to be overridden separately from the others
/// as `str` being UTF-8 means that it never contains `0xFF` bytes, which
/// can be used to provide prefix-freedom cheaper than hashing a length.
///
/// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating
/// them into a buffer), then you can hash the bytes of the `str` followed
/// by a single `0xFF` byte.
///
/// If your `Hasher` works in chunks, you can also do this by being careful
/// about how you pad partial chunks. If the chunks are padded with `0x00`
/// bytes then just hashing an extra `0xFF` byte doesn't necessarily
/// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash
/// the same sequence of chunks. But if you pad with `0xFF` bytes instead,
/// ensuring at least one padding byte, then it can often provide
/// prefix-freedom cheaper than hashing the length would.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_str(&mut self, s: &str) {
self.write_length_prefix(s.len());
self.write(s.as_bytes());
}
}
```
With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`.
`write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach.
---
Compatibility:
Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Update `int_roundings` methods from feedback
This updates `#![feature(int_roundings)]` (#88581) from feedback. All methods now take `NonZeroX`. The documentation makes clear that they panic in debug mode and wrap in release mode.
r? `@joshtriplett`
`@rustbot` label +T-libs +T-libs-api +S-waiting-on-review
Add slice::remainder
This adds a remainder function to the Slice iterator, so that a caller can access unused
elements if iteration stops.
Addresses #91733
Faster parsing for lower numbers for radix up to 16 (cont.)
( Continuation of https://github.com/rust-lang/rust/pull/83371 )
With LingMan's change I think this is potentially ready.
Make non-power-of-two alignments a validity error in `Layout`
Inspired by the zulip conversation about how `Layout` should better enforce `size <= isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
It's left as `pub(crate)` here; a future PR could consider giving it a tracking issue for non-internal usage.
Inspired by the zulip conversation about how `Layout` should better enforce `size < isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
Implement provenance preserving methods on NonNull
### Description
Add the `addr`, `with_addr`, `map_addr` methods to the `NonNull` type, and map the address type to `NonZeroUsize`.
### Motivation
The `NonNull` type is useful for implementing pointer types which have the 0-niche. It is currently possible to implement these provenance preserving functions by calling `NonNull::as_ptr` and `new_unchecked`. The adding these methods makes it more ergonomic.
### Testing
Added a unit test of a non-null tagged pointer type. This is based on some real code I have elsewhere, that currently routes the pointer through a `NonZeroUsize` and back out to produce a usable pointer. I wanted to produce an ideal version of the same tagged pointer struct that preserved pointer provenance.
### Related
Extension of APIs proposed in #95228 . I can also split this out into a separate tracking issue if that is better (though I may need some pointers on how to do that).
Handle rustc_const_stable attribute in library feature collector
The library feature collector in [compiler/rustc_passes/src/lib_features.rs](551b4fa395/compiler/rustc_passes/src/lib_features.rs) has only been looking at `#[stable(…)]`, `#[unstable(…)]`, and `#[rustc_const_unstable(…)]` attributes, while ignoring `#[rustc_const_stable(…)]`. The consequences of this were:
- When any const feature got stabilized (changing one or more `rustc_const_unstable` to `rustc_const_stable`), users who had previously enabled that unstable feature using `#![feature(…)]` would get told "unknown feature", rather than rustc's nicer "the feature … has been stable since … and no longer requires an attribute to enable".
This can be seen in the way that https://github.com/rust-lang/rust/pull/93957#issuecomment-1079794660 failed after rebase:
```console
error[E0635]: unknown feature `const_ptr_offset`
--> $DIR/offset_from_ub.rs:1:35
|
LL | #![feature(const_ptr_offset_from, const_ptr_offset)]
| ^^^^^^^^^^^^^^^^
```
- We weren't enforcing that a particular feature is either stable everywhere or unstable everywhere, and that a feature that has been stabilized has the same stabilization version everywhere, both of which we enforce for the other stability attributes.
This PR updates the library feature collector to handle `rustc_const_stable`, and fixes places in the standard library and test suite where `rustc_const_stable` was being used in a way that does not meet the rules for a stability attribute.
skip slow int_log tests in Miri
Iterating over i16::MAX many things takes a long time in Miri, let's not do that.
I added https://github.com/rust-lang/miri/pull/2044 on the Miri side to still give us some test coverage.
**Description**
Add the `addr`, `with_addr, `map_addr` methods to the `NonNull` type,
and map the address type to `NonZeroUsize`.
**Motiviation**
The `NonNull` type is useful for implementing pointer types which have
the 0-niche. It is currently possible to implement these provenance
preserving functions by calling `NonNull::as_ptr` and `new_unchecked`.
The addition of these methods simply make it more ergonomic to use.
**Testing**
Added a unit test of a nonnull tagged pointer type. This is based on
some real code I have elsewhere, that currently routes the pointer
through a `NonZeroUsize` and back out to produce a usable pointer.
Derive Eq for std::cmp::Ordering, instead of using manual impl.
This allows consts of type Ordering to be used in patterns, and with feature(adt_const_params) allows using `Ordering` as a const generic parameter.
Currently, `std::cmp::Ordering` implements `Eq` using a manually written `impl Eq for Ordering {}`, instead of `derive(Eq)`. This means that it does not implement `StructuralEq`.
This commit removes the manually written impl, and adds `derive(Eq)` to `Ordering`, so that it will implement `StructuralEq`.
Let `try_collect` take advantage of `try_fold` overrides
No public API changes.
With this change, `try_collect` (#94047) is no longer going through the `impl Iterator for &mut impl Iterator`, and thus will be able to use `try_fold` overrides instead of being forced through `next` for every element.
Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56
This might as well go to the same person as my last `try_process` PR (#93572), so
r? ``@yaahc``
Enable conditional checking of values in the Rust codebase
This pull-request enable conditional checking of (well known) values in the Rust codebase.
Well known values were added in https://github.com/rust-lang/rust/pull/94362. All the `target_*` values are taken from all the built-in targets which is why some extra values were needed do be added as they are not (yet ?) defined in any built-in targets.
r? `@Mark-Simulacrum`
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
Add Iterator::collect_into
This PR adds `Iterator::collect_into` as proposed by ``@cormacrelf`` in #48597 (see https://github.com/rust-lang/rust/pull/48597#issuecomment-842083688).
Followup of #92982.
This adds the following method to the Iterator trait:
```rust
fn collect_into<E: Extend<Self::Item>>(self, collection: &mut E) -> &mut E
```
Implement `RawWaker` and `Waker` getters for underlying pointers
implement #87021
New APIs:
- `RawWaker::data(&self) -> *const ()`
- `RawWaker::vtable(&self) -> &'static RawWakerVTable`
- ~`Waker::as_raw_waker(&self) -> &RawWaker`~ `Waker::as_raw(&self) -> &RawWaker`
This third one is an auxiliary function to make the two APIs above more useful. Since we can only get `&Waker` in `Future::poll`, without this, we need to `transmute` it into `&RawWaker` (relying on `repr(transparent)`) in order to access its data/vtable pointers.
~Not sure if it should be named `as_raw` or `as_raw_waker`. Seems we always use `as_<something-raw>` instead of just `as_raw`. But `as_raw_waker` seems not quite consistent with `Waker::from_raw`.~ As suggested in https://github.com/rust-lang/rust/pull/91828#discussion_r770729837, use `as_raw`.
Carefully remove bounds checks from some chunk iterator functions
So, I was writing code that requires the equivalent of `rchunks(N).rev()` (which isn't the same as forward `chunks(N)` — in particular, if the buffer length is not a multiple of `N`, I must handle the "remainder" first).
I happened to look at the codegen output of the function (I was actually interested in whether or not a nested loop was being unrolled — it was), and noticed that in the outer `rchunks(n).rev()` loop, LLVM seemed to be unable to remove the bounds checks from the iteration: https://rust.godbolt.org/z/Tnz4MYY8f (this panic was from the split_at in `RChunks::next_back`).
After doing some experimentation, it seems all of the `next_back` in the non-exact chunk iterators have the issue: (`Chunks::next_back`, `RChunks::next_back`, `ChunksMut::next_back`, and `RChunksMut::next_back`)...
Even worse, the forward `rchunks` iterators sometimes have the issue as well (... but only sometimes). For example https://rust.godbolt.org/z/oGhbqv53r has bounds checks, but if I uncomment the loop body, it manages to remove the check (which is bizarre, since I'd expect the opposite...). I suspect it's highly dependent on the surrounding code, so I decided to remove the bounds checks from them anyway. Overall, this change includes:
- All `next_back` functions on the non-`Exact` iterators (e.g. `R?Chunks(Mut)?`).
- All `next` functions on the non-exact rchunks iterators (e.g. `RChunks(Mut)?`).
I wasn't able to catch any of the other chunk iterators failing to remove the bounds checks (I checked iterations over `r?chunks(_exact)?(_mut)?` with constant chunk sizes under `-O3`, `-Os`, and `-Oz`), which makes sense, since these were the cases where it was harder to prove the bounds check correct to remove...
In fact, it took quite a bit of thinking to convince myself that using unchecked_ here was valid — so I'm not really surprised that LLVM had trouble (although compilers are slightly better at this sort of reasoning than humans). A consequence of that is the fact that the `// SAFETY` comment for these are... kinda long...
---
I didn't do this for, or even think about it for, any of the other iteration methods; just `next` and `next_back` (where it mattered). If this PR is accepted, I'll file a follow up for someone (possibly me) to look at the others later (in particular, `nth`/`nth_back` looked like they had similar logic), but I wanted to do this now, as IMO `next`/`next_back` are the most important here, since they're what gets used by the iteration protocol.
---
Note: While I don't expect this to impact performance directly, the panic is a side effect, which would otherwise not exist in these loops. That is, this could prevent the compiler from being able to move/remove/otherwise rework a loop over these iterators (as an example, it could not delete the code for a loop whose body computes a value which doesn't get used).
Also, some like to be able to have confidence this code has no panicking branches in the optimized code, and "no bounds checks" is kinda part of the selling point of Rust's iterators anyway.
Remove deprecated and unstable slice_partition_at_index functions
They have been deprecated since commit 01ac5a97c9
which was part of the 1.49.0 release, so from the point of nightly,
11 releases ago.
Make `char::DecodeUtf16::size_hist` more precise
New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down.
Fixes#88762
Revival of #88763
Add `intrinsics::const_deallocate`
Tracking issue: #79597
Related: #91884
This allows deallocation of a memory allocated by `intrinsics::const_allocate`. At the moment, this can be only used to reduce memory usage, but in the future this may be useful to detect memory leaks (If an allocated memory remains after evaluation, raise an error...?).
impl Not for !
The lack of this impl caused trouble for me in some degenerate cases of macro-generated code of the form `if !$cond {...}`, even without `feature(never_type)` on a stable compiler. Namely if `$cond` contains a `return` or `break` or similar diverging expression, which would otherwise be perfectly legal in boolean position, the code previously failed to compile with:
```console
error[E0600]: cannot apply unary operator `!` to type `!`
--> library/core/tests/ops.rs:239:8
|
239 | if !return () {}
| ^^^^^^^^^^ cannot apply unary operator `!`
```
Simplification of BigNum::bit_length
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
Partially stabilize `maybe_uninit_extra`
This covers:
```rust
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
```
It does not cover the const-ness of `write` under `const_maybe_uninit_write` nor the const-ness of `assume_init_read` (this commit adds `const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
This covers:
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
It does not cover the const-ness of `write` under
`const_maybe_uninit_write` nor the const-ness of
`assume_init_read` (this commit adds
`const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
disable test with self-referential generator on Miri
Running the libcore test suite in Miri currently fails due to the known incompatibility of self-referential generators with Miri's aliasing checks (https://github.com/rust-lang/unsafe-code-guidelines/issues/148). So let's disable that test in Miri for now.
Allow reverse iteration of lowercase'd/uppercase'd chars
The PR implements `DoubleEndedIterator` trait for `ToLowercase` and `ToUppercase`.
This enables reverse iteration of lowercase/uppercase variants of character sequences.
One of use cases: determining whether a char sequence is a suffix of another one.
Example:
```rust
fn endswith_ignore_case(s1: &str, s2: &str) -> bool {
for eob in s1
.chars()
.flat_map(|c| c.to_lowercase())
.rev()
.zip_longest(s2.chars().flat_map(|c| c.to_lowercase()).rev())
{
match eob {
EitherOrBoth::Both(c1, c2) => {
if c1 != c2 {
return false;
}
}
EitherOrBoth::Left(_) => return true,
EitherOrBoth::Right(_) => return false,
}
}
true
}
```
Revert "Temporarily rename int_roundings functions to avoid conflicts"
This reverts commit 3ece63b64e.
This should be okay because #90329 has been merged.
r? `@joshtriplett`
Mark defaulted `PartialEq`/`PartialOrd` methods as const
WIthout it, `const` impls of these traits are unpleasant to write. I think this kind of change is allowed now. although it looks like it might require some Miri tweaks. Let's find out.
r? ```@fee1-dead```
Do array-slice equality via array equality, rather than always via slices
~~Draft because it needs a rebase after #91766 eventually gets through bors.~~
This enables the optimizations from #85828 to be used for array-to-slice comparisons too, not just array-to-array.
For example, <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=5f9ba69b3d5825a782f897c830d3a6aa>
```rust
pub fn demo(x: &[u8], y: [u8; 4]) -> bool {
*x == y
}
```
Currently writes the array to stack for no reason:
```nasm
sub rsp, 4
mov dword ptr [rsp], edx
cmp rsi, 4
jne .LBB0_1
mov eax, dword ptr [rdi]
cmp eax, dword ptr [rsp]
sete al
add rsp, 4
ret
.LBB0_1:
xor eax, eax
add rsp, 4
ret
```
Whereas with the change in this PR it just compares it directly:
```nasm
cmp rsi, 4
jne .LBB1_1
cmp dword ptr [rdi], edx
sete al
ret
.LBB1_1:
xor eax, eax
ret
```
Constify `bool::then{,_some}`
Note on `~const Drop`: it has no effect when called from runtime functions, when called from const contexts, the trait system ensures that the type can be dropped in const contexts.
This'll still go via slices eventually for large arrays, but this way slice comparisons to short arrays can use the same memcmp-avoidance tricks.
Added some tests for all the combinations to make sure I didn't accidentally infinitely-recurse something.
Minor improvements to `future::join!`'s implementation
This is a follow-up from #91645, regarding [some remarks I made](https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async-foundations/topic/join!/near/264293660).
Mainly:
- it hides the recursive munching through a private `macro`, to avoid leaking such details (a corollary is getting rid of the need to use ``@`` to disambiguate);
- it uses a `match` binding, _outside_ the `async move` block, to better match the semantics from function-like syntax;
- it pre-pins the future before calling into `poll_fn`, since `poll_fn`, alone, cannot guarantee that its capture does not move (to clarify: I believe the previous code was sound, thanks to the outer layer of `async`. But I find it clearer / more robust to refactorings this way 🙂).
- it uses `@ibraheemdev's` very neat `.ready()?`;
- it renames `Took` to `Taken` for consistency with `Done` (tiny nit 😄).
~~TODO~~Done:
- [x] Add unit tests to enforce the function-like `:value` semantics are respected.
r? `@nrc`
Sync portable-simd to remove autosplats
This PR syncs portable-simd in up to a8385522ad in order to address the type inference breakages documented on nightly in https://github.com/rust-lang/rust/issues/90904 by removing the vector + scalar binary operations (called "autosplats", "broadcasting", or "rank promotion", depending on who you ask) that allow `{scalar} + &'_ {scalar}` to fail in some cases, because it becomes possible the programmer may have meant `{scalar} + &'_ {vector}`.
A few quality-of-life improvements make their way in as well:
- Lane counts can now go to 64, as LLVM seems to have fixed their miscompilation for those.
- `{i,u}8x64` to `__m512i` is now available.
- a bunch of `#[must_use]` notes appear throughout the module.
- Some implementations, mostly instances of `impl core::ops::{Op}<Simd> for Simd` that aren't `{vector} + {vector}` (e.g. `{vector} + &'_ {vector}`), leverage some generics and `where` bounds now to make them easier to understand by reducing a dozen implementations into one (and make it possible for people to open the docs on less burly devices).
- And some internal-only improvements.
None of these changes should affect a beta backport, only actual users of `core::simd` (and most aren't even visible in the programmatic sense), though I can extract an even more minimal changeset for beta if necessary. It seemed simpler to just keep moving forward.
Make `array::{try_from_fn, try_map}` and `Iterator::try_find` generic over `Try`
Fixes#85115
This only updates unstable functions.
`array::try_map` didn't actually exist before; this adds it under the still-open tracking issue #79711 from the old PR #79713.
Tracking issue for the new trait: #91285
This would also solve the return type question in for the proposed `Iterator::try_reduce` in #87054
disable tests in Miri that take too long
Comparing slices of length `usize::MAX` diverges in Miri. In fact these tests even diverge in rustc unless `-O` is passed. I tried this code to check that:
```rust
#![feature(slice_take)]
const EMPTY_MAX: &'static [()] = &[(); usize::MAX];
fn main() {
let mut slice: &[_] = &[(); usize::MAX];
println!("1");
assert_eq!(Some(&[] as _), slice.take(usize::MAX..));
println!("2");
let remaining: &[_] = EMPTY_MAX;
println!("3");
assert_eq!(remaining, slice);
println!("4");
}
```
So, disable these tests in Miri for now.
Fixes 85115
This only updates unstable functions.
`array::try_map` didn't actually exist before, despite the tracking issue 79711 still being open from the old PR 79713.
Fix Iterator::advance_by contract inconsistency
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Methods that were only blocked on `const_panic` have been stabilized.
The remaining methods of `duration_consts_2` are all related to floats,
and as such have been placed behind the `duration_consts_float` feature
gate.
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Stabilize `const_raw_ptr_deref` for `*const T`
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is behind the
same feature gate as mutable references.
closes https://github.com/rust-lang/rust/issues/51911
pub use core::simd;
A portable abstraction over SIMD has been a major pursuit in recent years for several programming languages. In Rust, `std::arch` offers explicit SIMD acceleration via compiler intrinsics, but it does so at the cost of having to individually maintain each and every single such API, and is almost completely `unsafe` to use. `core::simd` offers safe abstractions that are resolved to the appropriate SIMD instructions by LLVM during compilation, including scalar instructions if that is all that is available.
`core::simd` is enabled by the `#![portable_simd]` nightly feature tracked in https://github.com/rust-lang/rust/issues/86656 and is introduced here by pulling in the https://github.com/rust-lang/portable-simd repository as a subtree. We built the repository out-of-tree to allow faster compilation and a stochastic test suite backed by the proptest crate to verify that different targets, features, and optimizations produce the same result, so that using this library does not introduce any surprises. As these tests are technically non-deterministic, and thus can introduce overly interesting Heisenbugs if included in the rustc CI, they are visible in the commit history of the subtree but do nothing here. Some tests **are** introduced via the documentation, but these use deterministic asserts.
There are multiple unsolved problems with the library at the current moment, including a want for better documentation, technical issues with LLVM scalarizing and lowering to libm, room for improvement for the APIs, and so far I have not added the necessary plumbing for allowing the more experimental or libm-dependent APIs to be used. However, I thought it would be prudent to open this for review in its current condition, as it is both usable and it is likely I am going to learn something else needs to be fixed when bors tries this out.
The major types are
- `core::simd::Simd<T, N>`
- `core::simd::Mask<T, N>`
There is also the `LaneCount` struct, which, together with the SimdElement and SupportedLaneCount traits, limit the implementation's maximum support to vectors we know will actually compile and provide supporting logic for bitmasks. I'm hoping to simplify at least some of these out of the way as the compiler and library evolve.
These tests just verify some basic APIs of core::simd function, and
guarantees that attempting to access the wrong things doesn't work.
The majority of tests are stochastic, and so remain upstream, but
a few deterministic tests arrive in the subtree as doc tests.
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is placed behind the
`const_raw_mut_ptr_deref` feature gate.
Add #[must_use] to expensive computations
The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite a bit. I'm open to wording changes.
For some reason clippy flagged four `BTreeSet` methods but didn't say boo about equivalent ones on `HashSet`. I stared at them for a while but I can't figure out the difference so I added the `HashSet` ones in.
```rust
// Flagged by clippy.
alloc::collections::btree_set::BTreeSet<T> fn difference<'a>(&'a self, other: &'a BTreeSet<T>) -> Difference<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn symmetric_difference<'a>(&'a self, other: &'a BTreeSet<T>) -> SymmetricDifference<'a, T>
alloc::collections::btree_set::BTreeSet<T> fn intersection<'a>(&'a self, other: &'a BTreeSet<T>) -> Intersection<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn union<'a>(&'a self, other: &'a BTreeSet<T>) -> Union<'a, T>;
// Ignored by clippy, but not by me.
std::collections::HashSet<T, S> fn difference<'a>(&'a self, other: &'a HashSet<T, S>) -> Difference<'a, T, S>;
std::collections::HashSet<T, S> fn symmetric_difference<'a>(&'a self, other: &'a HashSet<T, S>) -> SymmetricDifference<'a, T, S>
std::collections::HashSet<T, S> fn intersection<'a>(&'a self, other: &'a HashSet<T, S>) -> Intersection<'a, T, S>;
std::collections::HashSet<T, S> fn union<'a>(&'a self, other: &'a HashSet<T, S>) -> Union<'a, T, S>;
```
Parent issue: #89692
r? ```@joshtriplett```
Implement split_array and split_array_mut
This implements `[T]::split_array::<const N>() -> (&[T; N], &[T])` and `[T; N]::split_array::<const M>() -> (&[T; M], &[T])` and their mutable equivalents. These are another few “missing” array implementations now that const generics are a thing, similar to #74373, #75026, etc. Fixes#74674.
This implements `[T; N]::split_array` returning an array and a slice. Ultimately, this is probably not what we want, we would want the second return value to be an array of length N-M, which will likely be possible with future const generics enhancements. We need to implement the array method now though, to immediately shadow the slice method. This way, when the slice methods get stabilized, calling them on an array will not be automatic through coercion, so we won't have trouble stabilizing the array methods later (cf. into_iter debacle).
An unchecked version of `[T]::split_array` could also be added as in #76014. This would not be needed for `[T; N]::split_array` as that can be compile-time checked. Edit: actually, since split_at_unchecked is internal-only it could be changed to be split_array-only.
Make more `From` impls `const` (libcore)
Adding `const` to `From` implementations in the core. `rustc_const_unstable` attribute is not added to unstable implementations.
Tracking issue: #88674
<details>
<summary>Done</summary><div>
- `T` from `T`
- `T` from `!`
- `Option<T>` from `T`
- `Option<&T>` from `&Option<T>`
- `Option<&mut T>` from `&mut Option<T>`
- `Cell<T>` from `T`
- `RefCell<T>` from `T`
- `UnsafeCell<T>` from `T`
- `OnceCell<T>` from `T`
- `Poll<T>` from `T`
- `u32` from `char`
- `u64` from `char`
- `u128` from `char`
- `char` from `u8`
- `AtomicBool` from `bool`
- `AtomicPtr<T>` from `*mut T`
- `AtomicI(bits)` from `i(bits)`
- `AtomicU(bits)` from `u(bits)`
- `i(bits)` from `NonZeroI(bits)`
- `u(bits)` from `NonZeroU(bits)`
- `NonNull<T>` from `Unique<T>`
- `NonNull<T>` from `&T`
- `NonNull<T>` from `&mut T`
- `Unique<T>` from `&mut T`
- `Infallible` from `!`
- `TryIntError` from `!`
- `TryIntError` from `Infallible`
- `TryFromSliceError` from `Infallible`
- `FromResidual for Option<T>`
</div></details>
<details>
<summary>Remaining</summary><dev>
- `NonZero` from `NonZero`
These can't be made const at this time because these use Into::into.
https://github.com/rust-lang/rust/blob/master/library/core/src/convert/num.rs#L393
- `std`, `alloc`
There may still be many implementations that can be made `const`.
</div></details>
Automatic exponential formatting in Debug
Context: See [this comment from the libs team](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-853454204)
---
Makes `"{:?}"` switch to exponential for floats based on magnitude. The libs team suggested exploring this idea in the discussion thread for RFC rust-lang/rfcs#2729. (**note:** this is **not** an implementation of the RFC; it is an implementation of one of the alternatives)
Thresholds chosen were 1e-4 and 1e16. Justification described [here](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-864482954).
**This will require a crater run.**
---
As mentioned in the commit message of 8731d4dfb4, this behavior will not apply when a precision is supplied, because I wanted to preserve the following existing and useful behavior of `{:.PREC?}` (which recursively applies `{:.PREC}` to floats in a struct):
```rust
assert_eq!(
format!("{:.2?}", [100.0, 0.000004]),
"[100.00, 0.00]",
)
```
I looked around and am not sure where there are any tests that actually use this in the test suite, though?
All things considered, I'm surprised that this change did not seem to break even a single existing test in `x.py test --stage 2`. (even when I tried a smaller threshold of 1e6)
The unifying theme for this commit is weak, admittedly. I put together a
list of "expensive" functions when I originally proposed this whole
effort, but nobody's cared about that criterion. Still, it's a decent
way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite
a bit.
Speedup int log10 branchless
This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
```
Benchmark on an M1 Mac Mini:
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
```
The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm.
From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.
Add 'core::array::from_fn' and 'core::array::try_from_fn'
These auxiliary methods fill uninitialized arrays in a safe way and are particularly useful for elements that don't implement `Default`.
```rust
// Foo doesn't implement Default
struct Foo(usize);
let _array = core::array::from_fn::<_, _, 2>(|idx| Foo(idx));
```
Different from `FromIterator`, it is guaranteed that the array will be fully filled and no error regarding uninitialized state will be throw. In certain scenarios, however, the creation of an **element** can fail and that is why the `try_from_fn` function is also provided.
```rust
#[derive(Debug, PartialEq)]
enum SomeError {
Foo,
}
let array = core::array::try_from_fn(|i| Ok::<_, SomeError>(i));
assert_eq!(array, Ok([0, 1, 2, 3, 4]));
let another_array = core::array::try_from_fn(|_| Err(SomeError::Foo));
assert_eq!(another_array, Err(SomeError::Foo));
```
Fix Lower/UpperExp formatting for integers and precision zero
Fixes the integer part of #89493 (I daren't touch the floating-point formatting code). The issue is that the "subtracted" precision essentially behaves like extra trailing zeros, but this is not currently reflected in the code properly.
implement advance_(back_)_by on more iterators
Add more efficient, non-default implementations for `feature(iter_advance_by)` (#77404) on more iterators and adapters.
This PR only contains implementations where skipping over items doesn't elide any observable side-effects such as user-provided closures or `clone()` functions. I'll put those in a separate PR.
const fn for option copied, take & replace
Tracking issue: [#67441](https://github.com/rust-lang/rust/issues/67441)
Adding const fn for the copied, take and replace method of Option. Also adding necessary unit test.
It's my first contribution so I am pretty sure I don't know what I'm doing but there's a first for everything!
Constify ?-operator for Result and Option
Try to make `?`-operator usable in `const fn` with `Result` and `Option`, see #74935 . Note that the try-operator itself was constified in #87237.
TODO
* [x] Add tests for const T -> T conversions
* [x] cleanup commits
* [x] Remove `#![allow(incomplete_features)]`
* [?] Await decision in #86808 - I'm not sure
* [x] Await support for parsing `~const` in bootstrapping compiler
* [x] Tracking issue(s)? - #88674
Make `Duration` respect `width` when formatting using `Debug`
When printing or writing a `std::time::Duration` using `Debug` formatting, it previously completely ignored any specified `width`. This is unlike types like integers and floats, which do pad to `width`, for both `Display` and `Debug`, though not all types consider `width` in their `Debug` output (see e.g. #30164). Curiously, `Duration`'s `Debug` formatting *did* consider `precision`.
This PR makes `Duration` pad to `width` just like integers and floats, so that
```rust
format!("|{:8?}|", Duration::from_millis(1234))
```
returns
```
|1.234s |
```
Before you ask "who formats `Debug` output?", note that `Duration` doesn't actually implement `Display`, so `Debug` is currently the only way to format `Duration`s. I think that's wrong, and `Duration` should get a `Display` implementation, but in the meantime there's no harm in making the `Debug` formatting respect `width` rather than ignore it.
I chose the default alignment to be left-aligned. The general rule Rust uses is: numeric types are right-aligned by default, non-numeric types left-aligned. It wasn't clear to me whether `Duration` is a numeric type or not. The fact that a formatted `Duration` can end with suffixes of variable length (`"s"`, `"ms"`, `"µs"`, etc.) made me lean towards left-alignment, but it would be trivial to change it.
Fixes issue #88059.
These functions are unstable, but because they're inherent they still
introduce conflicts with stable trait functions in crates. Temporarily
rename them to fix these conflicts, until we can resolve those conflicts
in a better way.
This is achieved with a branchless bit-twiddling implementation of the
case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
Benchmark on an M1 Mac Mini:
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
The 128 bit case remains ridiculously slow because llvm fails to optimize division by
a constant 128-bit value to multiplications. This could be worked around but it seems
preferable to fix this in llvm.
From u32 up, table lookup (like suggested here
https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813) is still
faster, but requires a hardware leading_zero to be viable, and might clog up the
cache.
Stabilize `UnsafeCell::raw_get()`
This PR stabilizes the associated function `UnsafeCell::raw_get()`. The FCP has [already completed](https://github.com/rust-lang/rust/issues/66358#issuecomment-899095068). While there was some discussion about the naming after the close of the FCP, it looks like people have agreed on this name. Still, it would probably be best if a `libs-api` member had a look at this and stated whether more discussion is needed.
While I was at it, I added some tests for `UnsafeCell`, because there were barely any.
Closes#66358.
fix: move test that require mut to another
Adding TODOs for Option::take and Option::copied
TODO to FIXME + moving const stability under normal
Moving const stability attr under normal stab attr
move more rustc stability attributes
Added the `Option::unzip()` method
* Adds the `Option::unzip()` method to turn an `Option<(T, U)>` into `(Option<T>, Option<U>)` under the `unzip_option` feature
* Adds tests for both `Option::unzip()` and `Option::zip()`, I noticed that `.zip()` didn't have any
* Adds `#[inline]` to a few of `Option`'s methods that were missing it
implement TrustedLen for Flatten/FlatMap if the U: IntoIterator == [T; N]
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
resolves#87094
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.
# Summary
Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).
Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.
In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.
Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.
```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>(); // Err(ParseFloatError { kind: Invalid })
```
# Solution
This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.
**Documentation**
Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:
```rust
// Round-to-even only happens for negative values of q
// when q ≥ −4 in the 64-bit case and when q ≥ −17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
// so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
// (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
// or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```
This ensures maintainability of the code base.
**Improvements for Disguised Fast-Path Cases**
The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.
However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.
**Digit Parsing Improvements**
Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.
**Unsafe Changes**
Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.
```rust
impl AsciiStr {
#[inline]
pub fn step_by(&mut self, n: usize) -> &mut Self {
unsafe { self.ptr = self.ptr.add(n) };
self
}
}
...
#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
// the first character is 'e'/'E' and scientific mode is enabled
let start = *s;
s.step();
...
}
```
The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.
```rust
impl AsciiStr {
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: same as step_by, safe as long n is less than the buffer length
self.ptr = unsafe { self.ptr.add(n) };
self
}
}
...
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
let start = *s;
// SAFETY: the first character is 'e'/'E' and scientific mode is enabled
unsafe {
s.step();
}
...
}
```
This allows us to trivially demonstrate the new implementation of dec2flt is safe.
**Inline Annotations Have Been Removed**
In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).
**Fixed Correctness Tests**
Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.
**Undefined Behavior**
An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:
```rust
#[inline]
pub fn check_len(&self, n: usize) -> bool {
unsafe { self.ptr.add(n) <= self.end }
}
```
And the new implementation is as follows:
```rust
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
```
Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).
**Inferring Binary Exponents**
Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).
# Code Size
The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.
**new**
Using rustc version 1.55.0-dev.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K
**old**
Using rustc version 1.53.0-nightly.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K
# Correctness
The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.
# Issues Addressed
This will fix and close the following issues:
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Implementation is based off fast-float-rust, with a few notable changes.
- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Due to #20400 the corresponding TrustedLen impls need a helper trait
instead of directly adding `Item = &[T;N]` bounds.
Since TrustedLen is a public trait this in turn means
the helper trait needs to be public. Since it's just a workaround
for a compiler deficit it's marked hidden, unstable and unsafe.
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
Add Integer::log variants
_This is another attempt at landing https://github.com/rust-lang/rust/pull/70835, which was approved by the libs team but failed on Android tests through Bors. The text copied here is from the original issue. The only change made so far is the addition of non-`checked_` variants of the log methods._
_Tracking issue: #70887_
---
This implements `{log,log2,log10}` methods for all integer types. The implementation was provided by `@substack` for use in the stdlib.
_Note: I'm not big on math, so this PR is a best effort written with limited knowledge. It's likely I'll be getting things wrong, but happy to learn and correct. Please bare with me._
## Motivation
Calculating the logarithm of a number is a generally useful operation. Currently the stdlib only provides implementations for floats, which means that if we want to calculate the logarithm for an integer we have to cast it to a float and then back to an int.
> would be nice if there was an integer log2 instead of having to either use the f32 version or leading_zeros() which i have to verify the results of every time to be sure
_— [`@substack,` 2020-03-08](https://twitter.com/substack/status/1236445105197727744)_
At higher numbers converting from an integer to a float we also risk overflows. This means that Rust currently only provides log operations for a limited set of integers.
The process of doing log operations by converting between floats and integers is also prone to rounding errors. In the following example we're trying to calculate `base10` for an integer. We might try and calculate the `base2` for the values, and attempt [a base swap](https://www.rapidtables.com/math/algebra/Logarithm.html#log-rules) to arrive at `base10`. However because we're performing intermediate rounding we arrive at the wrong result:
```rust
// log10(900) = ~2.95 = 2
dbg!(900f32.log10() as u64);
// log base change rule: logb(x) = logc(x) / logc(b)
// log2(900) / log2(10) = 9/3 = 3
dbg!((900f32.log2() as u64) / (10f32.log2() as u64));
```
_[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6bd6c68b3539e400f9ca4fdc6fc2eed0)_
This is somewhat nuanced as a lot of the time it'll work well, but in real world code this could lead to some hard to track bugs. By providing correct log implementations directly on integers we can help prevent errors around this.
## Implementation notes
I checked whether LLVM intrinsics existed before implementing this, and none exist yet. ~~Also I couldn't really find a better way to write the `ilog` function. One option would be to make it a private method on the number, but I didn't see any precedent for that. I also didn't know where to best place the tests, so I added them to the bottom of the file. Even though they might seem like quite a lot they take no time to execute.~~
## References
- [Log rules](https://www.rapidtables.com/math/algebra/Logarithm.html#log-rules)
- [Rounding error playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6bd6c68b3539e400f9ca4fdc6fc2eed0)
- [substack's tweet asking about integer log2 in the stdlib](https://twitter.com/substack/status/1236445105197727744)
- [Integer Logarithm, A. Jaffer 2008](https://people.csail.mit.edu/jaffer/III/ilog.pdf)
core: add unstable no_fp_fmt_parse to disable float formatting code
In some projects (e.g. kernel), floating point is forbidden. They can disable
hardware floating point support and use `+soft-float` to avoid fp instructions
from being generated, but as libcore contains the formatting code for `f32`
and `f64`, some fp intrinsics are depended. One could define stubs for these
intrinsics that just panic [1], but it means that if any formatting functions
are accidentally used, mistake can only be caught during the runtime rather
than during compile-time or link-time, and they consume a lot of space without
LTO.
This patch provides an unstable cfg `no_fp_fmt_parse` to disable these.
A panicking stub is still provided for the `Debug` implementation (unfortunately)
because there are some SIMD types that use `#[derive(Debug)]`.
[1]: https://lkml.org/lkml/2021/4/14/1028