Add a special case for align_offset /w stride != 1
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
Stabilize `core::ffi:c_*` and rexport in `std::ffi`
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
This only stabilizes the base types, not the non-zero variants, since
those have their own separate tracking issue and have not gone through
FCP to stabilize.
Allow arithmetic and certain bitwise ops on AtomicPtr
This is mainly to support migrating from `AtomicUsize`, for the strict provenance experiment.
This is a pretty dubious set of APIs, but it should be sufficient to allow code that's using `AtomicUsize` to manipulate a tagged pointer atomically. It's under a new feature gate, `#![feature(strict_provenance_atomic_ptr)]`, but I'm not sure if it needs its own tracking issue. I'm happy to make one, but it's not clear that it's needed.
I'm unsure if it needs changes in the various non-LLVM backends. Because we just cast things to integers anyway (and were already doing so), I doubt it.
API change proposal: https://github.com/rust-lang/libs-team/issues/60Fixes#95492
ptr::copy and ptr::swap are doing untyped copies
The consensus in https://github.com/rust-lang/rust/issues/63159 seemed to be that these operations should be "untyped", i.e., they should treat the data as raw bytes, should work when these bytes violate the validity invariant of `T`, and should exactly preserve the initialization state of the bytes that are being copied. This is already somewhat implied by the description of "copying/swapping size*N bytes" (rather than "N instances of `T`").
The implementations mostly already work that way (well, for LLVM's intrinsics the documentation is not precise enough to say what exactly happens to poison, but if this ever gets clarified to something that would *not* perfectly preserve poison, then I strongly assume there will be some way to make a copy that *does* perfectly preserve poison). However, I had to adjust `swap_nonoverlapping`; after ``@scottmcm's`` [recent changes](https://github.com/rust-lang/rust/pull/94212), that one (sometimes) made a typed copy. (Note that `mem::swap`, which works on mutable references, is unchanged. It is documented as "swapping the values at two mutable locations", which to me strongly indicates that it is indeed typed. It is also safe and can rely on `&mut T` pointing to a valid `T` as part of its safety invariant.)
On top of adding a test (that will be run by Miri), this PR then also adjusts the documentation to indeed stably promise the untyped semantics. I assume this means the PR has to go through t-libs (and maybe t-lang?) FCP.
Fixes https://github.com/rust-lang/rust/issues/63159
Add the Provider api to core::any
This is an implementation of [RFC 3192](https://github.com/rust-lang/rfcs/pull/3192) ~~(which is yet to be merged, thus why this is a draft PR)~~. It adds an API for type-driven requests and provision of data from trait objects. A primary use case is for the `Error` trait, though that is not implemented in this PR. The only major difference to the RFC is that the functionality is added to the `any` module, rather than being in a sibling `provide_any` module (as discussed in the RFC thread).
~~Still todo: improve documentation on items, including adding examples.~~
cc `@yaahc`
Extend ptr::null and null_mut to all thin (including extern) types
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions. The change should be backward-compatible, but it is **insta-stable** since it cannot (easily, at all?) be made available only through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`, which makes sense it terms of its signature but not in terms of its implementation. `dangling` uses `align_of()` as an address. But what `align_of()` should be for extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff, the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(const_fn_trait_bound)`
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
These guards changed to pointers in #97027, but their `Display` was
formatting that field directly, which made it show the raw pointer
value. Now we go through `Deref` to display the real value again.
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions.
The change should be backward-compatible, but it is **insta-stable**
since it cannot (easily, at all?) be made available only
through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`,
which makes sense it terms of its signature but not in terms of its implementation.
`dangling` uses `align_of()` as an address. But what `align_of()` should be for
extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet
part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in
stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the
entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff,
the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
Add a dedicated length-prefixing method to `Hasher`
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Fixes#94026
r? rust-lang/libs
---
The core of this change is the following two new methods on `Hasher`:
```rust
pub trait Hasher {
/// Writes a length prefix into this hasher, as part of being prefix-free.
///
/// If you're implementing [`Hash`] for a custom collection, call this before
/// writing its contents to this `Hasher`. That way
/// `(collection![1, 2, 3], collection![4, 5])` and
/// `(collection![1, 2], collection![3, 4, 5])` will provide different
/// sequences of values to the `Hasher`
///
/// The `impl<T> Hash for [T]` includes a call to this method, so if you're
/// hashing a slice (or array or vector) via its `Hash::hash` method,
/// you should **not** call this yourself.
///
/// This method is only for providing domain separation. If you want to
/// hash a `usize` that represents part of the *data*, then it's important
/// that you pass it to [`Hasher::write_usize`] instead of to this method.
///
/// # Examples
///
/// ```
/// #![feature(hasher_prefixfree_extras)]
/// # // Stubs to make the `impl` below pass the compiler
/// # struct MyCollection<T>(Option<T>);
/// # impl<T> MyCollection<T> {
/// # fn len(&self) -> usize { todo!() }
/// # }
/// # impl<'a, T> IntoIterator for &'a MyCollection<T> {
/// # type Item = T;
/// # type IntoIter = std::iter::Empty<T>;
/// # fn into_iter(self) -> Self::IntoIter { todo!() }
/// # }
///
/// use std:#️⃣:{Hash, Hasher};
/// impl<T: Hash> Hash for MyCollection<T> {
/// fn hash<H: Hasher>(&self, state: &mut H) {
/// state.write_length_prefix(self.len());
/// for elt in self {
/// elt.hash(state);
/// }
/// }
/// }
/// ```
///
/// # Note to Implementers
///
/// If you've decided that your `Hasher` is willing to be susceptible to
/// Hash-DoS attacks, then you might consider skipping hashing some or all
/// of the `len` provided in the name of increased performance.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_length_prefix(&mut self, len: usize) {
self.write_usize(len);
}
/// Writes a single `str` into this hasher.
///
/// If you're implementing [`Hash`], you generally do not need to call this,
/// as the `impl Hash for str` does, so you can just use that.
///
/// This includes the domain separator for prefix-freedom, so you should
/// **not** call `Self::write_length_prefix` before calling this.
///
/// # Note to Implementers
///
/// The default implementation of this method includes a call to
/// [`Self::write_length_prefix`], so if your implementation of `Hasher`
/// doesn't care about prefix-freedom and you've thus overridden
/// that method to do nothing, there's no need to override this one.
///
/// This method is available to be overridden separately from the others
/// as `str` being UTF-8 means that it never contains `0xFF` bytes, which
/// can be used to provide prefix-freedom cheaper than hashing a length.
///
/// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating
/// them into a buffer), then you can hash the bytes of the `str` followed
/// by a single `0xFF` byte.
///
/// If your `Hasher` works in chunks, you can also do this by being careful
/// about how you pad partial chunks. If the chunks are padded with `0x00`
/// bytes then just hashing an extra `0xFF` byte doesn't necessarily
/// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash
/// the same sequence of chunks. But if you pad with `0xFF` bytes instead,
/// ensuring at least one padding byte, then it can often provide
/// prefix-freedom cheaper than hashing the length would.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_str(&mut self, s: &str) {
self.write_length_prefix(s.len());
self.write(s.as_bytes());
}
}
```
With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`.
`write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach.
---
Compatibility:
Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Update `int_roundings` methods from feedback
This updates `#![feature(int_roundings)]` (#88581) from feedback. All methods now take `NonZeroX`. The documentation makes clear that they panic in debug mode and wrap in release mode.
r? `@joshtriplett`
`@rustbot` label +T-libs +T-libs-api +S-waiting-on-review
Add slice::remainder
This adds a remainder function to the Slice iterator, so that a caller can access unused
elements if iteration stops.
Addresses #91733
Faster parsing for lower numbers for radix up to 16 (cont.)
( Continuation of https://github.com/rust-lang/rust/pull/83371 )
With LingMan's change I think this is potentially ready.
Make non-power-of-two alignments a validity error in `Layout`
Inspired by the zulip conversation about how `Layout` should better enforce `size <= isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
It's left as `pub(crate)` here; a future PR could consider giving it a tracking issue for non-internal usage.
Inspired by the zulip conversation about how `Layout` should better enforce `size < isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
Implement provenance preserving methods on NonNull
### Description
Add the `addr`, `with_addr`, `map_addr` methods to the `NonNull` type, and map the address type to `NonZeroUsize`.
### Motivation
The `NonNull` type is useful for implementing pointer types which have the 0-niche. It is currently possible to implement these provenance preserving functions by calling `NonNull::as_ptr` and `new_unchecked`. The adding these methods makes it more ergonomic.
### Testing
Added a unit test of a non-null tagged pointer type. This is based on some real code I have elsewhere, that currently routes the pointer through a `NonZeroUsize` and back out to produce a usable pointer. I wanted to produce an ideal version of the same tagged pointer struct that preserved pointer provenance.
### Related
Extension of APIs proposed in #95228 . I can also split this out into a separate tracking issue if that is better (though I may need some pointers on how to do that).
Handle rustc_const_stable attribute in library feature collector
The library feature collector in [compiler/rustc_passes/src/lib_features.rs](551b4fa395/compiler/rustc_passes/src/lib_features.rs) has only been looking at `#[stable(…)]`, `#[unstable(…)]`, and `#[rustc_const_unstable(…)]` attributes, while ignoring `#[rustc_const_stable(…)]`. The consequences of this were:
- When any const feature got stabilized (changing one or more `rustc_const_unstable` to `rustc_const_stable`), users who had previously enabled that unstable feature using `#![feature(…)]` would get told "unknown feature", rather than rustc's nicer "the feature … has been stable since … and no longer requires an attribute to enable".
This can be seen in the way that https://github.com/rust-lang/rust/pull/93957#issuecomment-1079794660 failed after rebase:
```console
error[E0635]: unknown feature `const_ptr_offset`
--> $DIR/offset_from_ub.rs:1:35
|
LL | #![feature(const_ptr_offset_from, const_ptr_offset)]
| ^^^^^^^^^^^^^^^^
```
- We weren't enforcing that a particular feature is either stable everywhere or unstable everywhere, and that a feature that has been stabilized has the same stabilization version everywhere, both of which we enforce for the other stability attributes.
This PR updates the library feature collector to handle `rustc_const_stable`, and fixes places in the standard library and test suite where `rustc_const_stable` was being used in a way that does not meet the rules for a stability attribute.
skip slow int_log tests in Miri
Iterating over i16::MAX many things takes a long time in Miri, let's not do that.
I added https://github.com/rust-lang/miri/pull/2044 on the Miri side to still give us some test coverage.
**Description**
Add the `addr`, `with_addr, `map_addr` methods to the `NonNull` type,
and map the address type to `NonZeroUsize`.
**Motiviation**
The `NonNull` type is useful for implementing pointer types which have
the 0-niche. It is currently possible to implement these provenance
preserving functions by calling `NonNull::as_ptr` and `new_unchecked`.
The addition of these methods simply make it more ergonomic to use.
**Testing**
Added a unit test of a nonnull tagged pointer type. This is based on
some real code I have elsewhere, that currently routes the pointer
through a `NonZeroUsize` and back out to produce a usable pointer.
Derive Eq for std::cmp::Ordering, instead of using manual impl.
This allows consts of type Ordering to be used in patterns, and with feature(adt_const_params) allows using `Ordering` as a const generic parameter.
Currently, `std::cmp::Ordering` implements `Eq` using a manually written `impl Eq for Ordering {}`, instead of `derive(Eq)`. This means that it does not implement `StructuralEq`.
This commit removes the manually written impl, and adds `derive(Eq)` to `Ordering`, so that it will implement `StructuralEq`.
Let `try_collect` take advantage of `try_fold` overrides
No public API changes.
With this change, `try_collect` (#94047) is no longer going through the `impl Iterator for &mut impl Iterator`, and thus will be able to use `try_fold` overrides instead of being forced through `next` for every element.
Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56
This might as well go to the same person as my last `try_process` PR (#93572), so
r? ``@yaahc``
Enable conditional checking of values in the Rust codebase
This pull-request enable conditional checking of (well known) values in the Rust codebase.
Well known values were added in https://github.com/rust-lang/rust/pull/94362. All the `target_*` values are taken from all the built-in targets which is why some extra values were needed do be added as they are not (yet ?) defined in any built-in targets.
r? `@Mark-Simulacrum`
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
Add Iterator::collect_into
This PR adds `Iterator::collect_into` as proposed by ``@cormacrelf`` in #48597 (see https://github.com/rust-lang/rust/pull/48597#issuecomment-842083688).
Followup of #92982.
This adds the following method to the Iterator trait:
```rust
fn collect_into<E: Extend<Self::Item>>(self, collection: &mut E) -> &mut E
```
Implement `RawWaker` and `Waker` getters for underlying pointers
implement #87021
New APIs:
- `RawWaker::data(&self) -> *const ()`
- `RawWaker::vtable(&self) -> &'static RawWakerVTable`
- ~`Waker::as_raw_waker(&self) -> &RawWaker`~ `Waker::as_raw(&self) -> &RawWaker`
This third one is an auxiliary function to make the two APIs above more useful. Since we can only get `&Waker` in `Future::poll`, without this, we need to `transmute` it into `&RawWaker` (relying on `repr(transparent)`) in order to access its data/vtable pointers.
~Not sure if it should be named `as_raw` or `as_raw_waker`. Seems we always use `as_<something-raw>` instead of just `as_raw`. But `as_raw_waker` seems not quite consistent with `Waker::from_raw`.~ As suggested in https://github.com/rust-lang/rust/pull/91828#discussion_r770729837, use `as_raw`.
Carefully remove bounds checks from some chunk iterator functions
So, I was writing code that requires the equivalent of `rchunks(N).rev()` (which isn't the same as forward `chunks(N)` — in particular, if the buffer length is not a multiple of `N`, I must handle the "remainder" first).
I happened to look at the codegen output of the function (I was actually interested in whether or not a nested loop was being unrolled — it was), and noticed that in the outer `rchunks(n).rev()` loop, LLVM seemed to be unable to remove the bounds checks from the iteration: https://rust.godbolt.org/z/Tnz4MYY8f (this panic was from the split_at in `RChunks::next_back`).
After doing some experimentation, it seems all of the `next_back` in the non-exact chunk iterators have the issue: (`Chunks::next_back`, `RChunks::next_back`, `ChunksMut::next_back`, and `RChunksMut::next_back`)...
Even worse, the forward `rchunks` iterators sometimes have the issue as well (... but only sometimes). For example https://rust.godbolt.org/z/oGhbqv53r has bounds checks, but if I uncomment the loop body, it manages to remove the check (which is bizarre, since I'd expect the opposite...). I suspect it's highly dependent on the surrounding code, so I decided to remove the bounds checks from them anyway. Overall, this change includes:
- All `next_back` functions on the non-`Exact` iterators (e.g. `R?Chunks(Mut)?`).
- All `next` functions on the non-exact rchunks iterators (e.g. `RChunks(Mut)?`).
I wasn't able to catch any of the other chunk iterators failing to remove the bounds checks (I checked iterations over `r?chunks(_exact)?(_mut)?` with constant chunk sizes under `-O3`, `-Os`, and `-Oz`), which makes sense, since these were the cases where it was harder to prove the bounds check correct to remove...
In fact, it took quite a bit of thinking to convince myself that using unchecked_ here was valid — so I'm not really surprised that LLVM had trouble (although compilers are slightly better at this sort of reasoning than humans). A consequence of that is the fact that the `// SAFETY` comment for these are... kinda long...
---
I didn't do this for, or even think about it for, any of the other iteration methods; just `next` and `next_back` (where it mattered). If this PR is accepted, I'll file a follow up for someone (possibly me) to look at the others later (in particular, `nth`/`nth_back` looked like they had similar logic), but I wanted to do this now, as IMO `next`/`next_back` are the most important here, since they're what gets used by the iteration protocol.
---
Note: While I don't expect this to impact performance directly, the panic is a side effect, which would otherwise not exist in these loops. That is, this could prevent the compiler from being able to move/remove/otherwise rework a loop over these iterators (as an example, it could not delete the code for a loop whose body computes a value which doesn't get used).
Also, some like to be able to have confidence this code has no panicking branches in the optimized code, and "no bounds checks" is kinda part of the selling point of Rust's iterators anyway.
Remove deprecated and unstable slice_partition_at_index functions
They have been deprecated since commit 01ac5a97c9
which was part of the 1.49.0 release, so from the point of nightly,
11 releases ago.
Make `char::DecodeUtf16::size_hist` more precise
New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down.
Fixes#88762
Revival of #88763
Add `intrinsics::const_deallocate`
Tracking issue: #79597
Related: #91884
This allows deallocation of a memory allocated by `intrinsics::const_allocate`. At the moment, this can be only used to reduce memory usage, but in the future this may be useful to detect memory leaks (If an allocated memory remains after evaluation, raise an error...?).
impl Not for !
The lack of this impl caused trouble for me in some degenerate cases of macro-generated code of the form `if !$cond {...}`, even without `feature(never_type)` on a stable compiler. Namely if `$cond` contains a `return` or `break` or similar diverging expression, which would otherwise be perfectly legal in boolean position, the code previously failed to compile with:
```console
error[E0600]: cannot apply unary operator `!` to type `!`
--> library/core/tests/ops.rs:239:8
|
239 | if !return () {}
| ^^^^^^^^^^ cannot apply unary operator `!`
```
Simplification of BigNum::bit_length
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
Partially stabilize `maybe_uninit_extra`
This covers:
```rust
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
```
It does not cover the const-ness of `write` under `const_maybe_uninit_write` nor the const-ness of `assume_init_read` (this commit adds `const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
This covers:
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
It does not cover the const-ness of `write` under
`const_maybe_uninit_write` nor the const-ness of
`assume_init_read` (this commit adds
`const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
disable test with self-referential generator on Miri
Running the libcore test suite in Miri currently fails due to the known incompatibility of self-referential generators with Miri's aliasing checks (https://github.com/rust-lang/unsafe-code-guidelines/issues/148). So let's disable that test in Miri for now.
Allow reverse iteration of lowercase'd/uppercase'd chars
The PR implements `DoubleEndedIterator` trait for `ToLowercase` and `ToUppercase`.
This enables reverse iteration of lowercase/uppercase variants of character sequences.
One of use cases: determining whether a char sequence is a suffix of another one.
Example:
```rust
fn endswith_ignore_case(s1: &str, s2: &str) -> bool {
for eob in s1
.chars()
.flat_map(|c| c.to_lowercase())
.rev()
.zip_longest(s2.chars().flat_map(|c| c.to_lowercase()).rev())
{
match eob {
EitherOrBoth::Both(c1, c2) => {
if c1 != c2 {
return false;
}
}
EitherOrBoth::Left(_) => return true,
EitherOrBoth::Right(_) => return false,
}
}
true
}
```
Revert "Temporarily rename int_roundings functions to avoid conflicts"
This reverts commit 3ece63b64e.
This should be okay because #90329 has been merged.
r? `@joshtriplett`
Mark defaulted `PartialEq`/`PartialOrd` methods as const
WIthout it, `const` impls of these traits are unpleasant to write. I think this kind of change is allowed now. although it looks like it might require some Miri tweaks. Let's find out.
r? ```@fee1-dead```
Do array-slice equality via array equality, rather than always via slices
~~Draft because it needs a rebase after #91766 eventually gets through bors.~~
This enables the optimizations from #85828 to be used for array-to-slice comparisons too, not just array-to-array.
For example, <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=5f9ba69b3d5825a782f897c830d3a6aa>
```rust
pub fn demo(x: &[u8], y: [u8; 4]) -> bool {
*x == y
}
```
Currently writes the array to stack for no reason:
```nasm
sub rsp, 4
mov dword ptr [rsp], edx
cmp rsi, 4
jne .LBB0_1
mov eax, dword ptr [rdi]
cmp eax, dword ptr [rsp]
sete al
add rsp, 4
ret
.LBB0_1:
xor eax, eax
add rsp, 4
ret
```
Whereas with the change in this PR it just compares it directly:
```nasm
cmp rsi, 4
jne .LBB1_1
cmp dword ptr [rdi], edx
sete al
ret
.LBB1_1:
xor eax, eax
ret
```
Constify `bool::then{,_some}`
Note on `~const Drop`: it has no effect when called from runtime functions, when called from const contexts, the trait system ensures that the type can be dropped in const contexts.
This'll still go via slices eventually for large arrays, but this way slice comparisons to short arrays can use the same memcmp-avoidance tricks.
Added some tests for all the combinations to make sure I didn't accidentally infinitely-recurse something.
Minor improvements to `future::join!`'s implementation
This is a follow-up from #91645, regarding [some remarks I made](https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async-foundations/topic/join!/near/264293660).
Mainly:
- it hides the recursive munching through a private `macro`, to avoid leaking such details (a corollary is getting rid of the need to use ``@`` to disambiguate);
- it uses a `match` binding, _outside_ the `async move` block, to better match the semantics from function-like syntax;
- it pre-pins the future before calling into `poll_fn`, since `poll_fn`, alone, cannot guarantee that its capture does not move (to clarify: I believe the previous code was sound, thanks to the outer layer of `async`. But I find it clearer / more robust to refactorings this way 🙂).
- it uses `@ibraheemdev's` very neat `.ready()?`;
- it renames `Took` to `Taken` for consistency with `Done` (tiny nit 😄).
~~TODO~~Done:
- [x] Add unit tests to enforce the function-like `:value` semantics are respected.
r? `@nrc`
Sync portable-simd to remove autosplats
This PR syncs portable-simd in up to a8385522ad in order to address the type inference breakages documented on nightly in https://github.com/rust-lang/rust/issues/90904 by removing the vector + scalar binary operations (called "autosplats", "broadcasting", or "rank promotion", depending on who you ask) that allow `{scalar} + &'_ {scalar}` to fail in some cases, because it becomes possible the programmer may have meant `{scalar} + &'_ {vector}`.
A few quality-of-life improvements make their way in as well:
- Lane counts can now go to 64, as LLVM seems to have fixed their miscompilation for those.
- `{i,u}8x64` to `__m512i` is now available.
- a bunch of `#[must_use]` notes appear throughout the module.
- Some implementations, mostly instances of `impl core::ops::{Op}<Simd> for Simd` that aren't `{vector} + {vector}` (e.g. `{vector} + &'_ {vector}`), leverage some generics and `where` bounds now to make them easier to understand by reducing a dozen implementations into one (and make it possible for people to open the docs on less burly devices).
- And some internal-only improvements.
None of these changes should affect a beta backport, only actual users of `core::simd` (and most aren't even visible in the programmatic sense), though I can extract an even more minimal changeset for beta if necessary. It seemed simpler to just keep moving forward.
Make `array::{try_from_fn, try_map}` and `Iterator::try_find` generic over `Try`
Fixes#85115
This only updates unstable functions.
`array::try_map` didn't actually exist before; this adds it under the still-open tracking issue #79711 from the old PR #79713.
Tracking issue for the new trait: #91285
This would also solve the return type question in for the proposed `Iterator::try_reduce` in #87054
disable tests in Miri that take too long
Comparing slices of length `usize::MAX` diverges in Miri. In fact these tests even diverge in rustc unless `-O` is passed. I tried this code to check that:
```rust
#![feature(slice_take)]
const EMPTY_MAX: &'static [()] = &[(); usize::MAX];
fn main() {
let mut slice: &[_] = &[(); usize::MAX];
println!("1");
assert_eq!(Some(&[] as _), slice.take(usize::MAX..));
println!("2");
let remaining: &[_] = EMPTY_MAX;
println!("3");
assert_eq!(remaining, slice);
println!("4");
}
```
So, disable these tests in Miri for now.
Fixes 85115
This only updates unstable functions.
`array::try_map` didn't actually exist before, despite the tracking issue 79711 still being open from the old PR 79713.
Fix Iterator::advance_by contract inconsistency
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Methods that were only blocked on `const_panic` have been stabilized.
The remaining methods of `duration_consts_2` are all related to floats,
and as such have been placed behind the `duration_consts_float` feature
gate.
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Stabilize `const_raw_ptr_deref` for `*const T`
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is behind the
same feature gate as mutable references.
closes https://github.com/rust-lang/rust/issues/51911
pub use core::simd;
A portable abstraction over SIMD has been a major pursuit in recent years for several programming languages. In Rust, `std::arch` offers explicit SIMD acceleration via compiler intrinsics, but it does so at the cost of having to individually maintain each and every single such API, and is almost completely `unsafe` to use. `core::simd` offers safe abstractions that are resolved to the appropriate SIMD instructions by LLVM during compilation, including scalar instructions if that is all that is available.
`core::simd` is enabled by the `#![portable_simd]` nightly feature tracked in https://github.com/rust-lang/rust/issues/86656 and is introduced here by pulling in the https://github.com/rust-lang/portable-simd repository as a subtree. We built the repository out-of-tree to allow faster compilation and a stochastic test suite backed by the proptest crate to verify that different targets, features, and optimizations produce the same result, so that using this library does not introduce any surprises. As these tests are technically non-deterministic, and thus can introduce overly interesting Heisenbugs if included in the rustc CI, they are visible in the commit history of the subtree but do nothing here. Some tests **are** introduced via the documentation, but these use deterministic asserts.
There are multiple unsolved problems with the library at the current moment, including a want for better documentation, technical issues with LLVM scalarizing and lowering to libm, room for improvement for the APIs, and so far I have not added the necessary plumbing for allowing the more experimental or libm-dependent APIs to be used. However, I thought it would be prudent to open this for review in its current condition, as it is both usable and it is likely I am going to learn something else needs to be fixed when bors tries this out.
The major types are
- `core::simd::Simd<T, N>`
- `core::simd::Mask<T, N>`
There is also the `LaneCount` struct, which, together with the SimdElement and SupportedLaneCount traits, limit the implementation's maximum support to vectors we know will actually compile and provide supporting logic for bitmasks. I'm hoping to simplify at least some of these out of the way as the compiler and library evolve.
These tests just verify some basic APIs of core::simd function, and
guarantees that attempting to access the wrong things doesn't work.
The majority of tests are stochastic, and so remain upstream, but
a few deterministic tests arrive in the subtree as doc tests.
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is placed behind the
`const_raw_mut_ptr_deref` feature gate.
Add #[must_use] to expensive computations
The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite a bit. I'm open to wording changes.
For some reason clippy flagged four `BTreeSet` methods but didn't say boo about equivalent ones on `HashSet`. I stared at them for a while but I can't figure out the difference so I added the `HashSet` ones in.
```rust
// Flagged by clippy.
alloc::collections::btree_set::BTreeSet<T> fn difference<'a>(&'a self, other: &'a BTreeSet<T>) -> Difference<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn symmetric_difference<'a>(&'a self, other: &'a BTreeSet<T>) -> SymmetricDifference<'a, T>
alloc::collections::btree_set::BTreeSet<T> fn intersection<'a>(&'a self, other: &'a BTreeSet<T>) -> Intersection<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn union<'a>(&'a self, other: &'a BTreeSet<T>) -> Union<'a, T>;
// Ignored by clippy, but not by me.
std::collections::HashSet<T, S> fn difference<'a>(&'a self, other: &'a HashSet<T, S>) -> Difference<'a, T, S>;
std::collections::HashSet<T, S> fn symmetric_difference<'a>(&'a self, other: &'a HashSet<T, S>) -> SymmetricDifference<'a, T, S>
std::collections::HashSet<T, S> fn intersection<'a>(&'a self, other: &'a HashSet<T, S>) -> Intersection<'a, T, S>;
std::collections::HashSet<T, S> fn union<'a>(&'a self, other: &'a HashSet<T, S>) -> Union<'a, T, S>;
```
Parent issue: #89692
r? ```@joshtriplett```
Implement split_array and split_array_mut
This implements `[T]::split_array::<const N>() -> (&[T; N], &[T])` and `[T; N]::split_array::<const M>() -> (&[T; M], &[T])` and their mutable equivalents. These are another few “missing” array implementations now that const generics are a thing, similar to #74373, #75026, etc. Fixes#74674.
This implements `[T; N]::split_array` returning an array and a slice. Ultimately, this is probably not what we want, we would want the second return value to be an array of length N-M, which will likely be possible with future const generics enhancements. We need to implement the array method now though, to immediately shadow the slice method. This way, when the slice methods get stabilized, calling them on an array will not be automatic through coercion, so we won't have trouble stabilizing the array methods later (cf. into_iter debacle).
An unchecked version of `[T]::split_array` could also be added as in #76014. This would not be needed for `[T; N]::split_array` as that can be compile-time checked. Edit: actually, since split_at_unchecked is internal-only it could be changed to be split_array-only.
Make more `From` impls `const` (libcore)
Adding `const` to `From` implementations in the core. `rustc_const_unstable` attribute is not added to unstable implementations.
Tracking issue: #88674
<details>
<summary>Done</summary><div>
- `T` from `T`
- `T` from `!`
- `Option<T>` from `T`
- `Option<&T>` from `&Option<T>`
- `Option<&mut T>` from `&mut Option<T>`
- `Cell<T>` from `T`
- `RefCell<T>` from `T`
- `UnsafeCell<T>` from `T`
- `OnceCell<T>` from `T`
- `Poll<T>` from `T`
- `u32` from `char`
- `u64` from `char`
- `u128` from `char`
- `char` from `u8`
- `AtomicBool` from `bool`
- `AtomicPtr<T>` from `*mut T`
- `AtomicI(bits)` from `i(bits)`
- `AtomicU(bits)` from `u(bits)`
- `i(bits)` from `NonZeroI(bits)`
- `u(bits)` from `NonZeroU(bits)`
- `NonNull<T>` from `Unique<T>`
- `NonNull<T>` from `&T`
- `NonNull<T>` from `&mut T`
- `Unique<T>` from `&mut T`
- `Infallible` from `!`
- `TryIntError` from `!`
- `TryIntError` from `Infallible`
- `TryFromSliceError` from `Infallible`
- `FromResidual for Option<T>`
</div></details>
<details>
<summary>Remaining</summary><dev>
- `NonZero` from `NonZero`
These can't be made const at this time because these use Into::into.
https://github.com/rust-lang/rust/blob/master/library/core/src/convert/num.rs#L393
- `std`, `alloc`
There may still be many implementations that can be made `const`.
</div></details>
Automatic exponential formatting in Debug
Context: See [this comment from the libs team](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-853454204)
---
Makes `"{:?}"` switch to exponential for floats based on magnitude. The libs team suggested exploring this idea in the discussion thread for RFC rust-lang/rfcs#2729. (**note:** this is **not** an implementation of the RFC; it is an implementation of one of the alternatives)
Thresholds chosen were 1e-4 and 1e16. Justification described [here](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-864482954).
**This will require a crater run.**
---
As mentioned in the commit message of 8731d4dfb4, this behavior will not apply when a precision is supplied, because I wanted to preserve the following existing and useful behavior of `{:.PREC?}` (which recursively applies `{:.PREC}` to floats in a struct):
```rust
assert_eq!(
format!("{:.2?}", [100.0, 0.000004]),
"[100.00, 0.00]",
)
```
I looked around and am not sure where there are any tests that actually use this in the test suite, though?
All things considered, I'm surprised that this change did not seem to break even a single existing test in `x.py test --stage 2`. (even when I tried a smaller threshold of 1e6)
The unifying theme for this commit is weak, admittedly. I put together a
list of "expensive" functions when I originally proposed this whole
effort, but nobody's cared about that criterion. Still, it's a decent
way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite
a bit.
Speedup int log10 branchless
This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
```
Benchmark on an M1 Mac Mini:
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
```
The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm.
From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.
Add 'core::array::from_fn' and 'core::array::try_from_fn'
These auxiliary methods fill uninitialized arrays in a safe way and are particularly useful for elements that don't implement `Default`.
```rust
// Foo doesn't implement Default
struct Foo(usize);
let _array = core::array::from_fn::<_, _, 2>(|idx| Foo(idx));
```
Different from `FromIterator`, it is guaranteed that the array will be fully filled and no error regarding uninitialized state will be throw. In certain scenarios, however, the creation of an **element** can fail and that is why the `try_from_fn` function is also provided.
```rust
#[derive(Debug, PartialEq)]
enum SomeError {
Foo,
}
let array = core::array::try_from_fn(|i| Ok::<_, SomeError>(i));
assert_eq!(array, Ok([0, 1, 2, 3, 4]));
let another_array = core::array::try_from_fn(|_| Err(SomeError::Foo));
assert_eq!(another_array, Err(SomeError::Foo));
```