use strict provenance APIs
The stdlib was adjusted to avoid bare int2ptr casts, but recently some casts of that sort have sneaked back in. Let's fix that. :)
implement ptr.addr() via transmute
As per the discussion in https://github.com/rust-lang/unsafe-code-guidelines/issues/286, the semantics for ptr-to-int transmutes that we are going with for now is to make them strip provenance without exposing it. That's exactly what `ptr.addr()` does! So we can implement `ptr.addr()` via `transmute`. This also means that once https://github.com/rust-lang/rust/pull/97684 lands, Miri can distinguish `ptr.addr()` from `ptr.expose_addr()`, and the following code will correctly be called out as having UB (if permissive provenance mode is enabled, which will become the default once the [implementation is complete](https://github.com/rust-lang/miri/issues/2133)):
```rust
fn main() {
let x: i32 = 3;
let x_ptr = &x as *const i32;
let x_usize: usize = x_ptr.addr();
// Cast back an address that did *not* get exposed.
let ptr = std::ptr::from_exposed_addr::<i32>(x_usize);
assert_eq!(unsafe { *ptr }, 3); //~ ERROR Undefined Behavior: dereferencing pointer failed
}
```
This completes the Miri implementation of the new distinctions introduced by strict provenance. :)
Cc `@Gankra` -- for now I left in your `FIXME(strict_provenance_magic)` saying these should be intrinsics, but I do not necessarily agree that they should be. Or if we have an intrinsic, I think it should behave exactly like the `transmute` does, which makes one wonder why the intrinsic should be needed.
Stabilize `{slice,array}::from_ref`
This PR stabilizes the following APIs as `const` functions in Rust `1.63`:
```rust
// core::array
pub const fn from_ref<T>(s: &T) -> &[T; 1];
// core::slice
pub const fn from_ref<T>(s: &T) -> &[T];
```
Note that the `mut` versions are not stabilized as unique references (`&mut _`) are [unstable in const context].
FCP: https://github.com/rust-lang/rust/issues/90206#issuecomment-1134586665
r? rust-lang/libs-api `@rustbot` label +T-libs-api -T-libs
[unstable in const context]: https://github.com/rust-lang/rust/issues/57349
Additional `*mut [T]` methods
Split out from #94247
This adds the following methods to raw slices that already exist on regular slices
* `*mut [T]::is_empty`
* `*mut [T]::split_at_mut`
* `*mut [T]::split_at_mut_unchecked`
These methods reduce the amount of unsafe code needed to migrate `ChunksMut` and related iterators
to raw slices (#94247)
r? `@m-ou-se`
Corrected EBNF grammar for from_str
Hello! This is my first time contributing to an open-source project. I'm excited to have the chance to contribute to the rust community 🥳
I noticed an issue with the documentation for `from_str` in `f32` and `f64`. It states that "All strings that adhere to the following [EBNF](https://www.w3.org/TR/REC-xml/#sec-notation) grammar when lowercased will result in an `Ok` being returned. I believe this is incorrect for the string `"."`, which is valid for the given EBNF grammar, but does not result in an `Ok` being returned ([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=09f891aa87963a56d3b0d715d8cbc2b4)). I have simplified the grammar in a way which fixes that, but is otherwise identical.
Previously, the `Number` part of the EBNF grammar had an option for `'.' Digit*`, which would include the string `"."`. This is not valid, and does not return an Ok as stated. The corrected version removes this, and still allows for the `'.' Digit+` case with the already existing `Digit* '.' Digit+` case.
Add unicode fast path to `is_printable`
Before, it would enter the full expensive check even for normal ascii characters. Now, it skips the check for the ascii characters in `32..127`. This range was checked manually from the current behavior.
I ran the `tracing` test suite in miri, and it was really slow. I looked at a profile, and miri spent most of the time in `core::char::methods::escape_debug_ext`, where half of that was dominated by `core::unicode::printable::is_printable`. So I optimized it here.
The tracing profile:
![The tracing profile](https://user-images.githubusercontent.com/48135649/170883650-23876e7b-3fd1-4e8b-9001-47672e06d914.svg)
Before, it would enter the full expensive check even for normal ascii
characters. Now, it skips the check for the ascii characters in
`32..127`. This range was checked manually from the current behavior.
Use Box::new() instead of box syntax in library tests
The tests inside `library/*` have no reason to use `box` syntax as they have 0 performance relevance. Therefore, we can safely remove them (instead of having to use alternatives like the one in #97293).
Replace `#[default_method_body_is_const]` with `#[const_trait]`
pulled out of #96077
related issues: #67792 and #92158
cc `@fee1-dead`
This is groundwork to only allowing `impl const Trait` for traits that are marked with `#[const_trait]`. This is necessary to prevent adding a new default method from becoming a breaking change (as it could be a non-const fn).
Finish bumping stage0
It looks like the last time had left some remaining cfg's -- which made me think
that the stage0 bump was actually successful. This brings us to a released 1.62
beta though.
This now brings us to cfg-clean, with the exception of check-cfg-features in bootstrap;
I'd prefer to leave that for a separate PR at this time since it's likely to be more tricky.
cc https://github.com/rust-lang/rust/pull/97147#issuecomment-1132845061
r? `@pietroalbini`
ptr::invalid is not equivalent to a int2ptr cast
I just realized I forgot to update these docs when adding `from_exposed_addr`.
Right now the docs say `invalid` and `from_exposed_addr` are both equivalent to a cast, and that is clearly not what we want.
Cc ``@Gankra``
Previously, the `Number` part of the EBNF grammar had an option for `'.' Digit*`, which would include the string "." (a single decimal point). This is not valid, and does not return an Ok as stated. The corrected version removes this, and still allows for the `'.' Digit+` case with the already existing `Digit* '.' Digit+` case.
Partially stabilize `(const_)slice_ptr_len` feature by stabilizing `NonNull::len`
This PR partially stabilizes features `const_slice_ptr_len` and `slice_ptr_len` by only stabilizing `NonNull::len`. This partial stabilization is tracked under features `slice_ptr_len_nonnull` and `const_slice_ptr_len_nonnull`, for which this PR can serve as the tracking issue.
To summarize the discussion from #71146 leading up to this partial stabilization request:
It's currently a bit footgunny to obtain the length of a raw slice pointer, stabilization of `NonNull:len` will help with removing these footguns. Some example footguns are:
```rust
/// # Safety
/// The caller must ensure that `ptr`:
/// 1. does not point to memory that was previously allocated but is now deallocated;
/// 2. is within the bounds of a single allocated object;
/// 3. does not to point to a slice for which the length exceeds `isize::MAX` bytes;
/// 4. points to a properly aligned address;
/// 5. does not point to uninitialized memory;
/// 6. does not point to a mutably borrowed memory location.
pub unsafe fn ptr_len<T>(ptr: core::ptr::NonNull<[T]>) -> usize {
(&*ptr.as_ptr()).len()
}
```
A slightly less complicated version (but still more complicated than it needs to be):
```rust
/// # Safety
/// The caller must ensure that the start of `ptr`:
/// 1. does not point to memory that was previously allocated but is now deallocated;
/// 2. must be within the bounds of a single allocated object.
pub unsafe fn ptr_len<T>(ptr: NonNull<[T]>) -> usize {
(&*(ptr.as_ptr() as *const [()])).len()
}
```
This PR does not stabilize `<*const [T]>::len` and `<*mut [T]>::len` because the tracking issue #71146 list a potential blocker for these methods, but this blocker [does not apply](https://github.com/rust-lang/rust/issues/71146#issuecomment-808735714) to `NonNull::len`.
We should probably also ping the [Constant Evaluation WG](https://github.com/rust-lang/const-eval) since this PR includes a `#[rustc_allow_const_fn_unstable(const_slice_ptr_len)]`. My instinct here is that this will probably be okay because the pointer is not actually dereferenced and `len()` does not touch the address component of the pointer, but would be best to double check :)
One potential down-side was raised that stabilizing `NonNull::len` could lead to encouragement of coding patterns like:
```
pub fn ptr_len<T>(ptr: *mut [T]) -> usize {
NonNull::new(ptr).unwrap().len()
}
```
which unnecessarily assert non-nullness. However, these are much less of a footgun than the above examples and this should be resolved when `slice_ptr_len` fully stabilizes eventually.
It looks like the last time had left some remaining cfg's -- which made me think
that the stage0 bump was actually successful. This brings us to a released 1.62
beta though.
Add section on common message styles for Result::expect
Based on a question from https://github.com/rust-lang/project-error-handling/issues/50#issuecomment-1092339937
~~One thing I haven't decided on yet, should I duplicate this section on `Option::expect`, link to this section, or move it somewhere else and link to that location from both docs?~~: I ended up moving the section to `std::error` and referencing it from both `Result::expect` and `Option::expect`'s docs.
I think this section, when combined with the similar update I made on [`std::panic!`](https://doc.rust-lang.org/nightly/std/macro.panic.html#when-to-use-panic-vs-result) implies that we should possibly more aggressively encourage and support the "expect as precondition" style described in this section. The consensus among the libs team seems to be that panic should be used for bugs, not expected potential failure modes. The "expect as error message" style seems to align better with the panic for unrecoverable errors style where they're seen as normal errors where the only difference is a desire to kill the current execution unit (aka erlang style error handling). I'm wondering if we should be providing a panic hook similar to `human-panic` or more strongly recommending the "expect as precondition" style of expect message.
Extend ptr::null and null_mut to all thin (including extern) types
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions. The change should be backward-compatible, but it is **insta-stable** since it cannot (easily, at all?) be made available only through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`, which makes sense it terms of its signature but not in terms of its implementation. `dangling` uses `align_of()` as an address. But what `align_of()` should be for extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff, the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(const_fn_trait_bound)`
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
[RFC 2011] Library code
CC https://github.com/rust-lang/rust/pull/96496
Based on https://github.com/dtolnay/case-studies/tree/master/autoref-specialization.
Basically creates two traits with the same method name. One trait is generic over any `T` and the other is specialized to any `T: Printable`.
The compiler will then call the corresponding trait method through auto reference.
```rust
fn main() {
let mut a = Capture::new();
let mut b = Capture::new();
(&Wrapper(&1i32)).try_capture(&mut a); // `try_capture` from `TryCapturePrintable`
(&Wrapper(&vec![1i32])).try_capture(&mut b); // `try_capture` from `TryCaptureGeneric`
assert_eq!(format!("{:?}", a), "1");
assert_eq!(format!("{:?}", b), "N/A");
}
```
r? `@scottmcm`
Change orderings of `Debug` for the Atomic types to `Relaxed`.
This reduces synchronization between threads when debugging the atomic types. Reducing the synchronization means that executions with and without the debug calls will be more consistent, making it easier to debug.
We discussed this on the Rust Community Discord with `@ibraheemdev` before.
explain how to turn integers into fn ptrs
(with an intermediate raw ptr, not a direct transmute)
Direct int2ptr transmute, under the semantics I am imagining, will produce a ptr with "invalid" provenance that is invalid to deref or call. We cannot give it the same semantics as int2ptr casts since those do [something complicated](https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html).
To my great surprise, that is already what the example in the `transmute` docs does. :) I still added a comment to say that that part is important, and I added a section explicitly talking about this to the `fn()` type docs.
With https://github.com/rust-lang/miri/pull/2151, Miri will start complaining about direct int-to-fnptr transmutes (in the sense that it is UB to call the resulting pointer).
Document rounding for floating-point primitive operations and string parsing
The docs for floating point don't have much to say at present about either the precision of their results or rounding behaviour.
As I understand it[^1][^2], Rust doesn't support operating with non-default rounding directions, so we need only describe roundTiesToEven.
[^1]: https://github.com/rust-lang/rust/issues/41753#issuecomment-299322887
[^2]: https://github.com/llvm/llvm-project/issues/8472#issuecomment-980888781
This PR makes a start by documenting that for primitive operations and `from_str()`.
Clarify slice and Vec iteration order
While already being inferable from the doc examples, it wasn't fully specified. This is the only logical way to do a slice iterator, so I think this should be uncontroversial. It also improves the `Vec::into_iter` example to better show the order and that the iterator returns owned values.
Change `NonNull::as_uninit_*` to take self by value (as opposed to reference), matching primitive pointers.
Copied from my comment on [#75402](https://github.com/rust-lang/rust/issues/75402#issuecomment-1100496823):
> I noticed that `as_uninit_*` on pointers take `self` by value (and pointers are `Copy`), e.g. see [`as_uninit_mut`](https://doc.rust-lang.org/core/primitive.pointer.html#method.as_uninit_mut).
>
> However, on `NonNull`, these functions take `self` by reference, e.g. see the function with the same name by for `NonNull`: [`as_uninit_mut`](https://doc.rust-lang.org/std/ptr/struct.NonNull.html#method.as_uninit_mut) takes `self` by mutable reference. Even more inconsistent, [`as_uninit_slice_mut`](https://doc.rust-lang.org/std/ptr/struct.NonNull.html#method.as_uninit_slice_mut) returns a mutable reference, but takes `self` by immutable reference.
>
> I think these methods should take `self` by value for consistency. The returned lifetime is unbounded anyways and not tied to the pointer/NonNull value anyways
I realized the change is trivial (if desired) so here I am creating my first PR. I think it's not a breaking change since (it's on nightly and) `NonNull` is `Copy`; all previous usages of these methods taking `self` by reference should continue to compile. However, it might cause warnings to appear on usages of `NonNull::as_uninit_mut`, which used to require the the `NonNull` variable be declared `mut`, but now it's not necessary.
Fix `Display` for `cell::{Ref,RefMut}`
These guards changed to pointers in #97027, but their `Display` was
formatting that field directly, which made it show the raw pointer
value. Now we go through `Deref` to display the real value again.
Miri noticed this change, #97204, so hopefully that will be fixed.
Stabilize `array_from_fn`
## Overall
Stabilizes `core::array::from_fn` ~~and `core::array::try_from_fn`~~ to allow the creation of custom infallible ~~and fallible~~ arrays.
Signature proposed for stabilization here, tweaked as requested in the meeting:
```rust
// in core::array
pub fn from_fn<T, const N: usize, F>(_: F) -> [T; N];
```
Examples in https://doc.rust-lang.org/nightly/std/array/fn.from_fn.html
## History
* On 2020-08-17, implementation was [proposed](https://github.com/rust-lang/rust/pull/75644).
* On 2021-09-29, tracking issue was [created](https://github.com/rust-lang/rust/issues/89379).
* On 2021-10-09, the proposed implementation was [merged](bc8ad24020).
* On 2021-12-03, the return type of `try_from_fn` was [changed](https://github.com/rust-lang/rust/pull/91286#issuecomment-985513407).
## Considerations
* It is being assumed that indices are useful and shouldn't be removed from the callbacks
* The fact that `try_from_fn` returns an unstable type `R: Try` does not prevent stabilization. Although I'm honestly not sure about it.
* The addition or not of repeat-like variants is orthogonal to this PR.
These considerations are not ways of saying what is better or what is worse. In reality, they are an attempt to move things forward, anything really.
cc https://github.com/rust-lang/rust/issues/89379
Implement Copy, Clone, PartialEq and Eq for core::fmt::Alignment
Alignment is a fieldless exhaustive enum, so it is already possible to
clone and compare it by matching, but it is inconvenient to do so. For
example, if one would like to create a struct describing a formatter
configuration and provide a clone implementation:
```rust
pub struct Format {
fill: char,
width: Option<usize>,
align: fmt::Alignment,
}
impl Clone for Format {
fn clone(&self) -> Self {
Format {
align: match self.align {
fmt::Alignment::Left => fmt::Alignment::Left,
fmt::Alignment::Right => fmt::Alignment::Right,
fmt::Alignment::Center => fmt::Alignment::Center,
},
.. *self
}
}
}
```
Derive Copy, Clone, PartialEq, and Eq for Alignment for convenience.
make ptr::invalid not the same as a regular int2ptr cast
In Miri, we would like to distinguish `ptr::invalid` from `ptr::from_exposed_provenance`, so that we can provide better diagnostics issues like https://github.com/rust-lang/miri/issues/2134, and so that we can detect the UB in programs like
```rust
fn main() {
let x = 0u8;
let original_ptr = &x as *const u8;
let addr = original_ptr.expose_addr();
let new_ptr: *const u8 = core::ptr::invalid(addr);
unsafe {
dbg!(*new_ptr);
}
}
```
To achieve that, the two functions need to have different implementations. Currently, both are just `as` casts. We *could* add an intrinsic for this, but it turns out `transmute` already has the right behavior, at least as far as Miri is concerned. So I propose we just use that.
Cc `@Gankra`
Add implicit call to from_str via parse in documentation
The documentation mentions "FromStr’s from_str method is often used implicitly,
through str’s parse method. See parse’s documentation for examples.".
It may be nicer to show that in the code example as well.
These guards changed to pointers in #97027, but their `Display` was
formatting that field directly, which made it show the raw pointer
value. Now we go through `Deref` to display the real value again.
Say "last" instead of "rightmost" in the documentation for `std::str:rfind`
In the documentation comment for `std::str::rfind`, say "last" instead
of "rightmost" to describe the match that `rfind` finds. This follows the
spirit of #30459, for which `trim_left` and `trim_right` were replaced by
`trim_start` and `trim_end` to be more clear about how they work on
text which is displayed right-to-left.
Use pointers in `cell::{Ref,RefMut}` to avoid `noalias`
When `Ref` and `RefMut` were based on references, they would get LLVM `noalias` attributes that were incorrect, because that alias guarantee is only true until the guard drops. A `&RefCell` on the same value can get a new borrow that aliases the previous guard, possibly leading to miscompilation. Using `NonNull` pointers in `Ref` and `RefCell` avoids `noalias`.
Fixes the library side of #63787, but we still might want to explore language solutions there.
In the documentation comment for `std::str::rfind`, say "last" instead
of "rightmost" to describe the match that `rfind` finds. This follows the
spirit of #30459, for which `trim_left` and `trim_right` were replaced by
`trim_start` and `trim_end` to be more clear about how they work on
text which is displayed right-to-left.
The documentation mentions "FromStr’s from_str method is often used implicitly,
through str’s parse method. See parse’s documentation for examples.".
It may be nicer to show that in the code example as well.
Remove potentially misleading realloc parenthetical
This parenthetical is problematic, because it suggests that the following is sound:
```rust
let layout = Layout:🆕:<[u8; 32]>();
let p1 = alloc(layout);
let p2 = realloc(p1, layout, 32);
if p1 == p2 {
p1.write([0; 32]);
dealloc(p1, layout);
} else {
dealloc(p2, layout);
}
```
At the very least, this isn't the case for [ANSI `realloc`](https://en.cppreference.com/w/c/memory/realloc)
> The original pointer `ptr` is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
and [Windows `HeapReAlloc`](https://docs.microsoft.com/en-us/windows/win32/api/heapapi/nf-heapapi-heaprealloc) is unclear at best (`HEAP_REALLOC_IN_PLACE_ONLY`'s description may imply that the old pointer may be used if `HEAP_REALLOC_IN_PLACE_ONLY` is provided).
The conservative position is to just remove the parenthetical.
cc `@rust-lang/wg-unsafe-code-guidelines` `@rust-lang/wg-allocators`
This reduces synchronization between threads when
debugging the atomic types.
Reducing the synchronization means that executions with and
without the debug calls will be more consistent,
making it easier to debug.
Fixes https://github.com/rust-lang/rust/issues/93959
This change was accepted in https://rust-lang.github.io/rfcs/2580-ptr-meta.html
Note that this changes the signature of **stable** functions.
The change should be backward-compatible, but it is **insta-stable**
since it cannot (easily, at all?) be made available only
through a `#![feature(…)]` opt-in.
The RFC also proposed the same change for `NonNull::dangling`,
which makes sense it terms of its signature but not in terms of its implementation.
`dangling` uses `align_of()` as an address. But what `align_of()` should be for
extern types or whether it should be allowed at all remains an open question.
This commit depends on https://github.com/rust-lang/rust/pull/93977, which is not yet
part of the bootstrap compiler. So `#[cfg]` is used to only apply the change in
stage 1+. As far a I know bounds cannot be made conditional with `#[cfg]`, so the
entire functions are duplicated. This is unfortunate but temporary.
Since this duplication makes it less obvious in the diff,
the new definitions differ in:
* More permissive bounds (`Thin` instead of implied `Sized`)
* Different implementation
* Having `rustc_allow_const_fn_unstable(ptr_metadata)`
Expand core::hint::unreachable_unchecked() docs
Rework the docs for `unreachable_unchecked`, encouraging deliberate use, and providing a better example for action at a distance.
Fixes#95865
Apparently LLVM is unable to understand that if count_ones() == 1 then self != 0.
Adding `assume(align != 0)` helps generating better asm:
https://rust.godbolt.org/z/ja18YKq91
Since they work on byte pointers (by `.cast::<u8>()`ing them), there is
no need to know the size of `T` and so there is no need for `T: Sized`.
The `is_aligned_to` is similar, though it doesn't need the _alignment_
of `T`.
Like we have `add`/`sub` which are the `usize` version of `offset`, this adds the `usize` equivalent of `offset_from`. Like how `.add(d)` replaced a whole bunch of `.offset(d as isize)`, you can see from the changes here that it's fairly common that code actually knows the order between the pointers and *wants* a `usize`, not an `isize`.
As a bonus, this can do `sub nuw`+`udiv exact`, rather than `sub`+`sdiv exact`, which can be optimized slightly better because it doesn't have to worry about negatives. That's why the slice iterators weren't using `offset_from`, though I haven't updated that code in this PR because slices are so perf-critical that I'll do it as its own change.
This is an intrinsic, like `offset_from`, so that it can eventually be allowed in CTFE. It also allows checking the extra safety condition -- see the test confirming that CTFE catches it if you pass the pointers in the wrong order.
Warn on unused `#[doc(hidden)]` attributes on trait impl items
[Zulip conversation](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/.E2.9C.94.20Validy.20checks.20for.20.60.23.5Bdoc.28hidden.29.5D.60).
Whether an associated item in a trait impl is shown or hidden in the documentation entirely depends on the corresponding item in the trait declaration. Rustdoc completely ignores `#[doc(hidden)]` attributes on impl items. No error or warning is emitted:
```rust
pub trait Tr { fn f(); }
pub struct Ty;
impl Tr for Ty { #[doc(hidden)] fn f() {} }
// ^^^^^^^^^^^^^^ ignored by rustdoc and currently
// no error or warning issued
```
This may lead users to the wrong belief that the attribute has an effect. In fact, several such cases are found in the standard library (I've removed all of them in this PR).
There does not seem to exist any incentive to allow this in the future either: Impl'ing a trait for a type means the type *fully* conforms to its API. Users can add `#[doc(hidden)]` to the whole impl if they want to hide the implementation or add the attribute to the corresponding associated item in the trait declaration to hide the specific item. Hiding an implementation of an associated item does not make much sense: The associated item can still be found on the trait page.
This PR emits the warn-by-default lint `unused_attribute` for this case with a future-incompat warning.
`@rustbot` label T-compiler T-rustdoc A-lint
Improve floating point documentation
This is my attempt to improve/solve https://github.com/rust-lang/rust/issues/95468 and https://github.com/rust-lang/rust/issues/73328 .
Added/refined explanations:
- Refine the "NaN as a special value" top level explanation of f32
- Refine `const NAN` docstring: add an explanation about there being multitude of NaN bitpatterns and disclaimer about the portability/stability guarantees.
- Refine `fn is_sign_positive` and `fn is_sign_negative` docstrings: add disclaimer about the sign bit of NaNs.
- Refine `fn min` and `fn max` docstrings: explain the semantics and their relationship to the standard and libm better.
- Refine `fn trunc` docstrings: explain the semantics slightly more.
- Refine `fn powi` docstrings: add disclaimer that the rounding behaviour might be different from `powf`.
- Refine `fn copysign` docstrings: add disclaimer about payloads of NaNs.
- Refine `minimum` and `maximum`: add disclaimer that "propagating NaN" doesn't mean that propagating the NaN bit patterns is guaranteed.
- Refine `max` and `min` docstrings: add "ignoring NaN" to bring the one-row explanation to parity with `minimum` and `maximum`.
Cosmetic changes:
- Reword `NaN` and `NAN` as plain "NaN", unless they refer to the specific `const NAN`.
- Reword "a number" to `self` in function docstrings to clarify.
- Remove "Returns NAN if the number is NAN" from `abs`, as this is told to be the default behavior in the top explanation.
Remove `#[rustc_deprecated]`
This removes `#[rustc_deprecated]` and introduces diagnostics to help users to the right direction (that being `#[deprecated]`). All uses of `#[rustc_deprecated]` have been converted. CI is expected to fail initially; this requires #95958, which includes converting `stdarch`.
I plan on following up in a short while (maybe a bootstrap cycle?) removing the diagnostics, as they're only intended to be short-term.
Further elaborate the lack of guarantees from `Hasher`
I realized that I got too excited in #94598 by adding new methods, and forgot to do the documentation to really answer the core question in #94026.
This PR just has that doc update.
r? `@Amanieu`
Add more diagnostic items
This just adds a handful diagnostic items I noticed were missing.
Would it be worth doing this for all of the remaining types? I'm willing to do it if it'd be helpful.
Link to correct `as_mut` in docs for `pointer::as_ref`
It previously linked to the unstable const-mut-cast method instead of
the `mut` counterpart for `as_ref`.
Closes#96327
Add a dedicated length-prefixing method to `Hasher`
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Fixes#94026
r? rust-lang/libs
---
The core of this change is the following two new methods on `Hasher`:
```rust
pub trait Hasher {
/// Writes a length prefix into this hasher, as part of being prefix-free.
///
/// If you're implementing [`Hash`] for a custom collection, call this before
/// writing its contents to this `Hasher`. That way
/// `(collection![1, 2, 3], collection![4, 5])` and
/// `(collection![1, 2], collection![3, 4, 5])` will provide different
/// sequences of values to the `Hasher`
///
/// The `impl<T> Hash for [T]` includes a call to this method, so if you're
/// hashing a slice (or array or vector) via its `Hash::hash` method,
/// you should **not** call this yourself.
///
/// This method is only for providing domain separation. If you want to
/// hash a `usize` that represents part of the *data*, then it's important
/// that you pass it to [`Hasher::write_usize`] instead of to this method.
///
/// # Examples
///
/// ```
/// #![feature(hasher_prefixfree_extras)]
/// # // Stubs to make the `impl` below pass the compiler
/// # struct MyCollection<T>(Option<T>);
/// # impl<T> MyCollection<T> {
/// # fn len(&self) -> usize { todo!() }
/// # }
/// # impl<'a, T> IntoIterator for &'a MyCollection<T> {
/// # type Item = T;
/// # type IntoIter = std::iter::Empty<T>;
/// # fn into_iter(self) -> Self::IntoIter { todo!() }
/// # }
///
/// use std:#️⃣:{Hash, Hasher};
/// impl<T: Hash> Hash for MyCollection<T> {
/// fn hash<H: Hasher>(&self, state: &mut H) {
/// state.write_length_prefix(self.len());
/// for elt in self {
/// elt.hash(state);
/// }
/// }
/// }
/// ```
///
/// # Note to Implementers
///
/// If you've decided that your `Hasher` is willing to be susceptible to
/// Hash-DoS attacks, then you might consider skipping hashing some or all
/// of the `len` provided in the name of increased performance.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_length_prefix(&mut self, len: usize) {
self.write_usize(len);
}
/// Writes a single `str` into this hasher.
///
/// If you're implementing [`Hash`], you generally do not need to call this,
/// as the `impl Hash for str` does, so you can just use that.
///
/// This includes the domain separator for prefix-freedom, so you should
/// **not** call `Self::write_length_prefix` before calling this.
///
/// # Note to Implementers
///
/// The default implementation of this method includes a call to
/// [`Self::write_length_prefix`], so if your implementation of `Hasher`
/// doesn't care about prefix-freedom and you've thus overridden
/// that method to do nothing, there's no need to override this one.
///
/// This method is available to be overridden separately from the others
/// as `str` being UTF-8 means that it never contains `0xFF` bytes, which
/// can be used to provide prefix-freedom cheaper than hashing a length.
///
/// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating
/// them into a buffer), then you can hash the bytes of the `str` followed
/// by a single `0xFF` byte.
///
/// If your `Hasher` works in chunks, you can also do this by being careful
/// about how you pad partial chunks. If the chunks are padded with `0x00`
/// bytes then just hashing an extra `0xFF` byte doesn't necessarily
/// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash
/// the same sequence of chunks. But if you pad with `0xFF` bytes instead,
/// ensuring at least one padding byte, then it can often provide
/// prefix-freedom cheaper than hashing the length would.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_str(&mut self, s: &str) {
self.write_length_prefix(s.len());
self.write(s.as_bytes());
}
}
```
With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`.
`write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach.
---
Compatibility:
Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
We might want to change the default before stabilizing (or maybe even after), but for getting in the new unstable methods, leave it as-is for now. That way it won't break cargo and such.
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Fix typo in `offset_from` documentation
Small fix for what I think is a typo in the `offset_from` documentation.
Someone reading this may understand that the distance in bytes is obtained by dividing the distance by `mem::size_of::<T>()`, but here we just want to define "units of T" in terms of bytes (i.e., units of T == bytes / `mem::size_of::<T>()`).
Update `int_roundings` methods from feedback
This updates `#![feature(int_roundings)]` (#88581) from feedback. All methods now take `NonZeroX`. The documentation makes clear that they panic in debug mode and wrap in release mode.
r? `@joshtriplett`
`@rustbot` label +T-libs +T-libs-api +S-waiting-on-review
Include nonexported macro_rules! macros in the doctest target
Fixes#88038
This PR aims to include nonexported `macro_rules!` macros in the doctest target. For more details, please see the above issue.
Avoid using `rand::thread_rng` in the stdlib benchmarks.
This is kind of an anti-pattern because it introduces extra nondeterminism for no real reason. In thread_rng's case this comes both from the random seed and also from the reseeding operations it does, which occasionally does syscalls (which adds additional nondeterminism). The impact of this would be pretty small in most cases, but it's a good practice to avoid (particularly because avoiding it was not hard).
Anyway, several of our benchmarks already did the right thing here anyway, so the change was pretty easy and mostly just applying it more universally. That said, the stdlib benchmarks aren't particularly stable (nor is our benchmark framework particularly great), so arguably this doesn't matter that much in practice.
~~Anyway, this also bumps the `rand` dev-dependency to 0.8, since it had fallen somewhat out of date.~~ Nevermind, too much of a headache.
library/core: Fixes implement of c_uint, c_long, c_ulong
Fixes: aa67016624 ("make memcmp return a value of c_int_width instead of i32")
Introduce c_num_definition to getting the cfg_if logic easier to maintain
Add newlines for easier code reading
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
It improves the performance of iterators using unchecked access when building in incremental mode
(due to the larger CGU count?). It might negatively affect incremental compile times for better runtime results,
but considering that the equivalent `next()` implementations also are `#[inline]` and usually are more complex this
should be ok.
```
./x.py bench library/core -i --stage 0 --test-args bench_trusted_random_access
OLD: 119,172 ns/iter
NEW: 17,714 ns/iter
```
rustdoc: Resolve doc links referring to `macro_rules` items
cc https://github.com/rust-lang/rust/issues/81633
UPD: the fallback to considering *all* `macro_rules` in the crate for unresolved names is not removed in this PR, it will be removed separately and will be run through crater.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
Fix some confusing wording and improve slice-search-related docs
This adds more links between `contains` and `binary_search` because I do think they have some relevant connections. If your (big) slice happens to be sorted and you know it, surely you should be using `[3; 100].binary_search(&5).is_ok()` over `[3; 100].contains(&5)`?
This also fixes the confusing "searches this sorted X" wording which just sounds really weird because it doesn't know whether it's actually sorted. It should be but it may not be. The new wording should make it clearer that you will probably want to sort it and in the same sentence it also mentions the related function `contains`.
Similarly, this mentions `binary_search` on `contains`' docs.
This also fixes some other minor stuff and inconsistencies.
Unstably constify `impl<I: Iterator> IntoIterator for I`
This constifies the default `IntoIterator` implementation under the `const_intoiterator_identity` feature.
Tracking Issue: #90603
No "weird" floats in const fn {from,to}_bits
I suspect this code is subtly incorrect and that we don't even e.g. use x87-style floats in CTFE, so I don't have to guard against that case. A future PR will be hopefully removing them from concern entirely, anyways. But at the moment I wanted to get this rolling because small questions like that one seem best answered by review.
r? `@oli-obk`
cc `@eddyb` `@thomcc`
Fixes: aa67016624 ("make memcmp return a value of c_int_width instead of i32")
Introduce c_num_definition to getting the cfg_if logic easier to maintain
Add newlines for easier code reading
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Add slice::remainder
This adds a remainder function to the Slice iterator, so that a caller can access unused
elements if iteration stops.
Addresses #91733
Make some `usize`-typed masks definitions agnostic to the size of `usize`
Some masks where defined as
```rust
const NONASCII_MASK: usize = 0x80808080_80808080u64 as usize;
```
where it was assumed that `usize` is never wider than 64, which is currently true.
To make those constants valid in a hypothetical 128-bit target, these constants have been redefined in an `usize`-width-agnostic way
```rust
const NONASCII_MASK: usize = usize::from_ne_bytes([0x80; size_of::<usize>()]);
```
There are already some cases where Rust anticipates the possibility of supporting 128-bit targets, such as not implementing `From<usize>` for `u64`.
docs: add link from zip to unzip
The docs for `Iterator::unzip` explain that it is kind of an inverse operation to `Iterator::zip` and guide the reader to the `zip` docs, but the `zip` docs don't let the user know that they can undo the `zip` operation with `unzip`. This change modifies the docs to help the user find `unzip`.
reference to taking `self` by value. This is consistent with the methods
of the same names on primitive pointers. The returned lifetime was
already previously unbounded.
Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
Stabilize `derive_default_enum`
This stabilizes `#![feature(derive_default_enum)]`, as proposed in [RFC 3107](https://github.com/rust-lang/rfcs/pull/3107) and tracked in #87517. In short, it permits you to `#[derive(Default)]` on `enum`s, indicating what the default should be by placing a `#[default]` attribute on the desired variant (which must be a unit variant in the interest of forward compatibility).
```````@rustbot``````` label +S-waiting-on-review +T-lang
Some masks where defined as
```rust
const NONASCII_MASK: usize = 0x80808080_80808080u64 as usize;
```
where it was assumed that `usize` is never wider than 64, which is currently true.
To make those constants valid in a hypothetical 128-bit target, these constants have been redefined in an `usize`-width-agnostic way
```rust
const NONASCII_MASK: usize = usize::from_ne_bytes([0x80; size_of::<usize>()]);
```
There are already some cases where Rust anticipates the possibility of supporting 128-bit targets, such as not implementing `From<usize>` for `u64`.
The docs for `Iterator::unzip` explain that it is kind of an inverse operation to `Iterator::zip` and guide the reader to the `zip` docs, but the `zip` docs don't let the user know that they can undo the `zip` operation with `unzip`. This change modifies the docs to help the user find `unzip`.
Implement tuples using recursion
Because it is c00l3r™, requires less repetition and can be used as a reference for external people.
This change is non-essential and I am not sure about potential performance impacts so feel free to close this PR if desired.
r? `@petrochenkov`
Faster parsing for lower numbers for radix up to 16 (cont.)
( Continuation of https://github.com/rust-lang/rust/pull/83371 )
With LingMan's change I think this is potentially ready.
Clarify str::from_utf8_unchecked's invariants
Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8."
This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do.
If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't.
Zulip discussion: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/str.3A.3Afrom_utf8_unchecked.20Safety.20requirement
Update binary_search example to instead redirect to partition_point
Inspired by discussion in the tracking issue for `Result::into_ok_or_err`: https://github.com/rust-lang/rust/issues/82223#issuecomment-1067098167
People are surprised by us not providing a `Result<T, T> -> T` conversion, and the main culprit for this confusion seems to be the `binary_search` API. We should instead redirect people to the equivalent API that implicitly does that `Result<T, T> -> T` conversion internally which should obviate the need for the `into_ok_or_err` function and give us time to work towards a more general solution that applies to all enums rather than just `Result` such as making or_patterns usable for situations like this via postfix `match`.
I choose to duplicate the example rather than simply moving it from `binary_search` to partition point because most of the confusion seems to arise when people are looking at `binary_search`. It makes sense to me to have the example presented immediately rather than requiring people to click through to even realize there is an example. If I had to put it in only one place I'd leave it in `binary_search` and remove it from `partition_point` but it seems pretty obviously relevant to `partition_point` so I figured the best option would be to duplicate it.
Specifically, make it clear that it is immediately UB to pass ill-formed UTF-8 into the function. The previous wording left space to interpret that the UB only occurred when calling another function, which "assumes that `&str`s are valid UTF-8."
This does not change whether str being UTF-8 is a safety or a validity invariant. (As per previous discussion, it is a safety invariant, not a validity invariant.) It just makes it clear that valid UTF-8 is a precondition of str::from_utf8_unchecked, and that emitting an Abstract Machine fault (e.g. UB or a sanitizer error) on invalid UTF-8 is a valid thing to do.
If user code wants to create an unsafe `&str` pointing to ill-formed UTF-8, it must be done via transmutes. Also, just, don't.
Avoid duplication of doc comments in `std::char` constants and functions
For those consts and functions, only the summary is kept and a reference to the `char` associated const/method is included.
Additionaly, re-exported functions have been converted to function definitions that call the previously re-exported function. This makes it easier to add a deprecated attribute to these functions in the future.
Use bitwise XOR in to_ascii_uppercase
This saves an instruction compared to the previous approach, which
was to unset the fifth bit with bitwise OR.
Comparison of generated assembly on x86: https://godbolt.org/z/GdfvdGs39
This can also affect autovectorization, saving SIMD instructions as well: https://godbolt.org/z/cnPcz75T9
Not sure if `u8::to_ascii_lowercase` should also be changed, since using bitwise OR for that function does not require an extra bitwise negate since the code is setting a bit rather than unsetting a bit. `char::to_ascii_uppercase` already uses XOR, so no change seems to be required there.
Left overs of #95761
These are just nits. Feel free to close this PR if all modifications are not worth merging.
* `#![feature(decl_macro)]` is not needed anymore in `rustc_expand`
* `tuple_impls` does not require `$Tuple:ident`. I guess it is there to enhance readability?
r? ```@petrochenkov```
Make non-power-of-two alignments a validity error in `Layout`
Inspired by the zulip conversation about how `Layout` should better enforce `size <= isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
It's left as `pub(crate)` here; a future PR could consider giving it a tracking issue for non-internal usage.
Inspired by the zulip conversation about how `Layout` should better enforce `size < isize::MAX as usize`, this uses an N-variant enum on N-bit platforms to require at the validity level that the existing invariant of "must be a power of two" is upheld.
This was MIRI can catch it, and means there's a more-specific type for `Layout` to store than just `NonZeroUsize`.
Correct safety reasoning in `str::make_ascii_{lower,upper}case()`
I don't understand why the previous comment was used (it was inserted in #66564), but it doesn't explain why these functions are safe, only why `str::as_bytes{_mut}()` are safe.
If someone thinks they make perfect sense, I'm fine with closing this PR.
Bump bootstrap compiler to 1.61.0 beta
This PR bumps the bootstrap compiler to the 1.61.0 beta. The first commit changes the stage0 compiler, the second commit applies the "mechanical" changes and the third and fourth commits apply changes explained in the relevant comments.
r? `@Mark-Simulacrum`