Custom MIR: Support `BinOp::Offset`
Since offset doesn't have an infix operator, a new function `Offset` is added which is lowered to `Rvalue::BinaryOp(BinOp::Offset, ..)`
r? ```@oli-obk``` or ```@tmiasko``` or ```@JakobDegen```
Improve the floating point parser in dec2flt.
Greetings everyone,
I've benn studying the rust floating point parser recently and made the following tweaks:
* Remove all remaining traces of `unsafe`. The parser is now 100% safe Rust.
* The trick in which eight digits are processed in parallel is now in a loop.
* Parsing of inf/NaN values has been reworked.
On my system, the changes result in performance improvements for some input values.
To avoid link time dependency between core and compiler-builtins, when
using opt-level that implicitly enables -Zshare-generics.
While compiler-builtins should be compiled with -Zshare-generics
disabled, the -Zbuild-std does not ensure this at the moment.
Change core::char::{EscapeUnicode, EscapeDefault and EscapeDebug}
structures from using a state machine to computing escaped sequence
upfront and during iteration just going through the characters.
This is arguably simpler since it’s easier to think about having
a buffer and start..end range to iterate over rather than thinking
about a state machine.
This also harmonises implementation of aforementioned iterators and
core::ascii::EscapeDefault struct. This is done by introducing a new
helper EscapeIterInner struct which holds the buffer and offers simple
methods for iterating over range.
As a side effect, this probably optimises Display implementation for
those types since rather than calling write_char repeatedly, write_str
is invoked once. On 64-bit platforms, it also reduces size of some of
the structs:
| Struct | Before | After |
|----------------------------+--------+-------+
| core::char::EscapeUnicode | 16 | 12 |
| core::char::EscapeDefault | 16 | 12 |
| core::char::EscapeDebug | 16 | 16 |
My ulterior motive and reason why I started looking into this is
addition of as_str method to the iterators. With this change this
will became trivial. It’s also going to be trivial to implement
DoubleEndedIterator if that’s ever desired.
Improve grammar of Iterator.partition_in_place
This is my first PR against Rust, please let me know if there's anything I should be providing here! I didn't find any instructions specific to documentation grammar in the [std-dev guide](https://std-dev-guide.rust-lang.org/documentation/summary.html).
Optimize `LazyCell` size
`LazyCell` can only store either the initializing function or the data it produces, so it does not need to reserve the space for both. Similar to #107329, but uses an `enum` instead of a `union`.
Move `doc(primitive)` future incompat warning to `invalid_doc_attributes`
Fixes#88070.
It's been a while since this was turned into a "future incompatible lint" so I think we can now turn it into a hard error without problem.
r? `@jyn514`
Insert alignment checks for pointer dereferences when debug assertions are enabled
Closes https://github.com/rust-lang/rust/issues/54915
- [x] Jake tells me this sounds like a place to use `MirPatch`, but I can't figure out how to insert a new basic block with a new terminator in the middle of an existing basic block, using `MirPatch`. (if nobody else backs up this point I'm checking this as "not actually a good idea" because the code looks pretty clean to me after rearranging it a bit)
- [x] Using `CastKind::PointerExposeAddress` is definitely wrong, we don't want to expose. Calling a function to get the pointer address seems quite excessive. ~I'll see if I can add a new `CastKind`.~ `CastKind::Transmute` to the rescue!
- [x] Implement a more helpful panic message like slice bounds checking.
r? `@oli-obk`
Rollup of 7 pull requests
Successful merges:
- #106985 (Enhanced doucmentation of binary search methods for `slice` and `VecDeque` for unsorted instances)
- #109509 (compiletest: Don't allow tests with overlapping prefix names)
- #109719 (RELEASES: Add "Only support Android NDK 25 or newer" to 1.68.0)
- #109748 (Don't ICE on `DiscriminantKind` projection in new solver)
- #109749 (Canonicalize float var as float in new solver)
- #109761 (Drop binutils on powerpc-unknown-freebsd)
- #109766 (Fix title for openharmony.md)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Enhanced doucmentation of binary search methods for `slice` and `VecDeque` for unsorted instances
Fixes#106746. Issue #106746 raises the concern that the binary search methods for slices and deques aren't explicit enough about the fact that they are only applicable to sorted slices/deques. I changed the explanation for these methods. I took the relatively harsh description of the behaviour of binary search on unsorted collections ("unspecified and meaningless") from the description of the [`partition_point`](https://doc.rust-lang.org/std/primitive.slice.html#method.partition_point) method:
> If this slice is not partitioned, the returned result is unspecified and meaningless, as this method performs a kind of binary search.
Rollup of 8 pull requests
Successful merges:
- #91793 (socket ancillary data implementation for FreeBSD (from 13 and above).)
- #92284 (Change advance(_back)_by to return the remainder instead of the number of processed elements)
- #102472 (stop special-casing `'static` in evaluation)
- #108480 (Use Rayon's TLV directly)
- #109321 (Erase impl regions when checking for impossible to eagerly monomorphize items)
- #109470 (Correctly substitute GAT's type used in `normalize_param_env` in `check_type_bounds`)
- #109562 (Update ar_archive_writer to 0.1.3)
- #109629 (remove obsolete `givens` from regionck)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Add a builtin `FnPtr` trait that is implemented for all function pointers
r? `@ghost`
Rebased version of https://github.com/rust-lang/rust/pull/99531 (plus adjustments mentioned in the PR).
If perf is happy with this version, I would like to land it, even if the diagnostics fix in 9df8e1befb5031a5bf9d8dfe25170620642d3c59 only works for `FnPtr` specifically, and does not generally improve blanket impls.
Change advance(_back)_by to return the remainder instead of the number of processed elements
When advance_by can't advance the iterator by the number of requested elements it now returns the amount by which it couldn't be advanced instead of the amount by which it did.
This simplifies adapters like chain, flatten or cycle because the remainder doesn't have to be calculated as the difference between requested steps and completed steps anymore.
Additionally switching from `Result<(), usize>` to `Result<(), NonZeroUsize>` reduces the size of the result and makes converting from/to a usize representing the number of remaining steps cheap.
Add `#[inline]` to CStr trait implementations
Fixes#109674
I noticed other usages of traits on `CStr` weren't being inlined, so also added hints to the other implementations
A successful advance is now signalled by returning `0` and other values now represent the remaining number
of steps that couldn't be advanced as opposed to the amount of steps that have been advanced during a partial advance_by.
This simplifies adapters a bit, replacing some `match`/`if` with arithmetic. Whether this is beneficial overall depends
on whether `advance_by` is mostly used as a building-block for other iterator methods and adapters or whether
we also see uses by users where `Result` might be more useful.
Stabilize `nonnull_slice_from_raw_parts`
FCP is done: https://github.com/rust-lang/rust/issues/71941#issuecomment-1100910416
Note that this doesn't const-stabilize `NonNull::slice_from_raw_parts` as `slice_from_raw_parts_mut` isn't const-stabilized yet. Given #67456 and #57349, it's not likely available soon, meanwhile, stabilizing only the feature makes some sense, I think.
Closes#71941
Add #[inline] to as_deref
While working on https://github.com/rust-lang/rust/pull/109247 I found an `as_deref` call in the compiler that should have been inlined. This fixes the missing inlining (but doesn't address the perf issues I was chasing).
r? `@thomcc`
Clarify that copied allocators must behave the same
Currently, the safety documentation for `Allocator` says that a cloned or moved allocator must behave the same as the original. However, it does not specify that a copied allocator must behave the same, and it's possible to construct an allocator that permits being moved or cloned, but sometimes produces a new allocator when copied.
<details>
<summary>Contrived example which results in a Miri error</summary>
```rust
#![feature(allocator_api, once_cell, strict_provenance)]
use std::{
alloc::{AllocError, Allocator, Global, Layout},
collections::HashMap,
hint,
marker::PhantomPinned,
num::NonZeroUsize,
pin::Pin,
ptr::{addr_of, NonNull},
sync::{LazyLock, Mutex},
};
mod source_allocator {
use super::*;
// `SourceAllocator` has 3 states:
// - invalid value: is_cloned == false, source != self.addr()
// - source value: is_cloned == false, source == self.addr()
// - cloned value: is_cloned == true
pub struct SourceAllocator {
is_cloned: bool,
source: usize,
_pin: PhantomPinned,
}
impl SourceAllocator {
// Returns a pinned source value (pointing to itself).
pub fn new_source() -> Pin<Box<Self>> {
let mut b = Box::new(Self {
is_cloned: false,
source: 0,
_pin: PhantomPinned,
});
b.source = b.addr();
Box::into_pin(b)
}
fn addr(&self) -> usize {
addr_of!(*self).addr()
}
// Invalid values point to source 0.
// Source values point to themselves.
// Cloned values point to their corresponding source.
fn source(&self) -> usize {
if self.is_cloned || self.addr() == self.source {
self.source
} else {
0
}
}
}
// Copying an invalid value produces an invalid value.
// Copying a source value produces an invalid value.
// Copying a cloned value produces a cloned value with the same source.
impl Copy for SourceAllocator {}
// Cloning an invalid value produces an invalid value.
// Cloning a source value produces a cloned value with that source.
// Cloning a cloned value produces a cloned value with the same source.
impl Clone for SourceAllocator {
fn clone(&self) -> Self {
if self.is_cloned || self.addr() != self.source {
*self
} else {
Self {
is_cloned: true,
source: self.source,
_pin: PhantomPinned,
}
}
}
}
static SOURCE_MAP: LazyLock<Mutex<HashMap<NonZeroUsize, usize>>> =
LazyLock::new(Default::default);
// SAFETY: Wraps `Global`'s methods with additional tracking.
// All invalid values share blocks with each other.
// Each source value shares blocks with all cloned values pointing to it.
// Cloning an allocator always produces a compatible allocator:
// - Cloning an invalid value produces another invalid value.
// - Cloning a source value produces a cloned value pointing to it.
// - Cloning a cloned value produces another cloned value with the same source.
// Moving an allocator always produces a compatible allocator:
// - Invalid values remain invalid when moved.
// - Source values cannot be moved, since they are always pinned to the heap.
// - Cloned values keep the same source when moved.
unsafe impl Allocator for SourceAllocator {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
let mut map = SOURCE_MAP.lock().unwrap();
let block = Global.allocate(layout)?;
let block_addr = block.cast::<u8>().addr();
map.insert(block_addr, self.source());
Ok(block)
}
unsafe fn deallocate(&self, block: NonNull<u8>, layout: Layout) {
let mut map = SOURCE_MAP.lock().unwrap();
let block_addr = block.addr();
// SAFETY: `block` came from an allocator that shares blocks with this allocator.
if map.remove(&block_addr) != Some(self.source()) {
hint::unreachable_unchecked()
}
Global.deallocate(block, layout)
}
}
}
use source_allocator::SourceAllocator;
// SAFETY: `alloc1` and `alloc2` must share blocks.
unsafe fn test_same(alloc1: &SourceAllocator, alloc2: &SourceAllocator) {
let ptr = alloc1.allocate(Layout:🆕:<i32>()).unwrap();
alloc2.deallocate(ptr.cast(), Layout:🆕:<i32>());
}
fn main() {
let orig = &*SourceAllocator::new_source();
let orig_cloned1 = &orig.clone();
let orig_cloned2 = &orig.clone();
let copied = &{ *orig };
let copied_cloned1 = &copied.clone();
let copied_cloned2 = &copied.clone();
unsafe {
test_same(orig, orig_cloned1);
test_same(orig_cloned1, orig_cloned2);
test_same(copied, copied_cloned1);
test_same(copied_cloned1, copied_cloned2);
test_same(orig, copied); // error
}
}
```
</details>
This could result in issues in the future for algorithms that specialize on `Copy` types. Right now, nothing in the standard library that depends on `Allocator + Clone` is susceptible to this issue, but I still think it would make sense to specify that copying an allocator is always as valid as cloning it.
Implement Default for some alloc/core iterators
Add `Default` impls to the following collection iterators:
* slice::{Iter, IterMut}
* binary_heap::IntoIter
* btree::map::{Iter, IterMut, Keys, Values, Range, IntoIter, IntoKeys, IntoValues}
* btree::set::{Iter, IntoIter, Range}
* linked_list::IntoIter
* vec::IntoIter
and these adapters:
* adapters::{Chain, Cloned, Copied, Rev, Enumerate, Flatten, Fuse, Rev}
For iterators which are generic over allocators it only implements it for the global allocator because we can't conjure an allocator from nothing or would have to turn the allocator field into an `Option` just for this change.
These changes will be insta-stable.
ACP: https://github.com/rust-lang/libs-team/issues/77
Add #[inline] to the Into for From impl
I was skimming through the standard library MIR and I noticed a handful of very suspicious `Into::into` calls in `alloc`. ~Since this is a trivial wrapper function, `#[inline(always)]` seems appropriate.;~ `#[inline]` works too and is a lot less spooky.
r? `@thomcc`
Shrink unicode case-mapping LUTs by 24k
I was looking into the binary bloat of a small program using `str::to_lowercase` and `str::to_uppercase`, and noticed that the lookup tables used for case mapping had a lot of zero-bytes in them. The reason for this is that since some characters map to up to three other characters when lower or uppercased, the LUTs store a `[char; 3]` for each character. However, the vast majority of cases only map to a single new character, in other words most of the entries are e.g. `(lowerc, [upperc, '\0', '\0'])`.
This PR introduces a new encoding scheme for these tables.
The changes reduces the size of my test binary by about 24K.
I've also done some `#[bench]`marks on unicode-heavy test data, and found that the performance of both `str::to_lowercase` and `str::to_uppercase` improves by up to 20%. These measurements are obviously very dependent on the character distribution of the data.
Someone else will have to decide whether this more complex scheme is worth it or not, I was just goofing around a bit and here's what came out of it 🤷♂️ No hard feelings if this isn't wanted!
panic_immediate_abort requires abort as a panic strategy
Guide `panic_immediate_abort` users away from `-Cpanic=unwind` and towards `-Cpanic=abort` to avoid an accidental use of the feature with the unwind strategy, e.g., on a targets where unwind is the default.
The `-Cpanic=unwind` combination doesn't offer the same benefits, since the code would still be generated under the assumption that functions implemented in Rust can unwind.
Updates `interpret`, `codegen_ssa`, and `codegen_cranelift` to consume the new cast instead of the intrinsic.
Includes `CastTransmute` for custom MIR building, to be able to test the extra UB.
Custom MIR: Allow optional RET type annotation
This currently doesn't compile because the type of `RET` is inferred, which fails if RET is a composite type and fields are initialised separately.
```rust
#![feature(custom_mir, core_intrinsics)]
extern crate core;
use core::intrinsics::mir::*;
#[custom_mir(dialect = "runtime", phase = "optimized")]
fn fn0() -> (i32, bool) {
mir! ({
RET.0 = 0;
RET.1 = true;
Return()
})
}
```
```
error[E0282]: type annotations needed
--> src/lib.rs:8:9
|
8 | RET.0 = 0;
| ^^^ cannot infer type
For more information about this error, try `rustc --explain E0282`.
```
This PR allows the user to manually specify the return type with `type RET = ...;` if required:
```rust
#[custom_mir(dialect = "runtime", phase = "optimized")]
fn fn0() -> (i32, bool) {
mir! (
type RET = (i32, bool);
{
RET.0 = 0;
RET.1 = true;
Return()
}
)
}
```
The syntax is not optimal, I'm happy to see other suggestions. Ideally I wanted it to be a normal type annotation like `let RET: ...;`, but this runs into the multiple parsing options error during macro expansion, as it can be parsed as a normal `let` declaration as well.
r? ```@oli-obk``` or ```@tmiasko``` or ```@JakobDegen```
move Option::as_slice to intrinsic
````@scottmcm```` suggested on #109095 I use a direct approach of unpacking the operation in MIR lowering, so here's the implementation.
cc ````@nikic```` as this should hopefully unblock #107224 (though perhaps other changes to the prior implementation, which I left for bootstrapping, are needed).
Document `Iterator::sum/product` for Option/Result
Closes#105266
We already document the similar behavior for `collect()` so I believe it makes sense to add this too. The Option/Result implementations *are* documented on their respective pages and the page for `Sum`, but buried amongst many other trait impls which doesn't make it very discoverable.
`````@rustbot````` label +A-docs
Add inlining annotations in `dec2flt`.
Currently, the combination of `dec2flt` being generic and the `FromStr` implementaions
containing inline anttributes causes massive amounts of assembly to be generated whenever
these implementation are used. In addition, the assembly has calls to function which ought to
be inlined, but they are not (even when using lto).
This Pr fixes this.
Improve `Iterator::collect_into` documentation
This improves the examples in the documentation of `Iterator::collect_into`, replacing the usages of `println!` with `assert_eq!` as suggested on [IRLO](https://internals.rust-lang.org/t/18534/9).
Beautify pin! docs
This makes pin docs a little bit less jargon-y and easier to read, by
* splitting up the sentences
* making them less interrupted by punctuation
* turning the footnotes into paragraphs, as they contain useful information that shouldn't be hidden in footnotes. Footnotes also interrupt the read flow.
Use `size_of_val` instead of manual calculation
Very minor thing that I happened to notice in passing, but it's both shorter and [means it gets `mul nsw`](https://rust.godbolt.org/z/Y9KxYETv5), so why not.
The indices are encoded as `u32`s in the range of invalid `char`s, so
that we know that if any mapping fails to parse as a `char` we should
use the value for lookup in the multi-table.
This avoids the second binary search in cases where a multi-`char`
mapping is needed.
Idea from @nikic
This makes pin docs a little bit less jargon-y and easier to read, by
* splitting up the sentences
* making them less interrupted by punctuation
* turning the footnotes into paragraphs, as they contain useful information
that shouldn't be hidden in footnotes. Footnotes also interrupt the read flow.
* other improvements and simplifications
Flatten/inline format_args!() and (string and int) literal arguments into format_args!()
Implements https://github.com/rust-lang/rust/issues/78356
Gated behind `-Zflatten-format-args=yes`.
Part of #99012
This change inlines string literals, integer literals and nested format_args!() into format_args!() during ast lowering, making all of the following pairs result in equivalent hir:
```rust
println!("Hello, {}!", "World");
println!("Hello, World!");
```
```rust
println!("[info] {}", format_args!("error"));
println!("[info] error");
```
```rust
println!("[{}] {}", status, format_args!("error: {}", msg));
println!("[{}] error: {}", status, msg);
```
```rust
println!("{} + {} = {}", 1, 2, 1 + 2);
println!("1 + 2 = {}", 1 + 2);
```
And so on.
This is useful for macros. E.g. a `log::info!()` macro could just pass the tokens from the user directly into a `format_args!()` that gets efficiently flattened/inlined into a `format_args!("info: {}")`.
It also means that `dbg!(x)` will have its file, line, and expression name inlined:
```rust
eprintln!("[{}:{}] {} = {:#?}", file!(), line!(), stringify!(x), x); // before
eprintln!("[example.rs:1] x = {:#?}", x); // after
```
Which can be nice in some cases, but also means a lot more unique static strings than before if dbg!() is used a lot.
The majority of char case replacements are single char replacements,
so storing them as [char; 3] wastes a lot of space.
This commit splits the replacement tables for both `to_lower` and
`to_upper` into two separate tables, one with single-character mappings
and one with multi-character mappings.
This reduces the binary size for programs using all of these tables
with roughly 24K bytes.
Since ascii chars are already handled by a special case in the
`to_lower` and `to_upper` functions, there's no need to waste space on
them in the LUTs.
Ensure `ptr::read` gets all the same LLVM `load` metadata that dereferencing does
I was looking into `array::IntoIter` optimization, and noticed that it wasn't annotating the loads with `noundef` for simple things like `array::IntoIter<i32, N>`. Trying to narrow it down, it seems that was because `MaybeUninit::assume_init_read` isn't marking the load as initialized (<https://rust.godbolt.org/z/Mxd8TPTnv>), which is unfortunate since that's basically its reason to exist.
The root cause is that `ptr::read` is currently implemented via the *untyped* `copy_nonoverlapping`, and thus the `load` doesn't get any type-aware metadata: no `noundef`, no `!range`. This PR solves that by lowering `ptr::read(p)` to `copy *p` in MIR, for which the backends already do the right thing.
Fortuitiously, this also improves the IR we give to LLVM for things like `mem::replace`, and fixes a couple of long-standing bugs where `ptr::read` on `Copy` types was worse than `*`ing them.
Zulip conversation: <https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Move.20array.3A.3AIntoIter.20to.20ManuallyDrop/near/341189936>
cc `@erikdesjardins` `@JakobDegen` `@workingjubilee` `@the8472`
Fixes#106369Fixes#73258
Remove `identity_future` indirection
This was previously needed because the indirection used to hide some unexplained lifetime errors, which it turned out were related to the `min_choice` algorithm.
Removing the indirection also solves a couple of cycle errors, large moves and makes async blocks support the `#[track_caller]`annotation.
Fixes https://github.com/rust-lang/rust/issues/104826.
Stabilize `atomic_as_ptr`
Fixes#66893
This stabilizes the `as_ptr` methods for atomics. The stabilization feature gate used here is `atomic_as_ptr` which supersedes `atomic_mut_ptr` to match the change in https://github.com/rust-lang/rust/pull/107736.
This needs FCP.
New stable API:
```rust
impl AtomicBool {
pub const fn as_ptr(&self) -> *mut bool;
}
impl AtomicI32 {
pub const fn as_ptr(&self) -> *mut i32;
}
// Includes all other atomic types
impl<T> AtomicPtr<T> {
pub const fn as_ptr(&self) -> *mut *mut T;
}
```
r? libs-api
``@rustbot`` label +needs-fcp
Move `Option::as_slice` to an always-sound implementation
This approach depends on CSE to not have any branches or selects when the guessed offset is correct -- which it always will be right now -- but to also be *sound* (just less efficient) if the layout algorithms change such that the guess is incorrect.
The codegen test confirms that CSE handles this as expected, leaving the optimal codegen.
cc JakobDegen #108545
This approach depends on CSE to not have any branches or selects when the guessed offset is correct -- which it always will be right now -- but to also be *sound* (just less efficient) if the layout algorithms change such that the guess is incorrect.
I was looking into `array::IntoIter` optimization, and noticed that it wasn't annotating the loads with `noundef` for simple things like `array::IntoIter<i32, N>`.
Turned out to be a more general problem as `MaybeUninit::assume_init_read` isn't marking the load as initialized (<https://rust.godbolt.org/z/Mxd8TPTnv>), which is unfortunate since that's basically its reason to exist.
This PR lowers `ptr::read(p)` to `copy *p` in MIR, which fortuitiously also improves the IR we give to LLVM for things like `mem::replace`.
Stabilize `nonzero_min_max`
## Overall
Stabilizes `nonzero_min_max` to allow the "infallible" construction of ordinary minimum and maximum `NonZero*` instances.
The feature is fairly straightforward and already matured for some time in stable toolchains.
```rust
let _ = NonZeroU8::MIN;
let _ = NonZeroI32::MAX;
```
## History
* On 2022-01-25, implementation was [created](https://github.com/rust-lang/rust/pull/93293).
## Considerations
* This report is fruit of the inanition observed after two unsuccessful attempts at getting feedback.
* Other constant variants discussed at https://github.com/rust-lang/rust/issues/89065#issuecomment-923238190 are orthogonal to this feature.
Fixes https://github.com/rust-lang/rust/issues/89065
Fix the docs for pointer method with_metadata_of
The name of the argument to `{*const T, *mut T}::with_metadata_of` was changed from `val` to `meta` recently, but the docs weren't updated to match.
Relevant pull request: #103701
Rollup of 8 pull requests
Successful merges:
- #108754 (Retry `pred_known_to_hold_modulo_regions` with fulfillment if ambiguous)
- #108759 (1.41.1 supported 32-bit Apple targets)
- #108839 (Canonicalize root var when making response from new solver)
- #108856 (Remove DropAndReplace terminator)
- #108882 (Tweak E0740)
- #108898 (Set `LIBC_CHECK_CFG=1` when building Rust code in bootstrap)
- #108911 (Improve rustdoc-gui/tester.js code a bit)
- #108916 (Remove an unused return value in `rustc_hir_typeck`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Guide `panic_immediate_abort` users away from `-Cpanic=unwind` and
towards `-Cpanic=abort` to avoid an accidental use of the feature with
the unwind strategy, e.g., on a targets where unwind is the default.
The `-Cpanic=unwind` combination doesn't offer the same benefits, since
the code would still be generated under the assumption that functions
implemented in Rust can unwind.
This was previously needed because the indirection used to hide some unexplained lifetime errors, which it turned out were related to the `min_choice` algorithm.
Removing the indirection also solves a couple of cycle errors, large moves and makes async blocks support the `#[track_caller]` annotation.
Use `nuw` when calculating slice lengths from `Range`s
An `assume` would definitely not be worth it, but since the flag is almost free we might as well tell LLVM this, especially on `_unchecked` calls where there's no obvious way for it to deduce it.
(Today neither safe nor unsafe indexing gets it: <https://rust.godbolt.org/z/G1jYT548s>)
Add `round_ties_even` to `f32` and `f64`
Tracking issue: #96710
Redux of #82273. See also #55107
Adds a new method, `round_ties_even`, to `f32` and `f64`, that rounds the float to the nearest integer , rounding halfway cases to the number with an even least significant bit. Uses the `roundeven` LLVM intrinsic to do this.
Of the five IEEE 754 rounding modes, this is the only one that doesn't already have a round-to-integer function exposed by Rust (others are `round`, `floor`, `ceil`, and `trunc`). Ties-to-even is also the rounding mode used for int-to-float and float-to-float `as` casts, as well as float arithmentic operations. So not having an explicit rounding method for it seems like an oversight.
Bikeshed: this PR currently uses `round_ties_even` for the name of the method. But maybe `round_ties_to_even` is better, or `round_even`, or `round_to_even`?
An `assume` would definitely not be worth it, but since the flag is almost free we might as well tell LLVM this, especially on `_unchecked` calls where there's no obvious way for it to deduce it.
(Today neither safe nor unsafe indexing gets it: <https://rust.godbolt.org/z/G1jYT548s>)
Use `partial_cmp` to implement tuple `lt`/`le`/`ge`/`gt`
In today's implementation, `(A, B)::gt` contains calls to *both* `A::eq` *and* `A::gt`.
That's fine for primitives, but for things like `String`s it's kinda weird -- `(String, usize)::gt` has a call to both `bcmp` and `memcmp` (<https://rust.godbolt.org/z/7jbbPMesf>) because when `bcmp` says the `String`s aren't equal, it turns around and calls `memcmp` to find out which one's bigger.
This PR changes the implementation to instead implement `(A, …, C, Z)::gt` using `A::partial_cmp`, `…::partial_cmp`, `C::partial_cmp`, and `Z::gt`. (And analogously for `lt`, `le`, and `ge`.) That way expensive comparisons don't need to be repeated.
Technically this is an observable change on stable, so I've marked it `needs-fcp` + `T-libs-api` and will
r? rust-lang/libs-api
I'm hoping that this will be non-controversial, however, since it's very similar to the observable changes that were made to the derives (#81384#98655) -- like those, this only changes behaviour if a type overrode behaviour in a way inconsistent with the rules for the various traits involved.
(The first commit here is #108156, adding the codegen test, which I used to make sure this doesn't regress behaviour for primitives.)
Zulip conversation about this change: <https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/.60.3E.60.20on.20Tuples/near/328392927>.
Match unmatched backticks in library/
Found with GNU grep:
```
grep -rEn '^(([^`]*`){2})*[^`]*`[^`]*$' library/ | rg -v '\s*[//]?.{1,2}```'
```
split out from #108685 as per advice.
Add `Atomic*::from_ptr`
This PR adds functions in the following form to all atomic types:
```rust
impl AtomicT {
pub const unsafe fn from_ptr<'a>(ptr: *mut T) -> &'a AtomicT;
}
```
r? `@m-ou-se` (we've talked about it before)
I'm not sure about docs & safety requirements, I'd appreciate some feedback on them.
Add support for QNX Neutrino to standard library
This change:
- adds standard library support for QNX Neutrino (7.1).
- upgrades `libc` to version `0.2.139` which supports QNX Neutrino
`@gh-tr`
⚠️ Backtraces on QNX require https://github.com/rust-lang/backtrace-rs/pull/507 which is not yet merged! (But everything else works without these changes) ⚠️
Tested mainly with a x86_64 virtual machine (see qnx-nto.md) and partially with an aarch64 hardware (some tests fail due to constrained resources).
Merge two different equality specialization traits in `core`
Arrays and slices each had their own version of this, without a matching set of `impl`s.
Merge them into one (still-`pub(crate)`) `cmp::BytewiseEq` trait, so we can stop doing all these things twice.
And that means that the `[T]::eq` → `memcmp` specialization picks up a bunch of types where that previously only worked for arrays, so examples like <https://rust.godbolt.org/z/KjsG8MGGT> will use it now instead of emitting loops.
r? the8472
Add `Option::as_`(`mut_`)`slice`
This adds the following functions:
* `Option<T>::as_slice(&self) -> &[T]`
* `Option<T>::as_mut_slice(&mut self) -> &[T]`
The `as_slice` and `as_mut_slice_mut` functions benefit from an optimization that makes them completely branch-free. ~~Unfortunately, this optimization is not available on by-value Options, therefore the `into_slice` implementations use the plain `match` + `slice::from_ref` approach.~~
Note that the optimization's soundness hinges on the fact that either the niche optimization makes the offset of the `Some(_)` contents zero or the mempory layout of `Option<T>` is equal to that of `Option<MaybeUninit<T>>`.
The idea has been discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Option.3A.3Aas_slice). Notably the idea for the `as_slice_mut` and `into_slice´ methods came from `@cuviper` and `@Sp00ph` hardened the optimization against niche-optimized Options.
The [rust playground](https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=74f8e4239a19f454c183aaf7b4a969e0) shows that the generated assembly of the optimized method is basically only a copy while the naive method generates code containing a `test dx, dx` on x86_64.
---
EDIT from reviewer: ACP is https://github.com/rust-lang/libs-team/issues/150
add missing feature in core/tests
https://github.com/rust-lang/rust/pull/104265 introduced the `ip_in_core` feature. For some reason core tests seem to still build without that feature -- no idea how that is possible. Might be related to https://github.com/rust-lang/rust/issues/15702? I was under the impression that `pub use` with different stability doesn't actually work. That's why `intrinsics::transmute` is stable, for example.
Either way, core tests fail to build in miri-test-libstd, and adding the feature fixes that.
r? ```@thomcc```
This adds the following functions:
* `Option<T>::as_slice(&self) -> &[T]`
* `Option<T>::as_slice_mut(&mut self) -> &[T]`
The `as_slice` and `as_slice_mut` functions benefit from an
optimization that makes them completely branch-free.
Note that the optimization's soundness hinges on the fact that either
the niche optimization makes the offset of the `Some(_)` contents zero
or the mempory layout of `Option<T>` is equal to that of
`Option<MaybeUninit<T>>`.
Inline `Poll` methods
With `opt-level="z"`, the `Poll::map*` methods are sometimes not inlined (see <https://godbolt.org/z/ca5ajKTEK>). This PR adds `#[inline]` to these methods. I have a project that can benefit from this change, but do we want to enable this behavior universally?
Fixes#101080.
Stabilize `#![feature(target_feature_11)]`
## Stabilization report
### Summary
Allows for safe functions to be marked with `#[target_feature]` attributes.
Functions marked with `#[target_feature]` are generally considered as unsafe functions: they are unsafe to call, cannot be assigned to safe function pointers, and don't implement the `Fn*` traits.
However, calling them from other `#[target_feature]` functions with a superset of features is safe.
```rust
// Demonstration function
#[target_feature(enable = "avx2")]
fn avx2() {}
fn foo() {
// Calling `avx2` here is unsafe, as we must ensure
// that AVX is available first.
unsafe {
avx2();
}
}
#[target_feature(enable = "avx2")]
fn bar() {
// Calling `avx2` here is safe.
avx2();
}
```
### Test cases
Tests for this feature can be found in [`src/test/ui/rfcs/rfc-2396-target_feature-11/`](b67ba9ba20/src/test/ui/rfcs/rfc-2396-target_feature-11/).
### Edge cases
- https://github.com/rust-lang/rust/issues/73631
Closures defined inside functions marked with `#[target_feature]` inherit the target features of their parent function. They can still be assigned to safe function pointers and implement the appropriate `Fn*` traits.
```rust
#[target_feature(enable = "avx2")]
fn qux() {
let my_closure = || avx2(); // this call to `avx2` is safe
let f: fn() = my_closure;
}
```
This means that in order to call a function with `#[target_feature]`, you must show that the target-feature is available while the function executes *and* for as long as whatever may escape from that function lives.
### Documentation
- Reference: https://github.com/rust-lang/reference/pull/1181
---
cc tracking issue #69098
r? `@ghost`
Move IpAddr, SocketAddr and V4+V6 related types to `core`
Implements RFC https://github.com/rust-lang/rfcs/pull/2832. The RFC has completed FCP with disposition merge, but is not yet merged.
Moves IP types to `core` as specified in the RFC.
The full list of moved types is: `IpAddr`, `Ipv4Addr`, `Ipv6Addr`, `SocketAddr`, `SocketAddrV4`, `SocketAddrV6`, `Ipv6MulticastScope` and `AddrParseError`.
Doing this move was one of the main driving arguments behind #78802.
Remove `from` lang item
It was probably a leftover from the old `?` desugaring but anyways, it's unused now except for clippy, which can just use a diagnostics item.
Require `literal`s for some `(u)int_impl!` parameters
The point of these is to be seen *lexically* in the docs, so they should always be passed as the correct literal, not as an expression.
(Otherwise we could just compute `Min`/`Max` from `BITS`, for example.)
r? Nilstrieb
Optimize break patterns
Use `wyrand` instead of calling `XORSHIFT` 2 times in break patterns for the 64-bit platform. The new PRNG is 2x faster than the previous one.
Bench result(via https://gist.github.com/zhangyunhao116/11ef41a150f5c23bb47d86255fbeba89):
```
old time: [1.3258 ns 1.3262 ns 1.3266 ns]
change: [+0.5901% +0.6731% +0.7791%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) high mild
6 (6.00%) high severe
new time: [657.65 ps 657.89 ps 658.18 ps]
change: [-1.6910% -1.6110% -1.5256%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
```
implement const iterator using `rustc_do_not_const_check`
Previous experiment: #102225.
Explanation: rather than making all default methods work under `const` all at once, this uses `rustc_do_not_const_check` as a workaround to "trick" the compiler to not run any checks on those other default methods. Any const implementations are only required to implement the `next` method. Any actual calls to the trait methods other than `next` will either error in compile time (at CTFE runs), or run the methods correctly if they do not have any non-const operations. This is extremely easy to maintain, remove, or improve.
The point of these is to be seen lexically in the docs, so they should always be passed as the correct literal, not as an expression.
(Otherwise we could just compute `Min`/`Max` from `BITS`, for example.)
Rename atomic 'as_mut_ptr' to 'as_ptr' to match Cell (ref #66893)
Originally discussed in https://github.com/rust-lang/rust/issues/66893#issuecomment-1419198623
~~This uses #107706 as a base to avoid a merge conflict once that gets rolled up (so disregard const changes in the diff until it does)~~ all merged & rebased
`@rustbot` label +T-libs-api
r? m-ou-se
Document that CStr::as_ptr returns a type alias
Rustdoc resolves type aliases too eagerly #15823 which makes the [std re-export](https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html#method.as_ptr) of `CStr::as_ptr` show `i8` instead of `c_char`. To work around this I've added info about `c_char` in the method's description.
BTW, I've also added a comment to what-not-to-do example in case someone copypasted it without reading the surrounding text.
Improve the `array::map` codegen
The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](7c46fb2111/library/core/src/array/mod.rs (L865-L912)) function, I had some ideas that ended up working better than I'd expected.
There's three main ideas in here, split over three commits:
1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily
2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant)
3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`). This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward.
(No libs-api changes; everything should be completely implementation-detail-internal.)
It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (#103830) to get further -- but this seems to go much better than before. And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞
r? `@thomcc`
As a simple example, this test
```rust
pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] {
x.map(|x| 13 * x + 7)
}
```
On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808`
```llvm
start:
%array.i.i.i.i = alloca [64 x i32], align 4
%_3.sroa.5.i.i.i = alloca [65 x i32], align 4
%_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8
```
(and yes, that's a 6**5**-element array `alloca` despite 6**4**-element input and output)
But with this PR it's only `sub rsp, 520`
```llvm
start:
%array.i.i.i.i.i.i = alloca [64 x i32], align 4
%array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4
```
Similarly, the loop it emits on nightly is scalar-only and horrifying
```nasm
.LBB0_1:
mov esi, 64
mov edi, 0
cmp rdx, 64
je .LBB0_3
lea rsi, [rdx + 1]
mov qword ptr [rsp + 784], rsi
mov r8d, dword ptr [rsp + 4*rdx + 528]
mov edi, 1
lea edx, [r8 + 2*r8]
lea r8d, [r8 + 4*rdx]
add r8d, 7
.LBB0_3:
test edi, edi
je .LBB0_11
mov dword ptr [rsp + 4*rcx + 272], r8d
cmp rsi, 64
jne .LBB0_6
xor r8d, r8d
mov edx, 64
test r8d, r8d
jne .LBB0_8
jmp .LBB0_11
.LBB0_6:
lea rdx, [rsi + 1]
mov qword ptr [rsp + 784], rdx
mov edi, dword ptr [rsp + 4*rsi + 528]
mov r8d, 1
lea esi, [rdi + 2*rdi]
lea edi, [rdi + 4*rsi]
add edi, 7
test r8d, r8d
je .LBB0_11
.LBB0_8:
mov dword ptr [rsp + 4*rcx + 276], edi
add rcx, 2
cmp rcx, 64
jne .LBB0_1
```
whereas with this PR it's unrolled and vectorized
```nasm
vpmulld ymm1, ymm0, ymmword ptr [rsp + 64]
vpaddd ymm1, ymm1, ymm2
vmovdqu ymmword ptr [rsp + 328], ymm1
vpmulld ymm1, ymm0, ymmword ptr [rsp + 96]
vpaddd ymm1, ymm1, ymm2
vmovdqu ymmword ptr [rsp + 360], ymm1
```
(though sadly still stack-to-stack)
Rollup of 7 pull requests
Successful merges:
- #107654 (reword descriptions of the deprecated int modules)
- #107915 (Add `array::map` benchmarks)
- #107961 (Avoid copy-pasting the `ilog` panic string in a bunch of places)
- #107962 (Add a doc note about why `Chain` is not `ExactSizeIterator`)
- #107966 (Update browser-ui-test version to 0.14.3)
- #107970 (Hermit: Remove floor symbol)
- #107973 (Fix unintentional UB in SIMD tests)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Avoid copy-pasting the `ilog` panic string in a bunch of places
I also ended up changing the implementations to `if let` because it doesn't work to
```rust
self.checked_ilog2().unwrap_or_else(panic_for_nonpositive_argument)
```
due to the `!`. But as a bonus that meant I could remove the `rustc_allow_const_fn_unstable` too.
Add `array::map` benchmarks
Since there were no previous benchmarks for `array::map`, and it is known to have mediocre/poor performance, add some simple benchmarks. These benchmarks vary the length of the array and size of each item.
avoid mixing accesses of ptrs derived from a mutable ref and parent ptrs
``@Vanille-N`` is working on a successor for Stacked Borrows. It will mostly accept strictly more code than Stacked Borrows did, with one exception: the following pattern no longer works.
```rust
let mut root = 6u8;
let mref = &mut root;
let ptr = mref as *mut u8;
*ptr = 0; // Write
assert_eq!(root, 0); // Parent Read
*ptr = 0; // Attempted Write
```
This worked in Stacked Borrows kind of by accident: when doing the "parent read", under SB we Disable `mref`, but the raw ptrs derived from it remain usable. The fact that we can still use the "children" of a reference that is no longer usable is quite nasty and leads to some undesirable effects (in particular it is the major blocker for resolving https://github.com/rust-lang/unsafe-code-guidelines/issues/257). So in Tree Borrows we no longer do that; instead, reading from `root` makes `mref` and all its children read-only.
Due to other improvements in Tree Borrows, the entire Miri test suite still passes with this new behavior, and even the entire libcore and liballoc test suite, except for these 2 cases this PR fixes. Both of these involve code where the programmer wrote `&mut` but then used pointers derived from that reference in ways that alias with the parent pointer, which arguably is violating uniqueness. They are fixed by properly using raw pointers throughout.
Document `PointerLike`
I forgot to document this, and even though it's currently more of an implementation detail, the old doc was kinda embarrassing 😅
Use associated items of `char` instead of freestanding items in `core::char`
The associated functions and constants on `char` have been stable since 1.52 and the freestanding items have soft-deprecated since 1.62 (https://github.com/rust-lang/rust/pull/95566). This PR ~~marks them as "deprecated in future", similar to the integer and floating point modules (`core::{i32, f32}` etc)~~ replaces all uses of `core::char::*` with `char::*` to prepare for future deprecation of `core::char::*`.
Speedup heapsort by 1.5x by making it branchless
`slice::sort_unstable` will fall back to heapsort if it repeatedly fails to find a good pivot. By making the core child update code branchless it is much faster. On Zen3 sorting 10k `u64` and forcing the sort to pick heapsort, results in:
455us -> 278us
`slice::sort_unstable` will fall back to heapsort if it repeatedly fails to find
a good pivot. By making the core child update code branchless it is much faster.
On Zen3 sorting 10k `u64` and forcing the sort to pick heapsort, results in:
455us -> 278us
Stabilize feature `cstr_from_bytes_until_nul`
This PR seeks to stabilize `cstr_from_bytes_until_nul`.
Partially addresses #95027
This function has only been on nightly for about 10 months, but I think it is simple enough that there isn't harm discussing stabilization. It has also had at least a handful of mentions on both the user forum and the discord, so it seems like it's already in use or at least known.
This needs FCP still.
Comment on potential discussion points:
- eventual conversion of `CStr` to be a single thin pointer: this function will still be useful to provide a safe way to create a `CStr` after this change.
- should this return a length too, to address concerns about the `CStr` change? I don't see it as being particularly useful, and it seems less ergonomic (i.e. returning `Result<(&CStr, usize), FromBytesUntilNulError>`). I think users that also need this length without the additional `strlen` call are likely better off using a combination of other methods, but this is up for discussion
- `CString::from_vec_until_nul`: this is also useful, but it doesn't even have a nightly implementation merged yet. I propose feature gating that separately, as opposed to blocking this `CStr` implementation on that
Possible alternatives:
A user can use `from_bytes_with_nul` on a slice up to `my_slice[..my_slice.iter().find(|c| c == 0).unwrap()]`. However; that is significantly less ergonomic, and is a bit more work for the compiler to optimize compared the direct `memchr` call that this wraps.
## New stable API
```rs
// both in core::ffi
pub struct FromBytesUntilNulError(());
impl CStr {
pub const fn from_bytes_until_nul(
bytes: &[u8]
) -> Result<&CStr, FromBytesUntilNulError>
}
```
cc ```@ericseppanen``` original author, ```@Mark-Simulacrum``` original reviewer, ```@m-ou-se``` brought up some issues on the thin pointer CStr
```@rustbot``` modify labels: +T-libs-api +needs-fcp
Rename `PointerSized` to `PointerLike`
The old name was unnecessarily vague. This PR renames a nightly language feature that I added, so I don't think it needs any additional approval, though anyone can feel free to speak up if you dislike the rename.
It's still unsatisfying that we don't the user which of {size, alignment} is wrong, but this trait really is just a stepping stone for a more generalized mechanism to create `dyn*`, just meant for nightly testing, so I don't think it really deserves additional diagnostic machinery for now.
Fixes#107696, cc ``@RalfJung``
r? ``@eholk``
Mark 'atomic_mut_ptr' methods const
There's nothing that would block these methods from being const (just an UnsafeCell get), and it would be helpful for FFI interfaces in static contexts
Related tracking issue: #66893