Run most `core::num` tests in const context too
This adds some infrastructure for something I was going to use in #131566, but it felt worthwhile enough on its own to merge/discuss separately.
Essentially, right now we tend to rely on UI tests to ensure that things work in const context, rather than just using library tests. This uses a few simple macro tricks to make it *relatively* painless to execute tests in both runtime and compile-time context. And this only applies to the numeric tests, and not anything else.
Recommended to review without whitespace in the diff.
cc `@RalfJung`
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.
This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver
These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.
Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.
It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.
This is a part of the arbitrary self types v2 project,
https://github.com/rust-lang/rfcs/pull/3519https://github.com/rust-lang/rust/issues/44874
r? @wesleywiser
update ABI compatibility docs for new option-like rules
Documents the rules decided [here](https://github.com/rust-lang/rust/pull/130628#issuecomment-2402761599) for our ABI compatibility rules.
Long-term this should be moved to the reference, but for now this is what we got.
Cc `@rust-lang/lang` `@rust-lang/opsem`
Remove outdated documentation for `repeat_n`
After #106943, which made `Take<Repeat<I>>` implement `ExactSizeIterator`, part of documentation about difference from `repeat(x).take(n)` is no longer valid.
````@rustbot```` labels: +A-docs, +A-iterators
Avoid superfluous UB checks in `IndexRange`
`IndexRange::len` is justified as an overall invariant, and
`take_prefix` and `take_suffix` are justified by local branch
conditions. A few more UB-checked calls remain in cases that are only
supported locally by `debug_assert!`, which won't do anything in
distributed builds, so those UB checks may still be useful.
We generally expect core's `#![rustc_preserve_ub_checks]` to optimize
away in user's release builds, but the mere presence of that extra code
can sometimes inhibit optimization, as seen in #131563.
optimize str.replace
Adds a fast path for str.replace for the ascii to ascii case. This allows for autovectorizing the code. Also should this instead be done with specialization? This way we could remove one branch. I think it is the kind of branch that is easy to predict though.
Benchmark for the fast path (replace all "a" with "b" in the rust wikipedia article, using criterion) :
| N | Speedup | Time New (ns) | Time Old (ns) |
|----------|---------|---------------|---------------|
| 2 | 2.03 | 13.567 | 27.576 |
| 8 | 1.73 | 17.478 | 30.259 |
| 11 | 2.46 | 18.296 | 45.055 |
| 16 | 2.71 | 17.181 | 46.526 |
| 37 | 4.43 | 18.526 | 81.997 |
| 64 | 8.54 | 18.670 | 159.470 |
| 200 | 9.82 | 29.634 | 291.010 |
| 2000 | 24.34 | 81.114 | 1974.300 |
| 20000 | 30.61 | 598.520 | 18318.000 |
| 1000000 | 29.31 | 33458.000 | 980540.000 |
Refactor some `core::fmt` macros
While looking at the macros in `core::fmt`, find that the macros are not well organized. So I created a patch to fix it.
[`core/src/fmt/num.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/num.rs)
* `impl_int!` and `impl_uint!` macro are **completly** same. It would be better to combine for readability
* `impl_int!` has a problem that the indenting is not uniform. It has unified into 4 spaces
* `debug` macro in `num` renamed to `impl_Debug`, And it was moved to a position close to the `impl_Display`.
[`core/src/fmt/float.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/float.rs)
[`core/src/fmt/nofloat.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/nofloat.rs)
* `floating` macro now receive multiple idents at once. It makes the code cleaner.
* Modified the panic message more clearly in fallback function of `cfg(no_fp_fmt_parse)`
Add `from_ref` and `from_mut` constructors to `core::ptr::NonNull`.
Relevant tracking issue: #130823
The `core::ptr::NonNull` type should have the convenience constructors `from_ref` and `from_mut` for parity with `core::ptr::from_ref` and `core::ptr::from_mut`.
Although the type in question already implements `From<&T>` and `From<&mut T>`, these new functions also carry the ability to be used in constant expressions (due to not being behind a trait).
Mark the unstable LazyCell::into_inner const
Other cell `into_inner` functions are const and there shouldn't be any problem here. Make the unstable `LazyCell::into_inner` const under the same gate as its stability (`lazy_cell_into_inner`).
Tracking issue: https://github.com/rust-lang/rust/issues/125623
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Some float methods are now `const fn` under the `const_float_methods` feature gate.
In order to support `min`, `max`, `abs` and `copysign`, the implementation of some intrinsics had to be moved from Miri to rustc_const_eval.
Expand set_ptr_value / with_metadata_of docs
In preparation of a potential FCP, intends to clean up and expand the documentation of this operation.
Rewrite these blobs to explicitly mention the case of a sized operand. The previous made that seem wrong instead of emphasizing it is nothing but a simple cast. Instead, the explanation now emphasizes that the address portion of the argument, together with its provenance, is discarded which previously had to be inferred by the reader. Then an example demonstrates a simple line of incorrect usage based on this idea of provenance.
Tracking issue: https://github.com/rust-lang/rust/issues/75091
Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode
Fixes#122669 by making `option_env!` emit an error when the value of the environment variable is not valid Unicode.
Autodiff Upstreaming - enzyme frontend
This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.
For the full implementation, see: https://github.com/rust-lang/rust/pull/129175
**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.
**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.
**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.
As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
unsafe { asm!("NOP"); };
::core::hint::black_box(f1(x, y));
::core::hint::black_box((dx, dret));
::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
try-job: dist-x86_64-msvc
Other cell `into_inner` functions are const and there shouldn't be any
problem here. Make the unstable `LazyCell::into_inner` const under the
same gate as its stability (`lazy_cell_into_inner`).
Tracking issue: https://github.com/rust-lang/rust/issues/125623
Update precondition tests (especially for zero-size access to null)
I don't much like the current way I've updated the precondition check helpers, but I couldn't come up with anything better. Ideas welcome.
I've organized `tests/ui/precondition-checks` mostly with one file per function that has `assert_unsafe_precondition` in it, with revisions that check each precondition. The important new test is `tests/ui/precondition-checks/zero-size-null.rs`.
Stabilize `Pin::as_deref_mut()`
Tracking issue: closes#86918
Stabilizing the following API:
```rust
impl<Ptr: DerefMut> Pin<Ptr> {
pub fn as_deref_mut(self: Pin<&mut Pin<Ptr>>) -> Pin<&mut Ptr::Target>;
}
```
I know that an FCP has not been started yet, but this isn't a very complex stabilization, and I'm hoping this can motivate an FCP to *get* started - this has been pending for a while and it's a very useful function when writing Future impls.
r? ``@jonhoo``
merge const_ipv4 / const_ipv6 feature gate into 'ip' feature gate
https://github.com/rust-lang/rust/issues/76205 has been closed a while ago, but there are still some functions that reference it. Those functions are all unstable *and* const-unstable. There's no good reason to use a separate feature gate for their const-stability, so this PR moves their const-stability under the same gate as their regular stability, and therefore removes the remaining references to https://github.com/rust-lang/rust/issues/76205.
core/net: add Ipv[46]Addr::from_octets, Ipv6Addr::from_segments.
Adds:
- `Ipv4Address::from_octets([u8;4])`
- `Ipv6Address::from_octets([u8;16])`
- `Ipv6Address::from_segments([u16;8])`
equivalent to the existing `From` impls.
Advantages:
- Consistent with `to_bits, from_bits`.
- More discoverable than the `From` impls.
- Helps with type inference: it's common to want to convert byte slices to IP addrs. If you try this
```rust
fn foo(x: &[u8]) -> Ipv4Addr {
Ipv4Addr::from(foo.try_into().unwrap())
}
```
it [doesn't work](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0e2873312de275a58fa6e33d1b213bec). You have to write `Ipv4Addr::from(<[u8;4]>::try_from(x).unwrap())` instead, which is not great. With `from_octets` it is able to infer the right types.
Found this while porting [smoltcp](https://github.com/smoltcp-rs/smoltcp/) from its own IP address types to the `core::net` types.
~~Tracking issues #27709 #76205~~
Tracking issue: https://github.com/rust-lang/rust/issues/131360
Optimize `escape_ascii` using a lookup table
Based upon my suggestion here: https://github.com/rust-lang/rust/pull/125340#issuecomment-2130441817
Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.
The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)
But, if you want to try my benchmarking code for yourself:
<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>
```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};
const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();
#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 2) };
let mut output = [ascii::Char::Null; N];
output[0] = ascii::Char::ReverseSolidus;
output[1] = a;
(output, 0..2)
}
#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
let mut output = [ascii::Char::Null; N];
let hi = HEX_DIGITS[(byte >> 4) as usize];
let lo = HEX_DIGITS[(byte & 0xf) as usize];
output[0] = ascii::Char::ReverseSolidus;
output[1] = ascii::Char::SmallX;
output[2] = hi;
output[3] = lo;
(output, 0..4)
}
#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 1) };
let mut output = [ascii::Char::Null; N];
output[0] = a;
(output, 0..1)
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
match byte {
b'\t' => backslash(ascii::Char::SmallT),
b'\r' => backslash(ascii::Char::SmallR),
b'\n' => backslash(ascii::Char::SmallN),
b'\\' => backslash(ascii::Char::ReverseSolidus),
b'\'' => backslash(ascii::Char::Apostrophe),
b'\"' => backslash(ascii::Char::QuotationMark),
0x00..=0x1F => hex_escape(byte),
_ => match ascii::Char::from_u8(byte) {
Some(a) => verbatim(a),
None => hex_escape(byte),
},
}
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
/// Lookup table helps us determine how to display character.
///
/// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
/// indicate whether the result is escaped or unescaped.
///
/// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
/// escaped NUL will not occur.
const LOOKUP: [u8; 256] = {
let mut arr = [0; 256];
let mut idx = 0;
loop {
arr[idx as usize] = match idx {
// use 8th bit to indicate escaped
b'\t' => 0x80 | b't',
b'\r' => 0x80 | b'r',
b'\n' => 0x80 | b'n',
b'\\' => 0x80 | b'\\',
b'\'' => 0x80 | b'\'',
b'"' => 0x80 | b'"',
// use NUL to indicate hex-escaped
0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',
_ => idx,
};
if idx == 255 {
break;
}
idx += 1;
}
arr
};
let lookup = LOOKUP[byte as usize];
// 8th bit indicates escape
let lookup_escaped = lookup & 0x80 != 0;
// SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };
if lookup_escaped {
// NUL indicates hex-escaped
if matches!(lookup_ascii, ascii::Char::Null) {
hex_escape(byte)
} else {
backslash(lookup_ascii)
}
} else {
verbatim(lookup_ascii)
}
}
fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
let mut vec = Vec::new();
for b in bytes {
let (buf, range) = f(*b);
vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
}
vec
}
pub fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("escape_ascii");
group.sample_size(1000);
let rand_200k = &mut [0; 200 * 1024];
thread_rng().fill(&mut rand_200k[..]);
let cat = include_bytes!("/bin/cat");
let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");
group.bench_function("old_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
});
group.bench_function("new_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
});
group.bench_function("old_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_old));
});
group.bench_function("new_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_new));
});
group.bench_function("old_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
});
group.bench_function("new_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```
</details>
My benchmark results:
```
escape_ascii/old_rand time: [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
4 (0.40%) high mild
18 (1.80%) high severe
escape_ascii/new_rand time: [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
38 (3.80%) high mild
escape_ascii/old_bin time: [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
17 (1.70%) high mild
22 (2.20%) high severe
escape_ascii/new_bin time: [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
43 (4.30%) high mild
64 (6.40%) high severe
escape_ascii/old_cargo_toml
time: [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
21 (2.10%) high mild
183 (18.30%) high severe
escape_ascii/new_cargo_toml
time: [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
56 (5.60%) high mild
32 (3.20%) high severe
```
Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
Stabilize const `ptr::write*` and `mem::replace`
Since `const_mut_refs` and `const_refs_to_cell` have been stabilized, we may now also stabilize the ability to write to places during const evaluation inside our library API. So, we now propose the `const fn` version of `ptr::write` and its variants. This allows us to also stabilize `mem::replace` and `ptr::replace`.
- const `mem::replace`: https://github.com/rust-lang/rust/issues/83164#issuecomment-2338660862
- const `ptr::write{,_bytes,_unaligned}`: https://github.com/rust-lang/rust/issues/86302#issuecomment-2330275266
Their implementation requires an additional internal stabilization of `const_intrinsic_forget`, which is required for `*::write*` and thus `*::replace`. Thus we const-stabilize the internal intrinsics `forget`, `write_bytes`, and `write_via_move`.
intrinsics fmuladdf{32,64}: expose llvm.fmuladd.* semantics
Add intrinsics `fmuladd{f32,f64}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc).
---
This topic has been discussed a few times on Zulip and was suggested, for example, by `@workingjubilee` in [Effect of fma disabled](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Effect.20of.20fma.20disabled/near/274179331).
`IndexRange::len` is justified as an overall invariant, and
`take_prefix` and `take_suffix` are justified by local branch
conditions. A few more UB-checked calls remain in cases that are only
supported locally by `debug_assert!`, which won't do anything in
distributed builds, so those UB checks may still be useful.
We generally expect core's `#![rustc_preserve_ub_checks]` to optimize
away in user's release builds, but the mere presence of that extra code
can sometimes inhibit optimization, as seen in #131563.
Stabilise `const_char_encode_utf8`.
Closes: #130512
This PR stabilises the `const_char_encode_utf8` feature gate (i.e. support for `char::encode_utf8` in const scenarios).
Note that the linked tracking issue is currently awaiting FCP.
Port sort-research-rs test suite to Rust stdlib tests
This PR is a followup to https://github.com/rust-lang/rust/pull/124032. It replaces the tests that test the various sort functions in the standard library with a test-suite developed as part of https://github.com/Voultapher/sort-research-rs. The current tests suffer a couple of problems:
- They don't cover important real world patterns that the implementations take advantage of and execute special code for.
- The input lengths tested miss out on code paths. For example, important safety property tests never reach the quicksort part of the implementation.
- The miri side is often limited to `len <= 20` which means it very thoroughly tests the insertion sort, which accounts for 19 out of 1.5k LoC.
- They are split into to core and alloc, causing code duplication and uneven coverage.
- ~~The randomness is tied to a caller location, wasting the space exploration capabilities of randomized testing.~~ The randomness is not repeatable, as it relies on `std:#️⃣:RandomState::new().build_hasher()`.
Most of these issues existed before https://github.com/rust-lang/rust/pull/124032, but they are intensified by it. One thing that is new and requires additional testing, is that the new sort implementations specialize based on type properties. For example `Freeze` and non `Freeze` execute different code paths.
Effectively there are three dimensions that matter:
- Input type
- Input length
- Input pattern
The ported test-suite tests various properties along all three dimensions, greatly improving test coverage. It side-steps the miri issue by preferring sampled approaches. For example the test that checks if after a panic the set of elements is still the original one, doesn't do so for every single possible panic opportunity but rather it picks one at random, and performs this test across a range of input length, which varies the panic point across them. This allows regular execution to easily test inputs of length 10k, and miri execution up to 100 which covers significantly more code. The randomness used is tied to a fixed - but random per process execution - seed. This allows for fully repeatable tests and fuzzer like exploration across multiple runs.
Structure wise, the tests are previously found in the core integration tests for `sort_unstable` and alloc unit tests for `sort`. The new test-suite was developed to be a purely black-box approach, which makes integration testing the better place, because it can't accidentally rely on internal access. Because unwinding support is required the tests can't be in core, even if the implementation is, so they are now part of the alloc integration tests. Are there architectures that can only build and test core and not alloc? If so, do such platforms require sort testing? For what it's worth the current implementation state passes miri `--target mips64-unknown-linux-gnuabi64` which is big endian.
The test-suite also contains tests for properties that were and are given by the current and previous implementations, and likely relied upon by users but weren't tested. For example `self_cmp` tests that the two parameters `a` and `b` passed into the comparison function are never references to the same object, which if the user is sorting for example a `&mut [Mutex<i32>]` could lead to a deadlock.
Instead of using the hashed caller location as rand seed, it uses seconds since unix epoch / 10, which given timestamps in the CI should be reasonably easy to reproduce, but also allows fuzzer like space exploration.
---
Test run-time changes:
Setup:
```
Linux 6.10
rustc 1.83.0-nightly (f79a912d9 2024-09-18)
AMD Ryzen 9 5900X 12-Core Processor (Zen 3 micro-architecture)
CPU boost enabled.
```
master: e9df22f
Before core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 869.6 ms ± 21.1 ms [User: 1327.6 ms, System: 95.1 ms]
Range (min … max): 845.4 ms … 917.0 ms 10 runs
# MIRIFLAGS="-Zmiri-disable-isolation" to get real time
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 738.44s
```
After core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 865.1 ms ± 14.7 ms [User: 1283.5 ms, System: 88.4 ms]
Range (min … max): 836.2 ms … 885.7 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 752.35s
```
Before alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 295.0 ms ± 9.9 ms [User: 719.6 ms, System: 35.3 ms]
Range (min … max): 284.9 ms … 319.3 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 322.75s
```
After alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 97.4 ms ± 4.1 ms [User: 297.7 ms, System: 28.6 ms]
Range (min … max): 92.3 ms … 109.2 ms 27 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 309.18s
```
Before alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 103.2 ms ± 1.7 ms [User: 135.7 ms, System: 39.4 ms]
Range (min … max): 99.7 ms … 107.3 ms 28 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 231.35s
```
After alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 379.8 ms ± 4.7 ms [User: 4620.5 ms, System: 1157.2 ms]
Range (min … max): 373.6 ms … 386.9 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 449.24s
```
In my opinion the results don't change iterative library development or CI execution in meaningful ways. For example currently the library doc-tests take ~66s and incremental compilation takes 10+ seconds. However I only have limited knowledge of the various local development workflows that exist, and might be missing one that is significantly impacted by this change.
Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) +
c`, to be fused if the code generator determines that (i) the target
instruction set has support for a fused operation, and (ii) that the
fused operation is more efficient than the equivalent, separate pair
of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
MIRI support is included for f32 and f64.
The codegen_cranelift uses the `fma` function from libc, which is a
correct implementation, but without the desired performance semantic. I
think this requires an update to cranelift to expose a suitable
instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same
way (using `fma` from libc).
Stabilize const `{slice,array}::from_mut`
This PR stabilizes the following APIs as const stable as of rust `1.83`:
```rs
// core::array
pub const fn from_mut<T>(s: &mut T) -> &mut [T; 1];
// core::slice
pub const fn from_mut<T>(s: &mut T) -> &mut [T];
```
This is made possible by `const_mut_refs` being stabilized (yay).
Tracking issue: #90206
Rewrite these blobs to explicitly mention the case of a sized operand.
The previous made that seem wrong instead of emphasizing it is nothing
but a simple cast. Instead, the explanation now emphasizes that the
address portion of the argument, together with its provenance, is
discarded which previously had to be inferred by the reader. Then an
example demonstrates a simple line of incorrect usage based on this
idea of provenance.
make Cell unstably const
Now that we can do interior mutability in `const`, most of the Cell API can be `const fn`. :) The main exception is `set`, because it drops the old value. So from const context one has to use `replace`, which delegates the responsibility for dropping to the caller.
Tracking issue: https://github.com/rust-lang/rust/issues/131283
`as_array_of_cells` is itself still unstable to I added the const-ness to the feature gate for that function and not to `const_cell`, Cc #88248.
r? libs-api
Stabilize the `map`/`value` methods on `ControlFlow`
And fix the stability attribute on the `pub use` in `core::ops`.
libs-api in https://github.com/rust-lang/rust/issues/75744#issuecomment-2231214910 seemed reasonably happy with naming for these, so let's try for an FCP.
Summary:
```rust
impl<B, C> ControlFlow<B, C> {
pub fn break_value(self) -> Option<B>;
pub fn map_break<T>(self, f: impl FnOnce(B) -> T) -> ControlFlow<T, C>;
pub fn continue_value(self) -> Option<C>;
pub fn map_continue<T>(self, f: impl FnOnce(C) -> T) -> ControlFlow<B, T>;
}
```
Resolves#75744
``@rustbot`` label +needs-fcp +t-libs-api -t-libs
---
Aside, in case it keeps someone else from going down the same dead end: I looked at the `{break,continue}_value` methods and tried to make them `const` as part of this, but that's disallowed because of not having `const Drop`, so put it back to not even unstably-const.
Add `[Option<T>; N]::transpose`
This PR as a new unstable libs API, `[Option<T>; N]::transpose`, which permits going from `[Option<T>; N]` to `Option<[T; N]>`.
This new API doesn't have an ACP as it was directly asked by T-libs-api in https://github.com/rust-lang/rust/issues/97601#issuecomment-2372109119:
> [..] but it'd be trivial to provide a helper method `.transpose()` that turns array-of-Option into Option-of-array (**and we think that method should exist**; it already does for array-of-MaybeUninit).
r? libs
Update Unicode escapes in `/library/core/src/char/methods.rs`
`char::MAX` is inconsistent on how Unicode escapes should be formatted. This PR resolves that.
ptr::add/sub: do not claim equivalence with `offset(c as isize)`
In https://github.com/rust-lang/rust/pull/110837, the `offset` intrinsic got changed to also allow a `usize` offset parameter. The intention is that this will do an unsigned multiplication with the size, and we have UB if that overflows -- and we also have UB if the result is larger than `usize::MAX`, i.e., if a subsequent cast to `isize` would wrap. ~~The LLVM backend sets some attributes accordingly.~~
This updates the docs for `add`/`sub` to match that intent, in preparation for adjusting codegen to exploit this UB. We use this opportunity to clarify what the exact requirements are: we compute the offset using mathematical multiplication (so it's no problem to have an `isize * usize` multiplication, we just multiply integers), and the result must fit in an `isize`.
Cc `@rust-lang/opsem` `@nikic`
https://github.com/rust-lang/rust/pull/130239 updates Miri to detect this UB.
`sub` still has some cases of UB not reflected in the underlying intrinsic semantics (and Miri does not catch): when we subtract `usize::MAX`, then after casting to `isize` that's just `-1` so we end up adding one unit without noticing any UB, but actually the offset we gave does not fit in an `isize`. Miri will currently still not complain for such cases:
```rust
fn main() {
let x = &[0i32; 2];
let x = x.as_ptr();
// This should be UB, we are subtracting way too much.
unsafe { x.sub(usize::MAX).read() };
}
```
However, the LLVM IR we generate here also is UB-free. This is "just" library UB but not language UB.
Cc `@saethlin;` might be worth adding precondition checks against overflow on `offset`/`add`/`sub`?
Fixes https://github.com/rust-lang/rust/issues/130211
make ptr metadata functions callable from stable const fn
So far this was done with a bunch of `rustc_allow_const_fn_unstable`. But those should be the exception, not the norm. If we are confident we can expose the ptr metadata APIs *indirectly* in stable const fn, we should just mark them as `rustc_const_stable`. And we better be confident we can do that since it's already been done a while ago. ;)
In particular this marks two intrinsics as const-stable: `aggregate_raw_ptr`, `ptr_metadata`. This should be uncontroversial, they are trivial to implement in the interpreter.
Cc `@rust-lang/wg-const-eval` `@rust-lang/lang`
This commit is a followup to https://github.com/rust-lang/rust/pull/124032. It
replaces the tests that test the various sort functions in the standard library
with a test-suite developed as part of
https://github.com/Voultapher/sort-research-rs. The current tests suffer a
couple of problems:
- They don't cover important real world patterns that the implementations take
advantage of and execute special code for.
- The input lengths tested miss out on code paths. For example, important safety
property tests never reach the quicksort part of the implementation.
- The miri side is often limited to `len <= 20` which means it very thoroughly
tests the insertion sort, which accounts for 19 out of 1.5k LoC.
- They are split into to core and alloc, causing code duplication and uneven
coverage.
- The randomness is not repeatable, as it
relies on `std:#️⃣:RandomState::new().build_hasher()`.
Most of these issues existed before
https://github.com/rust-lang/rust/pull/124032, but they are intensified by it.
One thing that is new and requires additional testing, is that the new sort
implementations specialize based on type properties. For example `Freeze` and
non `Freeze` execute different code paths.
Effectively there are three dimensions that matter:
- Input type
- Input length
- Input pattern
The ported test-suite tests various properties along all three dimensions,
greatly improving test coverage. It side-steps the miri issue by preferring
sampled approaches. For example the test that checks if after a panic the set of
elements is still the original one, doesn't do so for every single possible
panic opportunity but rather it picks one at random, and performs this test
across a range of input length, which varies the panic point across them. This
allows regular execution to easily test inputs of length 10k, and miri execution
up to 100 which covers significantly more code. The randomness used is tied to a
fixed - but random per process execution - seed. This allows for fully
repeatable tests and fuzzer like exploration across multiple runs.
Structure wise, the tests are previously found in the core integration tests for
`sort_unstable` and alloc unit tests for `sort`. The new test-suite was
developed to be a purely black-box approach, which makes integration testing the
better place, because it can't accidentally rely on internal access. Because
unwinding support is required the tests can't be in core, even if the
implementation is, so they are now part of the alloc integration tests. Are
there architectures that can only build and test core and not alloc? If so, do
such platforms require sort testing? For what it's worth the current
implementation state passes miri `--target mips64-unknown-linux-gnuabi64` which
is big endian.
The test-suite also contains tests for properties that were and are given by the
current and previous implementations, and likely relied upon by users but
weren't tested. For example `self_cmp` tests that the two parameters `a` and `b`
passed into the comparison function are never references to the same object,
which if the user is sorting for example a `&mut [Mutex<i32>]` could lead to a
deadlock.
Instead of using the hashed caller location as rand seed, it uses seconds since
unix epoch / 10, which given timestamps in the CI should be reasonably easy to
reproduce, but also allows fuzzer like space exploration.
stabilize const_cell_into_inner
This const-stabilizes
- `UnsafeCell::into_inner`
- `Cell::into_inner`
- `RefCell::into_inner`
- `OnceCell::into_inner`
`@rust-lang/wg-const-eval` this uses `rustc_allow_const_fn_unstable(const_precise_live_drops)`, so we'd be comitting to always finding *some* way to accept this code. IMO that's fine -- what these functions do is to move out the only field of a struct, and that struct has no destructor itself. The field's destructor does not get run as it gets returned to the caller.
`@rust-lang/libs-api` this was FCP'd already [years ago](https://github.com/rust-lang/rust/issues/78729#issuecomment-811409860), except that `OnceCell::into_inner` was added to the same feature gate since then (Cc `@tgross35).` Does that mean we have to re-run the FCP? If yes, I'd honestly prefer to move `OnceCell` into its own feature gate to not risk missing the next release. (That's why it's not great to add new functions to an already FCP'd feature gate.) OTOH if this needs an FCP either way since the previous FCP was so long ago, then we might as well do it all at once.
Improve Ord docs
- Makes wording more clear and re-structures some sections that can be overwhelming for someone not already in the know.
- Adds examples of how *not* to implement Ord, inspired by various anti-patterns found in real world code.
Many of the wording changes are inspired directly by my personal experience of being confused by the `Ord` docs and seeing other people get it wrong as well, especially lately having looked at a number of `Ord` implementations as part of #128899.
Created with help by `@orlp.`
r? `@workingjubilee`
Update `catch_unwind` doc comments for `c_unwind`
Updates `catch_unwind` doc comments to indicate that catching a foreign exception _will no longer_ be UB. Instead, there are two possible behaviors, though it is not specified which one an implementation will choose.
Nominated for t-lang to confirm that they are okay with making such a promise based on t-opsem FCP, or whether they would like to be included in the FCP.
Related: https://github.com/rust-lang/rust/issues/74990, https://github.com/rust-lang/rust/issues/115285, https://github.com/rust-lang/reference/pull/1226
- Makes wording more clear and re-structures some
sections that can be overwhelming for some not
already in the know.
- Adds examples of how *not* to implement Ord,
inspired by various anti-patterns found in real
world code.
update `compiler-builtins` to 0.1.126
this requires the addition of a bootstrap variant of the new `naked_asm!` macro
r? `@tgross35`
extracted from https://github.com/rust-lang/rust/pull/128651
[`cfg_match`] Generalize inputs
cc #115585
Changes the input type from `item` to `tt`, which makes the macro have the same functionality of `cfg_if`.
Also adds a test to ensure that `stmt_expr_attributes` is not triggered.