feat(byte_sub_ptr): unstably add ptr::byte_sub_ptr
This is an API that naturally should exist as a combination of byte_offset_from and sub_ptr
both existing (they showed up at similar times so this union was never made). Adding these
is a logical (and perhaps final) precondition of stabilizing ptr_sub_ptr (https://github.com/rust-lang/rust/issues/95892).
Original PR by ``@Gankra`` (https://github.com/rust-lang/rust/pull/121919), I am just reviving it. The 2nd commit (with a small docs tweak) is by me.
make const_alloc_layout feature gate only about functions that are already stable
The const_alloc_layout feature gate has two kinds of functions: those that are stable, but not yet const-stable, and those that are fully unstable.
I think we should split that up. So this PR makes const_alloc_layout just about functions that are already stable but waiting for const-stability; all the other functions now have their constness guarded by the gate that also guards their regular stability.
Cc https://github.com/rust-lang/rust/issues/67521
remove some unnecessary rustc_allow_const_fn_unstable
These are either unstable functions that don't need the attribute, or the attribute refers to a feature that is already stable.
Cleanup attributes around unchecked shifts and unchecked negation in const
The underlying intrinsic is marked as "safe to expose on stable", so we shouldn't need any `rustc_allow_const_fn_unstable(unchecked_shifts)` anywhere. However, bootstrap rustc doesn't yet have the new const stability checks, so these changes only apply under `cfg(not(bootstrap))`.
This is an API that naturally should exist as a combination of byte_offset_from and sub_ptr
both existing (they showed up at similar times so this union was never made). Adding these
is a logical (and perhaps final) precondition of stabilizing ptr_sub_ptr (#95892).
Use Hacker's Delight impl in `i64::midpoint` instead of wide `i128` impl
This PR switches `i64::midpoint` and (`isize::midpoint` where `isize == i64`) to using our Hacker's Delight impl instead of wide `i128` implementation.
As LLVM seems to be outperformed by the complexity of signed 128-bits number compared to our Hacker's Delight implementation.[^1]
It doesn't seems like it's an improvement for the other sizes[^2], so we let them with the wide implementation.
[^1]: https://rust.godbolt.org/z/ravE75EYj
[^2]: https://rust.godbolt.org/z/fzr171zKh
r? libs
Mark `str::is_char_boundary` and `str::split_at*` unstably `const`.
Tracking issues: #131516, #131518
First commit implements `const_is_char_boundary`, second commit implements `const_str_split_at` (which depends on `const_is_char_boundary`)
~~I used `const_eval_select` for `is_char_boundary` since there is a comment about optimizations that would theoretically not happen with the simple `const`-compatible version (since `slice::get` is not `const`ifiable) cc #84751. I have not checked if this code difference is still required for the optimization, so it might not be worth the code complication, but 🤷.~~
This changes `str::split_at_checked` to use a new private helper function `split_at_unchecked` (copied from `split_at_mut_unchecked`) that does pointer stuff instead of `get_unchecked`, since that is not currently `const`ifiable due to using the `SliceIndex` trait.
Lint against getting pointers from immediately dropped temporaries
Fixes#123613
## Changes:
1. New lint: `dangling_pointers_from_temporaries`. Is a generalization of `temporary_cstring_as_ptr` for more types and more ways to get a temporary.
2. `temporary_cstring_as_ptr` is removed and marked as renamed to `dangling_pointers_from_temporaries`.
3. `clippy::temporary_cstring_as_ptr` is marked as renamed to `dangling_pointers_from_temporaries`.
4. Fixed a false positive[^fp] for when the pointer is not actually dangling because of lifetime extension for function/method call arguments.
5. `core::cell::Cell` is now `rustc_diagnostic_item = "Cell"`
## Questions:
- [ ] Instead of manually checking for a list of known methods and diagnostic items, maybe add some sort of annotation to those methods in library and check for the presence of that annotation? https://github.com/rust-lang/rust/pull/128985#issuecomment-2318714312
## Known limitations:
### False negatives[^fn]:
See the comments in `compiler/rustc_lint/src/dangling.rs`
1. Method calls that are not checked for:
- `temporary_unsafe_cell.get()`
- `temporary_sync_unsafe_cell.get()`
2. Ways to get a temporary that are not recognized:
- `owning_temporary.field`
- `owning_temporary[index]`
3. No checks for ref-to-ptr conversions:
- `&raw [mut] temporary`
- `&temporary as *(const|mut) _`
- `ptr::from_ref(&temporary)` and friends
[^fn]: lint **should** be emitted, but **is not**
[^fp]: lint **should not** be emitted, but **is**
Make clearer that guarantees in ABI compatibility are for Rust only
cc https://github.com/rust-lang/rust/pull/132136#issuecomment-2439737631 -- it looks like we already had a note that I missed in my initial look here, but this goes further to emphasize the guarantees, including uplifting it to the top of the general documentation.
r? `@RalfJung`
As LLVM seems to be outperformed by the complexity of signed 128-bits
number compared to our Hacker's Delight implementation.[^1]
It doesn't seems like it's an improvement for the other sizes[^2], so we
let them with the wide implementation.
[^1]: https://rust.godbolt.org/z/ravE75EYj
[^2]: https://rust.godbolt.org/z/fzr171zKh
Rename macro `SmartPointer` to `CoercePointee`
As per resolution #129104 we will rename the macro to better reflect the technical specification of the feature and clarify the communication.
- `SmartPointer` is renamed to `CoerceReferent`
- `#[pointee]` attribute is renamed to `#[referent]`
- `#![feature(derive_smart_pointer)]` gate is renamed to `#![feature(derive_coerce_referent)]`.
- Any mention of `SmartPointer` in the file names are renamed accordingly.
r? `@compiler-errors`
cc `@nikomatsakis` `@Darksonn`
Round negative signed integer towards zero in `iN::midpoint`
This PR changes the implementation of `iN::midpoint` (the signed variants) to round negative signed integers **towards zero** *instead* of negative infinity as is currently the case.
This is done so that the obvious expectations[^1] of `midpoint(a, b) == midpoint(b, a)` and `midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even more obvious implementation `(a + b) / 2` always true.
The unsigned variants `uN::midpoint` (which are being [FCP-ed](https://github.com/rust-lang/rust/pull/131784#issuecomment-2417188117)) already rounds towards zero, so there is no consistency issue.
cc `@scottmcm`
r? `@dtolnay`
[^1]: https://github.com/rust-lang/rust/issues/110840#issuecomment-2336753931
Instead of towards negative infinity as is currently the case.
This done so that the obvious expectations of
`midpoint(a, b) == midpoint(b, a)` and
`midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even
more obvious implementation `(a + b) / 2` true.
https://github.com/rust-lang/rust/issues/110840#issuecomment-2336753931
Const stability checks v2
The const stability system has served us well ever since `const fn` were first stabilized. It's main feature is that it enforces *recursive* validity -- a stable const fn cannot internally make use of unstable const features without an explicit marker in the form of `#[rustc_allow_const_fn_unstable]`. This is done to make sure that we don't accidentally expose unstable const features on stable in a way that would be hard to take back. As part of this, it is enforced that a `#[rustc_const_stable]` can only call `#[rustc_const_stable]` functions. However, some problems have been coming up with increased usage:
- It is baffling that we have to mark private or even unstable functions as `#[rustc_const_stable]` when they are used as helpers in regular stable `const fn`, and often people will rather add `#[rustc_allow_const_fn_unstable]` instead which was not our intention.
- The system has several gaping holes: a private `const fn` without stability attributes whose inherited stability (walking up parent modules) is `#[stable]` is allowed to call *arbitrary* unstable const operations, but can itself be called from stable `const fn`. Similarly, `#[allow_internal_unstable]` on a macro completely bypasses the recursive nature of the check.
Fundamentally, the problem is that we have *three* disjoint categories of functions, and not enough attributes to distinguish them:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
Functions in the first two categories cannot use unstable const features and they can only call functions from the first two categories.
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, all the holes mentioned above have been closed. There's still one potential hole that is hard to avoid, which is when MIR building automatically inserts calls to a particular function in stable functions -- which happens in the panic machinery. Those need to be manually marked `#[rustc_const_stable_indirect]` to be sure they follow recursive const stability. But that's a fairly rare and special case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be constified simply by marking it as `const fn`, and it will then be const-callable from stable `const fn` and subject to recursive const stability requirements. If it is publicly reachable (which implies it cannot be unmarked), it will be const-unstable under the same feature gate. Only if the function ever becomes `#[stable]` does it need a `#[rustc_const_unstable]` or `#[rustc_const_stable]` marker to decide if this should also imply const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to use unstable const lang features (including intrinsics), or (b) `#[stable]` functions that are not yet intended to be const-stable. Adding `#[rustc_const_stable]` is only needed for functions that are actually meant to be directly callable from stable const code. `#[rustc_const_stable_indirect]` is used to mark intrinsics as const-callable and for `#[rustc_const_unstable]` functions that are actually called from other, exposed-on-stable `const fn`. No other attributes are required.
Also see the updated dev-guide at https://github.com/rust-lang/rustc-dev-guide/pull/2098.
I think in the future we may want to tweak this further, so that in the hopefully common case where a public function's const-stability just exactly mirrors its regular stability, we never have to add any attribute. But right now, once the function is stable this requires `#[rustc_const_stable]`.
### Open question
There is one point I could see we might want to do differently, and that is putting `#[rustc_const_unstable]` functions (but not intrinsics) in category 2 by default, and requiring an extra attribute for `#[rustc_const_not_exposed_on_stable]` or so. This would require a bunch of extra annotations, but would have the advantage that turning a `#[rustc_const_unstable]` into `#[rustc_const_stable]` will never change the way the function is const-checked. Currently, we often discover in the const stabilization PR that a function needs some other unstable const things, and then we rush to quickly deal with that. In this alternative universe, we'd work towards getting rid of the `rustc_const_not_exposed_on_stable` before stabilization, and once that is done stabilization becomes a trivial matter. `#[rustc_const_stable_indirect]` would then only be used for intrinsics.
I think I like this idea, but might want to do it in a follow-up PR, as it will need a whole bunch of annotations in the standard library. Also, we probably want to convert all const intrinsics to the "new" form (`#[rustc_intrinsic]` instead of an `extern` block) before doing this to avoid having to deal with two different ways of declaring intrinsics.
Cc `@rust-lang/wg-const-eval` `@rust-lang/libs-api`
Part of https://github.com/rust-lang/rust/issues/129815 (but not finished since this is not yet sufficient to safely let us expose `const fn` from hashbrown)
Fixes https://github.com/rust-lang/rust/issues/131073 by making it so that const-stable functions are always stable
try-job: test-various
library: consistently use American spelling for 'behavior'
We use "behavior" a lot more often than "behaviour", but some "behaviour" have even snuck into user-facing docs. This makes the spelling consistent.
Fundamentally, we have *three* disjoint categories of functions:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, several holes in recursive const stability checking are being closed.
There's still one potential hole that is hard to avoid, which is when MIR
building automatically inserts calls to a particular function in stable
functions -- which happens in the panic machinery. Those need to *not* be
`rustc_const_unstable` (or manually get a `rustc_const_stable_indirect`) to be
sure they follow recursive const stability. But that's a fairly rare and special
case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be
constified simply by marking it as `const fn`, and it will then be
const-callable from stable `const fn` and subject to recursive const stability
requirements. If it is publicly reachable (which implies it cannot be unmarked),
it will be const-unstable under the same feature gate. Only if the function ever
becomes `#[stable]` does it need a `#[rustc_const_unstable]` or
`#[rustc_const_stable]` marker to decide if this should also imply
const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to
use unstable const lang features (including intrinsics), or (b) `#[stable]`
functions that are not yet intended to be const-stable. Adding
`#[rustc_const_stable]` is only needed for functions that are actually meant to
be directly callable from stable const code. `#[rustc_const_stable_indirect]` is
used to mark intrinsics as const-callable and for `#[rustc_const_unstable]`
functions that are actually called from other, exposed-on-stable `const fn`. No
other attributes are required.
This commit contains a new Receiver trait, which is the basis for the
Arbitrary Self Types v2 RFC. This allows smart pointers to be method
receivers even if they're not Deref.
This is currently unused by the compiler - a subsequent PR will start to
use this for method resolution if the arbitrary_self_types feature gate
is enabled. This is being landed first simply to make review
simpler: if people feel this should all be in an atomic PR let me know.
This is a part of the arbitrary self types v2 project,
https://github.com/rust-lang/rfcs/pull/3519https://github.com/rust-lang/rust/issues/44874
r? @wesleywiser
Expand `ptr::fn_addr_eq()` documentation.
* Describe more clearly what is (not) guaranteed, and de-emphasize the description of rustc implementation details.
* Explain what you *can* reliably use it for.
Tracking issue for `ptr_fn_addr_eq`: #129322
The motivation for this PR is that I just learned that `ptr::fn_addr_eq()` exists, read the documentation, and thought: “*I* know what this means, but someone not already familiar with how `rustc` works could be left wondering whether this is even good for anything.” Fixing that seems especially important if we’re going to recommend people use it instead of `==` (as per #118833).
Rollup of 6 pull requests
Successful merges:
- #131851 ([musl] use posix_spawn if a directory change was requested)
- #132048 (AIX: use /dev/urandom for random implementation )
- #132093 (compiletest: suppress Windows Error Reporting (WER) for `run-make` tests)
- #132101 (Avoid using imports in thread_local_inner! in static)
- #132113 (Provide a default impl for Pattern::as_utf8_pattern)
- #132115 (rustdoc: Extend fake_variadic to "wrapped" tuples)
r? `@ghost`
`@rustbot` modify labels: rollup
Provide a default impl for Pattern::as_utf8_pattern
Newly added ```Pattern::as_utf8_pattern()``` causes needless breakage for crates that implement Pattern. This provides a default implementation instead.
r? `@BurntSushi`
Document textual format of SocketAddrV{4,6}
This commit adds new "Textual representation" documentation sections to SocketAddrV4 and SocketAddrV6, by analogy to the existing "textual representation" sections of Ipv4Addr and Ipv6Addr.
Rationale: Without documentation about which formats are actually accepted, it's hard for a programmer to be sure that their code will actually behave as expected when implementing protocols that require support (or rejection) for particular representations. This lack of clarity can in turn can lead to ambiguities and security problems like those discussed in RFC 6942.
(I've tried to describe the governing RFCs or standards where I could, but it's possible that the actual implementers had something else in mind. I could not find any standards that corresponded _exactly_ to the one implemented in SocketAddrv6, but I have linked the relevant documents that I could find.)
This commit adds new "Textual representation" documentation sections to
SocketAddrV4 and SocketAddrV6, by analogy to the existing
"textual representation" sections of Ipv4Addr and Ipv6Addr.
Rationale: Without documentation about which formats are actually
accepted, it's hard for a programmer to be sure that their code
will actually behave as expected when implementing protocols that
require support (or rejection) for particular representations.
This lack of clarity can in turn can lead to ambiguities and
security problems like those discussed in RFC 6942.
(I've tried to describe the governing RFCs or standards where I
could, but it's possible that the actual implementers had something
else in mind. I could not find any standards that corresponded
_exactly_ to the one implemented in SocketAddrv6, but I have linked
the relevant documents that I could find.)
Rename Receiver -> LegacyReceiver
As part of the "arbitrary self types v2" project, we are going to replace the current `Receiver` trait with a new mechanism based on a new, different `Receiver` trait.
This PR renames the old trait to get it out the way. Naming is hard. Options considered included:
* HardCodedReceiver (because it should only be used for things in the standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver
These are all bad names, but fortunately this will be temporary. Assuming the new mechanism proceeds to stabilization as intended, the legacy trait will be removed altogether.
Although we expect this trait to be used only in the standard library, we suspect it may be in use elsehwere, so we're landing this change separately to identify any surprising breakages.
It's known that this trait is used within the Rust for Linux project; a patch is in progress to remove their dependency.
This is a part of the arbitrary self types v2 project,
https://github.com/rust-lang/rfcs/pull/3519https://github.com/rust-lang/rust/issues/44874
r? `@wesleywiser`
"innermost", "outermost", "leftmost", and "rightmost" don't need hyphens
These are all standard dictionary words and don't require hyphenation.
-----
Encountered an instance of this in error messages and it bugged me, so I
figured I'd fix it across the entire codebase.
Run most `core::num` tests in const context too
This adds some infrastructure for something I was going to use in #131566, but it felt worthwhile enough on its own to merge/discuss separately.
Essentially, right now we tend to rely on UI tests to ensure that things work in const context, rather than just using library tests. This uses a few simple macro tricks to make it *relatively* painless to execute tests in both runtime and compile-time context. And this only applies to the numeric tests, and not anything else.
Recommended to review without whitespace in the diff.
cc `@RalfJung`
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.
This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver
These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.
Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.
It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.
This is a part of the arbitrary self types v2 project,
https://github.com/rust-lang/rfcs/pull/3519https://github.com/rust-lang/rust/issues/44874
r? @wesleywiser
update ABI compatibility docs for new option-like rules
Documents the rules decided [here](https://github.com/rust-lang/rust/pull/130628#issuecomment-2402761599) for our ABI compatibility rules.
Long-term this should be moved to the reference, but for now this is what we got.
Cc `@rust-lang/lang` `@rust-lang/opsem`
Remove outdated documentation for `repeat_n`
After #106943, which made `Take<Repeat<I>>` implement `ExactSizeIterator`, part of documentation about difference from `repeat(x).take(n)` is no longer valid.
````@rustbot```` labels: +A-docs, +A-iterators
Avoid superfluous UB checks in `IndexRange`
`IndexRange::len` is justified as an overall invariant, and
`take_prefix` and `take_suffix` are justified by local branch
conditions. A few more UB-checked calls remain in cases that are only
supported locally by `debug_assert!`, which won't do anything in
distributed builds, so those UB checks may still be useful.
We generally expect core's `#![rustc_preserve_ub_checks]` to optimize
away in user's release builds, but the mere presence of that extra code
can sometimes inhibit optimization, as seen in #131563.
optimize str.replace
Adds a fast path for str.replace for the ascii to ascii case. This allows for autovectorizing the code. Also should this instead be done with specialization? This way we could remove one branch. I think it is the kind of branch that is easy to predict though.
Benchmark for the fast path (replace all "a" with "b" in the rust wikipedia article, using criterion) :
| N | Speedup | Time New (ns) | Time Old (ns) |
|----------|---------|---------------|---------------|
| 2 | 2.03 | 13.567 | 27.576 |
| 8 | 1.73 | 17.478 | 30.259 |
| 11 | 2.46 | 18.296 | 45.055 |
| 16 | 2.71 | 17.181 | 46.526 |
| 37 | 4.43 | 18.526 | 81.997 |
| 64 | 8.54 | 18.670 | 159.470 |
| 200 | 9.82 | 29.634 | 291.010 |
| 2000 | 24.34 | 81.114 | 1974.300 |
| 20000 | 30.61 | 598.520 | 18318.000 |
| 1000000 | 29.31 | 33458.000 | 980540.000 |
Refactor some `core::fmt` macros
While looking at the macros in `core::fmt`, find that the macros are not well organized. So I created a patch to fix it.
[`core/src/fmt/num.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/num.rs)
* `impl_int!` and `impl_uint!` macro are **completly** same. It would be better to combine for readability
* `impl_int!` has a problem that the indenting is not uniform. It has unified into 4 spaces
* `debug` macro in `num` renamed to `impl_Debug`, And it was moved to a position close to the `impl_Display`.
[`core/src/fmt/float.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/float.rs)
[`core/src/fmt/nofloat.rs`](https://github.com/rust-lang/rust/blob/master/library/core/src/fmt/nofloat.rs)
* `floating` macro now receive multiple idents at once. It makes the code cleaner.
* Modified the panic message more clearly in fallback function of `cfg(no_fp_fmt_parse)`
Add `from_ref` and `from_mut` constructors to `core::ptr::NonNull`.
Relevant tracking issue: #130823
The `core::ptr::NonNull` type should have the convenience constructors `from_ref` and `from_mut` for parity with `core::ptr::from_ref` and `core::ptr::from_mut`.
Although the type in question already implements `From<&T>` and `From<&mut T>`, these new functions also carry the ability to be used in constant expressions (due to not being behind a trait).
Mark the unstable LazyCell::into_inner const
Other cell `into_inner` functions are const and there shouldn't be any problem here. Make the unstable `LazyCell::into_inner` const under the same gate as its stability (`lazy_cell_into_inner`).
Tracking issue: https://github.com/rust-lang/rust/issues/125623
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Some float methods are now `const fn` under the `const_float_methods` feature gate.
In order to support `min`, `max`, `abs` and `copysign`, the implementation of some intrinsics had to be moved from Miri to rustc_const_eval.
Expand set_ptr_value / with_metadata_of docs
In preparation of a potential FCP, intends to clean up and expand the documentation of this operation.
Rewrite these blobs to explicitly mention the case of a sized operand. The previous made that seem wrong instead of emphasizing it is nothing but a simple cast. Instead, the explanation now emphasizes that the address portion of the argument, together with its provenance, is discarded which previously had to be inferred by the reader. Then an example demonstrates a simple line of incorrect usage based on this idea of provenance.
Tracking issue: https://github.com/rust-lang/rust/issues/75091
Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode
Fixes#122669 by making `option_env!` emit an error when the value of the environment variable is not valid Unicode.
Autodiff Upstreaming - enzyme frontend
This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.
For the full implementation, see: https://github.com/rust-lang/rust/pull/129175
**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.
**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.
**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.
As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
unsafe { asm!("NOP"); };
::core::hint::black_box(f1(x, y));
::core::hint::black_box((dx, dret));
::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
try-job: dist-x86_64-msvc
Other cell `into_inner` functions are const and there shouldn't be any
problem here. Make the unstable `LazyCell::into_inner` const under the
same gate as its stability (`lazy_cell_into_inner`).
Tracking issue: https://github.com/rust-lang/rust/issues/125623
Update precondition tests (especially for zero-size access to null)
I don't much like the current way I've updated the precondition check helpers, but I couldn't come up with anything better. Ideas welcome.
I've organized `tests/ui/precondition-checks` mostly with one file per function that has `assert_unsafe_precondition` in it, with revisions that check each precondition. The important new test is `tests/ui/precondition-checks/zero-size-null.rs`.
Stabilize `Pin::as_deref_mut()`
Tracking issue: closes#86918
Stabilizing the following API:
```rust
impl<Ptr: DerefMut> Pin<Ptr> {
pub fn as_deref_mut(self: Pin<&mut Pin<Ptr>>) -> Pin<&mut Ptr::Target>;
}
```
I know that an FCP has not been started yet, but this isn't a very complex stabilization, and I'm hoping this can motivate an FCP to *get* started - this has been pending for a while and it's a very useful function when writing Future impls.
r? ``@jonhoo``
merge const_ipv4 / const_ipv6 feature gate into 'ip' feature gate
https://github.com/rust-lang/rust/issues/76205 has been closed a while ago, but there are still some functions that reference it. Those functions are all unstable *and* const-unstable. There's no good reason to use a separate feature gate for their const-stability, so this PR moves their const-stability under the same gate as their regular stability, and therefore removes the remaining references to https://github.com/rust-lang/rust/issues/76205.
core/net: add Ipv[46]Addr::from_octets, Ipv6Addr::from_segments.
Adds:
- `Ipv4Address::from_octets([u8;4])`
- `Ipv6Address::from_octets([u8;16])`
- `Ipv6Address::from_segments([u16;8])`
equivalent to the existing `From` impls.
Advantages:
- Consistent with `to_bits, from_bits`.
- More discoverable than the `From` impls.
- Helps with type inference: it's common to want to convert byte slices to IP addrs. If you try this
```rust
fn foo(x: &[u8]) -> Ipv4Addr {
Ipv4Addr::from(foo.try_into().unwrap())
}
```
it [doesn't work](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0e2873312de275a58fa6e33d1b213bec). You have to write `Ipv4Addr::from(<[u8;4]>::try_from(x).unwrap())` instead, which is not great. With `from_octets` it is able to infer the right types.
Found this while porting [smoltcp](https://github.com/smoltcp-rs/smoltcp/) from its own IP address types to the `core::net` types.
~~Tracking issues #27709 #76205~~
Tracking issue: https://github.com/rust-lang/rust/issues/131360
Optimize `escape_ascii` using a lookup table
Based upon my suggestion here: https://github.com/rust-lang/rust/pull/125340#issuecomment-2130441817
Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.
The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)
But, if you want to try my benchmarking code for yourself:
<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>
```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};
const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();
#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 2) };
let mut output = [ascii::Char::Null; N];
output[0] = ascii::Char::ReverseSolidus;
output[1] = a;
(output, 0..2)
}
#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
let mut output = [ascii::Char::Null; N];
let hi = HEX_DIGITS[(byte >> 4) as usize];
let lo = HEX_DIGITS[(byte & 0xf) as usize];
output[0] = ascii::Char::ReverseSolidus;
output[1] = ascii::Char::SmallX;
output[2] = hi;
output[3] = lo;
(output, 0..4)
}
#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 1) };
let mut output = [ascii::Char::Null; N];
output[0] = a;
(output, 0..1)
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
match byte {
b'\t' => backslash(ascii::Char::SmallT),
b'\r' => backslash(ascii::Char::SmallR),
b'\n' => backslash(ascii::Char::SmallN),
b'\\' => backslash(ascii::Char::ReverseSolidus),
b'\'' => backslash(ascii::Char::Apostrophe),
b'\"' => backslash(ascii::Char::QuotationMark),
0x00..=0x1F => hex_escape(byte),
_ => match ascii::Char::from_u8(byte) {
Some(a) => verbatim(a),
None => hex_escape(byte),
},
}
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
/// Lookup table helps us determine how to display character.
///
/// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
/// indicate whether the result is escaped or unescaped.
///
/// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
/// escaped NUL will not occur.
const LOOKUP: [u8; 256] = {
let mut arr = [0; 256];
let mut idx = 0;
loop {
arr[idx as usize] = match idx {
// use 8th bit to indicate escaped
b'\t' => 0x80 | b't',
b'\r' => 0x80 | b'r',
b'\n' => 0x80 | b'n',
b'\\' => 0x80 | b'\\',
b'\'' => 0x80 | b'\'',
b'"' => 0x80 | b'"',
// use NUL to indicate hex-escaped
0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',
_ => idx,
};
if idx == 255 {
break;
}
idx += 1;
}
arr
};
let lookup = LOOKUP[byte as usize];
// 8th bit indicates escape
let lookup_escaped = lookup & 0x80 != 0;
// SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };
if lookup_escaped {
// NUL indicates hex-escaped
if matches!(lookup_ascii, ascii::Char::Null) {
hex_escape(byte)
} else {
backslash(lookup_ascii)
}
} else {
verbatim(lookup_ascii)
}
}
fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
let mut vec = Vec::new();
for b in bytes {
let (buf, range) = f(*b);
vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
}
vec
}
pub fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("escape_ascii");
group.sample_size(1000);
let rand_200k = &mut [0; 200 * 1024];
thread_rng().fill(&mut rand_200k[..]);
let cat = include_bytes!("/bin/cat");
let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");
group.bench_function("old_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
});
group.bench_function("new_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
});
group.bench_function("old_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_old));
});
group.bench_function("new_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_new));
});
group.bench_function("old_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
});
group.bench_function("new_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```
</details>
My benchmark results:
```
escape_ascii/old_rand time: [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
4 (0.40%) high mild
18 (1.80%) high severe
escape_ascii/new_rand time: [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
38 (3.80%) high mild
escape_ascii/old_bin time: [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
17 (1.70%) high mild
22 (2.20%) high severe
escape_ascii/new_bin time: [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
43 (4.30%) high mild
64 (6.40%) high severe
escape_ascii/old_cargo_toml
time: [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
21 (2.10%) high mild
183 (18.30%) high severe
escape_ascii/new_cargo_toml
time: [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
56 (5.60%) high mild
32 (3.20%) high severe
```
Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
Stabilize const `ptr::write*` and `mem::replace`
Since `const_mut_refs` and `const_refs_to_cell` have been stabilized, we may now also stabilize the ability to write to places during const evaluation inside our library API. So, we now propose the `const fn` version of `ptr::write` and its variants. This allows us to also stabilize `mem::replace` and `ptr::replace`.
- const `mem::replace`: https://github.com/rust-lang/rust/issues/83164#issuecomment-2338660862
- const `ptr::write{,_bytes,_unaligned}`: https://github.com/rust-lang/rust/issues/86302#issuecomment-2330275266
Their implementation requires an additional internal stabilization of `const_intrinsic_forget`, which is required for `*::write*` and thus `*::replace`. Thus we const-stabilize the internal intrinsics `forget`, `write_bytes`, and `write_via_move`.
intrinsics fmuladdf{32,64}: expose llvm.fmuladd.* semantics
Add intrinsics `fmuladd{f32,f64}`. This computes `(a * b) + c`, to be fused if the code generator determines that (i) the target instruction set has support for a fused operation, and (ii) that the fused operation is more efficient than the equivalent, separate pair of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
The codegen_cranelift uses the `fma` function from libc, which is a correct implementation, but without the desired performance semantic. I think this requires an update to cranelift to expose a suitable instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same way (using `fma` from libc).
---
This topic has been discussed a few times on Zulip and was suggested, for example, by `@workingjubilee` in [Effect of fma disabled](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Effect.20of.20fma.20disabled/near/274179331).
`IndexRange::len` is justified as an overall invariant, and
`take_prefix` and `take_suffix` are justified by local branch
conditions. A few more UB-checked calls remain in cases that are only
supported locally by `debug_assert!`, which won't do anything in
distributed builds, so those UB checks may still be useful.
We generally expect core's `#![rustc_preserve_ub_checks]` to optimize
away in user's release builds, but the mere presence of that extra code
can sometimes inhibit optimization, as seen in #131563.
Stabilise `const_char_encode_utf8`.
Closes: #130512
This PR stabilises the `const_char_encode_utf8` feature gate (i.e. support for `char::encode_utf8` in const scenarios).
Note that the linked tracking issue is currently awaiting FCP.
Port sort-research-rs test suite to Rust stdlib tests
This PR is a followup to https://github.com/rust-lang/rust/pull/124032. It replaces the tests that test the various sort functions in the standard library with a test-suite developed as part of https://github.com/Voultapher/sort-research-rs. The current tests suffer a couple of problems:
- They don't cover important real world patterns that the implementations take advantage of and execute special code for.
- The input lengths tested miss out on code paths. For example, important safety property tests never reach the quicksort part of the implementation.
- The miri side is often limited to `len <= 20` which means it very thoroughly tests the insertion sort, which accounts for 19 out of 1.5k LoC.
- They are split into to core and alloc, causing code duplication and uneven coverage.
- ~~The randomness is tied to a caller location, wasting the space exploration capabilities of randomized testing.~~ The randomness is not repeatable, as it relies on `std:#️⃣:RandomState::new().build_hasher()`.
Most of these issues existed before https://github.com/rust-lang/rust/pull/124032, but they are intensified by it. One thing that is new and requires additional testing, is that the new sort implementations specialize based on type properties. For example `Freeze` and non `Freeze` execute different code paths.
Effectively there are three dimensions that matter:
- Input type
- Input length
- Input pattern
The ported test-suite tests various properties along all three dimensions, greatly improving test coverage. It side-steps the miri issue by preferring sampled approaches. For example the test that checks if after a panic the set of elements is still the original one, doesn't do so for every single possible panic opportunity but rather it picks one at random, and performs this test across a range of input length, which varies the panic point across them. This allows regular execution to easily test inputs of length 10k, and miri execution up to 100 which covers significantly more code. The randomness used is tied to a fixed - but random per process execution - seed. This allows for fully repeatable tests and fuzzer like exploration across multiple runs.
Structure wise, the tests are previously found in the core integration tests for `sort_unstable` and alloc unit tests for `sort`. The new test-suite was developed to be a purely black-box approach, which makes integration testing the better place, because it can't accidentally rely on internal access. Because unwinding support is required the tests can't be in core, even if the implementation is, so they are now part of the alloc integration tests. Are there architectures that can only build and test core and not alloc? If so, do such platforms require sort testing? For what it's worth the current implementation state passes miri `--target mips64-unknown-linux-gnuabi64` which is big endian.
The test-suite also contains tests for properties that were and are given by the current and previous implementations, and likely relied upon by users but weren't tested. For example `self_cmp` tests that the two parameters `a` and `b` passed into the comparison function are never references to the same object, which if the user is sorting for example a `&mut [Mutex<i32>]` could lead to a deadlock.
Instead of using the hashed caller location as rand seed, it uses seconds since unix epoch / 10, which given timestamps in the CI should be reasonably easy to reproduce, but also allows fuzzer like space exploration.
---
Test run-time changes:
Setup:
```
Linux 6.10
rustc 1.83.0-nightly (f79a912d9 2024-09-18)
AMD Ryzen 9 5900X 12-Core Processor (Zen 3 micro-architecture)
CPU boost enabled.
```
master: e9df22f
Before core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 869.6 ms ± 21.1 ms [User: 1327.6 ms, System: 95.1 ms]
Range (min … max): 845.4 ms … 917.0 ms 10 runs
# MIRIFLAGS="-Zmiri-disable-isolation" to get real time
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 738.44s
```
After core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 865.1 ms ± 14.7 ms [User: 1283.5 ms, System: 88.4 ms]
Range (min … max): 836.2 ms … 885.7 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 752.35s
```
Before alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 295.0 ms ± 9.9 ms [User: 719.6 ms, System: 35.3 ms]
Range (min … max): 284.9 ms … 319.3 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 322.75s
```
After alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 97.4 ms ± 4.1 ms [User: 297.7 ms, System: 28.6 ms]
Range (min … max): 92.3 ms … 109.2 ms 27 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 309.18s
```
Before alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 103.2 ms ± 1.7 ms [User: 135.7 ms, System: 39.4 ms]
Range (min … max): 99.7 ms … 107.3 ms 28 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 231.35s
```
After alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 379.8 ms ± 4.7 ms [User: 4620.5 ms, System: 1157.2 ms]
Range (min … max): 373.6 ms … 386.9 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 449.24s
```
In my opinion the results don't change iterative library development or CI execution in meaningful ways. For example currently the library doc-tests take ~66s and incremental compilation takes 10+ seconds. However I only have limited knowledge of the various local development workflows that exist, and might be missing one that is significantly impacted by this change.
Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) +
c`, to be fused if the code generator determines that (i) the target
instruction set has support for a fused operation, and (ii) that the
fused operation is more efficient than the equivalent, separate pair
of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
MIRI support is included for f32 and f64.
The codegen_cranelift uses the `fma` function from libc, which is a
correct implementation, but without the desired performance semantic. I
think this requires an update to cranelift to expose a suitable
instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same
way (using `fma` from libc).
Stabilize const `{slice,array}::from_mut`
This PR stabilizes the following APIs as const stable as of rust `1.83`:
```rs
// core::array
pub const fn from_mut<T>(s: &mut T) -> &mut [T; 1];
// core::slice
pub const fn from_mut<T>(s: &mut T) -> &mut [T];
```
This is made possible by `const_mut_refs` being stabilized (yay).
Tracking issue: #90206