Operations like is_aligned would return actively wrong results at compile-time,
i.e. calling it on the same pointer at compiletime and runtime could yield
different results. That's no good.
Instead of having hacks to make align_offset kind-of work in const-eval, just
use const_eval_select in the few places where it makes sense, which also ensures
those places are all aware they need to make sure the fallback behavior is
consistent.
Rename `rustc_abi::Abi` to `BackendRepr`
Remove the confabulation of `rustc_abi::Abi` with what "ABI" actually means by renaming it to `BackendRepr`, and rename `Abi::Aggregate` to `BackendRepr::Memory`. The type never actually represented how things are passed, as that has to have `PassMode` considered, at minimum, but rather it just is how we represented some things to the backend. This conflation arose because LLVM, the primary backend at the time, would lower certain IR forms using certain ABIs. Even that only somewhat was true, as it broke down when one ventured significantly afield of what is described by the System V AMD64 ABI either by using different architectures, ABI-modifying IR annotations, the same architecture **with different ISA extensions enabled**, or other... unexpected delights.
Unfortunately both names are still somewhat of a misnomer right now, as people have written code for years based on this misunderstanding. Still, their original names are even moreso, and for better or worse, this backend code hasn't received as much maintenance as the rest of the compiler, lately. Actually arriving at a correct end-state will simply require us to disentangle a lot of code in order to fix, much of it pointlessly repeated in several places. Thus this is not an "actual fix", just a way to deflect further misunderstandings.
This is a standard pattern:
```
MyAnalysis.into_engine(tcx, body).iterate_to_fixpoint()
```
`into_engine` and `iterate_to_fixpoint` are always called in pairs, but
sometimes with a builder-style `pass_name` call between them. But a
builder-style interface is overkill here. This has been bugging me a for
a while.
This commit:
- Merges `Engine::new` and `Engine::iterate_to_fixpoint`. This removes
the need for `Engine` to have fields, leaving it as a trivial type
that the next commit will remove.
- Renames `Analysis::into_engine` as `Analysis::iterate_to_fixpoint`,
gives it an extra argument for the optional pass name, and makes it
call `Engine::iterate_to_fixpoint` instead of `Engine::new`.
This turns the pattern from above into this:
```
MyAnalysis.iterate_to_fixpoint(tcx, body, None)
```
which is shorter at every call site, and there's less plumbing required
to support it.
The initial naming of "Abi" was an awful mistake, conveying wrong ideas
about how psABIs worked and even more about what the enum meant.
It was only meant to represent the way the value would be described to
a codegen backend as it was lowered to that intermediate representation.
It was never meant to mean anything about the actual psABI handling!
The conflation is because LLVM typically will associate a certain form
with a certain ABI, but even that does not hold when the special cases
that actually exist arise, plus the IR annotations that modify the ABI.
Reframe `rustc_abi::Abi` as the `BackendRepr` of the type, and rename
`BackendRepr::Aggregate` as `BackendRepr::Memory`. Unfortunately, due to
the persistent misunderstandings, this too is now incorrect:
- Scattered ABI-relevant code is entangled with BackendRepr
- We do not always pre-compute a correct BackendRepr that reflects how
we "actually" want this value to be handled, so we leave the backend
interface to also inject various special-cases here
- In some cases `BackendRepr::Memory` is a "real" aggregate, but in
others it is in fact using memory, and in some cases it is a scalar!
Our rustc-to-backend lowering code handles this sort of thing right now.
That will eventually be addressed by lifting duplicated lowering code
to either rustc_codegen_ssa or rustc_target as appropriate.
Then we can rename the _raw functions to drop their suffix, and instead
explicitly use is_stable_const_fn for the few cases where that is really what
you want.
Fundamentally, we have *three* disjoint categories of functions:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, several holes in recursive const stability checking are being closed.
There's still one potential hole that is hard to avoid, which is when MIR
building automatically inserts calls to a particular function in stable
functions -- which happens in the panic machinery. Those need to *not* be
`rustc_const_unstable` (or manually get a `rustc_const_stable_indirect`) to be
sure they follow recursive const stability. But that's a fairly rare and special
case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be
constified simply by marking it as `const fn`, and it will then be
const-callable from stable `const fn` and subject to recursive const stability
requirements. If it is publicly reachable (which implies it cannot be unmarked),
it will be const-unstable under the same feature gate. Only if the function ever
becomes `#[stable]` does it need a `#[rustc_const_unstable]` or
`#[rustc_const_stable]` marker to decide if this should also imply
const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to
use unstable const lang features (including intrinsics), or (b) `#[stable]`
functions that are not yet intended to be const-stable. Adding
`#[rustc_const_stable]` is only needed for functions that are actually meant to
be directly callable from stable const code. `#[rustc_const_stable_indirect]` is
used to mark intrinsics as const-callable and for `#[rustc_const_unstable]`
functions that are actually called from other, exposed-on-stable `const fn`. No
other attributes are required.
terminology: #[feature] *enables* a feature (instead of "declaring" or "activating" it)
Mostly, we currently call a feature that has a corresponding `#[feature(name)]` attribute in the current crate a "declared" feature. I think that is confusing as it does not align with what "declaring" usually means. Furthermore, we *also* refer to `#[stable]`/`#[unstable]` as *declaring* a feature (e.g. in [these diagnostics](f25e5abea2/compiler/rustc_passes/messages.ftl (L297-L301))), which aligns better with what "declaring" usually means. To make things worse, the functions `tcx.features().active(...)` and `tcx.features().declared(...)` both exist and they are doing almost the same thing (testing whether a corresponding `#[feature(name)]` exists) except that `active` would ICE if the feature is not an unstable lang feature. On top of this, the callback when a feature is activated/declared is called `set_enabled`, and many comments also talk about "enabling" a feature.
So really, our terminology is just a mess.
I would suggest we use "declaring a feature" for saying that something is/was guarded by a feature (e.g. `#[stable]`/`#[unstable]`), and "enabling a feature" for `#[feature(name)]`. This PR implements that.
stabilize Strict Provenance and Exposed Provenance APIs
Given that [RFC 3559](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html) has been accepted, t-lang has approved the concept of provenance to exist in the language. So I think it's time that we stabilize the strict provenance and exposed provenance APIs, and discuss provenance explicitly in the docs:
```rust
// core::ptr
pub const fn without_provenance<T>(addr: usize) -> *const T;
pub const fn dangling<T>() -> *const T;
pub const fn without_provenance_mut<T>(addr: usize) -> *mut T;
pub const fn dangling_mut<T>() -> *mut T;
pub fn with_exposed_provenance<T>(addr: usize) -> *const T;
pub fn with_exposed_provenance_mut<T>(addr: usize) -> *mut T;
impl<T: ?Sized> *const T {
pub fn addr(self) -> usize;
pub fn expose_provenance(self) -> usize;
pub fn with_addr(self, addr: usize) -> Self;
pub fn map_addr(self, f: impl FnOnce(usize) -> usize) -> Self;
}
impl<T: ?Sized> *mut T {
pub fn addr(self) -> usize;
pub fn expose_provenance(self) -> usize;
pub fn with_addr(self, addr: usize) -> Self;
pub fn map_addr(self, f: impl FnOnce(usize) -> usize) -> Self;
}
impl<T: ?Sized> NonNull<T> {
pub fn addr(self) -> NonZero<usize>;
pub fn with_addr(self, addr: NonZero<usize>) -> Self;
pub fn map_addr(self, f: impl FnOnce(NonZero<usize>) -> NonZero<usize>) -> Self;
}
```
I also did a pass over the docs to adjust them, because this is no longer an "experiment". The `ptr` docs now discuss the concept of provenance in general, and then they go into the two families of APIs for dealing with provenance: Strict Provenance and Exposed Provenance. I removed the discussion of how pointers also have an associated "address space" -- that is not actually tracked in the pointer value, it is tracked in the type, so IMO it just distracts from the core point of provenance. I also adjusted the docs for `with_exposed_provenance` to make it clear that we cannot guarantee much about this function, it's all best-effort.
There are two unstable lints associated with the strict_provenance feature gate; I moved them to a new [strict_provenance_lints](https://github.com/rust-lang/rust/issues/130351) feature since I didn't want this PR to have an even bigger FCP. ;)
`@rust-lang/opsem` Would be great to get some feedback on the docs here. :)
Nominating for `@rust-lang/libs-api.`
Part of https://github.com/rust-lang/rust/issues/95228.
[FCP comment](https://github.com/rust-lang/rust/pull/130350#issuecomment-2395114536)
Rollup of 4 pull requests
Successful merges:
- #126588 (Added more scenarios where comma to be removed in the function arg)
- #131728 (bootstrap: extract builder cargo to its own module)
- #131968 (Rip out old effects var handling code from traits)
- #131981 (Remove the `BoundConstness::NotConst` variant)
r? `@ghost`
`@rustbot` modify labels: rollup
Continue to get rid of `ty::Const::{try_}eval*`
This PR mostly does:
* Removes all of the `try_eval_*` and `eval_*` helpers from `ty::Const`, and replace their usages with `try_to_*`.
* Remove `ty::Const::eval`.
* Rename `ty::Const::normalize` to `ty::Const::normalize_internal`. This function is still used in the normalization code itself.
* Fix some weirdness around the `TransmuteFrom` goal.
I'm happy to split it out further; for example, I could probably land the first part which removes the helpers, or the changes to codegen which are more obvious than the changes to tools.
r? BoxyUwU
Part of https://github.com/rust-lang/rust/issues/130704
Remove `GenKillAnalysis`
There are two kinds of dataflow analysis in the compiler: `Analysis`, which is the basic kind, and `GenKillAnalysis`, which is a more specialized kind for gen/kill analyses that is intended as an optimization. However, it turns out that `GenKillAnalysis` is actually a pessimization! It's faster (and much simpler) to do all the gen/kill analyses via `Analysis`. This lets us remove `GenKillAnalysis`, and `GenKillSet`, and a few other things, and also merge `AnalysisDomain` into `Analysis`. The PR removes 500 lines of code and improves performance.
r? `@tmiasko`
Some float methods are now `const fn` under the `const_float_methods` feature gate.
In order to support `min`, `max`, `abs` and `copysign`, the implementation of some intrinsics had to be moved from Miri to rustc_const_eval.
fix/update teach_note from 'escaping mutable ref/ptr' const-check
The old note was quite confusing since it talked about statics, but the message is also shown for consts. So let's reword to something that is true for both of them.
Don't use Immediate::offset to transmute pointers to integers
This applies the relatively new `assert_matches_abi` check in the `offset` operation on immediates, which makes sure that if offsets are used to alter the layout (which is possible because the field layout is arbitrarily picked by the caller), this is not done in a way that breaks the invariant of the `Immediate` type.
This leads to ICEs in a GVN mir-opt test, so the second commit fixes GVN.
Fixes https://github.com/rust-lang/rust/issues/131064.
- fix for divergence
- fix error message
- fix another cranelift test
- fix some cranelift things
- don't set the NORETURN option for naked asm
- fix use of naked_asm! in doc comment
- fix use of naked_asm! in run-make test
- use `span_bug` in unreachable branch
interpret: always enable write_immediate sanity checks
Writing a wrongly-sized scalar somewhere can have quite confusing effects. Let's see how expensive it is to catch this early.
Check vtable projections for validity in miri
Currently, miri does not catch when we transmute `dyn Trait<Assoc = A>` to `dyn Trait<Assoc = B>`. This PR implements such a check, and fixes https://github.com/rust-lang/miri/issues/3905.
To do this, we modify `GlobalAlloc::VTable` to contain the *whole* list of `PolyExistentialPredicate`, and then modify `check_vtable_for_type` to validate the `PolyExistentialProjection`s of the vtable, along with the principal trait that was already being validated.
cc ``@RalfJung``
r? ``@lcnr`` or types
I also tweaked the diagnostics a bit.
---
**Open question:** We don't validate the auto traits. You can transmute `dyn Foo` into `dyn Foo + Send`. Should we check that? We currently have a test that *exercises* this as not being UB:
6c6d210089/src/tools/miri/tests/pass/dyn-upcast.rs (L14-L20)
I'm not actually sure if we ever decided that's actually UB or not 🤔
We could perhaps still check that the underlying type of the object (i.e. the concrete type that was unsized) implements the auto traits, to catch UB like:
```rust
fn main() {
let x: &dyn Trait = &std::ptr::null_mut::<()>();
let _: &(dyn Trait + Send) = std::mem::transmute(x);
//~^ this vtable is not allocated for a type that is `Send`!
}
```
interpret: remove outdated FIXME
The rule about `repr(C)` types with compatible fields got removed from the ABI compat docs before they landed, so this FIXME here is no longer correct. (So this is basically a follow-up to https://github.com/rust-lang/rust/pull/130185, doing some more cleanup around deciding not to guarantee ABI compatibility for structurally compatible `repr(C)` types.)
fix rustc_nonnull_optimization_guaranteed docs
As far as I can tell, even back when this was [added](https://github.com/rust-lang/rust/pull/60300) it never *enabled* any optimizations. It just indicates that the FFI compat lint should accept those types for NPO.
Prevent Deduplication of `LongRunningWarn`
Fixes#118612
As mention in the issue, `LongRunningWarn` is meant to be repeated multiple times.
Therefore, this PR stores a unique number in every instance of `LongRunningWarn` so that it's not hashed into the same value and omitted by the deduplication mechanism.
interpret, miri: fix dealing with overflow during slice indexing and allocation
This is mostly to fix https://github.com/rust-lang/rust/issues/130284.
I then realized we're using somewhat sketchy arguments for a similar multiplication in `copy`/`copy_nonoverlapping`/`write_bytes`, so I made them all share the same function that checks exactly the right thing. (The intrinsics would previously fail on allocations larger than `1 << 47` bytes... which are theoretically possible maybe? Anyway it seems conceptually wrong to use any other bound than `isize::MAX` here.)
miri: treat non-memory local variables properly for data race detection
Fixes https://github.com/rust-lang/miri/issues/3242
Miri has an optimization where some local variables are not represented in memory until something forces them to be stored in memory (most notably, creating a pointer/reference to the local will do that). However, for a subsystem triggering on memory accesses -- such as the data race detector -- this means that the memory access seems to happen only when the local is moved to memory, instead of at the time that it actually happens. This can lead to UB reports in programs that do not actually have UB.
This PR fixes that by adding machine hooks for reads and writes to such efficiently represented local variables. The data race system tracks those very similar to how it would track reads and writes to addressable memory, and when a local is moved to memory, the clocks get overwritten with the information stored for the local.
Rollup of 5 pull requests
Successful merges:
- #129195 (Stabilize `&mut` (and `*mut`) as well as `&Cell` (and `*const Cell`) in const)
- #130118 (move Option::unwrap_unchecked into const_option feature gate)
- #130295 (Fix target-cpu fpu features on Armv8-R.)
- #130371 (Correctly account for niche-optimized tags in rustc_transmute)
- #130381 (library: Compute Rust exception class from its string repr)
r? `@ghost`
`@rustbot` modify labels: rollup
const-eval interning: accept interior mutable pointers in final value
…but keep rejecting mutable references
This fixes https://github.com/rust-lang/rust/issues/121610 by no longer firing the lint when there is a pointer with interior mutability in the final value of the constant. On stable, such pointers can be created with code like:
```rust
pub enum JsValue {
Undefined,
Object(Cell<bool>),
}
impl Drop for JsValue {
fn drop(&mut self) {}
}
// This does *not* get promoted since `JsValue` has a destructor.
// However, the outer scope rule applies, still giving this 'static lifetime.
const UNDEFINED: &JsValue = &JsValue::Undefined;
```
It's not great to accept such values since people *might* think that it is legal to mutate them with unsafe code. (This is related to how "infectious" `UnsafeCell` is, which is a [wide open question](https://github.com/rust-lang/unsafe-code-guidelines/issues/236).) However, we [explicitly document](https://doc.rust-lang.org/reference/behavior-considered-undefined.html) that things created by `const` are immutable. Furthermore, we also accept the following even more questionable code without any lint today:
```rust
let x: &'static Option<Cell<i32>> = &None;
```
This is even more questionable since it does *not* involve a `const`, and yet still puts the data into immutable memory. We could view this as promotion [potentially introducing UB](https://github.com/rust-lang/unsafe-code-guidelines/issues/493). However, we've accepted this since ~forever and it's [too late to reject this now](https://github.com/rust-lang/rust/pull/122789); the pattern is just too useful.
So basically, if you think that `UnsafeCell` should be tracked fully precisely, then you should want the lint we currently emit to be removed, which this PR does. If you think `UnsafeCell` should "infect" surrounding `enum`s, the big problem is really https://github.com/rust-lang/unsafe-code-guidelines/issues/493 which does not trigger the lint -- the cases the lint triggers on are actually the "harmless" ones as there is an explicit surrounding `const` explaining why things end up being immutable.
What all this goes to show is that the hard error added in https://github.com/rust-lang/rust/pull/118324 (later turned into the future-compat lint that I am now suggesting we remove) was based on some wrong assumptions, at least insofar as it concerns shared references. Furthermore, that lint does not help at all for the most problematic case here where the potential UB is completely implicit. (In fact, the lint is actively in the way of [my preferred long-term strategy](https://github.com/rust-lang/unsafe-code-guidelines/issues/493#issuecomment-2028674105) for dealing with this UB.) So I think we should go back to square one and remove that error/lint for shared references. For mutable references, it does seem to work as intended, so we can keep it. Here it serves as a safety net in case the static checks that try to contain mutable references to the inside of a const initializer are not working as intended; I therefore made the check ICE to encourage users to tell us if that safety net is triggered.
Closes https://github.com/rust-lang/rust/issues/122153 by removing the lint.
Cc `@rust-lang/opsem` `@rust-lang/lang`
- Replace non-standard names like 's, 'p, 'rg, 'ck, 'parent, 'this, and
'me with vanilla 'a. These are cases where the original name isn't
really any more informative than 'a.
- Replace names like 'cx, 'mir, and 'body with vanilla 'a when the lifetime
applies to multiple fields and so the original lifetime name isn't
really accurate.
- Put 'tcx last in lifetime lists, and 'a before 'b.
Fix `clippy::useless_conversion`
Self-explanatory. Probably the last clippy change I'll actually put up since this is the only other one I've actually seen in the wild.
Simplify some nested `if` statements
Applies some but not all instances of `clippy::collapsible_if`. Some ended up looking worse afterwards, though, so I left those out. Also applies instances of `clippy::collapsible_else_if`
Review with whitespace disabled please.
miri: fix overflow detection for unsigned pointer offset
This is the Miri part of https://github.com/rust-lang/rust/pull/130229. This is already UB in codegen so we better make Miri detect it; updating the docs may take time if we have to follow some approval process, but let's make Miri match reality ASAP.
r? ``@scottmcm``
Rollup of 11 pull requests
Successful merges:
- #128523 (Add release notes for 1.81.0)
- #129605 (Add missing `needs-llvm-components` directives for run-make tests that need target-specific codegen)
- #129650 (Clean up `library/profiler_builtins/build.rs`)
- #129651 (skip stage 0 target check if `BOOTSTRAP_SKIP_TARGET_SANITY` is set)
- #129684 (Enable Miri to pass pointers through FFI)
- #129762 (Update the `wasm-component-ld` binary dependency)
- #129782 (couple more crash tests)
- #129816 (tidy: say which feature gate has a stability issue mismatch)
- #129818 (make the const-unstable-in-stable error more clear)
- #129824 (Fix code examples buttons not appearing on click on mobile)
- #129826 (library: Fix typo in `core::mem`)
r? `@ghost`
`@rustbot` modify labels: rollup
make the const-unstable-in-stable error more clear
The default should be to add `rustc_const_unstable`, not `rustc_allow_const_fn_unstable`.
Also I discovered our check for missing const stability attributes on stable functions -- but strangely that check only kicks in for "reachable" functions. `check_missing_stability` checks for reachability since all reachable functions must have a stability attribute, but I would say if a function has `#[stable]` it should also have const-stability attributes regardless of reachability.
Enable Miri to pass pointers through FFI
Following https://github.com/rust-lang/rust/pull/126787, the purpose of this PR is to now enable Miri to execute native calls that make use of pointers.
> <details>
>
> <summary> Simple example </summary>
>
> ```rust
> extern "C" {
> fn ptr_printer(ptr: *mut i32);
> }
>
> fn main() {
> let ptr = &mut 42 as *mut i32;
> unsafe {
> ptr_printer(ptr);
> }
> }
> ```
> ```c
> void ptr_printer(int *ptr) {
> printf("printing pointer dereference from C: %d\n", *ptr);
> }
> ```
> should now show `printing pointer dereference from C: 42`.
>
> </details>
Note that this PR does not yet implement any logic involved in updating Miri's "analysis" state (byte initialization, provenance) upon such a native call.
r? ``@RalfJung``
const fn stability checking: also check declared language features
Fixes https://github.com/rust-lang/rust/issues/129656
`@oli-obk` I assume it is just an oversight that this didn't use `features().declared()`? Or is there a deep reason that this must only check `declared_lib_features`?
interpret: do not make const-eval query result depend on tcx.sess
The check against calling functions with missing target features uses `tcx.sess` to determine which target features are available. However, this can differ between different crates in a crate graph, so the same const-eval query can come to different conclusions about whether a constant evaluates successfully or not -- which is bad, we should consistently get the same result everywhere.
const-eval: do not make UbChecks behavior depend on current crate's flags
Fixes https://github.com/rust-lang/rust/issues/129552
Let's see if we can get away with just always enabling these checks.
Stop storing a special inner body for the coroutine by-move body for async closures
...and instead, just synthesize an item which is treated mostly normally by the MIR pipeline.
This PR does a few things:
* We synthesize a new `DefId` for the by-move body of a closure, which has its `mir_built` fed with the output of the `ByMoveBody` MIR transformation, and some other relevant queries.
* This has the `DefKind::ByMoveBody`, which we use to distinguish it from "real" bodies (that come from HIR) which need to be borrowck'd. Introduce `TyCtxt::is_synthetic_mir` to skip over `mir_borrowck` which is called by `mir_promoted`; borrowck isn't really possible to make work ATM since it heavily relies being called on a body generated from HIR, and is redundant by the construction of the by-move-body.
* Remove the special `PassManager` hacks for handling the inner `by_move_body` stored within the coroutine's mir body. Instead, this body is fed like a regular MIR body, so it's goes through all of the `tcx.*_mir` stages normally (build -> promoted -> ...etc... -> optimized) ✨.
* Remove the `InstanceKind::ByMoveBody` shim, since now we have a "regular" def id, we can just use `InstanceKind::Item`. This also allows us to remove the corresponding hacks from codegen, such as in `fn_sig_for_fn_abi` ✨.
Notable remarks:
* ~~I know it's kind of weird to be using `DefKind::Closure` here, since it's not a distinct closure but just a new MIR body. I don't believe it really matters, but I could also use a different `DefKind`... maybe one that we could use for synthetic MIR bodies in general?~~ edit: We're doing this now.
make it possible to enable const_precise_live_drops per-function
This makes const_precise_live_drops work with rustc_allow_const_fn_unstable so that we can stabilize individual functions that rely on const_precise_live_drops.
The goal is that we can use that to stabilize some of https://github.com/rust-lang/rust/issues/67441 without having to stabilize const_precise_live_drops.
miri weak memory emulation: put previous value into initial store buffer
Fixes https://github.com/rust-lang/miri/issues/2164 by doing a read before each atomic write so that we can initialize the store buffer. The read suppresses memory access hooks and UB exceptions, to avoid otherwise influencing the program behavior. If the read fails, we store that as `None` in the store buffer, so that when an atomic read races with the first atomic write to some memory and previously the memory was uninitialized, we can report UB due to reading uninit memory.
``@cbeuw`` this changes a bit the way we initialize the store buffers. Not sure if you still remember all this code, but if you could have a look to make sure this still makes sense, that would be great. :)
r? ``@saethlin``
const checking: properly compute the set of transient locals
For const-checking the MIR of a const/static initializer, we have to know the set of "transient" locals. The reason for this is that creating a mutable (or interior mutable) reference to a transient local is "safe" in the sense that this reference cannot possibly end up in the final value of the const -- even if it is turned into a raw pointer and stored in a union, we will see that pointer during interning and reliably reject it as dangling.
So far, we determined the set of transient locals as "locals that have a `StorageDead` somewhere". But that's not quite right -- if we had MIR like
```rust
StorageLive(_5);
StorageDead(_5);
StorageLive(_5);
// no further storage annotations for _5
```
Then we'd consider `_5` to be "transient", but it is not actually transient.
We do not currently generate such MIR, but I feel uneasy relying on subtle invariants like this. So this PR implements a proper analysis for computing the set of "transient" locals: a local is "transient" if it is guaranteed dead at all `Return` terminators.
Cc `@cjgillot`
make writes_through_immutable_pointer a hard error
This turns the lint added in https://github.com/rust-lang/rust/pull/118324 into a hard error. This has been reported in cargo's future-compat reports since Rust 1.76 (released in February). Given that const_mut_refs is still unstable, it should be impossible to even hit this error on stable: we did accidentally stabilize some functions that can cause this error, but that got reverted in https://github.com/rust-lang/rust/pull/117905. Still, let's do a crater run just to be sure.
Given that this should only affect unstable code, I don't think it needs an FCP, but let's Cc ``@rust-lang/lang`` anyway -- any objection to making this unambiguous UB into a hard error during const-eval? This can be viewed as part of https://github.com/rust-lang/rust/pull/129195 which is already nominated for discussion.
Use `bool` in favor of `Option<()>` for diagnostics
We originally only supported `Option<()>` for optional notes/labels, but we now support `bool`. Let's use that, since it usually leads to more readable code.
I'm not removing the support from the derive macro, though I guess we could error on it... 🤔
Shrink `TyKind::FnPtr`.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
r? `@compiler-errors`
miri: make vtable addresses not globally unique
Miri currently gives vtables a unique global address. That's not actually matching reality though. So this PR enables Miri to generate different addresses for the same type-trait pair.
To avoid generating an unbounded number of `AllocId` (and consuming unbounded amounts of memory), we use the "salt" technique that we also already use for giving constants non-unique addresses: the cache is keyed on a "salt" value n top of the actually relevant key, and Miri picks a random salt (currently in the range `0..16`) each time it needs to choose an `AllocId` for one of these globals -- that means we'll get up to 16 different addresses for each vtable. The salt scheme is integrated into the global allocation deduplication logic in `tcx`, and also used for functions and string literals. (So this also fixes the problem that casting the same function to a fn ptr over and over will consume unbounded memory.)
r? `@saethlin`
Fixes https://github.com/rust-lang/miri/issues/3737
Normalize struct tail properly for `dyn` ptr-to-ptr casting in new solver
Realized that the new solver didn't handle ptr-to-ptr casting correctly.
r? lcnr
Built on #128694
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and
`FnHeader`, which can be packed more efficiently. This reduces the size
of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms.
This reduces peak memory usage by a few percent on some benchmarks. It
also reduces cache misses and page faults similarly, though this doesn't
translate to clear cycles or wall-time improvements on CI.
Pass the right `ParamEnv` to `might_permit_raw_init_strict`
Fixes#119620
`might_permit_raw_init_strict` currently passes an empty `ParamEnv` to the `InterpCx`, instead of the actual `ParamEnv` that was passed in to `check_validity_requirement` at callsite.
This leads to ICEs such as the linked issue where for `UnsafeCell<*mut T>` we initially get the layout with the right `ParamEnv` (which suceeds because it can prove that `T: Sized` and therefore `UnsafeCell<*mut T>` has a known layout) but then do the rest with an empty `ParamEnv` where `T: Sized` is not known to hold so getting the layout for `*mut T` later fails.
This runs into an assertion in other layout code where it's making the (valid) assumption that, when we already have a layout for a struct (`UnsafeCell<*mut T>`), getting the layout of one of its fields (`*mut T`) should also succeed, which wasn't the case here due to using the wrong `ParamEnv`.
So, this PR changes it to just use the same `ParamEnv` all the way throughout.
MIR required_consts, mentioned_items: ensure we do not forget to fill these lists
Bodies initially get created with empty required_consts and mentioned_items, but at some point those should be filled. Make sure we notice when that is forgotten.
raw_eq: using it on bytes with provenance is not UB (outside const-eval)
The current behavior of raw_eq violates provenance monotonicity. See https://github.com/rust-lang/rust/pull/124921 for an explanation of provenance monotonicity. It is violated in raw_eq because comparing bytes without provenance is well-defined, but adding provenance makes the operation UB.
So remove the no-provenance requirement from raw_eq. However, the requirement stays in-place for compile-time invocations of raw_eq, that indeed cannot deal with provenance.
Cc `@rust-lang/opsem`
miri: fix offset_from behavior on wildcard pointers
offset_from wouldn't behave correctly when the "end" pointer was a wildcard pointer (result of an int2ptr cast) just at the end of the allocation. Fix that by expressing the "same allocation" check in terms of two `check_ptr_access_signed` instead of something specific to offset_from, which is both more canonical and works better with wildcard pointers.
The second commit just improves diagnostics: I wanted the "pointer is dangling (has no provenance)" message to say how many bytes of memory it expected to see (since if it were 0 bytes, this would actually be legal, so it's good to tell the user that it's not 0 bytes). And then I was annoying that the error looks so different for when you deref a dangling pointer vs an out-of-bounds pointer so I made them more similar.
Fixes https://github.com/rust-lang/miri/issues/3767
Use `#[rustfmt::skip]` on some `use` groups to prevent reordering.
`use` declarations will be reformatted in #125443. Very rarely, there is a desire to force a group of `use` declarations together in a way that auto-formatting will break up. E.g. when you want a single comment to apply to a group. #126776 dealt with all of these in the codebase, ensuring that no comments intended for multiple `use` declarations would end up in the wrong place. But some people were unhappy with it.
This commit uses `#[rustfmt::skip]` to create these custom `use` groups in an idiomatic way for a few of the cases changed in #126776. This works because rustfmt treats any `use` item annotated with `#[rustfmt::skip]` as a barrier and won't reorder other `use` items around it.
r? `@cuviper`
interpret: add sanity check in dyn upcast to double-check what codegen does
For dyn receiver calls, we already have two codepaths: look up the function to call by indexing into the vtable, or alternatively resolve the DefId given the dynamic type of the receiver. With debug assertions enabled, the interpreter does both and compares the results. (Without debug assertions we always use the vtable as it is simpler.)
This PR does the same for dyn trait upcasts. However, for casts *not* using the vtable is the easier thing to do, so now the vtable path is the debug-assertion-only path. In particular, there are cases where the vtable does not contain a pointer for upcasts but instead reuses the old pointer: when the supertrait vtable is a prefix of the larger vtable. We don't want to expose this optimization and detect UB if people do a transmute assuming this optimization, so we cannot in general use the vtable indexing path.
r? ``@oli-obk``
`use` declarations will be reformatted in #125443. Very rarely, there is
a desire to force a group of `use` declarations together in a way that
auto-formatting will break up. E.g. when you want a single comment to
apply to a group. #126776 dealt with all of these in the codebase,
ensuring that no comments intended for multiple `use` declarations would
end up in the wrong place. But some people were unhappy with it.
This commit uses `#[rustfmt::skip]` to create these custom `use` groups
in an idiomatic way for a few of the cases changed in #126776. This
works because rustfmt treats any `use` item annotated with
`#[rustfmt::skip]` as a barrier and won't reorder other `use` items
around it.
Clean up more comments near use declarations
#125443 will reformat all use declarations in the repository. There are a few edge cases involving comments on use declarations that require care. This PR fixes them up so #125443 can go ahead with a simple `x fmt --all`. A follow-up to #126717.
r? ``@cuviper``
There are some comments describing multiple subsequent `use` items. When
the big `use` reformatting happens some of these `use` items will be
reordered, possibly moving them away from the comment. With this
additional level of formatting it's not really feasible to have comments
of this type. This commit removes them in various ways:
- merging separate `use` items when appropriate;
- inserting blank lines between the comment and the first `use` item;
- outright deletion (for comments that are relatively low-value);
- adding a separate "top-level" comment.
We also entirely skip formatting for four library files that contain
nothing but `pub use` re-exports, where reordering would be painful.
offset_from: always allow pointers to point to the same address
This PR implements the last remaining part of the t-opsem consensus in https://github.com/rust-lang/unsafe-code-guidelines/issues/472: always permits offset_from when both pointers have the same address, no matter how they are computed. This is required to achieve *provenance monotonicity*.
Tracking issue: https://github.com/rust-lang/rust/issues/117945
### What is provenance monotonicity and why does it matter?
Provenance monotonicity is the property that adding arbitrary provenance to any no-provenance pointer must never make the program UB. More specifically, in the program state, data in memory is stored as a sequence of [abstract bytes](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#abstract-byte), where each byte can optionally carry provenance. When a pointer is stored in memory, all of the bytes it is stored in carry that provenance. Provenance monotonicity means: if we take some byte that does not have provenance, and give it some arbitrary provenance, then that cannot change program behavior or introduce UB into a UB-free program.
We care about provenance monotonicity because we want to allow the optimizer to remove provenance-stripping operations. Removing a provenance-stripping operation effectively means the program after the optimization has provenance where the program before the optimization did not -- since the provenance removal does not happen in the optimized program. IOW, the compiler transformation added provenance to previously provenance-free bytes. This is exactly what provenance monotonicity lets us do.
We care about removing provenance-stripping operations because `*ptr = *ptr` is, in general, (likely) a provenance-stripping operation. Specifically, consider `ptr: *mut usize` (or any integer type), and imagine the data at `*ptr` is actually a pointer (i.e., we are type-punning between pointers and integers). Then `*ptr` on the right-hand side evaluates to the data in memory *without* any provenance (because [integers do not have provenance](https://rust-lang.github.io/rfcs/3559-rust-has-provenance.html#integers-do-not-have-provenance)). Storing that back to `*ptr` means that the abstract bytes `ptr` points to are the same as before, except their provenance is now gone. This makes `*ptr = *ptr` a provenance-stripping operation (Here we assume `*ptr` is fully initialized. If it is not initialized, evaluating `*ptr` to a value is UB, so removing `*ptr = *ptr` is trivially correct.)
### What does `offset_from` have to do with provenance monotonicity?
With `ptr = without_provenance(N)`, `ptr.offset_from(ptr)` is always well-defined and returns 0. By provenance monotonicity, I can now add provenance to the two arguments of `offset_from` and it must still be well-defined. Crucially, I can add *different* provenance to the two arguments, and it must still be well-defined. In other words, this must always be allowed: `ptr1.with_addr(N).offset_from(ptr2.with_addr(N))` (and it returns 0). But the current spec for `offset_from` says that the two pointers must either both be derived from an integer or both be derived from the same allocation, which is not in general true for arbitrary `ptr1`, `ptr2`.
To obtain provenance monotonicity, this PR hence changes the spec for offset_from to say that if both pointers have the same address, the function is always well-defined.
### What further consequences does this have?
It means the compiler can no longer transform `end2 = begin.offset(end.offset_from(begin))` into `end2 = end`. However, it can still be transformed into `end2 = begin.with_addr(end.addr())`, which later parts of the backend (when provenance has been erased) can trivially turn into `end2 = end`.
The only alternative I am aware of is a fundamentally different handling of zero-sized accesses, where a "no provenance" pointer is not allowed to do zero-sized accesses and instead we have a special provenance that indicates "may be used for zero-sized accesses (and nothing else)". `offset` and `offset_from` would then always be UB on a "no provenance" pointer, and permit zero-sized offsets on a "zero-sized provenance" pointer. This achieves provenance monotonicity. That is, however, a breaking change as it contradicts what we landed in https://github.com/rust-lang/rust/pull/117329. It's also a whole bunch of extra UB, which doesn't seem worth it just to achieve that transformation.
### What about the backend?
LLVM currently doesn't have an intrinsic for pointer difference, so we anyway cast to integer and subtract there. That's never UB so it is compatible with any relaxation we may want to apply.
If LLVM gets a `ptrsub` in the future, then plausibly it will be consistent with `ptradd` and [consider two equal pointers to be inbounds](https://github.com/rust-lang/rust/pull/124921#issuecomment-2205795829).
Support tail calls in mir via `TerminatorKind::TailCall`
This is one of the interesting bits in tail call implementation — MIR support.
This adds a new `TerminatorKind` which represents a tail call:
```rust
TailCall {
func: Operand<'tcx>,
args: Vec<Operand<'tcx>>,
fn_span: Span,
},
```
*Structurally* this is very similar to a normal `Call` but is missing a few fields:
- `destination` — tail calls don't write to destination, instead they pass caller's destination to the callee (such that eventual `return` will write to the caller of the function that used tail call)
- `target` — similarly to `destination` tail calls pass the caller's return address to the callee, so there is nothing to do
- `unwind` — I _think_ this is applicable too, although it's a bit confusing
- `call_source` — `become` forbids operators and is not created as a lowering of something else; tail calls always come from HIR (at least for now)
It might be helpful to read the interpreter implementation to understand what `TailCall` means exactly, although I've tried documenting it too.
-----
There are a few `FIXME`-questions still left, ideally we'd be able to answer them during review ':)
-----
r? `@oli-obk`
cc `@scottmcm` `@DrMeepster` `@JakobDegen`
offset_from, offset: clearly separate safety requirements the user needs to prove from corollaries that automatically follow
By landing https://github.com/rust-lang/rust/pull/116675 we decided that objects larger than `isize::MAX` cannot exist in the address space of a Rust program, which lets us simplify these rules.
For `offset_from`, we can even state that the *absolute* distance fits into an `isize`, and therefore exclude `isize::MIN`. This PR also changes Miri to treat an `isize::MIN` difference like the other isize-overflowing cases.
Miri function identity hack: account for possible inlining
Having a non-lifetime generic is not the only reason a function can be duplicated. Another possibility is that the function may be eligible for cross-crate inlining. So also take into account the inlining attribute in this Miri hack for function pointer identity.
That said, `cross_crate_inlinable` will still sometimes return true even for `inline(never)` functions:
- when they are `DefKind::Ctor(..) | DefKind::Closure` -- I assume those cannot be `InlineAttr::Never` anyway?
- when `cross_crate_inline_threshold == InliningThreshold::Always`
so maybe this is still not quite the right criterion to use for function pointer identity.
Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Implement new effects desugaring
cc `@rust-lang/project-const-traits.` Will write down notes once I have finished.
* [x] See if we want `T: Tr` to desugar into `T: Tr, T::Effects: Compat<true>`
* [x] Fix ICEs on `type Assoc: ~const Tr` and `type Assoc<T: ~const Tr>`
* [ ] add types and traits to minicore test
* [ ] update rustc-dev-guide
Fixes#119717Fixes#123664Fixes#124857Fixes#126148
Add a tidy rule to check that fluent messages and attrs don't end in `.`
This adds a new dependency on `fluent-parse` to `tidy` -- we already rely on it in rustc so I feel like it's not that big of a deal.
This PR also adjusts many error messages that currently end in `.`; not all of them since I added an `ALLOWLIST`, excluded `rustc_codegen_*` ftl files, and `.teach_note` attributes.
r? ``@estebank`` ``@oli-obk``
`PtrMetadata` doesn't care about `*const`/`*mut`/`&`/`&mut`, so GVN away those casts in its argument.
This includes updating MIR to allow calling PtrMetadata on references too, not just raw pointers. That means that `[T]::len` can be just `_0 = PtrMetadata(_1)`, for example.
# Conflicts:
# tests/mir-opt/pre-codegen/slice_index.slice_get_unchecked_mut_range.PreCodegen.after.panic-abort.mir
# tests/mir-opt/pre-codegen/slice_index.slice_get_unchecked_mut_range.PreCodegen.after.panic-unwind.mir
safe transmute: support non-ZST, variantful, uninhabited enums
Previously, `Tree::from_enum`'s implementation branched into three disjoint cases:
1. enums that uninhabited
2. enums for which all but one variant is uninhabited
3. enums with multiple variants
This branching (incorrectly) did not differentiate between variantful and variantless uninhabited enums. In both cases, we assumed (and asserted) that uninhabited enums are zero-sized types. This assumption is false for enums like:
enum Uninhabited { A(!, u128) }
...which, currently, has the same size as `u128`. This faulty assumption manifested as the ICE reported in #126460.
In this PR, we revise the first case of `Tree::from_enum` to consider only the narrow category of "enums that are uninhabited ZSTs". These enums, whose layouts are described with `Variants::Single { index }`, are special in their layouts otherwise resemble the `!` type and cannot be descended into like typical enums. This first case captures uninhabited enums like:
enum Uninhabited { A(!, !), B(!) }
The second case is revised to consider the broader category of "enums that defer their layout to one of their variants"; i.e., enums whose layouts are described with `Variants::Single { index }` and that do have a variant at `index`. This second case captures uninhabited enums that are not ZSTs, like:
enum Uninhabited { A(!, u128) }
...which represent their variants with `Variants::Single`.
Finally, the third case is revised to cover the broader category of "enums with multiple variants", which captures uninhabited enums like:
enum Uninhabited { A(u8, !), B(!, u32) }
...which represent their variants with `Variants::Multiple`.
This PR also adds a comment requested by ````@RalfJung```` in his review of #126358 to `compiler/rustc_const_eval/src/interpret/discriminant.rs`.
Fixes#126460
r? ````@compiler-errors````
Rename `InstanceDef` -> `InstanceKind`
Renames `InstanceDef` to `InstanceKind`. The `Def` here is confusing, and makes it hard to distinguish `Instance` and `InstanceDef`. `InstanceKind` makes this more obvious, since it's really just describing what *kind* of instance we have.
Not sure if this is large enough to warrant a types team MCP -- it's only 53 files. I don't personally think it does, but happy to write one if anyone disagrees. cc ``@rust-lang/types``
r? types
Rollup of 9 pull requests
Successful merges:
- #125829 (rustc_span: Add conveniences for working with span formats)
- #126361 (Unify intrinsics body handling in StableMIR)
- #126417 (Add `f16` and `f128` inline ASM support for `x86` and `x86-64`)
- #126424 ( Also sort `crt-static` in `--print target-features` output)
- #126428 (Polish `std::path::absolute` documentation.)
- #126429 (Add `f16` and `f128` const eval for binary and unary operationations)
- #126448 (End support for Python 3.8 in tidy)
- #126488 (Use `std::path::absolute` in bootstrap)
- #126511 (.mailmap: Associate both my work and my private email with me)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `f16` and `f128` const eval for binary and unary operationations
Add const evaluation and Miri support for f16 and f128, including unary and binary operations. Casts are not yet included.
Fixes https://github.com/rust-lang/rust/issues/124583
r? ``@RalfJung``
Previously, `Tree::from_enum`'s implementation branched into three disjoint
cases:
1. enums that uninhabited
2. enums for which all but one variant is uninhabited
3. enums with multiple inhabited variants
This branching (incorrectly) did not differentiate between variantful and
variantless uninhabited enums. In both cases, we assumed (and asserted) that
uninhabited enums are zero-sized types. This assumption is false for enums like:
enum Uninhabited { A(!, u128) }
...which, currently, has the same size as `u128`. This faulty assumption
manifested as the ICE reported in #126460.
In this PR, we revise the first case of `Tree::from_enum` to consider only the
narrow category of "enums that are uninhabited ZSTs". These enums, whose layouts
are described with `Variants::Single { index }`, are special in their layouts
otherwise resemble the `!` type and cannot be descended into like typical enums.
This first case captures uninhabited enums like:
enum Uninhabited { A(!, !), B(!) }
The second case is revised to consider the broader category of "enums that defer
their layout to one of their variants"; i.e., enums whose layouts are described
with `Variants::Single { index }` and that do have a variant at `index`. This
second case captures uninhabited enums that are not ZSTs, like:
enum Uninhabited { A(!, u128) }
...which represent their variants with `Variants::Single`.
Finally, the third case is revised to cover the broader category of "enums with
multiple variants", which captures uninhabited, non-ZST enums like:
enum Uninhabited { A(u8, !), B(!, u32) }
...which represent their variants with `Variants::Multiple`.
This PR also adds a comment requested by RalfJung in his review of #126358 to
`compiler/rustc_const_eval/src/interpret/discriminant.rs`.
Fixes#126460
const validation: fix ICE on dangling ZST reference
Fixes https://github.com/rust-lang/rust/issues/126393
I'm not super happy with this fix but I can't think of a better one.
r? `@oli-obk`
safe transmute: support `Single` enums
Previously, the implementation of `Tree::from_enum` incorrectly treated enums with `Variants::Single` and `Variants::Multiple` identically. This is incorrect for `Variants::Single` enums, which delegate their layout to that of a variant with a particular index (or no variant at all if the enum is empty).
This flaw manifested first as an ICE. `Tree::from_enum` attempted to compute the tag of variants other than the one at `Variants::Single`'s `index`, and fell afoul of a sanity-checking assertion in `compiler/rustc_const_eval/src/interpret/discriminant.rs`. This assertion is non-load-bearing, and can be removed; the routine its in is well-behaved even without it.
With the assertion removed, the proximate issue becomes apparent: calling `Tree::from_variant` on a variant that does not exist is ill-defined. A sanity check the given variant has `FieldShapes::Arbitrary` fails, and the analysis is (correctly) aborted with `Err::NotYetSupported`.
This commit corrects this chain of failures by ensuring that `Tree::from_variant` is not called on variants that are, as far as layout is concerned, nonexistent. Specifically, the implementation of `Tree::from_enum` is now partitioned into three cases:
1. enums that are uninhabited
2. enums for which all but one variant is uninhabited
3. enums with multiple inhabited variants
`Tree::from_variant` is now only invoked in the third case. In the first case, `Tree::uninhabited()` is produced. In the second case, the layout is delegated to `Variants::Single`'s index.
Fixes#125811
Previously, the implementation of `Tree::from_enum` incorrectly
treated enums with `Variants::Single` and `Variants::Multiple`
identically. This is incorrect for `Variants::Single` enums,
which delegate their layout to that of a variant with a particular
index (or no variant at all if the enum is empty).
This flaw manifested first as an ICE. `Tree::from_enum` attempted
to compute the tag of variants other than the one at
`Variants::Single`'s `index`, and fell afoul of a sanity-checking
assertion in `compiler/rustc_const_eval/src/interpret/discriminant.rs`.
This assertion is non-load-bearing, and can be removed; the routine
its in is well-behaved even without it.
With the assertion removed, the proximate issue becomes apparent:
calling `Tree::from_variant` on a variant that does not exist is
ill-defined. A sanity check the given variant has
`FieldShapes::Arbitrary` fails, and the analysis is (correctly)
aborted with `Err::NotYetSupported`.
This commit corrects this chain of failures by ensuring that
`Tree::from_variant` is not called on variants that are, as far as
layout is concerned, nonexistent. Specifically, the implementation
of `Tree::from_enum` is now partitioned into three cases:
1. enums that are uninhabited
2. enums for which all but one variant is uninhabited
3. enums with multiple inhabited variants
`Tree::from_variant` is now only invoked in the third case. In the
first case, `Tree::uninhabited()` is produced. In the second case,
the layout is delegated to `Variants::Single`'s index.
Fixes#125811