Supress niches in coroutines to avoid aliasing violations
As mentioned [here](https://github.com/rust-lang/rust/issues/63818#issuecomment-2264915918), using niches in fields of coroutines that are referenced by other fields is unsound: the discriminant accesses violate the aliasing requirements of the reference pointing to the relevant field. This issue causes [Miri errors in practice](https://github.com/rust-lang/miri/issues/3780).
The "obvious" fix for this is to suppress niches in coroutines. That's what this PR does. However, we have several tests explicitly ensuring that we *do* use niches in coroutines. So I see two options:
- We guard this behavior behind a `-Z` flag (that Miri will set by default). There is no known case of these aliasing violations causing miscompilations. But absence of evidence is not evidence of absence...
- (What this PR does right now.) We temporarily adjust the coroutine layout logic and the associated tests until the proper fix lands. The "proper fix" here is to wrap fields that other fields can point to in [`UnsafePinned`](https://github.com/rust-lang/rust/issues/125735) and make `UnsafePinned` suppress niches; that would then still permit using niches of *other* fields (those that never get borrowed). However, I know that coroutine sizes are already a problem, so I am not sure if this temporary size regression is acceptable.
`@compiler-errors` any opinion? Also who else should be Cc'd here?
Stop storing a special inner body for the coroutine by-move body for async closures
...and instead, just synthesize an item which is treated mostly normally by the MIR pipeline.
This PR does a few things:
* We synthesize a new `DefId` for the by-move body of a closure, which has its `mir_built` fed with the output of the `ByMoveBody` MIR transformation, and some other relevant queries.
* This has the `DefKind::ByMoveBody`, which we use to distinguish it from "real" bodies (that come from HIR) which need to be borrowck'd. Introduce `TyCtxt::is_synthetic_mir` to skip over `mir_borrowck` which is called by `mir_promoted`; borrowck isn't really possible to make work ATM since it heavily relies being called on a body generated from HIR, and is redundant by the construction of the by-move-body.
* Remove the special `PassManager` hacks for handling the inner `by_move_body` stored within the coroutine's mir body. Instead, this body is fed like a regular MIR body, so it's goes through all of the `tcx.*_mir` stages normally (build -> promoted -> ...etc... -> optimized) ✨.
* Remove the `InstanceKind::ByMoveBody` shim, since now we have a "regular" def id, we can just use `InstanceKind::Item`. This also allows us to remove the corresponding hacks from codegen, such as in `fn_sig_for_fn_abi` ✨.
Notable remarks:
* ~~I know it's kind of weird to be using `DefKind::Closure` here, since it's not a distinct closure but just a new MIR body. I don't believe it really matters, but I could also use a different `DefKind`... maybe one that we could use for synthetic MIR bodies in general?~~ edit: We're doing this now.
Document & implement the transmutation modeled by `BikeshedIntrinsicFrom`
Documents that `BikeshedIntrinsicFrom` models transmute-via-union, which is slightly more expressive than the transmute-via-cast implemented by `transmute_copy`. Additionally, we provide an implementation of transmute-via-union as a method on the `BikeshedIntrinsicFrom` trait with additional documentation on the boundary between trait invariants and caller obligations.
Whether or not transmute-via-union is the right kind of transmute to model remains up for discussion [1]. Regardless, it seems wise to document the present behavior.
[1] https://rust-lang.zulipchat.com/#narrow/stream/216762-project-safe-transmute/topic/What.20'kind'.20of.20transmute.20to.20model.3F/near/426331967
Tracking Issue: https://github.com/rust-lang/rust/issues/99571
r? `@compiler-errors`
cc `@scottmcm,` `@Lokathor`
Documents that `BikeshedIntrinsicFrom` models transmute-via-union,
which is slightly more expressive than the transmute-via-cast
implemented by `transmute_copy`. Additionally, we provide an
implementation of transmute-via-union as a method on the
`BikeshedIntrinsicFrom` trait with additional documentation on
the boundary between trait invariants and caller obligations.
Whether or not transmute-via-union is the right kind of transmute
to model remains up for discussion [1]. Regardless, it seems wise
to document the present behavior.
[1] https://rust-lang.zulipchat.com/#narrow/stream/216762-project-safe-transmute/topic/What.20'kind'.20of.20transmute.20to.20model.3F/near/426331967
Stabilize `raw_ref_op` (RFC 2582)
This stabilizes the syntax `&raw const $expr` and `&raw mut $expr`. It has existed unstably for ~4 years now, and has been exposed on stable via the `addr_of` and `addr_of_mut` macros since Rust 1.51 (released more than 3 years ago). I think it has become clear that these operations are here to stay. So it is about time we give them proper primitive syntax. This has two advantages over the macro:
- Being macros, `addr_of`/`addr_of_mut` could in theory do arbitrary magic with the expression on which they work. The only "magic" they actually do is using the argument as a place expression rather than as a value expression. Place expressions are already a subtle topic and poorly understood by many programmers; having this hidden behind a macro using unstable language features makes this even worse. Conversely, people do have an idea of what happens below `&`/`&mut`, so we can make the subtle topic a lot more approachable by connecting to existing intuition.
- The name `addr_of` is quite unfortunate from today's perspective, given that we have accepted provenance as a reality, which means that a pointer is *not* just an address. Strict provenance has a method, `addr`, which extracts the address of a pointer; using the term `addr` in two different ways is quite unfortunate. That's why this PR soft-deprecates `addr_of` -- we will wait a long time before actually showing any warning here, but we should start telling people that the "addr" part of this name is somewhat misleading, and `&raw` avoids that potential confusion.
In summary, this syntax improves developers' ability to conceptualize the operational semantics of Rust, while making a fundamental operation frequently used in unsafe code feel properly built in.
Possible questions to consider, based on the RFC and [this](https://github.com/rust-lang/rust/issues/64490#issuecomment-1163802912) great summary by `@CAD97:`
- Some questions are entirely about the semantics. The semantics are the same as with the macros so I don't think this should have any impact on this syntax PR. Still, for completeness' sake:
- Should `&raw const *mut_ref` give a read-only pointer?
- Tracked at: https://github.com/rust-lang/unsafe-code-guidelines/issues/257
- I think ideally the answer is "no". Stacked Borrows says that pointer is read-only, but Tree Borrows says it is mutable.
- What exactly does `&raw const (*ptr).field` require? Answered in [the reference](https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html): the arithmetic to compute the field offset follows the rules of `ptr::offset`, making it UB if it goes out-of-bounds. Making this a safe operation (using `wrapping_offset` rules) is considered too much of a loss for alias analysis.
- Choose a different syntax? I don't want to re-litigate the RFC. The only credible alternative that has been proposed is `&raw $place` instead of `&raw const $place`, which (IIUC) could be achieved by making `raw` a contextual keyword in a new edition. The type is named `*const T`, so the explicit `const` is consistent in that regard. `&raw expr` lacks the explicit indication of immutability. However, `&raw const expr` is quite a but longer than `addr_of!(expr)`.
- Shouldn't we have a completely new, better raw pointer type instead? Yes we all want to see that happen -- but I don't think we should block stabilization on that, given that such a nicer type is not on the horizon currently and given the issues with `addr_of!` mentioned above. (If we keep the `&raw $place` syntax free for this, we could use it in the future for that new type.)
- What about the lint the RFC talked about? It hasn't been implemented yet. Given that the problematic code is UB with or without this stabilization, I don't think the lack of the lint should block stabilization.
- I created an issue to track adding it: https://github.com/rust-lang/rust/issues/127724
- Other points from the "future possibilites of the RFC
- "Syntactic sugar" extension: this has not been implemented. I'd argue this is too confusing, we should stick to what the RFC suggested and if we want to do anything about such expressions, add the lint.
- Encouraging / requiring `&raw` in situations where references are often/definitely incorrect: this has been / is being implemented. On packed fields this already is a hard error, and for `static mut` a lint suggesting raw pointers is being rolled out.
- Lowering of casts: this has been implemented. (It's also an invisible implementation detail.)
- `offsetof` woes: we now have native `offset_of` so this is not relevant any more.
To be done before landing:
- [x] Suppress `unused_parens` lint around `&raw {const|mut}` expressions
- See bottom of https://github.com/rust-lang/rust/pull/127679#issuecomment-2264073752 for rationale
- Implementation: https://github.com/rust-lang/rust/pull/128782
- [ ] Update the Reference.
- https://github.com/rust-lang/reference/pull/1567
Fixes https://github.com/rust-lang/rust/issues/64490
cc `@rust-lang/lang` `@rust-lang/opsem`
try-job: x86_64-msvc
try-job: test-various
try-job: dist-various-1
try-job: armhf-gnu
try-job: aarch64-apple
Move ZST ABI handling to `rustc_target`
Currently, target specific handling of ZST function call ABI (specifically passing them indirectly instead of ignoring them) is handled in `rustc_ty_utils`, whereas all other target specific function call ABI handling is located in `rustc_target`. This PR moves the ZST handling to `rustc_target` so that all the target-specific function call ABI handling is in one place. In the process of doing so, this PR fixes#125850 by ensuring that ZST arguments are always correctly ignored in the x86-64 `"sysv64"` ABI; any code which would be affected by this fix would have ICEd before this PR. Tests are also added using `#[rustc_abi(debug)]` to ensure this behaviour does not regress.
Fixes#125850
Shrink `TyKind::FnPtr`.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
r? `@compiler-errors`
Normalize struct tail properly for `dyn` ptr-to-ptr casting in new solver
Realized that the new solver didn't handle ptr-to-ptr casting correctly.
r? lcnr
Built on #128694
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and
`FnHeader`, which can be packed more efficiently. This reduces the size
of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms.
This reduces peak memory usage by a few percent on some benchmarks. It
also reduces cache misses and page faults similarly, though this doesn't
translate to clear cycles or wall-time improvements on CI.
Ensure floats are returned losslessly by the Rust ABI on 32-bit x86
Solves #115567 for the (default) `"Rust"` ABI. When compiling for 32-bit x86, this PR changes the `"Rust"` ABI to return floats indirectly instead of in x87 registers (with the exception of single `f32`s, which this PR returns in general purpose registers as they are small enough to fit in one). No change is made to the `"C"` ABI as that ABI requires x87 register usage and therefore will need a different solution.
Remove the unstable `extern "wasm"` ABI (`wasm_abi` feature tracked
in #83788).
As discussed in https://github.com/rust-lang/rust/pull/127513#issuecomment-2220410679
and following, this ABI is a failed experiment that did not end
up being used for anything. Keeping support for this ABI in LLVM 19
would require us to switch wasm targets to the `experimental-mv`
ABI, which we do not want to do.
It should be noted that `Abi::Wasm` was internally used for two
things: The `-Z wasm-c-abi=legacy` ABI that is still used by
default on some wasm targets, and the `extern "wasm"` ABI. Despite
both being `Abi::Wasm` internally, they were not the same. An
explicit `extern "wasm"` additionally enabled the `+multivalue`
feature.
I've opted to remove `Abi::Wasm` in this patch entirely, instead
of keeping it as an ABI with only internal usage. Both
`-Z wasm-c-abi` variants are now treated as part of the normal
C ABI, just with different different treatment in
adjust_for_foreign_abi.
Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Fix `FnMut::call_mut`/`Fn::call` shim for async closures that capture references
I adjusted async closures to be able to implement `Fn` and `FnMut` *even if* they capture references, as long as those references did not need to borrow data from the closure captures themselves. See #125259.
However, when I did this, I didn't actually relax an assertion in the `build_construct_coroutine_by_move_shim` shim code, which builds the `Fn`/`FnMut`/`FnOnce` implementations for async closures. Therefore, if we actually tried to *call* `FnMut`/`Fn` on async closures, it would ICE.
This PR adjusts this assertion to ensure that we only capture immutable references in closures if they implement `Fn`/`FnMut`. It also adds a bunch of tests and makes more of the async-closure tests into `build-pass` since we often care about these tests actually generating the right closure shims and stuff. I think it might be excessive to *always* use build-pass here, but 🤷 it's not that big of a deal.
Fixes#127019Fixes#127012
r? oli-obk
Only compute vtable information during codegen
This PR removes vtable information from the `Object` and `TraitUpcasting` candidate sources in the trait solvers, and defers the computation of relevant information to `Instance::resolve`. This is because vtables really aren't a thing in the trait world -- they're an implementation detail in codegen.
Previously it was just easiest to tangle this information together since we were already doing the work of looking at all the supertraits in the trait solver, and specifically because we use traits to represent when it's possible to call a method via a vtable (`Object` candidate) and do upcasting (`Unsize` candidate). but I am somewhat suspicious we're doing a *lot* of extra work, especially in polymorphic contexts, so let's see what perf says.
We already do this for a number of crates, e.g. `rustc_middle`,
`rustc_span`, `rustc_metadata`, `rustc_span`, `rustc_errors`.
For the ones we don't, in many cases the attributes are a mess.
- There is no consistency about order of attribute kinds (e.g.
`allow`/`deny`/`feature`).
- Within attribute kind groups (e.g. the `feature` attributes),
sometimes the order is alphabetical, and sometimes there is no
particular order.
- Sometimes the attributes of a particular kind aren't even grouped
all together, e.g. there might be a `feature`, then an `allow`, then
another `feature`.
This commit extends the existing sorting to all compiler crates,
increasing consistency. If any new attribute line is added there is now
only one place it can go -- no need for arbitrary decisions.
Exceptions:
- `rustc_log`, `rustc_next_trait_solver` and `rustc_type_ir_macros`,
because they have no crate attributes.
- `rustc_codegen_gcc`, because it's quasi-external to rustc (e.g. it's
ignored in `rustfmt.toml`).
Implement `needs_async_drop` in rustc and optimize async drop glue
This PR expands on #121801 and implements `Ty::needs_async_drop` which works almost exactly the same as `Ty::needs_drop`, which is needed for #123948.
Also made compiler's async drop code to look more like compiler's regular drop code, which enabled me to write an optimization where types which do not use `AsyncDrop` can simply forward async drop glue to `drop_in_place`. This made size of the async block from the [async_drop test](67980dd6fb/tests/ui/async-await/async-drop.rs) to decrease by 12%.
Make `body_owned_by` return the `Body` instead of just the `BodyId`
fixes#125677
Almost all `body_owned_by` callers immediately called `body`, too, so just return `Body` directly.
This makes the inline-const query feeding more robust, as all calls to `body_owned_by` will now yield a body for inline consts, too.
I have not yet figured out a good way to make `tcx.hir().body()` return an inline-const body, but that can be done as a follow-up
Uplift `EarlyBinder` into `rustc_type_ir`
We also need to give `EarlyBinder` a `'tcx` param, so that we can carry the `Interner` in the `EarlyBinder` too. This is necessary because otherwise we have an unconstrained `I: Interner` parameter in many of the `EarlyBinder`'s inherent impls.
I also generally think that this is desirable to have, in case we later want to track some state in the `EarlyBinder`.
r? lcnr
Rename Unsafe to Safety
Alternative to #124455, which is to just have one Safety enum to use everywhere, this opens the posibility of adding `ast::Safety::Safe` that's useful for unsafe extern blocks.
This leaves us today with:
```rust
enum ast::Safety {
Unsafe(Span),
Default,
// Safe (going to be added for unsafe extern blocks)
}
enum hir::Safety {
Unsafe,
Safe,
}
```
We would convert from `ast::Safety::Default` into the right Safety level according the context.
Split out `ty::AliasTerm` from `ty::AliasTy`
Splitting out `AliasTerm` (for use in project and normalizes goals) and `AliasTy` (for use in `ty::Alias`)
r? lcnr
Refactor float `Primitive`s to a separate `Float` type
Now there are 4 of them, it makes sense to refactor `F16`, `F32`, `F64` and `F128` out of `Primitive` and into a separate `Float` type (like integers already are). This allows patterns like `F16 | F32 | F64 | F128` to be simplified into `Float(_)`, and is consistent with `ty::FloatTy`.
As a side effect, this PR also makes the `Ty::primitive_size` method work with `f16` and `f128`.
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Add simple async drop glue generation
This is a prototype of the async drop glue generation for some simple types. Async drop glue is intended to behave very similar to the regular drop glue except for being asynchronous. Currently it does not execute synchronous drops but only calls user implementations of `AsyncDrop::async_drop` associative function and awaits the returned future. It is not complete as it only recurses into arrays, slices, tuples, and structs and does not have same sensible restrictions as the old `Drop` trait implementation like having the same bounds as the type definition, while code assumes their existence (requires a future work).
This current design uses a workaround as it does not create any custom async destructor state machine types for ADTs, but instead uses types defined in the std library called future combinators (deferred_async_drop, chain, ready_unit).
Also I recommend reading my [explainer](https://zetanumbers.github.io/book/async-drop-design.html).
This is a part of the [MCP: Low level components for async drop](https://github.com/rust-lang/compiler-team/issues/727) work.
Feature completeness:
- [x] `AsyncDrop` trait
- [ ] `async_drop_in_place_raw`/async drop glue generation support for
- [x] Trivially destructible types (integers, bools, floats, string slices, pointers, references, etc.)
- [x] Arrays and slices (array pointer is unsized into slice pointer)
- [x] ADTs (enums, structs, unions)
- [x] tuple-like types (tuples, closures)
- [ ] Dynamic types (`dyn Trait`, see explainer's [proposed design](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#async-drop-glue-for-dyn-trait))
- [ ] coroutines (https://github.com/rust-lang/rust/pull/123948)
- [x] Async drop glue includes sync drop glue code
- [x] Cleanup branch generation for `async_drop_in_place_raw`
- [ ] Union rejects non-trivially async destructible fields
- [ ] `AsyncDrop` implementation requires same bounds as type definition
- [ ] Skip trivially destructible fields (optimization)
- [ ] New [`TyKind::AdtAsyncDestructor`](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#adt-async-destructor-types) and get rid of combinators
- [ ] [Synchronously undroppable types](https://github.com/zetanumbers/posts/blob/main/async-drop-design.md#exclusively-async-drop)
- [ ] Automatic async drop at the end of the scope in async context
Introduce perma-unstable `wasm-c-abi` flag
Now that `wasm-bindgen` v0.2.88 supports the spec-compliant C ABI, the idea is to switch to that in a future version of Rust. In the meantime it would be good to let people test and play around with it.
This PR introduces a new perma-unstable `-Zwasm-c-abi` compiler flag, which switches to the new spec-compliant C ABI when targeting `wasm32-unknown-unknown`.
Alternatively, we could also stabilize this and then deprecate it when we switch. I will leave this to the Rust maintainers to decide.
This is a companion PR to #117918, but they could be merged independently.
MCP: https://github.com/rust-lang/compiler-team/issues/703
Tracking issue: https://github.com/rust-lang/rust/issues/122532
Trait predicates for types which have errors may still
evaluate to OK leading to downstream ICEs. Now we return
a selection error for such types in candidate assembly and
thereby prevent such issues
Cleanup: Rename `HAS_PROJECTIONS` to `HAS_ALIASES` etc.
The name of the bitflag `HAS_PROJECTIONS` and of its corresponding method `has_projections` is quite historical dating back to a time when projections were the only kind of alias type.
I think it's time to update it to clear up any potential confusion for newcomers and to reduce unnecessary friction during contributor onboarding.
r? types
Only inspect user-written predicates for privacy concerns
fixes#123288
Previously we looked at the elaborated predicates, which, due to adding various bounds on fields, end up requiring trivially true bounds. But these bounds can contain private types, which the privacy visitor then found and errored about.
Assert that args are actually compatible with their generics, rather than just their count
Right now we just check that the number of args is right, rather than actually checking the kinds. Uplift a helper fn that I wrote from trait selection to do just that. Found a couple bugs along the way.
r? `@lcnr` or `@fmease` (or anyone really lol)
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Make `TyCtxt::coroutine_layout` take coroutine's kind parameter
For coroutines that come from coroutine-closures (i.e. async closures), we may have two kinds of bodies stored in the coroutine; one that takes the closure's captures by reference, and one that takes the captures by move.
These currently have identical layouts, but if we do any optimization for these layouts that are related to the upvars, then they will diverge -- e.g. https://github.com/rust-lang/rust/pull/120168#discussion_r1536943728.
This PR relaxes the assertion I added in #121122, and instead make the `TyCtxt::coroutine_layout` method take the `coroutine_kind_ty` argument from the coroutine, which will allow us to differentiate these by-move and by-ref bodies.