I've been experimenting with transforming the StableMIR to instrument
the code with potential UB checks. The modified body will only
be used by our analysis tool, however, constants in StableMIR must be
backed by rustc constants. Thus, I'm adding a few functions to build
constants, such as building string and other primitives.
Add asm goto support to `asm!`
Tracking issue: #119364
This PR implements asm-goto support, using the syntax described in "future possibilities" section of [RFC2873](https://rust-lang.github.io/rfcs/2873-inline-asm.html#asm-goto).
Currently I have only implemented the `label` part, not the `fallthrough` part (i.e. fallthrough is implicit). This doesn't reduce the expressive though, since you can use label-break to get arbitrary control flow or simply set a value and rely on jump threading optimisation to get the desired control flow. I can add that later if deemed necessary.
r? ``@Amanieu``
cc ``@ojeda``
`CompilerError` has `CompilationFailed` and `ICE` variants, which seems
reasonable at first. But the way it identifies them is flawed:
- If compilation errors out, i.e. `RunCompiler::run` returns an `Err`,
it uses `CompilationFailed`, which is reasonable.
- If compilation panics with `FatalError`, it catches the panic and uses
`ICE`. This is sometimes right, because ICEs do cause `FatalError`
panics, but sometimes wrong, because certain compiler errors also
cause `FatalError` panics. (The compiler/rustdoc/clippy/whatever just
catches the `FatalError` with `catch_with_exit_code` in `main`.)
In other words, certain non-ICE compilation failures get miscategorized
as ICEs. It's not possible to reliably distinguish the two cases, so
this commit merges them. It also renames the combined variant as just
`Failed`, to better match the existing `Interrupted` and `Skipped`
variants.
Here is an example of a non-ICE failure that causes a `FatalError`
panic, from `tests/ui/recursion_limit/issue-105700.rs`:
```
#![recursion_limit="4"]
#![invalid_attribute]
#![invalid_attribute]
#![invalid_attribute]
#![invalid_attribute]
#![invalid_attribute]
//~^ERROR recursion limit reached while expanding
fn main() {{}}
```
The internal function was unsound, it could cause UB in rare cases where
the user inadvertly stored the returned object in a location that could
outlive the TyCtxt.
In order to make it safe, we now take a type context as an argument to
the internal fn, and we ensure that interned items are lifted using the
provided context.
Thus, this change ensures that the compiler can properly enforce
that the object does not outlive the type context it was lifted to.
Make tcx optional from StableMIR run macro and extend it to accept closures
Change `run` macro to avoid sometimes unnecessary dependency on `TyCtxt`, and introduce `run_with_tcx` to capture use cases where `tcx` is required. Additionally, extend both macros to accept closures that may capture variables.
I've also modified the `internal()` method to make it safer, by accepting the type context to force the `'tcx` lifetime to match the context lifetime.
These are non-backward compatible changes, but they only affect internal APIs which are provided today as helper functions until we have a stable API to start the compiler.
I added `tcx` argument to `internal` to force 'tcx to be the same
lifetime as TyCtxt. The only other solution I could think is to change
this function to be `unsafe`.
Simplify the `run` macro to avoid sometimes unnecessary dependency
on `TyCtxt`. Instead, users can use the new internal method `tcx()`.
Additionally, extend the macro to accept closures that may capture
variables.
These are non-backward compatible changes, but they only affect
internal APIs which are provided today as helper functions until we
have a stable API to start the compiler.
Add method to get instance instantiation arguments
Add a method to get the instance instantiation arguments, and include that information in the instance debug.
Add function ABI and type layout to StableMIR
This change introduces a new module to StableMIR named `abi` with information from `rustc_target::abi` and `rustc_abi`, that allow users to retrieve more low level information required to perform bit-precise analysis.
The layout of a type can be retrieved via `Ty::layout`, and the instance ABI can be retrieved via `Instance::fn_abi()`.
To properly handle errors while retrieve layout information, we had to implement a few layout related traits.
r? ```@compiler-errors```
This change introduces a new module to StableMIR named `abi` with
information from `rustc_target::abi` and `rustc_abi`, that allow users
to retrieve more low level information required to perform
bit-precise analysis.
The layout of a type can be retrieved via `Ty::layout`, and the instance
ABI can be retrieved via `Instance::fn_abi()`.
To properly handle errors while retrieve layout information, we had
to implement a few layout related traits.
- Remove `fn_sig()` from Instance.
- Change return value of `AssertMessage::description` to `Cow<>`.
- Add assert to instance `ty()`.
- Generalize uint / int type creation.
Fix BinOp `ty()` assertion and `fn_sig()` for closures
`BinOp::ty()` was asserting that the argument types were primitives. However, the primitive check doesn't include pointers, which can be used in a `BinaryOperation`. Thus extend the arguments to include them.
Since I had to add methods to check for pointers in TyKind, I just went ahead and added a bunch more utility checks that can be handy for our users and fixed the `fn_sig()` method to also include closures.
`@compiler-errors` just wanted to confirm that today no `BinaryOperation` accept SIMD types. Is that correct?
r? `@compiler-errors`
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and
removing redundant imports code into two PR.
Add instance evaluation and methods to read an allocation in StableMIR
The instance evaluation is needed to handle intrinsics such as `type_id` and `type_name`.
Since we now use Allocation to represent all evaluated constants, provide a few methods to help process the data inside an allocation.
I've also started to add a structured way to get information about the compilation target machine. For now, I've only added information needed to process an allocation.
r? ``````@ouz-a``````
The instance evaluation is needed to handle intrinsics such as
`type_id` and `type_name`.
Since we now use Allocation to represent all evaluated constants,
provide a few methods to help process the data inside an allocation.
Add method to get type of an Rvalue in StableMIR
Provide a method to StableMIR users to retrieve the type of an Rvalue operation. There were two possible implementation:
1. Create the logic inside stable_mir to process the type according to the Rvalue semantics, which duplicates the logic of `rustc_middle::mir::Rvalue::ty()`.
2. Implement the Rvalue translation from StableMIR back to internal representation, invoke the `rustc_middle::mir::Rvalue::ty()`, and translate the return value to StableMIR.
I chose the first one for now since the duplication was fairly small, and the option 2 would require way more work to translate everything back to rustc internal representation. If we eventually add those translations, we could easily swap to the option 2.
```@compiler-errors``` / ```@ouz-a``` Please let me know if you have any strong opinion here.
r? ```@compiler-errors```
compile-time evaluation: detect writes through immutable pointers
This has two motivations:
- it unblocks https://github.com/rust-lang/rust/pull/116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in https://github.com/rust-lang/rust/pull/117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB
When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.
The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.
I just hope perf works out.
Fix is_foreign_item for StableMIR instance
Change the implementation of `Instance::is_foreign_item` to directly query the compiler for the instance `def_id` instead of incorrectly relying on the conversion to `CrateItem`. I also added a method to check if the instance has body, since the function already existed and it just wasn't exposed via public APIs. This makes it much cheaper for the user to check if the instance has body.
## Background:
- In pull https://github.com/rust-lang/rust/pull/118524, I fixed the conversion from Instance to CrateItem to avoid the conversion if the instance didn't have a body available. This broke the `is_foreign_item`.
r? `@ouz-a`
Change the implementation of `Instance::is_foreign_item` to directly
query the compiler for the instance `def_id` instead of incorrectly
relying on the conversion to `CrateItem`.
Background:
- In pull https://github.com/rust-lang/rust/pull/118524, I fixed the
conversion from Instance to CrateItem to avoid the conversion if the
instance didn't have a body available. This broke the `is_foreign_item`.
Although, we would like to avoid crashes whenever
possible, and that's why I wanted to make this API fallible. It's
looking pretty hard to do proper validation.
I think many of our APIs will unfortunately depend on the user doing
the correct thing since at the MIR level we are working on,
we expect types to have been checked already.
The new structure encodes its invariant, which reduces the likelihood
of having an inconsistent representation. It is also more intuitive and
user friendly.
I encapsulated the structure for now in case we decide to change it back.
Add `pretty_terminator` to pretty stable-mir
~Because we don't have successors in `stable_mir` this is somewhat lacking but it's better than nothing~, also fixed bug(?) with `Opaque` which printed extra `"` when we try to print opaqued `String`.
**Edit**: Added successors so this covers Terminators as a whole.
r? `@celinval`
Add common trait for crate definitions
In stable mir, we specialize DefId, however some functionality is the same for every definition, such as def paths, and getting their crate. Use a trait to implement those.
Remove `PredicateKind::ClosureKind`
We don't need the `ClosureKind` predicate kind -- instead, `Fn`-family trait goals are left as ambiguous, and we only need to make progress on `FnOnce` projection goals for inference purposes.
This is similar to how we do confirmation of `Fn`-family trait and projection goals in the new trait solver, which also doesn't use the `ClosureKind` predicate.
Some hacky logic is added in the second commit so that we can keep the error messages the same.
Rollup of 5 pull requests
Successful merges:
- #117972 (Add VarDebugInfo to Stable MIR)
- #118109 (rustdoc-search: simplify `checkPath` and `sortResults`)
- #118110 (Document `DefiningAnchor` a bit more)
- #118112 (Don't ICE when ambiguity is found when selecting `Index` implementation in typeck)
- #118135 (Remove quotation from filename in stable_mir)
Failed merges:
- #118012 (Add support for global allocation in smir)
r? `@ghost`
`@rustbot` modify labels: rollup
Remove quotation from filename in stable_mir
Previously we had quotation marks in filenames which is obviously wrong this fixes that.
r? ```@celinval```
Fixed the `has_body()` function operator. Before that, this function was
returning false for all shims.
Change resolve_drop_in_place() to also return an instance for empty
shims, since they may still be required for vtable construction.
Add more APIs to retrieve information about types, and add more instance
resolution options.
Make `Instance::body()` return an Option<Body>, since not every instance
might have an available body. For example, foreign instances, virtual
instances, dependencies.
In cases like Kani, we will invoke the rustc_internal run command
directly for now. It would be handly to be able to have a callback
that can return a value.
We also need extra methods to convert stable constructs into internal
ones, so we can break down the transition into finer grain commits.
finish `RegionKind` renaming
second step of https://github.com/rust-lang/types-team/issues/95
continues the work from #117876. While working on this and I encountered a bunch of further cleanup which I'll either open a tracking issue for or will do in a separate PR:
- rewrite the `RegionKind` docs, they still talk about `ReEmpty` and are generally out of date
- rename `DescriptionCtx` to `DescriptionCtxt`
- what is `CheckRegions::Bound`?
- `collect_late_bound_regions` et al
- `erase_late_bound_regions` -> `instantiate_bound_regions_with_erased`?
- `EraseEarlyRegions` visitor should be removed, feels duplicate
r? `@BoxyUwU`
Add richer structure for Stable MIR Projections
Resolves https://github.com/rust-lang/project-stable-mir/issues/49.
Projections in Stable MIR are currently just strings. This PR replaces that representation with a richer structure, namely projections become vectors of `ProjectionElem`s, just as in MIR. The `ProjectionElem` enum is heavily based off of the MIR `ProjectionElem`.
This PR is a draft since there are several outstanding issues to resolve, including:
- How should `UserTypeProjection`s be represented in Stable MIR? In MIR, the projections are just a vector of `ProjectionElem<(),()>`, meaning `ProjectionElem`s that don't have Local or Type arguments (for `Index`, `Field`, etc. objects). Should `UserTypeProjection`s be represented this way in Stable MIR as well? Or is there a more user-friendly representation that wouldn't drag along all the `ProjectionElem` variants that presumably can't appear?
- What is the expected behavior of a `Place`'s `ty` function? Should it resolve down the chain of projections so that something like `*_1.f` would return the type referenced by field `f`?
- Tests should be added for `UserTypeProjection`
Also shifts comments explaining why Stable MIR drops an optional variant
name field, for `Downcast` projection elements, to the `Place::stable`
function.
Add CoroutineWitness to covered types in smir
Previously we accepted `CouroutineWitness` as `unreachable!` but https://github.com/rust-lang/project-stable-mir/issues/50 shows it is indeed reachable, this pr fixes that and covers `CouroutineWitness`
It's not clear to me (klinvill) that UserTypeProjections are produced
anymore with the removal of type ascriptions as per
https://github.com/rust-lang/rfcs/pull/3307. Furthermore, it's not clear
to me which variants of ProjectionElem could appear in such projections.
For these reasons, I'm reverting projections in UserTypeProjections to
simple strings until I can get more clarity on UserTypeProjections.
This commit includes richer projections for both Places and
UserTypeProjections. However, the tests only touch on Places. There are
also outstanding TODOs regarding how projections should be resolved to
produce Place types, and regarding if UserTypeProjections should just
contain ProjectionElem<(),()> objects as in MIR.
Support enum variants in offset_of!
This MR implements support for navigating through enum variants in `offset_of!`, placing the enum variant name in the second argument to `offset_of!`. The RFC placed it in the first argument, but I think it interacts better with nested field access in the second, as you can then write things like
```rust
offset_of!(Type, field.Variant.field)
```
Alternatively, a syntactic distinction could be made between variants and fields (e.g. `field::Variant.field`) but I'm not convinced this would be helpful.
[RFC 3308 # Enum Support](https://rust-lang.github.io/rfcs/3308-offset_of.html#enum-support-offset_ofsomeenumstructvariant-field_on_variant)
Tracking Issue #106655.
- Sort dependencies and features sections.
- Add `tidy` markers to the sorted sections so they stay sorted.
- Remove empty `[lib`] sections.
- Remove "See more keys..." comments.
Excluded files:
- rustc_codegen_{cranelift,gcc}, because they're external.
- rustc_lexer, because it has external use.
- stable_mir, because it has external use.
Implement `gen` blocks in the 2024 edition
Coroutines tracking issue https://github.com/rust-lang/rust/issues/43122
`gen` block tracking issue https://github.com/rust-lang/rust/issues/117078
This PR implements `gen` blocks that implement `Iterator`. Most of the logic with `async` blocks is shared, and thus I renamed various types that were referring to `async` specifically.
An example usage of `gen` blocks is
```rust
fn foo() -> impl Iterator<Item = i32> {
gen {
yield 42;
for i in 5..18 {
if i.is_even() { continue }
yield i * 2;
}
}
}
```
The limitations (to be resolved) of the implementation are listed in the tracking issue
Create a new ConstantKind variant (ZeroSized) for StableMIR
ZeroSized constants can be represented as `mir::Const::Val` even if their layout is not yet known. In those cases, CrateItem::body() was crashing when trying to convert a `ConstValue::ZeroSized` into its stable counterpart `ConstantKind::Allocated`.
Instead, we now map `ConstValue::ZeroSized` into a new variant: `ConstantKind::ZeroSized`.
**Note:** I didn't add any new test here since we already have covering tests in our project repository which I manually confirmed that will fix the issue.
ZeroSized constants can be represented as `mir::Const::Val` even if
their layout is not yet known. In those cases, CrateItem::body() was
crashing when trying to convert a `ConstValue::ZeroSized` into its
stable counterpart `ConstantKind::Allocated`.
Instead, we now map `ConstValue::ZeroSized` into a new variant:
`ConstantKind::ZeroSized`.
Add way to differentiate argument locals from other locals in Stable MIR
This PR resolvesrust-lang/project-stable-mir#47 which request a way to differentiate argument locals in a SMIR `Body` from other locals.
Specifically, this PR exposes the `arg_count` field from the MIR `Body`. However, I'm opening this as a draft PR because I think there are a few outstanding questions on how this information should be exposed and described. Namely:
- Is exposing `arg_count` the best way to surface this information to SMIR users? Would it be better to leave `arg_count` as a private field and add public methods (e.g. `fn arguments(&self) -> Iter<'_, LocalDecls>`) that may use the underlying `arg_count` info from the MIR body, but expose this information to users in a more convenient form? Or is it best to stick close to the current MIR convention?
- If the answer to the above point is to stick with the current MIR convention (`arg_count`), is it reasonable to also commit to sticking to the current MIR convention that the first local is always the return local, while the next `arg_count` locals are always the (in-order) argument locals?
- Should `Body` in SMIR only represent function bodies (as implied by the comment I added)? That seems to be the current case in MIR, but should this restriction always be the case for SMIR?
r? `@celinval`
r? `@oli-obk`
Rename AsyncCoroutineKind to CoroutineSource
pulled out of https://github.com/rust-lang/rust/pull/116447
Also refactors the printing infra of `CoroutineSource` to be ready for easily extending it with a `Gen` variant for `gen` blocks
This commit hides the arg_count field in Body and instead exposes more
stable and user-friendly methods to get the return and argument locals.
As a result, Body instances must now be constructed using the `new`
function.
This field allows SMIR consumers to identify which locals correspond to
argument locals. It simply exposes the arg_count field from the MIR
representation.
Remove fold code and add `Const::internal()` to StableMIR
We are not planning to support user generated constant in the foreseeable future, so we are cleaning up the fold logic and user created type for now. Users should use `Instance::resolve` in order to trigger monomorphization.
The Instance::resolve was however incomplete, since we weren't handling internalizing constants yet. Thus, I added that.
I decided to keep the `Const` fields private in case we decide to translate them lazily.
Uplift `ClauseKind` and `PredicateKind` into `rustc_type_ir`
Uplift `ClauseKind` and `PredicateKind` into `rustc_type_ir`.
Blocked on #116951
r? `@ghost`
We are not planning to support user generated constant in the
foreseeable future, so we are removing the Fold logic for now in
favor of the Instance::resolve logic.
The Instance::resolve was however incomplete, since we weren't handling
internalizing constants yet. Thus, I added that.
I decided to keep the Const fields private in case we decide to
translate them lazily.
Note: We do not expect to provide internalizing methods for all
StableMIR constructs. They exist only to help migrating efforts to allow
users to mix StableMIR and internal constructs.