Extend Miri to correctly pass mutable pointers through FFI
Based off of https://github.com/rust-lang/rust/pull/129684, this PR further extends Miri to execute native calls that make use of pointers to *mutable* memory.
We adapt Miri's bookkeeping of internal state upon any FFI call that gives external code permission to mutate memory.
Native code may now possibly write and therefore initialize and change the pointer provenance of bytes it has access to: Such memory is assumed to be *initialized* afterwards and bytes are given *arbitrary (wildcard) provenance*. This enables programs that correctly use mutating FFI calls to run Miri without errors, at the cost of possibly missing Undefined Behaviour caused by incorrect usage of mutating FFI.
> <details>
>
> <summary> Simple example </summary>
>
> ```rust
> extern "C" {
> fn init_int(ptr: *mut i32);
> }
>
> fn main() {
> let mut x = std::mem::MaybeUninit::<i32>::uninit();
> let x = unsafe {
> init_int(x.as_mut_ptr());
> x.assume_init()
> };
>
> println!("C initialized my memory to: {x}");
> }
> ```
> ```c
> void init_int(int *ptr) {
> *ptr = 42;
> }
> ```
> should now show `C initialized my memory to: 42`.
>
> </details>
r? ``@RalfJung``
rust_for_linux: -Zreg-struct-return commandline flag for X86 (#116973)
Command line flag `-Zreg-struct-return` for X86 (32-bit) for rust-for-linux.
This flag enables the same behavior as the `abi_return_struct_as_int` target spec key.
- Tracking issue: https://github.com/rust-lang/rust/issues/116973
If a type has unsafe fields, its safety invariants are not simply
the conjunction of its field types' safety invariants. Consequently,
it's invalid to reason about the safety properties of these types
in a purely structural manner — i.e., the manner in which `auto`
traits are implemented.
Makes progress towards #132922.
implement checks for tail calls
Quoting the [RFC draft](https://github.com/phi-go/rfcs/blob/guaranteed-tco/text/0000-explicit-tail-calls.md):
> The argument to become is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that lifetimes may differ as long as they pass borrow checking, see [below](https://github.com/phi-go/rfcs/blob/guaranteed-tco/text/0000-explicit-tail-calls.md#return-type-coercion) for specifics on the return type.
> Tail calling closures and tail calling from closures is not allowed. This is due to the high implementation effort, see below, this restriction can be lifted by a future RFC.
> Invocations of operators were considered as valid targets but were rejected on grounds of being too error-prone. In any case, these can still be called as methods.
> Tail calling [variadic functions](https://doc.rust-lang.org/beta/unstable-book/language-features/c-variadic.html) and tail calling from variadic functions is not allowed. As support for variadic function is stabilized on a per target level, support for tail-calls regarding variadic functions would need to follow a similar approach. To avoid this complexity and to minimize implementation effort for backends, this interaction is currently not allowed but support can be added with a future RFC.
-----
The checks are implemented as a query, similarly to `check_unsafety`.
The code is cherry-picked straight out of #112657 which was written more than a year ago, so I expect we might need to change some things ^^"
improve TagEncoding::Niche docs, sanity check, and UB checks
Turns out the `niche_variants` range can actually contain the `untagged_variant`. We should report this as UB in Miri, so this PR implements that.
Also rename `partially_check_layout` to `layout_sanity_check` for better consistency with how similar functions are called in other parts of the compiler.
Turns out my adjustments to the transmutation logic also fix https://github.com/rust-lang/rust/issues/126267.
Get rid of HIR const checker
As far as I can tell, the HIR const checker was implemented in https://github.com/rust-lang/rust/pull/66170 because we were not able to issue useful const error messages in the MIR const checker.
This seems to have changed in the last 5 years, probably due to work like #90532. I've tweaked the diagnostics slightly and think the error messages have gotten *better* in fact.
Thus I think the HIR const checker has reached the end of its usefulness, and we can retire it.
cc `@RalfJung`
fix ICE when promoted has layout size overflow
Turns out there is no reason to distinguish `tainted_by_errors` and `can_be_spurious` here, we can just track whether we allow this even in "infallible" constants.
Fixes https://github.com/rust-lang/rust/issues/125476
Move `Const::{from_anon_const,try_from_lit}` to hir_ty_lowering
Fixes#128176.
This accomplishes one of the followup items from #131081.
These operations are much more about lowering the HIR than about
`Const`s themselves. They fit better in hir_ty_lowering with
`lower_const_arg` (formerly `Const::from_const_arg`) and the rest.
To accomplish this, `const_evaluatable_predicates_of` had to be changed
to not use `from_anon_const` anymore. Instead of visiting the HIR and
lowering anon consts on the fly, it now visits the `rustc_middle::ty`
data structures instead and directly looks for `UnevaluatedConst`s. This
approach was proposed in:
https://github.com/rust-lang/rust/pull/131081#discussion_r1821189257
r? `@BoxyUwU`
These operations are much more about lowering the HIR than about
`Const`s themselves. They fit better in hir_ty_lowering with
`lower_const_arg` (formerly `Const::from_const_arg`) and the rest.
To accomplish this, `const_evaluatable_predicates_of` had to be changed
to not use `from_anon_const` anymore. Instead of visiting the HIR and
lowering anon consts on the fly, it now visits the `rustc_middle::ty`
data structures instead and directly looks for `UnevaluatedConst`s. This
approach was proposed in:
https://github.com/rust-lang/rust/pull/131081#discussion_r1821189257
remove `Ty::is_copy_modulo_regions`
Using these functions is likely incorrect if an `InferCtxt` is available, I moved this function to `TyCtxt` (and added it to `LateContext`) and added a note to the documentation that one should prefer `Infer::type_is_copy_modulo_regions` instead.
I didn't yet move `is_sized` and `is_freeze`, though I think we should move these as well.
r? `@compiler-errors` cc #132279
Remove `hir::ArrayLen`
This refactoring removes `hir::ArrayLen`, replacing it with `hir::ConstArg`. To represent inferred array lengths (previously `hir::ArrayLen::Infer`), a new variant `ConstArgKind::Infer` is added.
r? `@BoxyUwU`
coverage: Use a query to identify which counter/expression IDs are used
Given that we already have a query to identify the highest-numbered counter ID in a MIR body, we can extend that query to also build bitsets of used counter/expression IDs. That lets us avoid some messy coverage bookkeeping during the main MIR traversal for codegen.
This does mean that we fail to treat some IDs as used in certain MIR-inlining scenarios, but I think that's fine, because it means that the results will be consistent across all instantiations of a function.
---
There's some more cleanup I want to do in the function coverage collector, since it isn't really collecting anything any more, but I'll leave that for future work.
support revealing defined opaque post borrowck
By adding a new `TypingMode::PostBorrowckAnalysis`. Currently only supported with the new solver and I didn't look into the way we replace `ReErased`. ``@compiler-errors`` mentioned that always using existentials may be unsound.
r? ``@compiler-errors``
this implements checks necessary to guarantee that we can actually
perform a tail call. while extremely restrictive, this is what is
documented in the RFC, and all these checks are needed for one reason or
another.
Enable -Zshare-generics for inline(never) functions
This avoids inlining cross-crate generic items when possible that are
already marked inline(never), implying that the author is not intending
for the function to be inlined by callers. As such, having a local copy
may make it easier for LLVM to optimize but mostly just adds to binary
bloat and codegen time. In practice our benchmarks indicate this is
indeed a win for larger compilations, where the extra cost in dynamic
linking to these symbols is diminished compared to the advantages in
fewer copies that need optimizing in each binary.
It might also make sense it expand this with other heuristics (e.g.,
`#[cold]`) in the future, but this seems like a good starting point.
FWIW, I expect that doing cleanup in where we make the decision
what should/shouldn't be shared is also a good idea. Way too
much code needed to be tweaked to check this. But I'm hoping
to leave that for a follow-up PR rather than blocking this on it.
This reduces code sizes and better respects programmer intent when
marking inline(never). Previously such a marking was essentially ignored
for generic functions, as we'd still inline them in remote crates.
coverage: Store coverage source regions as `Span` until codegen
Historically, coverage spans were converted into line/column coordinates during the MIR instrumentation pass.
This PR moves that conversion step into codegen, so that coverage spans spend most of their time stored as `Span` instead.
In addition to being conceptually nicer, this also reduces the size of coverage mappings in MIR, because `Span` is smaller than 4x u32.
---
There should be no changes to coverage output.
Some more refactorings towards removing driver queries
Follow up to https://github.com/rust-lang/rust/pull/127184
## Custom driver breaking change
The `after_analysis` callback is changed to accept `TyCtxt` instead of `Queries`. The only safe query in `Queries` to call at this point is `global_ctxt()` which allows you to enter the `TyCtxt` either way. To fix your custom driver, replace the `queries: &'tcx Queries<'tcx>` argument with `tcx: TyCtxt<'tcx>` and remove your `queries.global_ctxt().unwrap().enter(|tcx| { ... })` call and only keep the contents of the closure.
## Custom driver deprecation
The `after_crate_root_parsing` callback is now deprecated. Several custom drivers are incorrectly calling `queries.global_ctxt()` from inside of it, which causes some driver code to be skipped. As such I would like to either remove it in the future or if custom drivers still need it, change it to accept an `&rustc_ast::Crate` instead.
Remove -Zfuel.
I'm not sure this feature is used. I only found 2 references in a google search, both referring to its introduction.
Meanwhile, it's a global mutable state, untracked by incremental compilation, so incompatible with it.
Simplify array length mismatch error reporting (to not try to turn consts into target usizes)
This changes `TypeError::FixedArrayLen` to use `ExpectedFound<ty::Const<'tcx>>` (instead of `ExpectedFound<u64>`), and renames it to `TypeError::ArrayLen`. This allows us to avoid a `try_to_target_usize` call in the type relation, which ICEs when we have a scalar of the wrong bit length (i.e. u8).
This also makes `structurally_relate_tys` to always use this type error kind any time we have a const mismatch resulting from relating the array-len part of `[T; N]`.
This has the effect of changing the error message we issue for array length mismatches involving non-valtree consts. I actually quite like the change, though, since before:
```
LL | fn test<const N: usize, const M: usize>() -> [u8; M] {
| ------- expected `[u8; M]` because of return type
LL | [0; N]
| ^^^^^^ expected `M`, found `N`
|
= note: expected array `[u8; M]`
found array `[u8; N]`
```
and after, which I think is far less verbose:
```
LL | fn test<const N: usize, const M: usize>() -> [u8; M] {
| ------- expected `[u8; M]` because of return type
LL | [0; N]
| ^^^^^^ expected an array with a size of M, found one with a size of N
```
The only questions I have are:
1. Should we do something about backticks here? Right now we don't backtick either fully evaluated consts like `2`, or rigid consts like `Foo::BAR`.... but maybe we should? It seems kinda verbose to do for numbers -- maybe we could intercept those specifically.
2. I guess we may still run the risk of leaking unevaluated consts into error reporting like `2 + 1`...?
r? ``@BoxyUwU``
Fixes#126359Fixes#131101
No need to re-sort existential preds in relate impl
We already assert that these predicates are in the right ordering in `mk_poly_existential_predicates`.
r? types
finish `Reveal` removal
After #133212 changed the `TypingMode` to be the only source of truth, this entirely rips out `Reveal`.
cc #132279
r? `@compiler-errors`
Rollup of 8 pull requests
Successful merges:
- #132090 (Stop being so bail-y in candidate assembly)
- #132658 (Detect const in pattern with typo)
- #132911 (Pretty print async fn sugar in opaques and trait bounds)
- #133102 (aarch64 softfloat target: always pass floats in int registers)
- #133159 (Don't allow `-Zunstable-options` to take a value )
- #133208 (generate-copyright: Now generates a library file too.)
- #133215 (Fix missing submodule in `./x vendor`)
- #133264 (implement OsString::truncate)
r? `@ghost`
`@rustbot` modify labels: rollup
Rollup of 6 pull requests
Successful merges:
- #129838 (uefi: process: Add args support)
- #130800 (Mark `get_mut` and `set_position` in `std::io::Cursor` as const.)
- #132708 (Point at `const` definition when used instead of a binding in a `let` statement)
- #133226 (Make `PointerLike` opt-in instead of built-in)
- #133244 (Account for `wasm32v1-none` when exporting TLS symbols)
- #133257 (Add `UnordMap::clear` method)
r? `@ghost`
`@rustbot` modify labels: rollup
Make `PointerLike` opt-in instead of built-in
The `PointerLike` trait currently is a built-in trait that computes the layout of the type. This is a bit problematic, because types implement this trait automatically. Since this can be broken due to semver-compatible changes to a type's layout, this is undesirable. Also, calling `layout_of` in the trait system also causes cycles.
This PR makes the trait implemented via regular impls, and adds additional validation on top to make sure that those impls are valid. This could eventually be `derive()`d for custom smart pointers, and we can trust *that* as a semver promise rather than risking library authors accidentally breaking it.
On the other hand, we may never expose `PointerLike`, but at least now the implementation doesn't invoke `layout_of` which could cause ICEs or cause cycles.
Right now for a `PointerLike` impl to be valid, it must be an ADT that is `repr(transparent)` and the non-1zst field needs to implement `PointerLike`. There are also some primitive impls for `&T`/ `&mut T`/`*const T`/`*mut T`/`Box<T>`.
Point at `const` definition when used instead of a binding in a `let` statement
Modify `PatKind::InlineConstant` to be `ExpandedConstant` standing in not only for inline `const` blocks but also for `const` items. This allows us to track named `const`s used in patterns when the pattern is a single binding. When we detect that there is a refutable pattern involving a `const` that could have been a binding instead, we point at the `const` item, and suggest renaming. We do this for both `let` bindings and `match` expressions missing a catch-all arm if there's at least one single binding pattern referenced.
After:
```
error[E0005]: refutable pattern in local binding
--> $DIR/bad-pattern.rs:19:13
|
LL | const PAT: u32 = 0;
| -------------- missing patterns are not covered because `PAT` is interpreted as a constant pattern, not a new variable
...
LL | let PAT = v1;
| ^^^ pattern `1_u32..=u32::MAX` not covered
|
= note: `let` bindings require an "irrefutable pattern", like a `struct` or an `enum` with only one variant
= note: for more information, visit https://doc.rust-lang.org/book/ch18-02-refutability.html
= note: the matched value is of type `u32`
help: introduce a variable instead
|
LL | let PAT_var = v1;
| ~~~~~~~
```
Before:
```
error[E0005]: refutable pattern in local binding
--> $DIR/bad-pattern.rs:19:13
|
LL | let PAT = v1;
| ^^^
| |
| pattern `1_u32..=u32::MAX` not covered
| missing patterns are not covered because `PAT` is interpreted as a constant pattern, not a new variable
| help: introduce a variable instead: `PAT_var`
|
= note: `let` bindings require an "irrefutable pattern", like a `struct` or an `enum` with only one variant
= note: for more information, visit https://doc.rust-lang.org/book/ch18-02-refutability.html
= note: the matched value is of type `u32`
```
CC #132582.
Reduce false positives of tail-expr-drop-order from consumed values (attempt #2)
r? `@nikomatsakis`
Tracked by #123739.
Related to #129864 but not replacing, yet.
Related to #130836.
This is an implementation of the approach suggested in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/temporary.20drop.20order.20changes). A new MIR statement `BackwardsIncompatibleDrop` is added to the MIR syntax. The lint now works by inspecting possibly live move paths before at the `BackwardsIncompatibleDrop` location and the actual drop under the current edition, which should be one before Edition 2024 in practice.
take 2
open up coroutines
tweak the wordings
the lint works up until 2021
We were missing one case, for ADTs, which was
causing `Result` to yield incorrect results.
only include field spans with significant types
deduplicate and eliminate field spans
switch to emit spans to impl Drops
Co-authored-by: Niko Matsakis <nikomat@amazon.com>
collect drops instead of taking liveness diff
apply some suggestions and add explantory notes
small fix on the cache
let the query recurse through coroutine
new suggestion format with extracted variable name
fine-tune the drop span and messages
bugfix on runtime borrows
tweak message wording
filter out ecosystem types earlier
apply suggestions
clippy
check lint level at session level
further restrict applicability of the lint
translate bid into nop for stable mir
detect cycle in type structure
lints_that_dont_need_to_run: never skip future-compat-reported lints
Follow-up to https://github.com/rust-lang/rust/pull/125116: future-compat lints show up with `--json=future-incompat` even if they are otherwise allowed in the crate. So let's ensure we do not skip those as part of the `lints_that_dont_need_to_run` logic.
I could not find a current future compat lint that is emitted by a lint pass, so there's no clear way to add a test for this.
Cc `@blyxyas` `@cjgillot`
Rollup of 4 pull requests
Successful merges:
- #131081 (Use `ConstArgKind::Path` for all single-segment paths, not just params under `min_generic_const_args`)
- #132577 (Report the `unexpected_cfgs` lint in external macros)
- #133023 (Merge `-Zhir-stats` into `-Zinput-stats`)
- #133200 (ignore an occasionally-failing test in Miri)
r? `@ghost`
`@rustbot` modify labels: rollup
Use `ConstArgKind::Path` for all single-segment paths, not just params under `min_generic_const_args`
r? `@BoxyUwU`
edit by `@BoxyUwU:`
This PR introduces a `min_generic_const_args` feature gate and implements some preliminary work for it, representing all const arguments that are single segment paths as `ConstArg::Path` instead of only those that resolve to a const generic parameter. There are a few bits of follow up work after this lands:
- Figure out how to represent `Foo<{ STATIC }>`
- Figure out how to evaluate `Foo<{ EnumVariantConstructor }>`
- Make param env normalization handle non-anon-consts
- Move `try_from_lit` and `from_anon_const` to hir ty lowering too
Improve VecCache under parallel frontend
This replaces the single Vec allocation with a series of progressively larger buckets. With the cfg for parallel enabled but with -Zthreads=1, this looks like a slight regression in i-count and cycle counts (~1%).
With the parallel frontend at -Zthreads=4, this is an improvement (-5% wall-time from 5.788 to 5.4688 on libcore) than our current Lock-based approach, likely due to reducing the bouncing of the cache line holding the lock. At -Zthreads=32 it's a huge improvement (-46%: 8.829 -> 4.7319 seconds).
try-job: i686-gnu-nopt
try-job: dist-x86_64-linux
Use `TypingMode` throughout the compiler instead of `ParamEnv`
Hopefully the biggest single PR as part of https://github.com/rust-lang/types-team/issues/128.
## `infcx.typing_env` while defining opaque types
I don't know how'll be able to correctly handle opaque types when using something taking a `TypingEnv` while defining opaque types. To correctly handle the opaques we need to be able to pass in the current `opaque_type_storage` and return constraints, i.e. we need to use a proper canonical query. We should migrate all the queries used during HIR typeck and borrowck where this matters to proper canonical queries. This is
## `layout_of` and `Reveal::All`
We convert the `ParamEnv` to `Reveal::All` right at the start of the `layout_of` query, so I've changed callers of `layout_of` to already use a post analysis `TypingEnv` when encountering it.
ca87b535a0/compiler/rustc_ty_utils/src/layout.rs (L51)
## `Ty::is_[unpin|sized|whatever]`
I haven't migrated `fn is_item_raw` to use `TypingEnv`, will do so in a followup PR, this should significantly reduce the amount of `typing_env.param_env`. At some point there will probably be zero such uses as using the type system while ignoring the `typing_mode` is incorrect.
## `MirPhase` and phase-transitions
When inside of a MIR-body, we can mostly use its `MirPhase` to figure out the right `typing_mode`. This does not work during phase transitions, most notably when transitioning from `Analysis` to `Runtime`:
dae7ac133b/compiler/rustc_mir_transform/src/lib.rs (L606-L625)
All these passes still run with `MirPhase::Analysis`, but we should only use `Reveal::All` once we're run the `RevealAll` pass. This required me to manually construct the right `TypingEnv` in all these passes. Given that it feels somewhat easy to accidentally miss this going forward, I would maybe like to change `Body::phase` to an `Option` and replace it at the start of phase transitions. This then makes it clear that the MIR is currently in a weird state.
r? `@ghost`
stability: remove skip_stability_check_due_to_privacy
This was added in https://github.com/rust-lang/rust/pull/38689 to deal with https://github.com/rust-lang/rust/issues/38412. However, even after removing the check, the relevant tests still pass. Let's see if CI finds any other tests that rely on this. If not, it seems like logic elsewhere in the compiler changed so this is not required any more.
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
After:
```
error[E0005]: refutable pattern in local binding
--> $DIR/bad-pattern.rs:19:13
|
LL | const PAT: u32 = 0;
| -------------- missing patterns are not covered because `PAT` is interpreted as a constant pattern, not a new variable
...
LL | let PAT = v1;
| ^^^
| |
| pattern `1_u32..=u32::MAX` not covered
| help: introduce a variable instead: `PAT_var`
|
= note: `let` bindings require an "irrefutable pattern", like a `struct` or an `enum` with only one variant
= note: for more information, visit https://doc.rust-lang.org/book/ch18-02-refutability.html
= note: the matched value is of type `u32`
```
Before:
```
error[E0005]: refutable pattern in local binding
--> $DIR/bad-pattern.rs:19:13
|
LL | let PAT = v1;
| ^^^
| |
| pattern `1_u32..=u32::MAX` not covered
| missing patterns are not covered because `PAT` is interpreted as a constant pattern, not a new variable
| help: introduce a variable instead: `PAT_var`
|
= note: `let` bindings require an "irrefutable pattern", like a `struct` or an `enum` with only one variant
= note: for more information, visit https://doc.rust-lang.org/book/ch18-02-refutability.html
= note: the matched value is of type `u32`
```
Querify MonoItem collection
Factored out of https://github.com/rust-lang/rust/pull/131650. These changes are required for post-mono MIR opts, because the previous implementation would load the MIR for every Instance that we traverse (as well as invoke queries on it). The cost of that would grow massively with post-mono MIR opts because we'll need to load new MIR for every Instance, instead of re-using the `optimized_mir` for every Instance with the same DefId.
So the approach here is to add two new queries, `items_of_instance` and `size_estimate`, which contain the specific information about an Instance's MIR that MirUsedCollector and CGU partitioning need, respectively. Caching these significantly increases the size of the query cache, but that's justified by our improved incrementality (I'm sure walking all the MIR for a huge crate scales quite poorly).
This also changes `MonoItems` into a type that will retain the traversal order (otherwise we perturb a bunch of diagnostics), and will also eliminate duplicate findings. Eliminating duplicates removes about a quarter of the query cache size growth.
The perf improvements in this PR are inflated because rustc-perf uses `-Zincremental-verify-ich`, which makes loading MIR a lot slower because MIR contains a lot of Spans and computing the stable hash of a Span is slow. And the primary goal of this PR is to load less MIR. Some squinting at `collector profile_local perf-record +stage1` runs suggests the magnitude of the improvements in this PR would be decreased by between a third and a half if that flag weren't being used. Though this effect may apply to the regressions too since most are incr-full and this change also causes such builds to encode more Spans.
This replaces the single Vec allocation with a series of progressively
larger buckets. With the cfg for parallel enabled but with -Zthreads=1,
this looks like a slight regression in i-count and cycle counts (<0.1%).
With the parallel frontend at -Zthreads=4, this is an improvement (-5%
wall-time from 5.788 to 5.4688 on libcore) than our current Lock-based
approach, likely due to reducing the bouncing of the cache line holding
the lock. At -Zthreads=32 it's a huge improvement (-46%: 8.829 -> 4.7319
seconds).
Mention both release *and* edition breakage for never type lints
This PR makes ~~two changes~~ a change to the never type lints (`dependency_on_unit_never_type_fallback` and `never_type_fallback_flowing_into_unsafe`):
1. Change the wording of the note to mention that the breaking change will be made in an edition _and_ in a future release
2. ~~Make these warnings be reported in deps (hopefully the lints are matured enough)~~
r? ``@compiler-errors``
cc ``@ehuss``
closes#132930