Document the current MIR semantics that are clear from existing code
This PR adds documentation to places, operands, rvalues, statementkinds, and terminatorkinds that describes their existing semantics and requirements. In many places the semantics depend on the Rust memory model or other T-Lang decisions - when this is the case, it is just noted as such with links to UCG issues where possible. I'm hopeful that none of the documentation added here can be used to justify optimizations that depend on the memory model. The documentation for places and operands probably comes closest to running afoul of this - if people think that it cannot be merged as is, it can definitely also be taken out.
The goal here is to only document parts of MIR that seem to be decided already, or are at least depended on by existing code. That leaves quite a number of open questions - those are marked as "needs clarification." I'm not sure what to do with those in this PR - we obviously can't decide all these questions here. Should I just leave them in as is? Take them out? Keep them in but as `//` instead of `///` comments?
If this is too big to review at once, I can split this up.
r? rust-lang/mir-opt
Respect -Z verify-llvm-ir and other flags that add extra passes when combined with -C no-prepopulate-passes in the new LLVM Pass Manager.
As part of the switch to the new LLVM Pass Manager the behaviour of flags such as `-Z verify-llvm-ir` (e.g. sanitizer, instrumentation) was modified when combined with `-C no-prepopulate-passes`. With the old PM, rustc was the one manually constructing the pipeline and respected those flags but in the new pass manager, those flags are used to build a list of callbacks that get invoked at certain extension points in the pipeline. Unfortunately, `-C no-prepopulate-passes` would skip building the pipeline altogether meaning we'd never add the corresponding passes. The fix here is to just manually invoke those callbacks as needed.
Fixes#95874
Demonstrating the current vs fixed behaviour using the bug in #95864
```console
$ rustc +nightly asm-miscompile.rs --edition 2021 --emit=llvm-ir -C no-prepopulate-passes -Z verify-llvm-ir
$ echo $?
0
$ rustc +stage1 asm-miscompile.rs --edition 2021 --emit=llvm-ir -C no-prepopulate-passes -Z verify-llvm-ir
Basic Block in function '_ZN14asm_miscompile3foo28_$u7b$$u7b$closure$u7d$$u7d$17h360e2f7eee1275c5E' does not have terminator!
label %bb1
LLVM ERROR: Broken module found, compilation aborted!
```
Fix miscompilation of inline assembly with outputs in cases where we emit an invoke instead of call instruction.
We ran into this bug where rustc would segfault while trying to compile certain uses of inline assembly.
Here is a simple repro that demonstrates the issue:
```rust
#![feature(asm_unwind)]
fn main() {
let _x = String::from("string here just cause we need something with a non-trivial drop");
let foo: u64;
unsafe {
std::arch::asm!(
"mov {}, 1",
out(reg) foo,
options(may_unwind)
);
}
println!("{}", foo);
}
```
([playground link](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=7d6641e83370d2536a07234aca2498ff))
But crucially `feature(asm_unwind)` is not actually needed and this can be triggered on stable as a result of the way async functions/generators are handled in the compiler. e.g.:
```rust
extern crate futures; // 0.3.21
async fn bar() {
let foo: u64;
unsafe {
std::arch::asm!(
"mov {}, 1",
out(reg) foo,
);
}
println!("{}", foo);
}
fn main() {
futures::executor::block_on(bar());
}
```
([playground link](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=1c7781c34dd4a3e80ae4bd936a0c82fc))
An example of the incorrect LLVM generated:
```llvm
bb1: ; preds = %start
%1 = invoke i64 asm sideeffect alignstack inteldialect unwind "mov ${0:q}, 1", "=&r,~{dirflag},~{fpsr},~{flags},~{memory}"()
to label %bb2 unwind label %cleanup, !srcloc !9
store i64 %1, i64* %foo, align 8
bb2:
[...snip...]
```
The store should not be placed after the asm invoke but rather should be in the normal control flow basic block (`bb2` in this case).
[Here](https://gist.github.com/luqmana/be1af5b64d2cda5a533e3e23a7830b44) is a writeup of the investigation that lead to finding this.
[`let_chains`] Forbid `let` inside parentheses
Parenthesizes are mostly a no-op in let chains, in other words, they are mostly ignored.
```rust
let opt = Some(Some(1i32));
if (let Some(a) = opt && (let Some(b) = a)) && b == 1 {
println!("`b` is declared inside but used outside");
}
```
As seen above, such behavior can lead to confusion.
A proper fix or nested encapsulation would probably require research, time and a modified MIR graph so in this PR I simply denied any `let` inside parentheses. Non-let stuff are still allowed.
```rust
fn main() {
let fun = || true;
if let true = (true && fun()) && (true) {
println!("Allowed");
}
}
```
It is worth noting that `let ...` is not an expression and the RFC did not mention this specific situation.
cc `@matthewjasper`
Rollup of 7 pull requests
Successful merges:
- #95743 (Update binary_search example to instead redirect to partition_point)
- #95771 (Update linker-plugin-lto.md to 1.60)
- #95861 (Note that CI tests Windows 10)
- #95875 (bootstrap: show available paths help text for aliased subcommands)
- #95876 (Add a note for unsatisfied `~const Drop` bounds)
- #95907 (address fixme for diagnostic variable name)
- #95917 (thin_box test: import from std, not alloc)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Only suggest removing semicolon when expression is compatible with `impl Trait`
https://github.com/rust-lang/rust/issues/54771#issuecomment-476423690
> It still needs checking that the last statement's expr can actually conform to the trait, but the naïve behavior is there.
Only suggest removing a semicolon when the type behind the semicolon actually implements the trait in an RPIT `-> impl Trait`. Also upgrade the label that suggests removing the semicolon to a suggestion (should it be verbose?).
cc #54771
When a `macro_rules! foo { ... }` invocation is compiled the name used
is `foo`, not `macro_rules!`. This is different to all other macro
invocations, and confused me when I was inserted debugging println
statements for macro evaluation.
This commit changes it to `macro_rules` (or just `macro`), which is what
I expected. There are no externally visible changes.
Rollup of 7 pull requests
Successful merges:
- #95566 (Avoid duplication of doc comments in `std::char` constants and functions)
- #95784 (Suggest replacing `typeof(...)` with an actual type)
- #95807 (Suggest adding a local for vector to fix borrowck errors)
- #95849 (Check for git submodules in non-git source tree.)
- #95852 (Fix missing space in lossy provenance cast lint)
- #95857 (Allow multiple derefs to be splitted in deref_separator)
- #95868 (rustdoc: Reduce allocations in a `html::markdown` function)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Allow multiple derefs to be splitted in deref_separator
Previously in #95649 only a single deref within projection was supported and multiple derefs caused a bunch of issues, this PR fixes those issues.
```@oli-obk``` helped a ton again ❤️
Suggest replacing `typeof(...)` with an actual type
This PR adds suggestion to replace `typeof(...)` with an actual type of `...`, for example in case of `typeof(1)` we suggest replacing it with `i32`.
If the expression
1. Is not const (`{ let a = 1; let _: typeof(a); }`)
2. Can't be found (`let _: typeof(this_variable_does_not_exist)`)
3. Or has non-suggestable type (closure, generator, error, etc)
we don't suggest anything.
The 1 one is sad, but it's not clear how to support non-consts expressions for `typeof`.
_This PR is inspired by [this tweet]._
[this tweet]: https://twitter.com/compiler_errors/status/1511945354752638976
Make def names and HIR names consistent.
The name in the `DefKey` is interned to create the `DefId`, so it does not
require any query to access. This can be leveraged to avoid a few useless
HIR accesses for names.
~In order to achieve that, generic parameters created from universal
impl-trait are given the pretty-printed ast as a name, instead of
`{{opaque}}`.~
~Drive-by: the `TyCtxt::opt_item_name` used a dummy span for non-local
definitions. We have access to `def_ident_span`, so we use it.~
We may sometimes emit an `invoke` instead of a `call` for inline
assembly during the MIR -> LLVM IR lowering. But we failed to update
the IR builder's current basic block before writing the results to the
outputs. This would result in invalid IR because the basic block would
end in a `store` instruction, which isn't a valid terminator.
expand: Remove `ParseSess::missing_fragment_specifiers`
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
cc https://github.com/rust-lang/rust/pull/95747#issuecomment-1091619403
r? ``@nnethercote``
Left overs of #95761
These are just nits. Feel free to close this PR if all modifications are not worth merging.
* `#![feature(decl_macro)]` is not needed anymore in `rustc_expand`
* `tuple_impls` does not require `$Tuple:ident`. I guess it is there to enhance readability?
r? ```@petrochenkov```
refactor: simplify few string related interactions
Few small optimizations:
check_doc_keyword: don't alloc string for emptiness check
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice.
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use FxHashSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
interpret: err instead of ICE on size mismatches in to_bits_or_ptr_internal
We did this a while ago already for `to_i32()` and friends, but missed this one. That became quite annoying when I was debugging an ICE caused by `read_pointer` in a Miri shim where the code was passing an argument at the wrong type.
Having `scalar_to_ptr` be fallible is consistent with all the other `Scalar::to_*` methods being fallible. I added `unwrap` only in code outside the interpreter, which is no worse off than before now in terms of panics.
r? ````@oli-obk````
Remove explicit delimiter token trees from `Delimited`.
They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.
r? `@petrochenkov`
Strict provenance lints
See #95488.
This PR introduces two unstable (allow by default) lints to which lint on int2ptr and ptr2int casts, as the former is not possible in the strict provenance model and the latter can be written nicer using the `.addr()` API.
Based on an initial version of the lint by ```@Gankra``` in #95199.
Cached stable hash cleanups
r? `@nnethercote`
Add a sanity assertion in debug mode to check that the cached hashes are actually the ones we get if we compute the hash each time.
Add a new data structure that bundles all the hash-caching work to make it easier to re-use it for different interned data structures
They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.
Enforce well formedness for type alias impl trait's hidden type
fixes#84657
This was not an issue with return-position-impl-trait because the generic bounds of the function are the same as those of the opaque type, and the hidden type must already be well formed within the function.
With type-alias-impl-trait the hidden type could be defined in a function that has *more* lifetime bounds than the type alias. This is fine, but the hidden type must still be well formed without those additional bounds.
* split `fuzzy_provenance_casts` into a ptr2int and a int2ptr lint
* feature gate both lints
* update documentation to be more realistic short term
* add tests for these lints
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice. Vec introduced in #83149, but no perf run posted on merge
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use BTreeSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
This commit suggests replacing typeof(...) with an actual type of "...",
for example in case of `typeof(1)` we suggest replacing it with `i32`.
If the expression
- Is not const (`{ let a = 1; let _: typeof(a); }`)
- Can't be found (`let _: typeof(this_variable_does_not_exist)`)
- Or has non-suggestable type (closure, generator, error, etc)
we don't suggest anything.
Report opaque type mismatches directly during borrowck of the function instead of within the `type_of` query.
This allows us to only store a single hidden type per opaque type instead of having to store one per set of substitutions.
r? `@compiler-errors`
This does not affect diagnostics, because the diagnostic messages are exactly the same.
This allows profiling costly arguments to be recorded only when `-Zself-profile-events=args` is on: using a closure that takes an `EventArgRecorder` and call its `record_arg` or `record_args` methods.
Use gender neutral terms
#95508 was not executed well, but it did find a couple of legitimate issues: some uses of unnecessarily gendered language, and some typos. This PR fixes (properly) the legitimate issues it found.
Stop flagging unexpected inner attributes as outer ones in certain diagnostics
Fixes#94340.
In the issue to-be-fixed I write that the general message _an inner attribute is not permitted in this context_ should be more specific noting that the “context” is the `include` macro. This, however, cannot be achieved without touching a lot of things and passing a flag to the `parse_expr` and `parse_item` calls in `expand_include`. This seems rather hacky to me. That's why I left it as it. `Span::from_expansion` does not apply either AFAIK.
`@rustbot` label A-diagnostics T-compiler
Bump bootstrap compiler to 1.61.0 beta
This PR bumps the bootstrap compiler to the 1.61.0 beta. The first commit changes the stage0 compiler, the second commit applies the "mechanical" changes and the third and fourth commits apply changes explained in the relevant comments.
r? `@Mark-Simulacrum`
By heap allocating the argument within `NtPath`, `NtVis`, and `NtStmt`.
This slightly reduces cumulative and peak allocation amounts, most
notably on `deep-vector`.
Check that all hidden types are the same and then deduplicate them.
fixes#95538
This used to trigger a sanity check. Now we accept that there may be multiple places where a hidden type is constrained and we merge all of these at the end.
Ideally we'd merge eagerly, but that is a larger refactoring that I don't want to put into a backport
New mir-opt deref_separator
This adds a new mir-opt that split certain derefs into this form:
`let x = (*a.b).c;` to => `tmp = a.b; let x = (*tmp).c;`
Huge thanks to ``@oli-obk`` for his patient mentoring.
interp/validity: enforce Scalar::Initialized
This is a follow-up to https://github.com/rust-lang/rust/pull/94527, to also account for the new kind of `Scalar` layout inside the validity checker.
r? `@oli-obk`
enhance `ConstGoto` mir-opt by moving up `StorageDead` statements
From the `FIXME` in the implementation of `ConstGoto` miropt. We can move `StorageDead` statements up to the predecessor. This can expand the scope of application of this opt.
interp: pass TyCtxt to Machine methods that do not take InterpCx
This just seems like something you might need, so let's consistently have it.
One day we might have to add `ParamEnv` as well, though that seems less likely (and in Miri you can always use `reveal_all` anyway). It might make sense to have a type that packages `TyCtxt` and `ParamEnv`, this pairing occurs quite frequently in rustc...
r? `@oli-obk`
Currently it's called in `parse_tt` every time a match rule is invoked.
This commit moves it so it's called instead once per match rule, in
`compile_declarative_macro. This is a performance win.
The commit also moves `compute_locs` out of `TtParser`, because there's
no longer any reason for it to be in there.
Let CTFE to handle partially uninitialized unions without marking the entire value as uninitialized.
follow up to #94411
To fix https://github.com/rust-lang/rust/issues/69488 and by extension fix https://github.com/rust-lang/rust/issues/94371, we should stop treating types like `MaybeUninit<usize>` as something that the `Scalar` type in the interpreter engine can represent. So we add a new field to `abi::Primitive` that records whether the primitive is nested in a union
cc `@RalfJung`
r? `@ghost`
Use the proc-macro descr to track their individual expansions with
self-profiling events. This will help diagnose performance issues
with slow proc-macros.