coverage: Emit mappings for unused functions without generating stubs
For a while I've been annoyed by the fact that generating coverage maps for unused functions involves generating a stub function at the LLVM level.
As I suspected, generating that stub function isn't actually necessary, as long as we specifically tell LLVM about the symbol names of all the functions that have coverage mappings but weren't codegenned (due to being unused).
---
There is some helper code that gets moved around in the follow-up patches, so look at the first patch to see the most important functional changes.
---
`@rustbot` label +A-code-coverage
fix spans for removing `.await` on `for` expressions
We need to use a span with the outer syntax context of a desugared `for` expression to join it with the `.await` span.
fixes https://github.com/rust-lang/rust/issues/117014
Lint `non_exhaustive_omitted_patterns` by columns
This is a rework of the `non_exhaustive_omitted_patterns` lint to make it more consistent. The intent of the lint is to help consumers of `non_exhaustive` enums ensure they stay up-to-date with all upstream variants. This rewrite fixes two cases we didn't handle well before:
First, because of details of exhaustiveness checking, the following wouldn't lint `Enum::C` as missing:
```rust
match Some(x) {
Some(Enum::A) => {}
Some(Enum::B) => {}
_ => {}
}
```
Second, because of the fundamental workings of exhaustiveness checking, the following would treat the `true` and `false` cases separately and thus lint about missing variants:
```rust
match (true, x) {
(true, Enum::A) => {}
(true, Enum::B) => {}
(false, Enum::C) => {}
_ => {}
}
```
Moreover, it would correctly not lint in the case where the pair is flipped, because of asymmetry in how exhaustiveness checking proceeds.
A drawback is that it no longer makes sense to set the lint level per-arm. This will silently break the lint for current users of it (but it's behind a feature gate so that's ok).
The new approach is now independent of the exhaustiveness algorithm; it's a separate pass that looks at patterns column by column. This is another of the motivations for this: I'm glad to move it out of the algorithm, it was akward there.
This PR is almost identical to https://github.com/rust-lang/rust/pull/111651. cc `@eholk` who reviewed it at the time. Compared to then, I'm more confident this is the right approach.
This simplifies the code by removing all the `self` assignments and
makes the flow of data clearer - always into the printer.
Especially in v0 mangling, which already used `&mut self` in some
places, it gets a lot more uniform.
Point at assoc fn definition on type param divergence
When the number of type parameters in the associated function of an impl and its trait differ, we now *always* point at the trait one, even if it comes from a foreign crate. When it is local, we point at the specific params, when it is foreign, we point at the whole associated item.
Fix#69944.
Mention `into_iter` on borrow errors suggestions when appropriate
If we encounter a borrow error on `vec![1, 2, 3].iter()`, suggest `into_iter`.
Fix#68445.
coverage: Fix inconsistent handling of function signature spans
While doing some more cleanup of `spans`, I noticed a strange inconsistency in how function signatures are handled. Normally the function signature span is treated as though it were executable as part of the start of the function, but in some cases the signature span disappears entirely from coverage, for no obvious reason.
This is caused by the fact that spans created by `CoverageSpan::for_fn_sig` don't add the span to their `merged_spans` field (unlike normal statement/terminator spans). In cases where the span-processing code looks at those merged spans, it thinks the signature span is no longer visible and deletes it.
Adding the signature span to `merged_spans` resolves the inconsistency.
(Prior to #116409 this wouldn't have been possible, because there was no case in the old `CoverageStatement` enum representing a signature. Now that `merged_spans` is just a list of spans, that's no longer an obstacle.)
Add stable Instance::body() and RustcInternal trait
The `Instance::body()` returns a monomorphized body.
For that, we had to implement visitor that monomorphize types and constants. We are also introducing the RustcInternal trait that will allow us to convert back from Stable to Internal.
Note that this trait is not yet visible for our users as it depends on Tables. We should probably add a new trait that can be exposed.
The tests here are very simple, and I'm planning on creating more exhaustive tests in the project-mir repo. But I was hoping to get some feedback here first.
r? ```@oli-obk```
Typo suggestion to change bindings with leading underscore
When encountering a binding that isn't found but has a typo suggestion for a binding with a leading underscore, suggest changing the binding definition instead of the use place.
Fix#60164.
coverage: Simplify the injection of coverage statements
This is a follow-up to #116046 that I left out of that PR because I didn't want to make it any larger.
After the various changes we've made to how coverage data is stored and transferred, the old code structure for injecting coverage statements into MIR is built around a lot of constraints that don't exist any more. We can simplify it by replacing it with a handful of loops over the BCB node/edge counters and the BCB spans.
---
`@rustbot` label +A-code-coverage
This query has a name that sounds general-purpose, but in fact it has
coverage-specific semantics, and (fortunately) is only used by coverage code.
Because it is only ever called once (from one designated CGU), it doesn't need
to be a query, and we can change it to a regular function instead.
When the number of type parameters in the associated function of an impl
and its trait differ, we now *always* point at the trait one, even if it
comes from a foreign crate. When it is local, we point at the specific
params, when it is foreign, we point at the whole associated item.
Fix#69944.
When encountering a binding that isn't found but has a typo suggestion
for a binding with a leading underscore, suggest changing the binding
definition instead of the use place.
Fix#60164.
Move where doc comment meant as comment check
The new place makes more sense and covers more cases beyond individual statements.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found doc comment `//!foo
--> $DIR/doc-comment-in-stmt.rs:25:22
|
LL | let y = x.max(1) //!foo
| ^^^^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: add a space before `!` to write a regular comment
|
LL | let y = x.max(1) // !foo
| +
```
Fix#65329.
Uplift movability and mutability, the simple way
Just make type_ir a dependency of ast. This can be relaxed later if we want to make the dependency less heavy. Part of rust-lang/types-team#124.
r? `@lcnr` or `@jackh726`
The new place makes more sense and covers more cases beyond individual
statements.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found doc comment `//!foo
--> $DIR/doc-comment-in-stmt.rs:25:22
|
LL | let y = x.max(1) //!foo
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: add a space before `!` to write a regular comment
|
LL | let y = x.max(1) // !foo
| +
```
Fix#65329.
Fix duplicate labels emitted in `render_multispan_macro_backtrace()`
This PR replaces the `Vec` used to store labels with an `FxIndexSet` in order to eliminate duplicates
Fixes#116836
The `Instance::body()` returns a monomorphized body.
For that, we had to implement visitor that monomorphize types and
constants. We are also introducing the RustcInternal trait that will
allow us to convert back from Stable to Internal.
Note that this trait is not yet visible for our users as it depends on
Tables. We should probably add a new trait that can be exposed.
Implement rustc part of RFC 3127 trim-paths
This PR implements (or at least tries to) [RFC 3127 trim-paths](https://github.com/rust-lang/rust/issues/111540), the rustc part. That is `-Zremap-path-scope` with all of it's components/scopes.
`@rustbot` label: +F-trim-paths
Use v0.0.0 in compiler crates
I may be totally off base here, but my understanding is that it's conventional to use v0.0.0 to reflect the unversioned nature of the compiler crates. Fix that for some of the compiler crates that were created recently.
Don't ICE when encountering unresolved regions in `fully_resolve`
We can encounter unresolved regions due to unconstrained impl lifetime arguments because `collect_return_position_impl_trait_in_trait_tys` runs before WF actually checks that the impl is well-formed.
Fixes#116525
Bump `COINDUCTIVE_OVERLAP_IN_COHERENCE` to deny + warn in deps
1.73 is the first place this shows up in stable (recall that there was only 1 regression), so let's bump this to deny on nightly.
r? lcnr
coverage: Move most per-function coverage info into `mir::Body`
Currently, all of the coverage information collected by the `InstrumentCoverage` pass is smuggled through MIR in the form of individual `StatementKind::Coverage` statements, which must then be reassembled by coverage codegen.
That's awkward for a number of reasons:
- While some of the coverage statements do care about their specific position in the MIR control-flow graph, many of them don't, and are just tacked onto the function's first BB as metadata carriers.
- MIR inlining can result in coverage statements being duplicated, so coverage codegen has to jump through hoops to avoid emitting duplicate mappings.
- MIR optimizations that would delete coverage statements need to carefully copy them into the function's first BB so as not to omit them from coverage reports.
- The order in which coverage codegen sees coverage statements is dependent on MIR optimizations/inlining, which can cause unnecessary churn in the emitted coverage mappings.
- We don't have a good way to annotate MIR-level functions with extra coverage info that doesn't belong in a statement.
---
This PR therefore takes most of the per-function coverage info and stores it in a field in `mir::Body` as `Option<Box<FunctionCoverageInfo>>`.
(This adds one pointer to the size of `mir::Body`, even when coverage is not enabled.)
Coverage statements still need to be injected into MIR in some cases, but only when they actually affect codegen (counters) or are needed to detect code that has been optimized away as unreachable (counters/expressions).
---
By the end of this PR, the information stored in `FunctionCoverageInfo` is:
- A hash of the function's source code (needed by LLVM's coverage map format)
- The number of coverage counters added by coverage instrumentation
- A table of coverage expressions, associating each expression ID with its operator (add or subtract) and its two operands
- The list of mappings, associating each covered code region with a counter/expression/zero value
---
~~This is built on top of #115301, so I'll rebase and roll a reviewer once that lands.~~
r? `@ghost`
`@rustbot` label +A-code-coverage
This new description reflects the changes made in this PR, and should hopefully
be more useful to non-coverage developers who need to care about coverage
statements.
Even though expression details are now stored in the info structure, we still
need to inject `ExpressionUsed` statements into MIR, because if one is missing
during codegen then we know that it was optimized out and we can remap all of
its associated code regions to zero.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
Instead of modifying the accumulated expressions in-place, we now build a set
of expressions that are known to be zero, and then consult that set on the fly
when converting the expression data for FFI.
This will be necessary when moving mappings and expression data into function
coverage info, which can't be mutated during codegen.
Don't compare host param by name
Seems sketchy to be searching for `sym::host` by name, especially when we can get the actual index with not very much work.
r? fee1-dead
Coverage codegen can now allocate arrays based on the number of
counters/expressions originally used by the instrumentor.
The existing query that inspects coverage statements is still used for
determining the number of counters passed to `llvm.instrprof.increment`. If
some high-numbered counters were removed by MIR optimizations, the instrumented
binary can potentially use less memory and disk space at runtime.
This allows coverage information to be attached to the function as a whole when
appropriate, instead of being smuggled through coverage statements in the
function's basic blocks.
As an example, this patch moves the `function_source_hash` value out of
individual `CoverageKind::Counter` statements and into the per-function info.
When synthesizing unused functions for coverage purposes, the absence of this
info is taken to indicate that a function was not eligible for coverage and
should not be synthesized.
Remove lots of generics from `ty::print`
All of these generics mostly resolve to the same thing, which means we can remove them, greatly simplifying the types involved in pretty printing and unlocking another simplification (that is not performed in this PR): Using `&mut self` instead of passing `self` through the return type.
cc `@eddyb` you probably know why it's like this, just checking in and making sure I didn't do anything bad
r? oli-obk
Use `YYYY-MM-DDTHH_MM_SS` as datetime format for ICE dump files
Windows paths do not support `:`, so use a datetime format in ICE dump paths that Windows will accept.
CC #116809, fix#115180.
Remove `IdFunctor` trait.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
r? `@BoxyUwU`
Implement an internal lint encouraging use of `Span::eq_ctxt`
Adds a new Rustc internal lint that forbids use of `.ctxt() == .ctxt()` for spans, encouraging use of `.eq_ctxt()` instead (see https://github.com/rust-lang/rust/issues/49509).
Also fixed a few violations of the lint in the Rustc codebase (a fun additional way I could test my code). Edit: MIR opt folks I believe that's why you're CC'ed, just a heads up.
Two things I'm not sure about:
1. The way I chose to detect calls to `Span::ctxt`. I know adding diagnostic items to methods is generally discouraged, but after some searching and experimenting I couldn't find anything else that worked, so I went with it. That said, I'm happy to implement something different if there's a better way out there. (For what it's worth, if there is a better way, it might be worth documenting in the rustc-dev-guide, which I'm happy to take care of)
2. The error message for the lint. Ideally it would probably be good to give some context as to why the suggestion is made (e.g. `rustc::default_hash_types` tells the user that it's because of performance), but I don't have that context so I couldn't put it in the error message. Happy to iterate on the error message based on feedback during review.
r? ``@oli-obk``
Add MonoItems and Instance to stable_mir
Also add a few methods to instantiate instances and get an instance definition. We're still missing support to actually monomorphize the instance body.
This is related to https://github.com/rust-lang/project-stable-mir/issues/36
r? ``@oli-obk``
``@oli-obk`` is that what you were thinking? I incorporated ``@bjorn3`` idea of just adding a Shim instance definition in https://github.com/rust-lang/rust/pull/116465.
Special case iterator chain checks for suggestion
When encountering method call chains of `Iterator`, check for trailing `;` in the body of closures passed into `Iterator::map`, as well as calls to `<T as Clone>::clone` when `T` is a type param and `T: !Clone`.
Fix#9082.
Add new simpler and more explicit syntax for check-cfg
<details>
<summary>
Old proposition (before the MCP)
</summary>
This PR adds a new simpler and more explicit syntax for check-cfg. It consist of two new form:
- `exhaustive(names, values)`
- `configure(name, "value1", "value2", ... "valueN")`
The preview forms `names(...)` and `values(...)` have implicit meaning that are not strait-forward. In particular `values(foo)`&`values(bar)` and `names(foo, bar)` are not equivalent which has created [some confusions](https://github.com/rust-lang/rust/pull/98080).
Also the `names()` and `values()` form are not clear either and again created some confusions where peoples believed that `values()`&`values(foo)` could be reduced to just `values(foo)`.
To fix that the two new forms are made to be explicit and simpler. See the table of correspondence:
- `names()` -> `exhaustive(names)`
- `values()` -> `exhaustive(values)`
- `names(foo)` -> `exhaustive(names)`&`configure(foo)`
- `values(foo)` -> `configure(foo)`
- `values(feat, "foo", "bar")` -> `configure(feat, "foo", "bar")`
- `values(foo)`&`values(bar)` -> `configure(foo, bar)`
- `names()`&`values()`&`values(my_cfg)` -> `exhaustive(names, values)`&`configure(my_cfg)`
Another benefits of the new syntax is that it allow for further options (like conditional checking for --cfg, currently always on) without syntax change.
The two previous forms are deprecated and will be removed once cargo and beta rustc have the necessary support.
</details>
This PR is the first part of the implementation of [MCP636 - Simplify and improve explicitness of the check-cfg syntax](https://github.com/rust-lang/compiler-team/issues/636).
## New `cfg` form
It introduces the new [`cfg` form](https://github.com/rust-lang/compiler-team/issues/636) and deprecate the other two:
```
rustc --check-cfg 'cfg(name1, ..., nameN, values("value1", "value2", ... "valueN"))'
```
## Default built-in names and values
It also changes the default for the built-in names and values checking.
- Built-in values checking would always be activated as long as a `--check-cfg` argument is present
- Built-in names checking would always be activated as long as a `--check-cfg` argument is present **unless** if any `cfg(any())` arg is passed
~~**Note: depends on https://github.com/rust-lang/rust/pull/111068 but is reviewable (last two commits)!**~~
Resolve https://github.com/rust-lang/compiler-team/issues/636
r? `@petrochenkov`
These are `Self` in almost all printers except one, which can just store
the state as a field instead. This simplifies the printer and allows for
further simplifications, for example using `&mut self` instead of
passing around the printer.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.