coverage: Move most per-function coverage info into `mir::Body`
Currently, all of the coverage information collected by the `InstrumentCoverage` pass is smuggled through MIR in the form of individual `StatementKind::Coverage` statements, which must then be reassembled by coverage codegen.
That's awkward for a number of reasons:
- While some of the coverage statements do care about their specific position in the MIR control-flow graph, many of them don't, and are just tacked onto the function's first BB as metadata carriers.
- MIR inlining can result in coverage statements being duplicated, so coverage codegen has to jump through hoops to avoid emitting duplicate mappings.
- MIR optimizations that would delete coverage statements need to carefully copy them into the function's first BB so as not to omit them from coverage reports.
- The order in which coverage codegen sees coverage statements is dependent on MIR optimizations/inlining, which can cause unnecessary churn in the emitted coverage mappings.
- We don't have a good way to annotate MIR-level functions with extra coverage info that doesn't belong in a statement.
---
This PR therefore takes most of the per-function coverage info and stores it in a field in `mir::Body` as `Option<Box<FunctionCoverageInfo>>`.
(This adds one pointer to the size of `mir::Body`, even when coverage is not enabled.)
Coverage statements still need to be injected into MIR in some cases, but only when they actually affect codegen (counters) or are needed to detect code that has been optimized away as unreachable (counters/expressions).
---
By the end of this PR, the information stored in `FunctionCoverageInfo` is:
- A hash of the function's source code (needed by LLVM's coverage map format)
- The number of coverage counters added by coverage instrumentation
- A table of coverage expressions, associating each expression ID with its operator (add or subtract) and its two operands
- The list of mappings, associating each covered code region with a counter/expression/zero value
---
~~This is built on top of #115301, so I'll rebase and roll a reviewer once that lands.~~
r? `@ghost`
`@rustbot` label +A-code-coverage
This new description reflects the changes made in this PR, and should hopefully
be more useful to non-coverage developers who need to care about coverage
statements.
Even though expression details are now stored in the info structure, we still
need to inject `ExpressionUsed` statements into MIR, because if one is missing
during codegen then we know that it was optimized out and we can remap all of
its associated code regions to zero.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
Instead of modifying the accumulated expressions in-place, we now build a set
of expressions that are known to be zero, and then consult that set on the fly
when converting the expression data for FFI.
This will be necessary when moving mappings and expression data into function
coverage info, which can't be mutated during codegen.
Don't compare host param by name
Seems sketchy to be searching for `sym::host` by name, especially when we can get the actual index with not very much work.
r? fee1-dead
Coverage codegen can now allocate arrays based on the number of
counters/expressions originally used by the instrumentor.
The existing query that inspects coverage statements is still used for
determining the number of counters passed to `llvm.instrprof.increment`. If
some high-numbered counters were removed by MIR optimizations, the instrumented
binary can potentially use less memory and disk space at runtime.
This allows coverage information to be attached to the function as a whole when
appropriate, instead of being smuggled through coverage statements in the
function's basic blocks.
As an example, this patch moves the `function_source_hash` value out of
individual `CoverageKind::Counter` statements and into the per-function info.
When synthesizing unused functions for coverage purposes, the absence of this
info is taken to indicate that a function was not eligible for coverage and
should not be synthesized.
Remove lots of generics from `ty::print`
All of these generics mostly resolve to the same thing, which means we can remove them, greatly simplifying the types involved in pretty printing and unlocking another simplification (that is not performed in this PR): Using `&mut self` instead of passing `self` through the return type.
cc `@eddyb` you probably know why it's like this, just checking in and making sure I didn't do anything bad
r? oli-obk
Use `YYYY-MM-DDTHH_MM_SS` as datetime format for ICE dump files
Windows paths do not support `:`, so use a datetime format in ICE dump paths that Windows will accept.
CC #116809, fix#115180.
Remove `IdFunctor` trait.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
r? `@BoxyUwU`
Implement an internal lint encouraging use of `Span::eq_ctxt`
Adds a new Rustc internal lint that forbids use of `.ctxt() == .ctxt()` for spans, encouraging use of `.eq_ctxt()` instead (see https://github.com/rust-lang/rust/issues/49509).
Also fixed a few violations of the lint in the Rustc codebase (a fun additional way I could test my code). Edit: MIR opt folks I believe that's why you're CC'ed, just a heads up.
Two things I'm not sure about:
1. The way I chose to detect calls to `Span::ctxt`. I know adding diagnostic items to methods is generally discouraged, but after some searching and experimenting I couldn't find anything else that worked, so I went with it. That said, I'm happy to implement something different if there's a better way out there. (For what it's worth, if there is a better way, it might be worth documenting in the rustc-dev-guide, which I'm happy to take care of)
2. The error message for the lint. Ideally it would probably be good to give some context as to why the suggestion is made (e.g. `rustc::default_hash_types` tells the user that it's because of performance), but I don't have that context so I couldn't put it in the error message. Happy to iterate on the error message based on feedback during review.
r? ``@oli-obk``
Add MonoItems and Instance to stable_mir
Also add a few methods to instantiate instances and get an instance definition. We're still missing support to actually monomorphize the instance body.
This is related to https://github.com/rust-lang/project-stable-mir/issues/36
r? ``@oli-obk``
``@oli-obk`` is that what you were thinking? I incorporated ``@bjorn3`` idea of just adding a Shim instance definition in https://github.com/rust-lang/rust/pull/116465.
Special case iterator chain checks for suggestion
When encountering method call chains of `Iterator`, check for trailing `;` in the body of closures passed into `Iterator::map`, as well as calls to `<T as Clone>::clone` when `T` is a type param and `T: !Clone`.
Fix#9082.
Add new simpler and more explicit syntax for check-cfg
<details>
<summary>
Old proposition (before the MCP)
</summary>
This PR adds a new simpler and more explicit syntax for check-cfg. It consist of two new form:
- `exhaustive(names, values)`
- `configure(name, "value1", "value2", ... "valueN")`
The preview forms `names(...)` and `values(...)` have implicit meaning that are not strait-forward. In particular `values(foo)`&`values(bar)` and `names(foo, bar)` are not equivalent which has created [some confusions](https://github.com/rust-lang/rust/pull/98080).
Also the `names()` and `values()` form are not clear either and again created some confusions where peoples believed that `values()`&`values(foo)` could be reduced to just `values(foo)`.
To fix that the two new forms are made to be explicit and simpler. See the table of correspondence:
- `names()` -> `exhaustive(names)`
- `values()` -> `exhaustive(values)`
- `names(foo)` -> `exhaustive(names)`&`configure(foo)`
- `values(foo)` -> `configure(foo)`
- `values(feat, "foo", "bar")` -> `configure(feat, "foo", "bar")`
- `values(foo)`&`values(bar)` -> `configure(foo, bar)`
- `names()`&`values()`&`values(my_cfg)` -> `exhaustive(names, values)`&`configure(my_cfg)`
Another benefits of the new syntax is that it allow for further options (like conditional checking for --cfg, currently always on) without syntax change.
The two previous forms are deprecated and will be removed once cargo and beta rustc have the necessary support.
</details>
This PR is the first part of the implementation of [MCP636 - Simplify and improve explicitness of the check-cfg syntax](https://github.com/rust-lang/compiler-team/issues/636).
## New `cfg` form
It introduces the new [`cfg` form](https://github.com/rust-lang/compiler-team/issues/636) and deprecate the other two:
```
rustc --check-cfg 'cfg(name1, ..., nameN, values("value1", "value2", ... "valueN"))'
```
## Default built-in names and values
It also changes the default for the built-in names and values checking.
- Built-in values checking would always be activated as long as a `--check-cfg` argument is present
- Built-in names checking would always be activated as long as a `--check-cfg` argument is present **unless** if any `cfg(any())` arg is passed
~~**Note: depends on https://github.com/rust-lang/rust/pull/111068 but is reviewable (last two commits)!**~~
Resolve https://github.com/rust-lang/compiler-team/issues/636
r? `@petrochenkov`
These are `Self` in almost all printers except one, which can just store
the state as a field instead. This simplifies the printer and allows for
further simplifications, for example using `&mut self` instead of
passing around the printer.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
docs: add Rust logo to more compiler crates
c6e6ecb1af added it to some of the compiler's crates, but avoided adding it to all of them to reduce bit-rot. This commit adds to more.
r? `@GuillaumeGomez`
Use tidy to enforce alphabetical dependency ordering
I get annoyed when dependencies in `Cargo.toml` files are not in alphabetical order. The [style guide](https://github.com/rust-lang/rust/blob/master/src/doc/style-guide/src/cargo.md) agrees with me.
There are ongoing efforts to provide linting/formatting of `Cargo.toml` files, e.g. https://github.com/rust-lang/rustfmt/pull/5240, https://crates.io/crates/cargo-toml-lint, and https://github.com/TimonPost/cargo-toml-format. But it's far from clear what's the right approach.
So this PR does something very simple: it uses the order checking already present in tidy. This allows incremental application of ordering, starting right now, and avoiding the need for any kind of all-at-once conversion.
If we do end up using some more comprehensive `Cargo.toml` linting/formatting solution in the future, the `tidy-alphabetical` lines will be easy to remove.
r? `@wesleywiser`