coverage: Move most per-function coverage info into `mir::Body`
Currently, all of the coverage information collected by the `InstrumentCoverage` pass is smuggled through MIR in the form of individual `StatementKind::Coverage` statements, which must then be reassembled by coverage codegen.
That's awkward for a number of reasons:
- While some of the coverage statements do care about their specific position in the MIR control-flow graph, many of them don't, and are just tacked onto the function's first BB as metadata carriers.
- MIR inlining can result in coverage statements being duplicated, so coverage codegen has to jump through hoops to avoid emitting duplicate mappings.
- MIR optimizations that would delete coverage statements need to carefully copy them into the function's first BB so as not to omit them from coverage reports.
- The order in which coverage codegen sees coverage statements is dependent on MIR optimizations/inlining, which can cause unnecessary churn in the emitted coverage mappings.
- We don't have a good way to annotate MIR-level functions with extra coverage info that doesn't belong in a statement.
---
This PR therefore takes most of the per-function coverage info and stores it in a field in `mir::Body` as `Option<Box<FunctionCoverageInfo>>`.
(This adds one pointer to the size of `mir::Body`, even when coverage is not enabled.)
Coverage statements still need to be injected into MIR in some cases, but only when they actually affect codegen (counters) or are needed to detect code that has been optimized away as unreachable (counters/expressions).
---
By the end of this PR, the information stored in `FunctionCoverageInfo` is:
- A hash of the function's source code (needed by LLVM's coverage map format)
- The number of coverage counters added by coverage instrumentation
- A table of coverage expressions, associating each expression ID with its operator (add or subtract) and its two operands
- The list of mappings, associating each covered code region with a counter/expression/zero value
---
~~This is built on top of #115301, so I'll rebase and roll a reviewer once that lands.~~
r? `@ghost`
`@rustbot` label +A-code-coverage
This new description reflects the changes made in this PR, and should hopefully
be more useful to non-coverage developers who need to care about coverage
statements.
Even though expression details are now stored in the info structure, we still
need to inject `ExpressionUsed` statements into MIR, because if one is missing
during codegen then we know that it was optimized out and we can remap all of
its associated code regions to zero.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
Instead of modifying the accumulated expressions in-place, we now build a set
of expressions that are known to be zero, and then consult that set on the fly
when converting the expression data for FFI.
This will be necessary when moving mappings and expression data into function
coverage info, which can't be mutated during codegen.
Don't compare host param by name
Seems sketchy to be searching for `sym::host` by name, especially when we can get the actual index with not very much work.
r? fee1-dead
Coverage codegen can now allocate arrays based on the number of
counters/expressions originally used by the instrumentor.
The existing query that inspects coverage statements is still used for
determining the number of counters passed to `llvm.instrprof.increment`. If
some high-numbered counters were removed by MIR optimizations, the instrumented
binary can potentially use less memory and disk space at runtime.
This allows coverage information to be attached to the function as a whole when
appropriate, instead of being smuggled through coverage statements in the
function's basic blocks.
As an example, this patch moves the `function_source_hash` value out of
individual `CoverageKind::Counter` statements and into the per-function info.
When synthesizing unused functions for coverage purposes, the absence of this
info is taken to indicate that a function was not eligible for coverage and
should not be synthesized.
Remove lots of generics from `ty::print`
All of these generics mostly resolve to the same thing, which means we can remove them, greatly simplifying the types involved in pretty printing and unlocking another simplification (that is not performed in this PR): Using `&mut self` instead of passing `self` through the return type.
cc `@eddyb` you probably know why it's like this, just checking in and making sure I didn't do anything bad
r? oli-obk
Use `YYYY-MM-DDTHH_MM_SS` as datetime format for ICE dump files
Windows paths do not support `:`, so use a datetime format in ICE dump paths that Windows will accept.
CC #116809, fix#115180.
Remove `IdFunctor` trait.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
r? `@BoxyUwU`
Automatically enable cross-crate inlining for small functions
This is basically reviving https://github.com/rust-lang/rust/pull/70550
The `#[inline]` attribute can have a significant impact on code generation or runtime performance (because it enables inlining between CGUs where it would normally not happen) and also on compile-time performance (because it enables MIR inlining). But it has to be added manually, which is awkward.
This PR factors whether a DefId is cross-crate inlinable into a query, and replaces all uses of `CodegenFnAttrs::requests_inline` with this new query. The new query incorporates all the other logic that is used to determine whether a Def should be treated as cross-crate-inlinable, and as a last step inspects the function's optimized_mir to determine if it should be treated as cross-crate-inlinable.
The heuristic implemented here is deliberately conservative; we only infer inlinability for functions whose optimized_mir does not contain any calls or asserts. I plan to study adjusting the cost model later, but for now the compile time implications of this change are so significant that I think this very crude heuristic is well worth landing.
Normalize alloc-id in tests.
AllocIds are globally numbered in a rustc invocation. This makes them very sensitive to changes unrelated to what is being tested. This commit normalizes them by renumbering, in order of appearance in the output.
The renumbering allows to keep the identity, that a simple `allocN` wouldn't. This is useful when we have memory dumps.
cc `@saethlin`
r? `@oli-obk`
Rollup of 5 pull requests
Successful merges:
- #111072 (Add new simpler and more explicit syntax for check-cfg)
- #116717 (Special case iterator chain checks for suggestion)
- #116719 (Add MonoItems and Instance to stable_mir)
- #116787 (Implement an internal lint encouraging use of `Span::eq_ctxt`)
- #116827 (Make `handle_options` public again.)
r? `@ghost`
`@rustbot` modify labels: rollup
Implement an internal lint encouraging use of `Span::eq_ctxt`
Adds a new Rustc internal lint that forbids use of `.ctxt() == .ctxt()` for spans, encouraging use of `.eq_ctxt()` instead (see https://github.com/rust-lang/rust/issues/49509).
Also fixed a few violations of the lint in the Rustc codebase (a fun additional way I could test my code). Edit: MIR opt folks I believe that's why you're CC'ed, just a heads up.
Two things I'm not sure about:
1. The way I chose to detect calls to `Span::ctxt`. I know adding diagnostic items to methods is generally discouraged, but after some searching and experimenting I couldn't find anything else that worked, so I went with it. That said, I'm happy to implement something different if there's a better way out there. (For what it's worth, if there is a better way, it might be worth documenting in the rustc-dev-guide, which I'm happy to take care of)
2. The error message for the lint. Ideally it would probably be good to give some context as to why the suggestion is made (e.g. `rustc::default_hash_types` tells the user that it's because of performance), but I don't have that context so I couldn't put it in the error message. Happy to iterate on the error message based on feedback during review.
r? ``@oli-obk``
Add MonoItems and Instance to stable_mir
Also add a few methods to instantiate instances and get an instance definition. We're still missing support to actually monomorphize the instance body.
This is related to https://github.com/rust-lang/project-stable-mir/issues/36
r? ``@oli-obk``
``@oli-obk`` is that what you were thinking? I incorporated ``@bjorn3`` idea of just adding a Shim instance definition in https://github.com/rust-lang/rust/pull/116465.
Special case iterator chain checks for suggestion
When encountering method call chains of `Iterator`, check for trailing `;` in the body of closures passed into `Iterator::map`, as well as calls to `<T as Clone>::clone` when `T` is a type param and `T: !Clone`.
Fix#9082.
Add new simpler and more explicit syntax for check-cfg
<details>
<summary>
Old proposition (before the MCP)
</summary>
This PR adds a new simpler and more explicit syntax for check-cfg. It consist of two new form:
- `exhaustive(names, values)`
- `configure(name, "value1", "value2", ... "valueN")`
The preview forms `names(...)` and `values(...)` have implicit meaning that are not strait-forward. In particular `values(foo)`&`values(bar)` and `names(foo, bar)` are not equivalent which has created [some confusions](https://github.com/rust-lang/rust/pull/98080).
Also the `names()` and `values()` form are not clear either and again created some confusions where peoples believed that `values()`&`values(foo)` could be reduced to just `values(foo)`.
To fix that the two new forms are made to be explicit and simpler. See the table of correspondence:
- `names()` -> `exhaustive(names)`
- `values()` -> `exhaustive(values)`
- `names(foo)` -> `exhaustive(names)`&`configure(foo)`
- `values(foo)` -> `configure(foo)`
- `values(feat, "foo", "bar")` -> `configure(feat, "foo", "bar")`
- `values(foo)`&`values(bar)` -> `configure(foo, bar)`
- `names()`&`values()`&`values(my_cfg)` -> `exhaustive(names, values)`&`configure(my_cfg)`
Another benefits of the new syntax is that it allow for further options (like conditional checking for --cfg, currently always on) without syntax change.
The two previous forms are deprecated and will be removed once cargo and beta rustc have the necessary support.
</details>
This PR is the first part of the implementation of [MCP636 - Simplify and improve explicitness of the check-cfg syntax](https://github.com/rust-lang/compiler-team/issues/636).
## New `cfg` form
It introduces the new [`cfg` form](https://github.com/rust-lang/compiler-team/issues/636) and deprecate the other two:
```
rustc --check-cfg 'cfg(name1, ..., nameN, values("value1", "value2", ... "valueN"))'
```
## Default built-in names and values
It also changes the default for the built-in names and values checking.
- Built-in values checking would always be activated as long as a `--check-cfg` argument is present
- Built-in names checking would always be activated as long as a `--check-cfg` argument is present **unless** if any `cfg(any())` arg is passed
~~**Note: depends on https://github.com/rust-lang/rust/pull/111068 but is reviewable (last two commits)!**~~
Resolve https://github.com/rust-lang/compiler-team/issues/636
r? `@petrochenkov`
Update cargo
17 commits in 6fa6fdc7606cfa664f9bee2fb33ee2ed904f4e88..ff768b45b302efd488178b31b35489e4fabb8799
2023-10-10 23:06:08 +0000 to 2023-10-17 12:51:31 +0000
- Clarify flag behavior in `cargo remove --help` (rust-lang/cargo#12823)
- doc(cargo-login): mention args after `--` in manpage (rust-lang/cargo#12832)
- changelog: add compat notice for `cargo login -- <arg>` (rust-lang/cargo#12830)
- update SPDX License info (rust-lang/cargo#12827)
- Add test for `-V` short argument (rust-lang/cargo#12822)
- add detailed message when target folder path is invalid (rust-lang/cargo#12820)
- chore(deps): update rust crate toml_edit to 0.20.2 (rust-lang/cargo#12761)
- Support `public` dependency configuration with workspace deps (rust-lang/cargo#12817)
- Update rustix to 0.38.18 (rust-lang/cargo#12815)
- contrib docs: add some conveniences (rust-lang/cargo#12812)
- Better suggestion for unsupported `--path` flag (rust-lang/cargo#12811)
- contrib docs: update rfc and roadmap links (rust-lang/cargo#12814)
- contrib doc: remove extraneous word (rust-lang/cargo#12813)
- Update curl-sys to pull in curl 8.4.0 (rust-lang/cargo#12808)
- feat: add package name and version to warning messages (rust-lang/cargo#12799)
- Do not call it "Downgrading" when difference is only build metadata (rust-lang/cargo#12796)
- Add unsupported short flag suggestion for `--target` and `--exclude` flags (rust-lang/cargo#12805)
r? ghost
These are `Self` in almost all printers except one, which can just store
the state as a field instead. This simplifies the printer and allows for
further simplifications, for example using `&mut self` instead of
passing around the printer.