coverage: Emit mappings for unused functions without generating stubs
For a while I've been annoyed by the fact that generating coverage maps for unused functions involves generating a stub function at the LLVM level.
As I suspected, generating that stub function isn't actually necessary, as long as we specifically tell LLVM about the symbol names of all the functions that have coverage mappings but weren't codegenned (due to being unused).
---
There is some helper code that gets moved around in the follow-up patches, so look at the first patch to see the most important functional changes.
---
`@rustbot` label +A-code-coverage
This query has a name that sounds general-purpose, but in fact it has
coverage-specific semantics, and (fortunately) is only used by coverage code.
Because it is only ever called once (from one designated CGU), it doesn't need
to be a query, and we can change it to a regular function instead.
Uplift movability and mutability, the simple way
Just make type_ir a dependency of ast. This can be relaxed later if we want to make the dependency less heavy. Part of rust-lang/types-team#124.
r? `@lcnr` or `@jackh726`
coverage: Move most per-function coverage info into `mir::Body`
Currently, all of the coverage information collected by the `InstrumentCoverage` pass is smuggled through MIR in the form of individual `StatementKind::Coverage` statements, which must then be reassembled by coverage codegen.
That's awkward for a number of reasons:
- While some of the coverage statements do care about their specific position in the MIR control-flow graph, many of them don't, and are just tacked onto the function's first BB as metadata carriers.
- MIR inlining can result in coverage statements being duplicated, so coverage codegen has to jump through hoops to avoid emitting duplicate mappings.
- MIR optimizations that would delete coverage statements need to carefully copy them into the function's first BB so as not to omit them from coverage reports.
- The order in which coverage codegen sees coverage statements is dependent on MIR optimizations/inlining, which can cause unnecessary churn in the emitted coverage mappings.
- We don't have a good way to annotate MIR-level functions with extra coverage info that doesn't belong in a statement.
---
This PR therefore takes most of the per-function coverage info and stores it in a field in `mir::Body` as `Option<Box<FunctionCoverageInfo>>`.
(This adds one pointer to the size of `mir::Body`, even when coverage is not enabled.)
Coverage statements still need to be injected into MIR in some cases, but only when they actually affect codegen (counters) or are needed to detect code that has been optimized away as unreachable (counters/expressions).
---
By the end of this PR, the information stored in `FunctionCoverageInfo` is:
- A hash of the function's source code (needed by LLVM's coverage map format)
- The number of coverage counters added by coverage instrumentation
- A table of coverage expressions, associating each expression ID with its operator (add or subtract) and its two operands
- The list of mappings, associating each covered code region with a counter/expression/zero value
---
~~This is built on top of #115301, so I'll rebase and roll a reviewer once that lands.~~
r? `@ghost`
`@rustbot` label +A-code-coverage
This new description reflects the changes made in this PR, and should hopefully
be more useful to non-coverage developers who need to care about coverage
statements.
Even though expression details are now stored in the info structure, we still
need to inject `ExpressionUsed` statements into MIR, because if one is missing
during codegen then we know that it was optimized out and we can remap all of
its associated code regions to zero.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
Don't compare host param by name
Seems sketchy to be searching for `sym::host` by name, especially when we can get the actual index with not very much work.
r? fee1-dead
Coverage codegen can now allocate arrays based on the number of
counters/expressions originally used by the instrumentor.
The existing query that inspects coverage statements is still used for
determining the number of counters passed to `llvm.instrprof.increment`. If
some high-numbered counters were removed by MIR optimizations, the instrumented
binary can potentially use less memory and disk space at runtime.
This allows coverage information to be attached to the function as a whole when
appropriate, instead of being smuggled through coverage statements in the
function's basic blocks.
As an example, this patch moves the `function_source_hash` value out of
individual `CoverageKind::Counter` statements and into the per-function info.
When synthesizing unused functions for coverage purposes, the absence of this
info is taken to indicate that a function was not eligible for coverage and
should not be synthesized.
Remove lots of generics from `ty::print`
All of these generics mostly resolve to the same thing, which means we can remove them, greatly simplifying the types involved in pretty printing and unlocking another simplification (that is not performed in this PR): Using `&mut self` instead of passing `self` through the return type.
cc `@eddyb` you probably know why it's like this, just checking in and making sure I didn't do anything bad
r? oli-obk
These are `Self` in almost all printers except one, which can just store
the state as a field instead. This simplifies the printer and allows for
further simplifications, for example using `&mut self` instead of
passing around the printer.
don't UB on dangling ptr deref, instead check inbounds on projections
This implements https://github.com/rust-lang/reference/pull/1387 in Miri. See that PR for what the change is about.
Detecting dangling references in `let x = &...;` is now done by validity checking only, so some tests need to have validity checking enabled. There is no longer inherently a "nodangle" check in evaluating the expression `&*ptr` (aside from the aliasing model).
r? `@oli-obk`
Based on:
- https://github.com/rust-lang/reference/pull/1387
- https://github.com/rust-lang/rust/pull/115524
interpret: clean up AllocBytes
Fixes https://github.com/rust-lang/miri/issues/2836
Nothing has moved here in half a year, so let's just remove these unused stubs -- they need a proper re-design anyway.
r? `@oli-obk`
Format all the let-chains in compiler crates
Since rust-lang/rustfmt#5910 has landed, soon we will have support for formatting let-chains (as soon as rustfmt syncs and beta gets bumped).
This PR applies the changes [from master rustfmt to rust-lang/rust eagerly](https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/out.20formatting.20of.20prs/near/374997516), so that the next beta bump does not have to deal with a 200+ file diff and can remain concerned with other things like `cfg(bootstrap)` -- #113637 was a pain to land, for example, because of let-else.
I will also add this commit to the ignore list after it has landed.
The commands that were run -- I'm not great at bash-foo, but this applies rustfmt to every compiler crate, and then reverts the two crates that should probably be formatted out-of-tree.
```
~/rustfmt $ ls -1d ~/rust/compiler/* | xargs -I@ cargo run --bin rustfmt -- `@/src/lib.rs` --config-path ~/rust --edition=2021 # format all of the compiler crates
~/rust $ git checkout HEAD -- compiler/rustc_codegen_{gcc,cranelift} # revert changes to cg-gcc and cg-clif
```
cc `@rust-lang/rustfmt`
r? `@WaffleLapkin` or `@Nilstrieb` who said they may be able to review this purely mechanical PR :>
cc `@Mark-Simulacrum` and `@petrochenkov,` who had some thoughts on the order of operations with big formatting changes in https://github.com/rust-lang/rust/pull/95262#issue-1178993801. I think the situation has changed since then, given that let-chains support exists on master rustfmt now, and I'm fairly confident that this formatting PR should land even if *bootstrap* rustfmt doesn't yet format let-chains in order to lessen the burden of the next beta bump.
Relate alias ty with variance
In the new solver, turns out that the subst-relate branch of the alias-relate predicate was relating args invariantly even for opaques, which have variance 💀.
This change is a bit more invasive, but I'd rather not special-case it [here](aeaa5c30e5/compiler/rustc_trait_selection/src/solve/alias_relate.rs (L171-L190)) and then have it break elsewhere. I'm doing a perf run to see if the extra call to `def_kind` is that expensive, if it is, I'll reconsider.
r? ``@lcnr``