coverage: Emit the filenames section before encoding per-function mappings
When embedding coverage information in LLVM IR (and ultimately in the resulting binary), there are two main things that each CGU needs to emit:
- A single `__llvm_covmap` record containing a coverage header, which mostly consists of a list of filenames used by the CGU's coverage mappings.
- Several `__llvm_covfun` records, one for each instrumented function, each of which contains the hash of the list of filenames in the header.
There is a kind of loose cyclic dependency between the two: we need the hash of the file table before we can emit the covfun records, but we need to traverse all of the instrumented functions in order to build the file table.
The existing code works by processing the individual functions first. It lazily adds filenames to the file table, and stores the mostly-complete function records in a temporary list. After this it hashes the file table, emits the header (containing the file table), and then uses the hash to emit all of the function records.
This PR reverses that order: first we traverse all of the functions (without trying to prepare their function records) to build a *complete* file table, and then emit it immediately. At this point we have the file table hash, so we can then proceed to build and emit all of the function records, without needing to store them in an intermediate list.
---
Along the way, this PR makes some necessary changes that are also worthwhile in their own right:
- We split `FunctionCoverage` into distinct collector/finished phases, which neatly avoids some borrow-checker hassles when extracting a function's final expression/mapping data.
- We avoid having to re-sort a function's mappings when preparing the list of filenames that it uses.
Use beta cargo in opt-dist
Using the new stage2 cargo caused issues when a backwards-incompatible change was made to cargo. This means that we won't be testing the LTO/1-CGU optimized cargo, but I don't think that's a big issue, as we primarily want to test the compiler.
Should fix [this](https://github.com/rust-lang/rust/pull/117000#issuecomment-1773639109) failure.
Most coverage metadata is encoded into two sections in the final executable.
The `__llvm_covmap` section mostly just contains a list of filenames, while the
`__llvm_covfun` section contains encoded coverage maps for each instrumented
function.
The catch is that each per-function record also needs to contain a hash of the
filenames list that it refers to. Historically this was handled by assembling
most of the per-function data into a temporary list, then assembling the
filenames buffer, then using the filenames hash to emit the per-function data,
and then finally emitting the filenames table itself.
However, now that we build the filenames table up-front (via a separate
traversal of the per-function data), we can hash and emit that part first, and
then emit each of the per-function records immediately after building. This
removes the awkwardness of having to temporarily store nearly-complete
per-function records.
The main change here is that `VirtualFileMapping` now uses an internal hashmap
to de-duplicate incoming global file IDs. That removes the need for
`encode_mappings_for_function` to re-sort its mappings by filename in order to
de-duplicate them.
(We still de-duplicate runs of identical filenames to save work, but this is
not load-bearing for correctness, so a sort is not necessary.)
The combined `get_expressions_and_counter_regions` method was an artifact of
having to prepare the expressions and mappings at the same time, to avoid
ownership/lifetime problems with temporary data used by both.
Now that we have an explicit transition from `FunctionCoverageCollector` to the
final `FunctionCoverage`, we can prepare any shared data during that step and
store it in the final struct.
This gives us a clearly-defined place to run code after the instance's MIR has
been traversed by codegen, but before we emit its `__llvm_covfun` record.
Rollup of 4 pull requests
Successful merges:
- #116985 (Use gdb.ValuePrinter tag class)
- #116989 (Skip test if Unix sockets are unsupported)
- #117034 (Don't crash on empty match in the `nonexhaustive_omitted_patterns` lint)
- #117037 (rustdoc book doc example error)
r? `@ghost`
`@rustbot` modify labels: rollup
rustdoc book doc example error
closes#117036
This is the minimal change required to make the second what-to-include.md example valid.
Another more modern solution could be considered:
```
/// Example
/// ```rust
/// let fortytwo = "42".parse::<u32>()?;
/// println!("{} + 10 = {}", fortytwo, fortytwo+10);
/// # Ok::<(), <u32 as std::str::FromStr>::Err>(())
/// ```
```
Use gdb.ValuePrinter tag class
GDB 14 has a "gdb.ValuePrinter" tag class that was introduced to let GDB evolve the pretty-printer API. Users of this tag are required to hide any local attributes or methods. This patch makes this change to the Rust pretty-printers. At present this is just a cleanup, providing the basis for any future changes.
ci: add a runner for vanilla LLVM 17
For CI cost, this can be seen as replacing the llvm-14 runner we dropped in #114148.
Also, I've set `IS_NOT_LATEST_LLVM` in the llvm-16 runner, since that's not the latest anymore.
Fix x86_64-gnu-llvm-15 CI tests
The CI script was broken - if there was a test failure in the first command chain (inside the `if`), CI would not report the failure.
It happened because there were two command chains separated by `&&` in the script, and since `set -e` doesn't exit for chained commands, if the first chain has failed, the script would happily continue forward, ignoring any test failures.
This could be fixed e.g. by adding some `|| exit 1` to the first chain, but I suppose that the `&&` chaining is unnecessary here anyway.
Reported [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/test.20failure.20didn't.20stop.20CI).
Fixes: https://github.com/rust-lang/rust/issues/116867
Sync rustc_codegen_cranelift
The main highlights this time is new support for riscv64 linux enabled by a cranelift update. I have also updated some of the crates built as part of cg_clif's test suite which enabled removing several patches for them. And finally I have fixed a couple of tests in rustc's test suite with cg_clif.
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler +subtree-sync
Rollup of 7 pull requests
Successful merges:
- #116312 (Initiate the inner usage of `cfg_match` (Compiler))
- #116928 (fix bootstrap paths in triagebot.toml)
- #116955 (Updated README with expandable table of content.)
- #116981 (update the registers of csky target)
- #116992 (Mention the syntax for `use` on `mod foo;` if `foo` doesn't exist)
- #117026 (Fix broken link to Ayu theme in the rustdoc book)
- #117028 (Remove unnecessary `all` in Box)
r? `@ghost`
`@rustbot` modify labels: rollup