This code was calling `sort_unstable_by`, but failed to impose a total order on
the initial spans. That resulted in unpredictable handling of closure spans,
producing inconsistencies in the coverage maps and in user-visible coverage
reports.
This patch fixes the problem by always sorting closure spans before
otherwise-identical non-closure spans, and also switches to a stable sort in
case the ordering is still not total.
Both of the coverage queries can now use this one helper function to iterate
over all of the `mir::Coverage` payloads in the statements of a `mir::Body`.
This shows one small benefit of separating `BcbCounter` from `CoverageKind`.
The function source hash will be the same for all counters within a function,
so instead of passing it through `CoverageCounters` and storing it in every
counter, we can just supply it during the final conversion to `CoverageKind`.
Storing coverage counter information in `CoverageCounters` has a few advantages
over storing it directly inside BCB graph nodes:
- The graph doesn't need to be mutable when making the counters, making it
easier to see that the graph itself is not modified during this step.
- All of the counter data is clearly visible in one place.
- It becomes possible to use a representation that doesn't correspond 1:1 to
graph nodes, e.g. storing all the edge counters in a single hashmap instead of
several.
Operand types are now tracked explicitly, so there is no need to reserve ID 0
for the special always-zero counter.
As part of the renumbering, this change fixes an off-by-one error in the way
counters were counted by the `coverageinfo` query. As a result, functions
should now have exactly the number of counters they actually need, instead of
always having an extra counter that is never used.
Operand types are now tracked explicitly, so there is no need for expression
IDs to avoid counter IDs by descending from `u32::MAX`. Instead they can just
count up from 0, and can be used directly as indices when necessary.
Because the three kinds of operand are now distinguished explicitly, we no
longer need fiddly code to disambiguate counter IDs and expression IDs based on
the total number of counters/expressions in a function.
This does increase the size of operands from 4 bytes to 8 bytes, but that
shouldn't be a big deal since they are mostly stored inside boxed structures,
and the current coverage code is not particularly size-optimized anyway.
Work around `rust-analyzer` false-positive type errors
rust-analyzer incorrectly reports two type errors in `debug.rs`:
> expected &dyn Display, found &i32
> expected &dyn Display, found &i32
This is due to a known bug in r-a: (https://github.com/rust-lang/rust-analyzer/issues/11847).
In these particular cases, changing `&0` to `&0i32` seems to be enough to avoid the bug.
Preprocess and cache dominator tree
Preprocessing dominators has a very strong effect for https://github.com/rust-lang/rust/pull/111344.
That pass checks that assignments dominate their uses repeatedly. Using the unprocessed dominator tree caused a quadratic runtime (number of bbs x depth of the dominator tree).
This PR also caches the dominator tree and the pre-processed dominators in the MIR cfg cache.
Rebase of https://github.com/rust-lang/rust/pull/107157
cc `@tmiasko`
coverage: Don't underflow column number
I noticed this when running coverage on a debug build of rustc. There
may be other places that do this but I'm just fixing the one I hit.
r? `@wesleywiser` `@richkadel`
Encode hashes as bytes, not varint
In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.
Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`
Before:
```
( 1) 373418203 (53.7%, 53.7%): 1
( 2) 196240113 (28.2%, 81.9%): 3
( 3) 108157958 (15.6%, 97.5%): 2
( 4) 17213120 ( 2.5%, 99.9%): 4
( 5) 223614 ( 0.0%,100.0%): 9
( 6) 216262 ( 0.0%,100.0%): 10
( 7) 15447 ( 0.0%,100.0%): 5
( 8) 3633 ( 0.0%,100.0%): 19
( 9) 3030 ( 0.0%,100.0%): 8
( 10) 1167 ( 0.0%,100.0%): 18
( 11) 1032 ( 0.0%,100.0%): 7
( 12) 1003 ( 0.0%,100.0%): 6
( 13) 10 ( 0.0%,100.0%): 16
( 14) 10 ( 0.0%,100.0%): 17
( 15) 5 ( 0.0%,100.0%): 12
( 16) 4 ( 0.0%,100.0%): 14
```
After:
```
( 1) 372939136 (53.7%, 53.7%): 1
( 2) 196240140 (28.3%, 82.0%): 3
( 3) 108014969 (15.6%, 97.5%): 2
( 4) 17192375 ( 2.5%,100.0%): 4
( 5) 435 ( 0.0%,100.0%): 5
( 6) 83 ( 0.0%,100.0%): 18
( 7) 79 ( 0.0%,100.0%): 10
( 8) 50 ( 0.0%,100.0%): 9
( 9) 6 ( 0.0%,100.0%): 19
```
The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
Unify terminology used in unwind action and terminator, and reflect
the fact that a nounwind panic is triggered instead of an immediate
abort is triggered for this terminator.
As I've been trying to replace a `Vec` with an `IndexVec`, having `last` exist on both but returning very different types makes the transition a bit awkward -- the errors are later, where you get things like "there's no `ty` method on `mir::Field`" rather than a more localized error like "hey, there's no `last` on `IndexVec`".
So I propose renaming `last` to `last_index` to help distinguish `Vec::last`, which returns an element, and `IndexVec::last_index`, which returns an index.
(Similarly, `Iterator::last` also returns an element, not an index.)
Avoid unnecessary hashing
I noticed some stable hashing being done in a non-incremental build. It turns out that some of this is necessary to compute the crate hash, but some of it is not. Removing the unnecessary hashing is a perf win.
r? `@cjgillot`