coverage: Split out counter increment sites from BCB node/edge counters
This makes it possible for two nodes/edges in the coverage graph to share the same counter, without causing the instrumentor to inject unwanted duplicate counter-increment statements.
---
````@rustbot```` label +A-code-coverage
This sidesteps the normal span refinement code in cases where we know that we
are only dealing with the special signature span that represents having called
an async function.
This makes it possible for two nodes/edges in the coverage graph to share the
same counter, without causing the instrumentor to inject unwanted duplicate
counter-increment statements.
The query accept arbitrary DefIds, not just owner DefIds.
The return can be an `Option` because if there are no nodes, then it doesn't matter whether it's due to NonOwner or Phantom.
Also rename the query to `opt_hir_owner_nodes`.
coverage: Dismantle `Instrumentor` and flatten span refinement
This is a combination of two refactorings that are unrelated, but would otherwise have a merge conflict.
No functional changes, other than a small tweak to debug logging as part of rearranging some functions.
Ignoring whitespace is highly recommended, since most of the modified lines have just been reindented.
---
The first change is to dismantle `Instrumentor` into ordinary functions.
This is one of those cases where encapsulating several values into a struct ultimately hurts more than it helps. With everything stored as local variables in one main function, and passed explicitly into helper functions, it's easier to see what is used where, and make changes as necessary.
---
The second change is to flatten the functions for extracting/refining coverage spans.
Consolidating this code into flatter functions reduces the amount of pointer-chasing required to read and modify it.
coverage: Don't instrument `#[automatically_derived]` functions
This PR makes the coverage instrumentor detect and skip functions that have [`#[automatically_derived]`](https://doc.rust-lang.org/reference/attributes/derive.html#the-automatically_derived-attribute) on their enclosing impl block.
Most notably, this means that methods generated by built-in derives (e.g. `Clone`, `Debug`, `PartialEq`) are now ignored by coverage instrumentation, and won't appear as executed or not-executed in coverage reports.
This is a noticeable change in user-visible behaviour, but overall I think it's a net improvement. For example, we've had a few user requests for this sort of change (e.g. #105055, https://github.com/rust-lang/rust/issues/84605#issuecomment-1902069040), and I believe it's the behaviour that most users will expect/prefer by default.
It's possible to imagine situations where users would want to instrument these derived implementations, but I think it's OK to treat that as an opportunity to consider adding more fine-grained option flags to control the details of coverage instrumentation, while leaving this new behaviour as the default.
(Also note that while `-Cinstrument-coverage` is a stable feature, the exact details of coverage instrumentation are allowed to change. So we *can* make this change; the main question is whether we *should*.)
Fixes#105055.
coverage: Never emit improperly-ordered coverage regions
If we emit a coverage region that is improperly ordered (end < start), `llvm-cov` will fail with `coveragemap_error::malformed`, which is inconvenient for users and also very hard to debug.
Ideally we would fix the root causes of these situations, but they tend to occur in very obscure edge-case scenarios (often involving nested macros), and we don't always have a good MCVE to work from. So it makes sense to also have a catch-all check that will prevent improperly-ordered regions from ever being emitted.
---
This is mainly aimed at resolving #119453. We don't have a specific way to reproduce it, which is why I haven't been able to add a test case in this PR. But based on the information provided in that issue, this change seems likely to avoid the error in `llvm-cov`.
`````@rustbot````` label +A-code-coverage
This also switches from `split_off(0)` to `std::mem::take` when emptying the
accumulated list of blocks, because `split_off(0)` handles capacity in a way
that is unintuitive when used in a loop.
The old loop had two separate places where it would flush the acumulated list
of straight-line blocks into a new BCB. One occurred at the start of the loop
body when the current block couldn't be chained into, and the other occurred at
the end of the loop body when the current block couldn't be chained from.
The latter check can be hoisted to the start of the loop body by making it
examine the previous block (which has added itself to the list) instead of the
current block. With that done, we can combine the two separate flushes into one
flush with two possible trigger conditions.
Filtering out unreachable successors is only needed by the main graph traversal
loop, so we can move the filtering step into that loop instead, eliminating the
need to pass the MIR body into `bcb_filtered_successors`.
This is less elegant in some ways, since we no longer visit a BCB's spans as a
batch, but will make it much easier to add support for other kinds of coverage
mapping regions (e.g. branch regions or gap regions).
This draws a clear distinction between the fields/methods that are needed by
initial span extraction and preprocessing, and those that are needed by the
main "refinement" loop.
These two tasks historically needed to be interleaved, but after various recent
changes (including #116046 and #116917) they can now be fully separated.
The old code used a heuristic to detect async functions and adjust their
coverage spans to produce better output. But there's no need to resort to a
heuristic when we can just check whether the current function is actually an
`async fn`.
If we want to know whether two byte positions are in the same file, we don't
need to clone and compare `Lrc<SourceFile>`; we can just get their indices and
compare those instead.