coverage: Consolidate creation of covmap/covfun records
This code for creating covmap/covfun records during codegen was split across multiple functions and files for dubious historical reasons. Having it all in one place makes it easier to follow.
This PR also includes two semi-related cleanups:
- Getting the codegen context's `coverage_cx` state is made infallible, since it should always exist when running the code paths that need it.
- The value of `covfun_section_name` is saved in the codegen context, since it never changes at runtime, and the code that needs it has access to the context anyway.
---
Background: Coverage instrumentation generates two kinds of metadata that are embedded in the final binary. There is per-CGU information that goes in the `__llvm_covmap` linker section, and per-function information that goes in the `__llvm_covfun` section (except on Windows, where slightly different section names are used).
Adding an extra `OnceCell` to `CrateCoverageContext` is much nicer than trying
to thread this string through multiple layers of function calls that already
have access to the context.
coverage: Pass coverage mappings to LLVM as separate structs
Instead of trying to cram *N* different kinds of coverage mapping data into a single list for FFI, pass *N* different lists of simpler structs.
This avoids the need to fill unused fields with dummy values, and avoids the need to tag structs with their underlying kind. It also lets us call the dedicated LLVM constructors for each different mapping type, instead of having to go through the complex general-purpose constructor.
Even though this adds multiple new structs to the FFI surface area, the resulting C++ code is simpler and shorter.
---
I've structured this mostly as a single atomic patch, rather than a series of incremental changes, because that avoids the need to make fiddly fixes to code that is about to be deleted anyway.
Migrate `llvm::set_comdat` and `llvm::SetUniqueComdat` to LLVM-C FFI.
Note, now we can call `llvm::set_comdat` only when the target actually
supports adding comdat. As this has no convenient LLVM-C API, we
implement this as `TargetOptions::supports_comdat`.
Co-authored-by: Stuart Cook <Zalathar@users.noreply.github.com>
Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
This code for recalculating `mcdc_bitmap_bytes` doesn't provide any benefit,
because its result won't have changed from the value in `FunctionCoverageInfo`
that was computed during the MIR instrumentation pass.
Because this now always takes place at the start of the function, we can just
use the normal `alloca` method and then initialize each bitmap immediately.
This patch also moves bitmap setup out of the `mcdc_parameters` method, because
there is no longer any particular reason for it to be there.
Add decision_depth field to TVBitmapUpdate/CondBitmapUpdate statements
Add decision_depth field to BcbMappingKinds MCDCBranch and MCDCDecision
Add decision_depth field to MCDCBranchSpan and MCDCDecisionSpan
Use the `Align` type when parsing alignment attributes
Use the `Align` type in `rustc_attr::parse_alignment`, removing the need to call `Align::from_bytes(...).unwrap()` later in the compilation process.
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Some of the marker statements used by coverage are added during MIR building
for use by the InstrumentCoverage pass (during analysis), and are not needed
afterwards.
When #118865 started enforcing the `rustc::potential_query_instability` lint in
`rustc_codegen_llvm`, it added an exemption for this site, arguing that the
entries are only used to create a list of filenames that is later sorted.
However, the list of entries also gets traversed when creating the function
coverage records in LLVM IR, which may be sensitive to hash-based ordering.
This patch therefore changes `function_coverage_map` to use `FxIndexMap`, which
should avoid hash-based instability by iterating in insertion order.
Coverage marker statements should have no effect on codegen, but in some cases
they could have the side-effect of creating a `func_coverage` entry for their
enclosing function. That can lead to an ICE for functions that don't actually
have any coverage spans.
There are cases where coverage instrumentation wants to show a span for some
syntax element, but there is no MIR node that naturally carries that span, so
the instrumentor can't see it.
MIR building can now use this new kind of coverage statement to deliberately
include those spans in MIR, attached to a dummy statement that has no other
effect.
This gives us a clearly-defined place to run code after the instance's MIR has
been traversed by codegen, but before we emit its `__llvm_covfun` record.
Even though expression details are now stored in the info structure, we still
need to inject `ExpressionUsed` statements into MIR, because if one is missing
during codegen then we know that it was optimized out and we can remap all of
its associated code regions to zero.
Previously, mappings were attached to individual coverage statements in MIR.
That necessitated special handling in MIR optimizations to avoid deleting those
statements, since otherwise codegen would be unable to reassemble the original
list of mappings.
With this change, a function's list of mappings is now attached to its MIR
body, and survives intact even if individual statements are deleted by
optimizations.
Coverage codegen can now allocate arrays based on the number of
counters/expressions originally used by the instrumentor.
The existing query that inspects coverage statements is still used for
determining the number of counters passed to `llvm.instrprof.increment`. If
some high-numbered counters were removed by MIR optimizations, the instrumented
binary can potentially use less memory and disk space at runtime.
This allows coverage information to be attached to the function as a whole when
appropriate, instead of being smuggled through coverage statements in the
function's basic blocks.
As an example, this patch moves the `function_source_hash` value out of
individual `CoverageKind::Counter` statements and into the per-function info.
When synthesizing unused functions for coverage purposes, the absence of this
info is taken to indicate that a function was not eligible for coverage and
should not be synthesized.