Tools, tests, and experimenting with MIR-derived coverage counters
Leverages the new mir_dump output file in HTML+CSS (from #76074) to visualize coverage code regions
and the MIR features that they came from (including overlapping spans).
See example below.
The `run-make-fulldeps/instrument-coverage` test has been refactored to maximize test coverage and reduce code duplication. The new tests support testing with and without `-Clink-dead-code`, so Rust coverage can be tested on MSVC (which, currently, only works with `link-dead-code` _disabled_).
New tests validate coverage region generation and coverage reports with multiple counters per function. Starting with a simple `if-else` branch tests, coverage tests for each additional syntax type can be added by simply dropping in a new Rust sample program.
Includes a basic, MIR-block-based implementation of coverage injection,
available via `-Zexperimental-coverage`. This implementation has known
flaws and omissions, but is simple enough to validate the new tools and
tests.
The existing `-Zinstrument-coverage` option currently enables
function-level coverage only, which at least appears to generate
accurate coverage reports at that level.
Experimental coverage is not accurate at this time. When branch coverage
works as intended, the `-Zexperimental-coverage` option should be
removed.
This PR replaces the bulk of PR #75828, with the remaining parts of
that PR distributed among other separate and indentpent PRs.
This PR depends on two of those other PRs: #76002, #76003 and #76074
Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage
instrumentation
![Screen-Recording-2020-08-21-at-2](https://user-images.githubusercontent.com/3827298/90972923-ff417880-e4d1-11ea-92bb-8713c6198f6d.gif)
r? @tmandry
FYI: @wesleywiser
diagnostics: shorten paths of unique symbols
This is a step towards implementing a fix for #50310, and continuation of the discussion in [Pre-RFC: Nicer Types In Diagnostics - compiler - Rust Internals](https://internals.rust-lang.org/t/pre-rfc-nicer-types-in-diagnostics/11139). Impressed upon me from previous discussion in #21934 that an RFC for this is not needed, and I should just come up with code.
The recent improvements to `use` suggestions that I've contributed have given rise to this implementation. Contrary to previous suggestions, it's rather simple logic, and I believe it only reduces the amount of cognitive load that a developer would need when reading type errors.
-----
If a symbol name can only be imported from one place, and as long as it was not glob-imported anywhere in the current crate, we can trim its printed path to the last component.
This has wide implications on error messages with types, for example, shortening `std::vec::Vec` to just `Vec`, as long as there is no other `Vec` importable from anywhere.
Adds a new mir_dump output file in HTML/CSS to visualize code regions
and the MIR features that they came from (including overlapping spans).
See example below:
Includes a basic, MIR-block-based implementation of coverage injection,
available via `-Zexperimental-coverage`. This implementation has known
flaws and omissions, but is simple enough to validate the new tools and
tests.
The existing `-Zinstrument-coverage` option currently enables
function-level coverage only, which at least appears to generate
accurate coverage reports at that level.
Experimental coverage is not accurate at this time. When branch coverage
works as intended, the `-Zexperimental-coverage` option should be
removed.
This PR replaces the bulk of PR #75828, with the remaining parts of
that PR distributed among other separate and indentpent PRs.
This PR depends on three of those other PRs: #76000, #76002, and
Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage
instrumentation
![Screen-Recording-2020-08-21-at-2](https://user-images.githubusercontent.com/3827298/90972923-ff417880-e4d1-11ea-92bb-8713c6198f6d.gif)
`run_compiler` is used by clippy and other tools, which should not have
the trimmed paths feature enabled by default, until we see it works well
for them.
Would also be nice to rename `TimePassesCallbacks` however it's a
submodule change.
inliner: Avoid query cycles when optimizing generators
The HIR Id trick is insufficient to prevent query cycles when optimizing
generators, since merely requesting a layout of a generator also
computes its `optimized_mir`.
Make no attempts to inline functions into generators within the same
crate to avoid query cycles.
Fixes#76181.
Avoid rehashing Fingerprint as a map key
This introduces a no-op `Unhasher` for map keys that are already hash-
like, for example `Fingerprint` and its wrapper `DefPathHash`. For these
we can directly produce the `u64` hash for maps. The first use of this
is `def_path_hash_to_def_id: Option<UnhashMap<DefPathHash, DefId>>`.
cc #56308
r? @eddyb
Improve recovery on malformed format call
The token following a format expression should be a comma. However, when it is replaced with a similar token (such as a dot), then the corresponding error is emitted, but the token is treated as a comma, and the parsing step continues.
r? @petrochenkov
If a symbol name can only be imported from one place for a type, and
as long as it was not glob-imported anywhere in the current crate, we
can trim its printed path and print only the name.
This has wide implications on error messages with types, for example,
shortening `std::vec::Vec` to just `Vec`, as long as there is no other
`Vec` importable anywhere.
This adds a new '-Z trim-diagnostic-paths=false' option to control this
feature.
On the good path, with no diagnosis printed, we should try to avoid
issuing this query, so we need to prevent trimmed_def_paths query on
several cases.
This change also relies on a previous commit that differentiates
between `Debug` and `Display` on various rustc types, where the latter
is trimmed and presented to the user and the former is not.
The first use case of this detection of regression for trimmed paths
computation, that is in the case of rustc, which should be computed only
in case of errors or warnings.
Our current user of this method is deeply nested, being a side effect
from `Display` formatting on lots of rustc types. So taking only the
caller to the error message is not enough - we should collect the
traceback instead.
While formatting for user diagnostics used `Display` for all most cases,
some small amount of cases used `Debug` instead. Until now, `Display`
and `Debug` yielded the same output for many types. However, with path
trimming, we want to show a shorter path for the user, these cases need
fixing.
compiler: use `OnceCell` from std
Fixes#76192
The only remaining direct use of `lazy_static` crate is in `src/bootstrap` but I am not sure how I can remove that dependency for now.
r? @matklad
This introduces a no-op `Unhasher` for map keys that are already hash-
like, for example `Fingerprint` and its wrapper `DefPathHash`. For these
we can directly produce the `u64` hash for maps. The first use of this
is `def_path_hash_to_def_id: Option<UnhashMap<DefPathHash, DefId>>`.
lexer: Tiny improvement to shebang detection
Lexer now discerns between regular comments and doc comments, so use that.
The change only affects the choice of reported errors.
Give a better error message for duplicate built-in macros
Minor follow-up to https://github.com/rust-lang/rust/pull/75176 giving a better error message for duplicate builtin macros. This would have made it a little easier to debug.
r? @petrochenkov
Add new `-Z dump-mir-spanview` option
Similar to `-Z dump-mir-graphviz`, this adds the option to write
HTML+CSS files that allow users to analyze the spans associated with MIR
elements (by individual statement, just terminator, or overall basic
block).
This PR was split out from PR #76004, and exposes an API for spanview
HTML+CSS files that is also used to analyze code regions chosen for
coverage instrumentation (in a follow-on PR).
Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage
instrumentation
r? @tmandry
FYI @wesleywiser
The HIR Id trick is insufficient to prevent query cycles when optimizing
generators, since merely requesting a layout of a generator also
computes its `optimized_mir`.
Make no attempts to inline functions into generators within the same
crate to avoid query cycles.
Run cfg-stripping on generic parameters before invoking derive macros
Fixes#75930
This changes the tokens seen by a proc-macro. However, ising a `#[cfg]` attribute
on a generic paramter is unusual, and combining it with a proc-macro
derive is probably even more unusual. I don't expect this to cause any
breakage.
Eliminate some other bound checks when index comes from an enum
#36962 introduced an assumption for the upper limit of the enum's value. This PR adds an assumption to the lower value as well.
I've modified the original codegen test to show that derived (in that case, adding 1) values also don't generate bounds checks.
However, this test is actually carefully crafted to not hit a bug: if the enum's variants are modified to 1 and 2 instead of 2 and 3, the test fails by adding a bounds check. I suppose this is an LLVM issue and #75525, while not exactly in this context should be tracking it.
I'm not at all confident if this patch can be accepted, or even if it _should_ be accepted in this state. But I'm curious about what others think :)
~Improves~ Should improve #13926 but does not close it because it's not exactly predictable, where bounds checks may pop up against the assumptions.
Previously, this would say no such macro existed, but this was
misleading, since the macro _did_ exist, it was just already seen.
- Say where the macro was previously defined
- Add long-form error message
Make to_immediate/from_immediate configurable by backends
`librustc_codegen_ssa` has the concept of an immediate vs. memory type, and `librustc_codegen_llvm` uses this distinction to implement `bool`s being `i8` in memory, and `i1` in immediate contexts. However, some of that implementation leaked into `codegen_ssa` when converting to/from immediate values. So, move those methods into builder traits, so that behavior can be configured by backends.
This is useful if a backend is able to keep bools as bools, or, needs to do more trickery than just bools to bytes.
(Note that there's already a large amount of things abstracted with "immediate types" - this is just bringing this particular thing in line to be abstracted as well)
---
Pinging @eddyb since that's who I was talking about this change with when they suggested I submit a PR.
Similar to `-Z dump-mir-graphviz`, this adds the option to write
HTML+CSS files that allow users to analyze the spans associated with MIR
elements (by individual statement, just terminator, or overall basic
block).
This PR was split out from PR #76004, and exposes an API for spanview
HTML+CSS files that is also used to analyze code regions chosen for
coverage instrumentation (in a follow-on PR).
Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage
instrumentation
Update expect-test to 1.0
The only change is that `expect_file` now uses path relative to the
current file (same as `include!`). Before, it used paths relative to
the workspace root, which makes no sense.
Adds two source span utility functions used in source-based coverage
`span.is_empty()` - returns true if `lo()` and `hi()` are equal. This is
not only a convenience, but makes it clear that a `Span` can be empty
(that is, retrieving the source for an empty `Span` will return an empty
string), and codifies the (otherwise undocumented--in the rustc_span
package, at least) fact that `Span` is a half-open interval (where
`hi()` is the open end).
`source_map.lookup_file_span()` - returns an enclosing `Span`
representing the start and end positions of the file enclosing the given
`BytePos`. This gives developers a clear way to quickly determine if any
any other `BytePos` or `Span` is also from the same file (for example,
by simply calling `file_span.contains(span)`).
This results in much simpler code and is much more runtime efficient
compared with the obvious alternative: calling `source_map.lookup_line()`
for any two `Span`'s byte positions, handle both arms of the `Result`
(both contain the file), and then compare files. It is also more
efficient than the non-public method `lookup_source_file_idx()` for each
`BytePos`, because, while comparing the internal source file indexes
would be efficient, looking up the source file index for every `BytePos`
or `Span` to be compared requires a binary search (worst case
performance being O(log n) for every lookup).
`source_map.lookup_file_span()` performs the binary search only once, to
get the `file_span` result that can be used to compare to any number of
other `BytePos` or `Span` values and those comparisons are always O(1).
This PR was split out from PR #75828 .
r? @tmandry
FYI: @wesleywiser
Fix `-Z instrument-coverage` on MSVC
Found that `-C link-dead-code` (which was enabled automatically
under `-Z instrument-coverage`) was causing the linking error that
resulted in segmentation faults in coverage instrumented binaries. Link
dead code is now disabled under MSVC, allowing `-Z instrument-coverage`
to be enabled under MSVC for the first time.
More details are included in Issue #76038 .
Note this PR makes it possible to support `Z instrument-coverage` but
does not enable instrument coverage for MSVC in existing tests. It will be
enabled in another PR to follow this one (both PRs coming from original
PR #75828).
r? @tmandry
FYI: @wesleywiser
`span.is_empty()` - returns true if `lo()` and `hi()` are equal. This is
not only a convenience, but makes it clear that a `Span` can be empty
(that is, retrieving the source for an empty `Span` will return an empty
string), and codifies the (otherwise undocumented--in the rustc_span
package, at least) fact that `Span` is a half-open interval (where
`hi()` is the open end).
`source_map.lookup_file_span()` - returns an enclosing `Span`
representing the start and end positions of the file enclosing the given
`BytePos`. This gives developers a clear way to quickly determine if any
any other `BytePos` or `Span` is also from the same file (for example,
by simply calling `file_span.contains(span)`).
This results in much simpler code and is much more runtime efficient
compared with the obvious alternative: calling `source_map.lookup_line()`
for any two `Span`'s byte positions, handle both arms of the `Result`
(both contain the file), and then compare files. It is also more
efficient than the non-public method `lookup_source_file_idx()` for each
`BytePos`, because, while comparing the internal source file indexes
would be efficient, looking up the source file index for every `BytePos`
or `Span` to be compared requires a binary search (worst case
performance being O(log n) for every lookup).
`source_map.lookup_file_span()` performs the binary search only once, to
get the `file_span` result that can be used to compare to any number of
other `BytePos` or `Span` values and those comparisons are always O(1).