Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths
This PR fixes#73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.
`RealFileName::Named` introduced in #72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.
`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.
When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".
`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by #44940) and `name_was_remapped` (introduced by #41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.
cc `@eddyb` who implemented `/rustc/...` path devirtualisation
This PR implements span quoting, allowing proc-macros to produce spans
pointing *into their own crate*. This is used by the unstable
`proc_macro::quote!` macro, allowing us to get error messages like this:
```
error[E0412]: cannot find type `MissingType` in this scope
--> $DIR/auxiliary/span-from-proc-macro.rs:37:20
|
LL | pub fn error_from_attribute(_args: TokenStream, _input: TokenStream) -> TokenStream {
| ----------------------------------------------------------------------------------- in this expansion of procedural macro `#[error_from_attribute]`
...
LL | field: MissingType
| ^^^^^^^^^^^ not found in this scope
|
::: $DIR/span-from-proc-macro.rs:8:1
|
LL | #[error_from_attribute]
| ----------------------- in this macro invocation
```
Here, `MissingType` occurs inside the implementation of the proc-macro
`#[error_from_attribute]`. Previosuly, this would always result in a
span pointing at `#[error_from_attribute]`
This will make many proc-macro-related error message much more useful -
when a proc-macro generates code containing an error, users will get an
error message pointing directly at that code (within the macro
definition), instead of always getting a span pointing at the macro
invocation site.
This is implemented as follows:
* When a proc-macro crate is being *compiled*, it causes the `quote!`
macro to get run. This saves all of the sapns in the input to `quote!`
into the metadata of *the proc-macro-crate* (which we are currently
compiling). The `quote!` macro then expands to a call to
`proc_macro::Span::recover_proc_macro_span(id)`, where `id` is an
opaque identifier for the span in the crate metadata.
* When the same proc-macro crate is *run* (e.g. it is loaded from disk
and invoked by some consumer crate), the call to
`proc_macro::Span::recover_proc_macro_span` causes us to load the span
from the proc-macro crate's metadata. The proc-macro then produces a
`TokenStream` containing a `Span` pointing into the proc-macro crate
itself.
The recursive nature of 'quote!' can be difficult to understand at
first. The file `src/test/ui/proc-macro/quote-debug.stdout` shows
the output of the `quote!` macro, which should make this eaier to
understand.
This PR also supports custom quoting spans in custom quote macros (e.g.
the `quote` crate). All span quoting goes through the
`proc_macro::quote_span` method, which can be called by a custom quote
macro to perform span quoting. An example of this usage is provided in
`src/test/ui/proc-macro/auxiliary/custom-quote.rs`
Custom quoting currently has a few limitations:
In order to quote a span, we need to generate a call to
`proc_macro::Span::recover_proc_macro_span`. However, proc-macros
support renaming the `proc_macro` crate, so we can't simply hardcode
this path. Previously, the `quote_span` method used the path
`crate::Span` - however, this only works when it is called by the
builtin `quote!` macro in the same crate. To support being called from
arbitrary crates, we need access to the name of the `proc_macro` crate
to generate a path. This PR adds an additional argument to `quote_span`
to specify the name of the `proc_macro` crate. Howver, this feels kind
of hacky, and we may want to change this before stabilizing anything
quote-related.
Additionally, using `quote_span` currently requires enabling the
`proc_macro_internals` feature. The builtin `quote!` macro
has an `#[allow_internal_unstable]` attribute, but this won't work for
custom quote implementations. This will likely require some additional
tricks to apply `allow_internal_unstable` to the span of
`proc_macro::Span::recover_proc_macro_span`.
Fixes: #84884
This solution might be considered a compromise, but I think it is the
better choice.
The results in the `closure.rs` test correctly resolve all test cases
broken as described in #84884.
One test pattern (in both `closure_macro.rs` and
`closure_macro_async.rs`) was also affected, and removes coverage
statistics for the lines inside the closure, because the closure
includes a macro. (The coverage remains at the callsite of the macro, so
we lose some detail, but there isn't a perfect choice with macros.
Often macro implementations are split across the macro and the callsite,
and there doesn't appear to be a single "right choice" for which body
should be covered. For the current implementation, we can't do both.
The callsite is most likely to be the preferred site for coverage.
I applied this fix to all `MacroKinds`, not just `Bang`.
I'm trying to resolve an issue of lost coverage in a
`MacroKind::Attr`-based, function-scoped macro. Instead of only
searching for a body_span that is "not a function-like macro" (that is,
MacroKind::Bang), I'm expanding this to all `MacroKind`s. Maybe I should
expand this to `ExpnKind::Desugaring` and `ExpnKind::AstPass` (or
subsets, depending on their sub-kinds) as well, but I'm not sure that's
a good idea.
I'd like to add a test of the `Attr` macro on functions, but I need time
to figure out how to constract a good, simple example without external
crate dependencies. For the moment, all tests still work as expected (no
change), this new commit shouldn't have a negative affect, and more
importantly, I believe it will have a positive effect. I will try to
confirm this.
Adds feature-gated `#[no_coverage]` function attribute, to fix derived Eq `0` coverage issue #83601
Derived Eq no longer shows uncovered
The Eq trait has a special hidden function. MIR `InstrumentCoverage`
would add this function to the coverage map, but it is never called, so
the `Eq` trait would always appear uncovered.
Fixes: #83601
The fix required creating a new function attribute `no_coverage` to mark
functions that should be ignored by `InstrumentCoverage` and the
coverage `mapgen` (during codegen).
Adding a `no_coverage` feature gate with tracking issue #84605.
r? `@tmandry`
cc: `@wesleywiser`
Improve coverage spans for chained function calls
Fixes: #84180
For chained function calls separated by the `?` try operator, the
function call following the try operator produced a MIR `Call` span that
matched the span of the first call. The `?` try operator started a new
span, so the second call got no span.
It turns out the MIR `Call` terminator has a `func` `Operand`
for the `Constant` representing the function name, and the function
name's Span can be used to reset the starting position of the span.
r? `@tmandry`
cc: `@wesleywiser`
The Eq trait has a special hidden function. MIR `InstrumentCoverage`
would add this function to the coverage map, but it is never called, so
the `Eq` trait would always appear uncovered.
Fixes: #83601
The fix required creating a new function attribute `no_coverage` to mark
functions that should be ignored by `InstrumentCoverage` and the
coverage `mapgen` (during codegen).
While testing, I also noticed two other issues:
* spanview debug file output ICEd on a function with no body. The
workaround for this is included in this PR.
* `assert_*!()` macro coverage can appear covered if followed by another
`assert_*!()` macro. Normally they appear uncovered. I submitted a new
Issue #84561, and added a coverage test to demonstrate this issue.
Fixes: #84180
For chained function calls separated by the `?` try operator, the
function call following the try operator produced a MIR `Call` span that
matched the span of the first call. The `?` try operator started a new
span, so the second call got no span.
It turns out the MIR `Call` terminator has a `func` `Operand`
for the `Constant` representing the function name, and the function
name's Span can be used to reset the starting position of the span.
Fixes: #83792
MIR `InstrumentCoverage` assumed the `FnSig` span was contained within a
single file, but this is not always the case. Some macro constructions
can result in a span that starts in one `SourceFile` and ends in a
different one.
The `FnSig` span is included in coverage results as long as that span is
in the same `SourceFile` and the same macro context, but by assuming the
`FnSig` span's `hi()` and `lo()` were in the same file, I took this for
granted, and checked only that the `FnSig` `hi()` was in the same
`SourceFile` as the `body_span`.
I actually drop the `hi()` though, and extend the `FnSig` span to the
`body_span.lo()`, so I really should have simply checked that the
`FnSig` span's `lo()` was in the `SourceFile` of the `body_span`.
coverage of async function bodies should match non-async
This fixes some missing coverage within async function bodies.
Commit 1 demonstrates the problem in the fixed issue, and commit 2 corrects it.
Fixes: #83985
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
When codegenning code coverage use the instance that coverage data was
originally generated for, to ensure basic level of compatibility with
MIR inlining.
Add Option::get_or_default
Tracking issue: #82901
The original issue is #55042, which was closed, but for an invalid reason (see discussion there). Opening this to reconsider (I hope that's okay). It seems like the only gap for `Option` being "entry-like".
I ran into a need for this method where I had a `Vec<Option<MyData>>` and wanted to do `vec[n].get_or_default().my_data_method()`. Using an `Option` as an inner component of a data structure is probably where the need for this will normally arise.
Make CTFE able to check for UB...
... by not doing any optimizations on the `const fn` MIR used in CTFE. This means we duplicate all `const fn`'s MIR now, once for CTFE, once for runtime. This PR is for checking the perf effect, so we have some data when talking about https://github.com/rust-lang/const-eval/blob/master/rfcs/0000-const-ub.md
To do this, we now have two queries for obtaining mir: `optimized_mir` and `mir_for_ctfe`. It is now illegal to invoke `optimized_mir` to obtain the MIR of a const/static item's initializer, an array length, an inline const expression or an enum discriminant initializer. For `const fn`, both `optimized_mir` and `mir_for_ctfe` work, the former returning the MIR that LLVM should use if the function is called at runtime. Similarly it is illegal to invoke `mir_for_ctfe` on regular functions.
This is all checked via appropriate assertions and I don't think it is easy to get wrong, as there should be no `mir_for_ctfe` calls outside the const evaluator or metadata encoding. Almost all rustc devs should keep using `optimized_mir` (or `instance_mir` for that matter).
See
https://github.com/rust-lang/rust/issues/80045#issuecomment-745733339
Coverage statements are moved to the beginning of the BCB. This does
also affect what's counted before a panic, changing some results, but I
think these results may even be preferred? In any case, there are no
guarantees about what's counted when a panic occurs (by design).
Fixes reported bugs in Rust Coverage
Fixes: #79569Fixes: #79566Fixes: #79565
For the first issue (#79569), I got hit a `debug_assert!()` before
encountering the reported error message (because I have `debug = true`
enabled in my config.toml).
The assertion showed me that some `SwitchInt`s can have more than one
target pointing to the same `BasicBlock`.
I had thought that was invalid, but since it seems to be possible, I'm
allowing this now.
I added a new test for this.
----
In the last two cases above, both tests (intentionally) fail to compile,
but the `InstrumentCoverage` pass is invoked anyway.
The MIR starts with an `Unreachable` `BasicBlock`, which I hadn't
encountered before. (I had assumed the `InstrumentCoverage` pass
would only be invoked with MIRs from successful compilations.)
I don't have test infrastructure set up to test coverage on files that
fail to compile, so I didn't add a new test.
r? `@tmandry`
FYI: `@wesleywiser`
Remove an unused dependency that made `rustdoc` crash
Whilst struggling with https://github.com/rust-lang/rust/issues/79980 I discovered that this dependency was unused, and that made rustdoc crash. This PR removes it.
Fixes: #79569Fixes: #79566Fixes: #79565
For the first issue (#79569), I got hit a `debug_assert!()` before
encountering the reported error message (because I have `debug = true`
enabled in my config.toml).
The assertion showed me that some `SwitchInt`s can have more than one
target pointing to the same `BasicBlock`.
I had thought that was invalid, but since it seems to be possible, I'm
allowing this now.
I added a new test for this.
----
In the last two cases above, both tests (intentionally) fail to compile,
but the `InstrumentCoverage` pass is invoked anyway.
The MIR starts with an `Unreachable` `BasicBlock`, which I hadn't
encountered before. (I had assumed the `InstrumentCoverage` pass
would only be invoked with MIRs from successful compilations.)
I don't have test infrastructure set up to test coverage on files that
fail to compile, so I didn't add a new test.
Dogfood `str_split_once()`
Part of https://github.com/rust-lang/rust/issues/74773.
Beyond increased clarity, this fixes some instances of a common confusion with how `splitn(2)` behaves: the first element will always be `Some()`, regardless of the delimiter, and even if the value is empty.
Given this code:
```rust
fn main() {
let val = "...";
let mut iter = val.splitn(2, '=');
println!("Input: {:?}, first: {:?}, second: {:?}", val, iter.next(), iter.next());
}
```
We get:
```
Input: "no_delimiter", first: Some("no_delimiter"), second: None
Input: "k=v", first: Some("k"), second: Some("v")
Input: "=", first: Some(""), second: Some("")
```
Using `str_split_once()` makes more clear what happens when the delimiter is not found.
Fixes: #79725
Some macros can create a situation where `fn_sig_span` and `body_span`
map to different files.
New documentation on coverage tests incorrectly assumed multiple test
binaries could just be listed at the end of the `llvm-cov` command,
but it turns out each binary needs a `--object` prefix.
This PR fixes the bug and updates the documentation to correct that
issue. It also fixes a few other minor issues in internal implementation
comments, and adds documentation on getting coverage results for doc
tests.
Added one more test (two files) showing coverage of generics and unused
functions across crates.
Created and referenced new Issues, as requested.
Added comments.
Added a note about the possible effects of compiler options on LLVM
coverage maps.
Fixes multiple issue with counters, with simplification
Includes a change to the implicit else span in ast_lowering, so coverage
of the implicit else no longer spans the `then` block.
Adds coverage for unused closures and async function bodies.
Fixes: #78542
Adding unreachable regions for known MIR missing from coverage map
Cleaned up PR commits, and removed link-dead-code requirement and tests
Coverage no longer depends on Issue #76038 (`-C link-dead-code` is
no longer needed or enforced, so MSVC can use the same tests as
Linux and MacOS now)
Restrict adding unreachable regions to covered files
Improved the code that adds coverage for uncalled functions (with MIR
but not-codegenned) to avoid generating coverage in files not already
included in the files with covered functions.
Resolved last known issue requiring --emit llvm-ir workaround
Fixed bugs in how unreachable code spans were added.