Implement Modified Condition/Decision Coverage
This is an implementation based on llvm backend support (>= 18) by `@evodius96` and branch coverage support by `@Zalathar.`
### Major changes:
* Add -Zcoverage-options=mcdc as switch. Now coverage options accept either `no-branch`, `branch`, or `mcdc`. `mcdc` also enables `branch` because it is essential to work.
* Add coverage mapping for MCDCBranch and MCDCDecision. Note that MCDCParameter evolves from llvm 18 to llvm 19. The mapping in rust side mainly references to 19 and is casted to 18 types in llvm wrapper.
* Add wrapper for mcdc instrinc functions from llvm. And inject associated statements to mir.
* Add BcbMappingKind::Decision, I'm not sure is it proper but can't find a better way temporarily.
* Let coverage-dump support parsing MCDCBranch and MCDCDecision from llvm ir.
* Add simple tests to check whether mcdc works.
* Same as clang, currently rustc does not generate instrument for decision with more than 6 condtions or only 1 condition due to considerations of resource.
### Implementation Details
1. To get information about conditions and decisions, `MCDCState` in `BranchInfoBuilder` is used during hir lowering to mir. For expressions with logical op we call `Builder::visit_coverage_branch_operation` to record its sub conditions, generate condition ids for them and save their spans (to construct the span of whole decision). This process mainly references to the implementation in clang and is described in comments over `MCDCState::record_conditions`. Also true marks and false marks introduced by branch coverage are used to detect where the decision evaluation ends: the next id of the condition == 0.
2. Once the `MCDCState::decision_stack` popped all recorded conditions, we can ensure that the decision is checked over and push it into `decision_spans`. We do not manually insert decision span to avoid complexity from then_else_break in nested if scopes.
3. When constructing CoverageSpans, add condition info to BcbMappingKind::Branch and decision info to BcbMappingKind::Decision. If the branch mapping has non-zero condition id it will be transformed to MCDCBranch mapping and insert `CondBitmapUpdate` statements to its evaluated blocks. While decision bcb mapping will insert `TestVectorBitmapUpdate` in all its end blocks.
### Usage
```bash
echo "[build]\nprofiler=true" >> config.toml
./x build --stage 1
./x test tests/coverage/mcdc_if.rs
```
to build the compiler and run tests.
```shell
export PATH=path/to/llvm-build:$PATH
rustup toolchain link mcdc build/host/stage1
cargo +mcdc rustc --bin foo -- -Cinstrument-coverage -Zcoverage-options=mcdc
cd target/debug
LLVM_PROFILE_FILE="foo.profraw" ./foo
llvm-profdata merge -sparse foo.profraw -o foo.profdata
llvm-cov show ./foo -instr-profile=foo.profdata --show-mcdc
```
to check "foo" code.
### Problems to solve
For now decision mapping will insert statements to its all end blocks, which may be optimized by inserting a final block of the decision. To do this we must also trace the evaluated value at each end of the decision and join them separately.
This implementation is not heavily tested so there should be some unrevealed issues. We are going to check our rust products in the next. Please let me know if you had any suggestions or comments.
Disable SimplifyToExp in MatchBranchSimplification
Due to the miscompilation mentioned in #124150, We need to disable MatchBranchSimplification temporarily.
To fully resolve this issue, my plan is:
1. Disable SimplifyToExp in MatchBranchSimplification (this PR).
2. Remove all potentially unclear transforms in #124122.
3. Gradually add back the removed transforms (possibly multiple PRs).
r? `@Nilstrieb` or `@oli-obk`
async closure coroutine by move body MirPass refactoring
Unsure about the last commit, but I think the other changes help in simplifying the control flow
Stop making any assumption about the projections applied to the upvars in the `ByMoveBody` pass
So it turns out that because of subtle optimizations like [`truncate_capture_for_optimization`](ab5bda1aa7/compiler/rustc_hir_typeck/src/upvar.rs (L2351)), we simply cannot make any assumptions about the shape of the projections applied to the upvar locals in a coroutine body.
So stop doing that -- the code is resilient to such projections, so the assertion really existed only to "protect against the unknown".
r? oli-obk
Fixes#123650
Re-enable the early otherwise branch optimization
Closes#95162. Fixes#119014.
This is the first part of #121397.
An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.
r? cjgillot
Fix `ByMove` coroutine-closure shim (for 2021 precise closure capturing behavior)
This PR reworks the way that we perform the `ByMove` coroutine-closure shim to account for the fact that the upvars of the outer coroutine-closure and the inner coroutine might not line up due to edition-2021 closure capture rules changes.
Specifically, the number of upvars may differ *and/or* the inner coroutine may have additional projections applied to an upvar. This PR reworks the information we pass into the `ByMoveBody` MIR visitor to account for both of these facts.
I tried to leave comments explaining exactly what everything is doing, but let me know if you have questions.
r? oli-obk
Actually use the inferred `ClosureKind` from signature inference in coroutine-closures
A follow-up to https://github.com/rust-lang/rust/pull/123349, which fixes another subtle bug: We were not taking into account the async closure kind we infer during closure signature inference.
When I pass a closure directly to an arg like `fn(x: impl async FnOnce())`, that should have the side-effect of artificially restricting the kind of the async closure to `ClosureKind::FnOnce`. We weren't doing this -- that's a quick fix; however, it uncovers a second, more subtle bug with the way that `move`, async closures, and `FnOnce` interact.
Specifically, when we have an async closure like:
```
let x = Struct;
let c = infer_as_fnonce(async move || {
println!("{x:?}");
}
```
The outer closure captures `x` by move, but the inner coroutine still immutably borrows `x` from the outer closure. Since we've forced the closure to by `async FnOnce()`, we can't actually *do* a self borrow, since the signature of `AsyncFnOnce::call_once` doesn't have a borrowed lifetime. This means that all `async move` closures that are constrained to `FnOnce` will fail borrowck.
We can fix that by detecting this case specifically, and making the *inner* async closure `move` as well. This is always beneficial to closure analysis, since if we have an `async FnOnce()` that's `move`, there's no reason to ever borrow anything, so `move` isn't artificially restrictive.
Cleanup: Rename `HAS_PROJECTIONS` to `HAS_ALIASES` etc.
The name of the bitflag `HAS_PROJECTIONS` and of its corresponding method `has_projections` is quite historical dating back to a time when projections were the only kind of alias type.
I think it's time to update it to clear up any potential confusion for newcomers and to reduce unnecessary friction during contributor onboarding.
r? types
coverage: Remove useless constants
After #122972 and #123419, these constants don't serve any useful purpose, so get rid of them.
`@rustbot` label +A-code-coverage
CFI: Support function pointers for trait methods
Adds support for both CFI and KCFI for function pointers to trait methods by attaching both concrete and abstract types to functions.
KCFI does this through generation of a `ReifyShim` on any function pointer for a method that could go into a vtable, and keeping this separate from `ReifyShim`s that are *intended* for vtable us by setting a `ReifyReason` on them.
CFI does this by setting both the concrete and abstract type on every instance.
This should land after #123024 or a similar PR, as it diverges the implementation of CFI vs KCFI.
r? `@compiler-errors`
Remove MIR unsafe check
Now that THIR unsafeck is enabled by default in stable I think we can remove MIR unsafeck entirely. This PR also removes safety information from MIR.
Rollup of 4 pull requests
Successful merges:
- #122411 ( Provide cabi_realloc on wasm32-wasip2 by default )
- #123349 (Fix capture analysis for by-move closure bodies)
- #123359 (Link against libc++abi and libunwind as well when building LLVM wrappers on AIX)
- #123388 (use a consistent style for links)
r? `@ghost`
`@rustbot` modify labels: rollup
Rename `UninhabitedEnumBranching` to `UnreachableEnumBranching`
Per [#120268](https://github.com/rust-lang/rust/pull/120268#discussion_r1517492060), I rename `UninhabitedEnumBranching` to `UnreachableEnumBranching` .
I solved some nits to add some comments.
I adjusted the workaround restrictions. This should be useful for `a <= b` and `if let Some/Ok(v)`. For enum with few variants, `early-tailduplication` should not cause compile time overhead.
r? RalfJung
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
KCFI needs to be able to tell which kind of `ReifyShim` it is examining
in order to decide whether to use a concrete type (`FnPtr` case) or an
abstract case (`Vtable` case). You can *almost* tell this from context,
but there is one case where you can't - if a trait has a method which is
*not* `#[track_caller]`, with an impl that *is* `#[track_caller]`, both
the vtable and a function pointer created from that method will be
`ReifyShim(def_id)`.
Currently, the reason is optional to ensure no additional unique
`ReifyShim`s are added without KCFI on. However, the case in which an
extra `ReifyShim` is created is sufficiently rare that this may be worth
revisiting to reduce complexity.
Rollup of 4 pull requests
Successful merges:
- #123176 (Normalize the result of `Fields::ty_with_args`)
- #123186 (copy any file from stage0/lib to stage0-sysroot/lib)
- #123187 (Forward port 1.77.1 release notes)
- #123188 (compiler: fix few unused_peekable and needless_pass_by_ref_mut clippy lints)
r? `@ghost`
`@rustbot` modify labels: rollup
compiler: fix few unused_peekable and needless_pass_by_ref_mut clippy lints
This fixes few instances of `unused_peekable` and `needless_pass_by_ref_mut`. While i expected to fix more warnings, `needless_pass_by_ref_mut` produced too much for one PR, so i stopped here.
Better reviewed commit by commit, as fixes splitted by chunks.
Simplify trim-paths feature by merging all debuginfo options together
This PR simplifies the trim-paths feature by merging all debuginfo options together, as described in https://github.com/rust-lang/rust/issues/111540#issuecomment-1994010274.
And also do some correctness fixes found during the review.
cc `@weihanglo`
r? `@michaelwoerister`
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_mir_transform\src\coroutine.rs:1229:11
|
1229 | body: &mut Body<'tcx>,
| ^^^^^^^^^^^^^^^ help: consider changing to: `&Body<'tcx>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_mir_transform\src\nrvo.rs:123:11
|
123 | body: &mut mir::Body<'_>,
| ^^^^^^^^^^^^^^^^^^ help: consider changing to: `&mir::Body<'_>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_mir_transform\src\nrvo.rs:87:34
|
87 | fn local_eligible_for_nrvo(body: &mut mir::Body<'_>) -> Option<Local> {
| ^^^^^^^^^^^^^^^^^^ help: consider changing to: `&mir::Body<'_>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
coverage: Re-enable `UnreachablePropagation` for coverage builds
This is a sequence of 3 related changes:
- Clean up the existing code that scans for unused functions
- Detect functions that were instrumented for coverage, but have had all their coverage statements removed by later MIR transforms (e.g. `UnreachablePropagation`)
- Re-enable `UnreachablePropagation` in coverage builds
Because we now detect functions that have lost their coverage statements, and treat them as unused, we don't need to worry about `UnreachablePropagation` removing all of those statements. This is demonstrated by `tests/coverage/unreachable.rs`.
Fixes#116171.
In `ConstructCoroutineInClosureShim`, pass receiver by mut ref, not mut pointer
The receivers were compatible at codegen time, but did not necessarily have the same layouts due to niches, which was caught by miri.
Fixesrust-lang/miri#3400
r? oli-obk
Replace `mir_built` query with a hook and use mir_const everywhere instead
A small perf improvement due to less dep graph handling.
Mostly just a cleanup to get rid of one of our many mir queries
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
Fix validation on substituted callee bodies in MIR inliner
When inlining a coroutine, we will substitute the MIR body with the args of the call. There is code in the MIR validator that attempts to prevent query cycles, and will use the coroutine body directly when it detects that's the body that's being validated. That means that when inlining a coroutine body that has been substituted, it may no longer be parameterized over the original args of the coroutine, which will lead to substitution ICEs.
Fixes#119064
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Rollup of 8 pull requests
Successful merges:
- #114009 (compiler: allow transmute of ZST arrays with generics)
- #122195 (Note that the caller chooses a type for type param)
- #122651 (Suggest `_` for missing generic arguments in turbofish)
- #122784 (Add `tag_for_variant` query)
- #122839 (Split out `PredicatePolarity` from `ImplPolarity`)
- #122873 (Merge my contributor emails into one using mailmap)
- #122885 (Adjust better spastorino membership to triagebot's adhoc_groups)
- #122888 (add a couple more tests)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `tag_for_variant` query
This query allows for sharing code between `rustc_const_eval` and `rustc_transmutability`. It's a precursor to a PR I'm working on to entirely replace the bespoke layout computations in `rustc_transmutability`.
r? `@compiler-errors`