CFI: Support function pointers for trait methods
Adds support for both CFI and KCFI for function pointers to trait methods by attaching both concrete and abstract types to functions.
KCFI does this through generation of a `ReifyShim` on any function pointer for a method that could go into a vtable, and keeping this separate from `ReifyShim`s that are *intended* for vtable us by setting a `ReifyReason` on them.
CFI does this by setting both the concrete and abstract type on every instance.
This should land after #123024 or a similar PR, as it diverges the implementation of CFI vs KCFI.
r? `@compiler-errors`
Rename `expose_addr` to `expose_provenance`
`expose_addr` is a bad name, an address is just a number and cannot be exposed. The operation is actually about the provenance of the pointer.
This PR thus changes the name of the method to `expose_provenance` without changing its return type. There is sufficient precedence for returning a useful value from an operation that does something else without the name indicating such, e.g. [`Option::insert`](https://doc.rust-lang.org/nightly/std/option/enum.Option.html#method.insert) and [`MaybeUninit::write`](https://doc.rust-lang.org/nightly/std/mem/union.MaybeUninit.html#method.write).
Returning the address is merely convenient, not a fundamental part of the operation. This is implied by the fact that integers do not have provenance since
```rust
let addr = ptr.addr();
ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
must behave exactly like
```rust
let addr = ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
as the result of `ptr.expose_provenance()` and `ptr.addr()` is the same integer. Therefore, this PR removes the `#[must_use]` annotation on the function and updates the documentation to reflect the important part.
~~An alternative name would be `expose_provenance`. I'm not at all opposed to that, but it makes a stronger implication than we might want that the provenance of the pointer returned by `ptr::with_exposed_provenance`[^1] is the same as that what was exposed, which is not yet specified as such IIUC. IMHO `expose` does not make that connection.~~
A previous version of this PR suggested `expose` as name, libs-api [decided on](https://github.com/rust-lang/rust/pull/122964#issuecomment-2033194319) `expose_provenance` to keep the symmetry with `with_exposed_provenance`.
CC `@RalfJung`
r? libs-api
[^1]: I'm using the new name for `from_exposed_addr` suggested by #122935 here.
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Use the `Align` type when parsing alignment attributes
Use the `Align` type in `rustc_attr::parse_alignment`, removing the need to call `Align::from_bytes(...).unwrap()` later in the compilation process.
Simplify trim-paths feature by merging all debuginfo options together
This PR simplifies the trim-paths feature by merging all debuginfo options together, as described in https://github.com/rust-lang/rust/issues/111540#issuecomment-1994010274.
And also do some correctness fixes found during the review.
cc `@weihanglo`
r? `@michaelwoerister`
CFI: Fix methods as function pointer cast
Fix casting between methods and function pointers by assigning a secondary type id to methods with their concrete self so they can be used as function pointers.
This was split off from #116404.
cc `@compiler-errors` `@workingjubilee`
Fix casting between methods and function pointers by assigning a
secondary type id to methods with their concrete self so they can be
used as function pointers.
Make `TyCtxt::coroutine_layout` take coroutine's kind parameter
For coroutines that come from coroutine-closures (i.e. async closures), we may have two kinds of bodies stored in the coroutine; one that takes the closure's captures by reference, and one that takes the captures by move.
These currently have identical layouts, but if we do any optimization for these layouts that are related to the upvars, then they will diverge -- e.g. https://github.com/rust-lang/rust/pull/120168#discussion_r1536943728.
This PR relaxes the assertion I added in #121122, and instead make the `TyCtxt::coroutine_layout` method take the `coroutine_kind_ty` argument from the coroutine, which will allow us to differentiate these by-move and by-ref bodies.
coverage: Re-enable `UnreachablePropagation` for coverage builds
This is a sequence of 3 related changes:
- Clean up the existing code that scans for unused functions
- Detect functions that were instrumented for coverage, but have had all their coverage statements removed by later MIR transforms (e.g. `UnreachablePropagation`)
- Re-enable `UnreachablePropagation` in coverage builds
Because we now detect functions that have lost their coverage statements, and treat them as unused, we don't need to worry about `UnreachablePropagation` removing all of those statements. This is demonstrated by `tests/coverage/unreachable.rs`.
Fixes#116171.
If a function was instrumented for coverage, but all of its coverage statements
have been removed by later MIR transforms, it should be treated as "unused"
even if the compiler generates an unreachable stub for it.
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Some of the marker statements used by coverage are added during MIR building
for use by the InstrumentCoverage pass (during analysis), and are not needed
afterwards.
various clippy fixes
We need to keep the order of the given clippy lint rules before passing them.
Since clap doesn't offer any useful interface for this purpose out of the box,
we have to handle it manually.
Additionally, this PR makes `-D` rules work as expected. Previously, lint rules were limited to `-W`. By enabling `-D`, clippy began to complain numerous lines in the tree, all of which have been resolved in this PR as well.
Fixes#121481
cc `@matthiaskrgr`
Update the minimum external LLVM to 17
With this change, we'll have stable support for LLVM 17 and 18.
For reference, the previous increase to LLVM 16 was #117947.
LLVM's default bad-alloc handler may throw if exceptions are enabled,
and `operator new` isn't hooked at all by default. Now we register our
own handler that prints a message similar to fatal errors, then aborts.
We also call the function that registers the C++ `std::new_handler`.
coverage: Initial support for branch coverage instrumentation
(This is a review-ready version of the changes that were drafted in #118305.)
This PR adds support for branch coverage instrumentation, gated behind the unstable flag value `-Zcoverage-options=branch`. (Coverage instrumentation must also be enabled with `-Cinstrument-coverage`.)
During THIR-to-MIR lowering (MIR building), if branch coverage is enabled, we collect additional information about branch conditions and their corresponding then/else blocks. We inject special marker statements into those blocks, so that the `InstrumentCoverage` MIR pass can reliably identify them even after the initially-built MIR has been simplified and renumbered.
The rest of the changes are mostly just plumbing needed to gather up the information that was collected during MIR building, and include it in the coverage metadata that we embed in the final binary.
Note that `llvm-cov show` doesn't print branch coverage information in its source views by default; that needs to be explicitly enabled with `--show-branches=count` or similar.
---
The current implementation doesn't have any support for instrumenting `if let` or let-chains. I think it's still useful without that, and adding it would be non-trivial, so I'm happy to leave that for future work.
coverage: Remove or migrate all unstable values of `-Cinstrument-coverage`
(This PR was substantially overhauled from its original version, which migrated all of the existing unstable values intact.)
This PR takes the three nightly-only values that are currently accepted by `-Cinstrument-coverage`, completely removes two of them (`except-unused-functions` and `except-unused-generics`), and migrates the third (`branch`) over to a newly-introduced unstable flag `-Zcoverage-options`.
I have a few motivations for wanting to do this:
- It's unclear whether anyone actually uses the `except-unused-*` values, so this serves as an opportunity to either remove them, or prompt existing users to object to their removal.
- After #117199, the stable values of `-Cinstrument-coverage` treat it as a boolean-valued flag, so having nightly-only extra values feels out-of-place.
- Nightly-only values also require extra ad-hoc code to make sure they aren't accidentally exposed to stable users.
- The new system allows multiple different settings to be toggled independently, which isn't possible in the current single-value system.
- The new system makes it easier to introduce new behaviour behind an unstable toggle, and then gather nightly-user feedback before possibly making it the default behaviour for all users.
- The new system also gives us a convenient place to put relatively-narrow options that won't ever be the default, but that nightly users might still want access to.
- It's likely that we will eventually want to give stable users more fine-grained control over coverage instrumentation. The new flag serves as a prototype of what that stable UI might eventually look like.
The `branch` option is a placeholder that currently does nothing. It will be used by #122322 to opt into branch coverage instrumentation.
---
I see `-Zcoverage-options` as something that will exist more-or-less indefinitely, though individual sub-options might come and go as appropriate. I think there will always be some demand for nightly-only toggles, so I don't see `-Zcoverage-options` itself ever being stable, though we might eventually stabilize something similar to it.