Add `f16` and `f128` math functions
This adds intrinsics and math functions for `f16` and `f128` floating point types. Support is quite limited and some things are broken so tests don't run on many platforms, but this provides a starting point.
Line 0 has a special meaning in DWARF. From the version 5 spec:
The compiler may emit the value 0 in cases
where an instruction cannot be attributed to any
source line.
DUMMY_SP spans cannot be attributed to any line. However, because rustc
internally stores line numbers starting at zero, lookup_debug_loc() adjusts
every line number by one. Special casing DUMMY_SP to actually emit line 0
ensures rustc communicates to the debugger that there's no meaningful source
code for this instruction, rather than telling the debugger to jump to line 1
randomly.
Since LLVM <https://reviews.llvm.org/D99439> (4c7f820b2b20, "Update
@llvm.powi to handle different int sizes for the exponent"), the size of
the integer can be specified for the `powi` intrinsic. Make use of this
so it is more obvious that integer size is consistent across all float
types.
This feature is available since LLVM 13 (October 2021). Based on
bootstrap we currently support >= 17.0, so there should be no support
problems.
When an archive fails to build, print the path
Currently the output on failure is as follows:
Compiling block-buffer v0.10.4
Compiling crypto-common v0.1.6
Compiling digest v0.10.7
Compiling sha2 v0.10.8
Compiling xz2 v0.1.7
error: failed to build archive: No such file or directory
error: could not compile `bootstrap` (lib) due to 1 previous error
Change this to print which file is being constructed, to give some hint about what is going on.
error: failed to build archive at `path/to/output`: No such file or directory
Rollup of 4 pull requests
Successful merges:
- #127574 (elaborate unknowable goals)
- #128141 (Set branch protection function attributes)
- #128315 (Fix vita build of std and forbid unsafe in unsafe in the os/vita module)
- #128339 ([rustdoc] Make the buttons remain when code example is clicked)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `select_unpredictable` to force LLVM to use CMOV
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
Set branch protection function attributes
Since LLVM 19, it is necessary to set not only module flags, but also function attributes for branch protection on aarch64. See e15d67cfc2 for the relevant LLVM change.
Fixes https://github.com/rust-lang/rust/issues/127829.
Update compiler_builtins to 0.1.114
The `weak-intrinsics` feature was removed from compiler_builtins in https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the `compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot.
In https://github.com/rust-lang/compiler-builtins/pull/593, some builtins for f16/f128 were added. These don't work for all compiler backends, so add a `compiler-builtins-no-f16-f128` feature and disable it for cranelift and gcc.
deps: dedup object, wasmparser, wasm-encoder
* dedups one `object`, additional dupe will be removed, with next `thorin-dwp` update
* `wasmparser` pinned to minor versions, so full merge isn't possible
* same with `wasm-encoder`
Turned off some features for `wasmparser` (see features https://github.com/bytecodealliance/wasm-tools/blob/v1.208.1/crates/wasmparser/Cargo.toml) in `run-make-support`, looks working?
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs
into branches if it comes from a `select` marked with an `unpredictable`
metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits
such a `select` and uses it in the implementation of `binary_search_by`.
compiler: Never debug_assert in codegen
In the name of Turing and his Hoarey heralds, assert our truths before creating a monster!
The `rustc_codegen_llvm` and `rustc_codegen_ssa` crates are fairly critical for rustc's correctness. Small mistakes here can easily result in undefined behavior, since a "small mistake" can mean something like "link and execute the wrong code". We should probably run any and all asserts in these modules unconditionally on whether this is a "debug build", and damn the costs in performance.
...Especially because the costs in performance seem to be *nothing*. It is not clear how much correctness we gain here, but I'll take free correctness improvements.
Since LLVM 19, it is necessary to set not only module flags, but
also function attributes for branch protection on aarch64. See
e15d67cfc2
for the relevant LLVM change.
rustc_target: add known safe s390x target features
This pull request adds known safe target features for s390x (aka IBM Z systems).
Currently, these features are unstable since stabilizing the target features requires submitting proposals.
The `vector` feature was added in IBM Z13 (`arch11`), and this is a SIMD feature for the newer IBM Z systems.
The `backchain` attribute is the IBM Z way of adding frame pointers like unwinding capabilities (the "frame-pointer" switch on IBM Z and IBM POWER platforms will add _emulated_ frame pointers to the binary, which profilers can't use for unwinding the stack).
Both attributes can be applied at the LLVM module or function levels. However, the `backchain` attribute has to be enabled for all the functions in the call stack to get a successful unwind process.
Handle .init_array link_section specially on wasm
Given that wasm-ld now has support for [.init_array](8f2bd8ae68/llvm/lib/MC/WasmObjectWriter.cpp (L1852)), it appears we can easily implement that section by falling through to the normal path rather than taking the typical custom_section path for wasm.
The wasm-ld appears to have a bunch of limitations. Only one static with the `link_section` in a crate or else you hit the fatal error in the link above "only one .init_array section fragment supported". They do not get merged.
You can still call multiple constructors by setting it to an array.
```
unsafe extern "C" fn ctor() {
println!("foo");
}
#[used]
#[link_section = ".init_array"]
static FOO: [unsafe extern "C" fn(); 2] = [ctor, ctor];
```
Another issue appears to be that if crate *A* depends on crate *B*, but *A* doesn't call any symbols from *B* and *B* doesn't `#[export_name = ...]` any symbols, then crate *B*'s constructor will not be called. The workaround to this is to provide an exported symbol in crate *B*.
... this is a special attribute that was made to be a target-feature in
LLVM 18+, but in all previous versions, this "feature" is a naked
attribute. We will have to handle this situation differently than all
other target-features.
Sync ar_archive_writer to LLVM 18.1.3
From LLVM 15.0.0-rc3. This adds support for COFF archives containing Arm64EC object files and has various fixes for AIX big archive files.
Currently the output on failure is as follows:
Compiling block-buffer v0.10.4
Compiling crypto-common v0.1.6
Compiling digest v0.10.7
Compiling sha2 v0.10.8
Compiling xz2 v0.1.7
error: failed to build archive: No such file or directory
error: could not compile `bootstrap` (lib) due to 1 previous error
Print which file is being constructed to give some hint about what is
going on.
In the future, branch and MC/DC mappings might have expressions that don't
correspond to any single point in the control-flow graph. That makes it
trickier to keep track of which expressions should expect an `ExpressionUsed`
node.
We therefore sidestep that complexity by only performing `ExpressionUsed`
simplification for expressions associated directly with ordinary `Code`
mappings.
Fix incorrect NDEBUG handling in LLVM bindings
We currently compile our LLVM bindings using `-DNDEBUG` if debuginfo for LLVM is disabled. However, `NDEBUG` doesn't have any relation to debuginfo, it controls whether assertions are enabled.
Split the LLVM_NDEBUG environment variable into two, so that assertions and debuginfo are controlled independently.
After this change, `LLVMRustDIBuilderInsertDeclareAtEnd` triggers an assertion failure on LLVM 19 due to an incorrect cast. Fix it by removing the unused return value entirely.
r? `@cuviper`
The return value changed from an Instruction to a DbgRecord in
LLVM 19. As we don't actually use the result, drop the return
value entirely to support both.
Rollup of 8 pull requests
Successful merges:
- #124599 (Suggest borrowing on fn argument that is `impl AsRef`)
- #127572 (Don't mark `DEBUG_EVENT` struct as `repr(packed)`)
- #127588 (core: Limit remaining f16 doctests to x86_64 linux)
- #127591 (Make sure that labels are defined after the primary span in diagnostics)
- #127598 (Allows `#[diagnostic::do_not_recommend]` to supress trait impls in suggestions as well)
- #127599 (Rename `lazy_cell_consume` to `lazy_cell_into_inner`)
- #127601 (check is_ident before parse_ident)
- #127605 (Remove extern "wasm" ABI)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `f16` and `f128` as simd types in LLVM
`@sayantn` is working on adding SIMD for `f16` and hitting the `FloatingPointVector` error. This should fix it and unblock adding support for `simd_fma` and `simd_fabs` in stdarch.
Remove the unstable `extern "wasm"` ABI (`wasm_abi` feature tracked
in #83788).
As discussed in https://github.com/rust-lang/rust/pull/127513#issuecomment-2220410679
and following, this ABI is a failed experiment that did not end
up being used for anything. Keeping support for this ABI in LLVM 19
would require us to switch wasm targets to the `experimental-mv`
ABI, which we do not want to do.
It should be noted that `Abi::Wasm` was internally used for two
things: The `-Z wasm-c-abi=legacy` ABI that is still used by
default on some wasm targets, and the `extern "wasm"` ABI. Despite
both being `Abi::Wasm` internally, they were not the same. An
explicit `extern "wasm"` additionally enabled the `+multivalue`
feature.
I've opted to remove `Abi::Wasm` in this patch entirely, instead
of keeping it as an ABI with only internal usage. Both
`-Z wasm-c-abi` variants are now treated as part of the normal
C ABI, just with different different treatment in
adjust_for_foreign_abi.
Add Natvis visualiser and debuginfo tests for `f16`
To render `f16`s in debuggers on MSVC targets, this PR changes the compiler to output `f16`s as `struct f16 { bits: u16 }`, and includes a Natvis visualiser that manually converts the `f16`'s bits to a `float` which is can then be displayed by debuggers. `gdb`, `lldb` and `cdb` tests are also included for `f16` .
`f16`/`f128` MSVC debug info issue: #121837
Tracking issue: #116909
Miri function identity hack: account for possible inlining
Having a non-lifetime generic is not the only reason a function can be duplicated. Another possibility is that the function may be eligible for cross-crate inlining. So also take into account the inlining attribute in this Miri hack for function pointer identity.
That said, `cross_crate_inlinable` will still sometimes return true even for `inline(never)` functions:
- when they are `DefKind::Ctor(..) | DefKind::Closure` -- I assume those cannot be `InlineAttr::Never` anyway?
- when `cross_crate_inline_threshold == InliningThreshold::Always`
so maybe this is still not quite the right criterion to use for function pointer identity.
Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Since this codegen flag now only controls LLVM-generated comments rather than
all assembly comments, make the name more accurate (and also match Clang).
`-Z patchable-function-entry` works like `-fpatchable-function-entry`
on clang/gcc. The arguments are total nop count and function offset.
See MCP rust-lang/compiler-team#704
Deprecate no-op codegen option `-Cinline-threshold=...`
This deprecates `-Cinline-threshold` since using it has no effect. This has been the case since the new LLVM pass manager started being used, more than 2 years ago.
Recommend using `-Cllvm-args=--inline-threshold=...` instead.
Closes#89742 which is E-help-wanted.
Add `f16` inline ASM support for 32-bit ARM
Adds `f16` inline ASM support for 32-bit ARM. SIMD vector types are taken from [here](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:`@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiesarchitectures=[A32]).`
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Add `f16` inline ASM support for RISC-V
This PR adds `f16` inline ASM support for RISC-V. A `FIXME` is left for `f128` support as LLVM does not support the required `Q` (Quad-Precision Floating-Point) extension yet.
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Honor collapse_debuginfo for statics.
fixes#126363
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.
This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using
r? <reviewer name>
-->
Also sort `crt-static` in `--print target-features` output
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`), because it was put at the bottom of the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out and remains sorted and does not need to be sorted an extra time.
On my machine the diff is just:
```diff
$ diff -u /tmp/before2.txt /tmp/after2.txt
--- /tmp/before2.txt 2024-06-13 20:40:27.091636592 +0200
+++ /tmp/after2.txt 2024-06-13 20:39:54.584894891 +0200
``@@`` -20,6 +20,7 ``@@``
bmi1 - Support BMI instructions.
bmi2 - Support BMI2 instructions.
cmpxchg16b - 64-bit with cmpxchg16b (this is true for most x86-64 chips, but not the first AMD chips).
+ crt-static - Enables C Run-time Libraries to be statically linked.
ermsb - REP MOVS/STOS are fast.
f16c - Support 16-bit floating point conversion instructions.
fma - Enable three-operand fused multiple-add.
``@@`` -49,7 +50,6 ``@@``
xsavec - Support xsavec instructions.
xsaveopt - Support xsaveopt instructions.
xsaves - Support xsaves instructions.
- crt-static - Enables C Run-time Libraries to be statically linked.
Code-generation features supported by LLVM for this target:
16bit-mode - 16-bit mode (i8086).
```
I couldn't find a ui test that tested this output. Let's see if CI finds a regression tests.
This deprecates `-Cinline-threshold` since using it has no effect. This
has been the case since the new LLVM pass manager started being used,
more than 2 years ago.
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`),
because it was put at the bottom the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out sorted and does not need to
be sorted an extra time.
We already do this for a number of crates, e.g. `rustc_middle`,
`rustc_span`, `rustc_metadata`, `rustc_span`, `rustc_errors`.
For the ones we don't, in many cases the attributes are a mess.
- There is no consistency about order of attribute kinds (e.g.
`allow`/`deny`/`feature`).
- Within attribute kind groups (e.g. the `feature` attributes),
sometimes the order is alphabetical, and sometimes there is no
particular order.
- Sometimes the attributes of a particular kind aren't even grouped
all together, e.g. there might be a `feature`, then an `allow`, then
another `feature`.
This commit extends the existing sorting to all compiler crates,
increasing consistency. If any new attribute line is added there is now
only one place it can go -- no need for arbitrary decisions.
Exceptions:
- `rustc_log`, `rustc_next_trait_solver` and `rustc_type_ir_macros`,
because they have no crate attributes.
- `rustc_codegen_gcc`, because it's quasi-external to rustc (e.g. it's
ignored in `rustfmt.toml`).
Directly add extension instead of using `Path::with_extension`
`Path::with_extension` has a nice footgun when the original path doesn't contain an extension: Anything after the last dot gets removed.
Make repr(packed) vectors work with SIMD intrinsics
In #117116 I fixed `#[repr(packed, simd)]` by doing the expected thing and removing padding from the layout. This should be the last step in providing a solution to rust-lang/portable-simd#319
Rollup of 6 pull requests
Successful merges:
- #125263 (rust-lld: fallback to rustc's sysroot if there's no path to the linker in the target sysroot)
- #125345 (rustc_codegen_llvm: add support for writing summary bitcode)
- #125362 (Actually use TAIT instead of emulating it)
- #125412 (Don't suggest adding the unexpected cfgs to the build-script it-self)
- #125445 (Migrate `run-make/rustdoc-with-short-out-dir-option` to `rmake.rs`)
- #125452 (Cleanup check-cfg handling in core and std)
r? `@ghost`
`@rustbot` modify labels: rollup
rustc_codegen_llvm: add support for writing summary bitcode
Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.
I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.
Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.
If we don't do this, some versions of LLVM (at least 17, experimentally)
will double-emit some error messages, which is how I noticed this. Given
that it seems to be costing some extra work, let's only request the
summary bitcode production if we'll actually bother writing it down,
otherwise skip it.
Typical uses of ThinLTO don't have any use for this as a standalone
file, but distributed ThinLTO uses this to make the linker phase more
efficient. With clang you'd do something like `clang -flto=thin
-fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o
(full of bitcode) and foo.indexing.o (just the summary or index part of
the bitcode). That's then usable by a two-stage linking process that's
more friendly to distributed build systems like bazel, which is why I'm
working on this area.
I talked some to @teresajohnson about naming in this area, as things
seem to be a little confused between various blog posts and build
systems. "bitcode index" and "bitcode summary" tend to be a little too
ambiguous, and she tends to use "thin link bitcode" and "minimized
bitcode" (which matches the descriptions in LLVM). Since the clang
option is thin-link-bitcode, I went with that to try and not add a new
spelling in the world.
Per @dtolnay, you can work around the lack of this by using `lld
--thinlto-index-only` to do the indexing on regular .o files of
bitcode, but that is a bit wasteful on actions when we already have all
the information in rustc and could just write out the matching minimized
bitcode. I didn't test that at all in our infrastructure, because by the
time I learned that I already had this patch largely written.
This code for recalculating `mcdc_bitmap_bytes` doesn't provide any benefit,
because its result won't have changed from the value in `FunctionCoverageInfo`
that was computed during the MIR instrumentation pass.
Rollup of 5 pull requests
Successful merges:
- #124615 (coverage: Further simplify extraction of mapping info from MIR)
- #124778 (Fix parse error message for meta items)
- #124797 (Refactor float `Primitive`s to a separate `Float` type)
- #124888 (Migrate `run-make/rustdoc-output-path` to rmake)
- #124957 (Make `Ty::builtin_deref` just return a `Ty`)
r? `@ghost`
`@rustbot` modify labels: rollup
Refactor float `Primitive`s to a separate `Float` type
Now there are 4 of them, it makes sense to refactor `F16`, `F32`, `F64` and `F128` out of `Primitive` and into a separate `Float` type (like integers already are). This allows patterns like `F16 | F32 | F64 | F128` to be simplified into `Float(_)`, and is consistent with `ty::FloatTy`.
As a side effect, this PR also makes the `Ty::primitive_size` method work with `f16` and `f128`.
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
codegen: memmove/memset cannot be non-temporal
non-temporal memset is not a thing.
And for memmove, since the LLVM backend doesn't support this, surely we don't need it in the GCC backend.
LLVM has updated data layouts to specify `Fn32` on 64-bit ARM to avoid
C++ accidentally underaligning functions when trying to comply with
member function ABIs.
This should only affect Rust in cases where we had a similar bug (I
don't believe we have one), but our data layout must match to generate
code.
As a compatibility adaptatation, if LLVM is not version 19 yet, `Fn32`
gets voided from the data layout.
See llvm/llvm-project#90415
coverage: Clean up creation of MC/DC condition bitmaps
This PR improves the code for creating and initializing [MC/DC](https://en.wikipedia.org/wiki/Modified_condition/decision_coverage) condition bitmap variables, as introduced by #123409 and modified by #124255.
- The condition bitmap variables are now created eagerly at the start of per-function codegen, via a new `init_coverage` method in `CoverageInfoBuilderMethods`. This avoids having to retroactively create the bitmaps while doing codegen for an individual coverage statement.
- As a result, we can now create and initialize those bitmaps using existing safe APIs, instead of having to perform our own unsafe call to `llvm::LLVMBuildAlloca`.
- This PR also tweaks the way we count the number of condition bitmaps needed, by tracking the total number of bitmaps needed (max depth + 1), instead of only tracking the maximum depth. This reduces the potential for subtle off-by-one confusion.
Stabilize the size of incr comp object file names
The current implementation does not produce stable-length paths, and we create the paths in a way that makes our allocation behavior is nondeterministic. I think `@eddyb` fixed a number of other cases like this in the past, and this PR fixes another one. Whether that actually matters I have no idea, but we still have bimodal behavior in rustc-perf and the non-uniformity in `find` and `ls` was bothering me.
I've also removed the truncation of the mangled CGU names. Before this PR incr comp paths look like this:
```
target/debug/incremental/scratch-38izrrq90cex7/s-gux6gz0ow8-1ph76gg-ewe1xj434l26w9up5bedsojpd/261xgo1oqnd90ry5.o
```
And after, they look like this:
```
target/debug/incremental/scratch-035omutqbfkbw/s-gux6borni0-16r3v1j-6n64tmwqzchtgqzwwim5amuga/55v2re42sztc8je9bva6g8ft3.o
```
On the one hand, I'm sure this will break some people's builds because they're on Windows and only a few bytes from the path length limit. But if we're that seriously worried about the length of our file names, I have some other ideas on how to make them smaller. And last time I deleted some hash truncations from the compiler, there was a huge drop in the number if incremental compilation ICEs that were reported: https://github.com/rust-lang/rust/pull/110367https://github.com/rust-lang/rust/pull/110367
---
Upon further reading, this PR actually fixes a bug. This comment says the CGU names are supposed to be a fixed-length hash, and before this PR they aren't: ca7d34efa9/compiler/rustc_monomorphize/src/partitioning.rs (L445-L448)
Because this now always takes place at the start of the function, we can just
use the normal `alloca` method and then initialize each bitmap immediately.
This patch also moves bitmap setup out of the `mcdc_parameters` method, because
there is no longer any particular reason for it to be there.
Remove many `#[macro_use] extern crate foo` items
This requires the addition of more `use` items, which often make the code more verbose. But they also make the code easier to read, because `#[macro_use]` obscures where macros are defined.
r? `@fee1-dead`
MCDC coverage: support nested decision coverage
#123409 provided the initial MCDC coverage implementation.
As referenced in #124144, it does not currently support "nested" decisions, like the following example :
```rust
fn nested_if_in_condition(a: bool, b: bool, c: bool) {
if a && if b || c { true } else { false } {
say("yes");
} else {
say("no");
}
}
```
Note that there is an if-expression (`if b || c ...`) embedded inside a boolean expression in the decision of an outer if-expression.
This PR proposes a workaround for this cases, by introducing a Decision context stack, and by handing several `temporary condition bitmaps` instead of just one.
When instrumenting boolean expressions, if the current node is a leaf condition (i.e. not a `||`/`&&` logical operator nor a `!` not operator), we insert a new decision context, such that if there are more boolean expressions inside the condition, they are handled as separate expressions.
On the codegen LLVM side, we allocate as many `temp_cond_bitmap`s as necessary to handle the maximum encountered decision depth.
Add decision_depth field to TVBitmapUpdate/CondBitmapUpdate statements
Add decision_depth field to BcbMappingKinds MCDCBranch and MCDCDecision
Add decision_depth field to MCDCBranchSpan and MCDCDecisionSpan
Set writable and dead_on_unwind attributes for sret arguments
Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's.
See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that:
* The return slot is writable.
* The return slot will not be read if the function unwinds.
Fixes https://github.com/rust-lang/rust/issues/90595.
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
Dellvmize some intrinsics (use `u32` instead of `Self` in some integer intrinsics)
This implements https://github.com/rust-lang/compiler-team/issues/693 minus what was implemented in #123226.
Note: I decided to _not_ change `shl`/... builder methods, as it just doesn't seem worth it.
r? ``@scottmcm``
[cleanup] [llvm backend] Prevent creating the same `Instance::mono` multiple times
Just a little thing I came across while going through the code.
r? ```@oli-obk```
Add support for Arm64EC to the Standard Library
Adds the final pieces so that the standard library can be built for arm64ec-pc-windows-msvc (initially added in #119199)
* Bumps `windows-sys` to 0.56.0, which adds support for Arm64EC.
* Correctly set the `isEC` parameter for LLVM's `writeArchive` function.
* Add `#![feature(asm_experimental_arch)]` to library crates where Arm64EC inline assembly is used, as it is currently unstable.
This makes sure that &[] is just as efficient as indirecting through
unsafe code (from_raw_parts). No new stable guarantee is intended about
whether or not we do this, this is just an optimization.
Co-authored-by: Ralf Jung <post@ralfj.de>
Add the missing inttoptr when we ptrtoint in ptr atomics
Ralf noticed this here: https://github.com/rust-lang/rust/pull/122220#discussion_r1535172094
Our previous codegen forgot to add the cast back to integer type. The code compiles anyway, because of course all locals are in-memory to start with, so previous codegen would do the integer atomic, store the integer to a local, then load a pointer from that local. Which is definitely _not_ what we wanted: That's an integer-to-pointer transmute, so all pointers returned by these `AtomicPtr` methods didn't have provenance. Yikes.
Here's the IR for `AtomicPtr::fetch_byte_add` on 1.76: https://godbolt.org/z/8qTEjeraY
```llvm
define noundef ptr `@atomicptr_fetch_byte_add(ptr` noundef nonnull align 8 %a, i64 noundef %v) unnamed_addr #0 !dbg !7 {
start:
%0 = alloca ptr, align 8, !dbg !12
%val = inttoptr i64 %v to ptr, !dbg !12
call void `@llvm.lifetime.start.p0(i64` 8, ptr %0), !dbg !28
%1 = ptrtoint ptr %val to i64, !dbg !28
%2 = atomicrmw add ptr %a, i64 %1 monotonic, align 8, !dbg !28
store i64 %2, ptr %0, align 8, !dbg !28
%self = load ptr, ptr %0, align 8, !dbg !28
call void `@llvm.lifetime.end.p0(i64` 8, ptr %0), !dbg !28
ret ptr %self, !dbg !33
}
```
r? `@RalfJung`
cc `@nikic`
Make `PlaceRef` and `OperandValue::Ref` share a common `PlaceValue` type
Both `PlaceRef` and `OperandValue::Ref` need the triple of the backend pointer immediate, the optional backend metadata for DSTs, and the actual alignment of the place (since it can differ from the ABI alignment).
This PR introduces a new `PlaceValue` type for those three values, leaving [`PlaceRef`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/place/struct.PlaceRef.html) with the `TyAndLayout` and a `PlaceValue`, just like how [`OperandRef`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/operand/struct.OperandRef.html) is a `TyAndLayout` and an `OperandValue`.
This means that various places that use `Ref`s as places can just pass the `PlaceValue` along, like in the below excerpt from the diff:
```diff
match operand.val {
- OperandValue::Ref(ptr, meta, align) => {
- debug_assert_eq!(meta, None);
+ OperandValue::Ref(source_place_val) => {
+ debug_assert_eq!(source_place_val.llextra, None);
debug_assert!(matches!(operand_kind, OperandValueKind::Ref));
- let fake_place = PlaceRef::new_sized_aligned(ptr, cast, align);
+ let fake_place = PlaceRef { val: source_place_val, layout: cast };
Some(bx.load_operand(fake_place).val)
}
```
There's more refactoring that I'd like to do after this, but I wanted to stop the PR here where it's hopefully easy (albeit probably not quick) to review since I tried to keep every change line-by-line clear. (Most are just adding `.val` to get to a field.)
You can also go commit-at-a-time if you'd like. Each passed tidy and the codegen tests on my machine (though I didn't run the cg_gcc ones).
I added this back in 111999, but I no longer think it's a good idea
- It had to get scaled back to only power-of-two things to not break a bunch of targets
- LLVM seems to be getting better at memcpy removal anyway
- Introducing vector instructions has seemed to sometimes (115515) make autovectorization worse
So this removes it from the codegen crates entirely, and instead just tries to use <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html#method.typed_place_copy> instead of direct `memcpy` so things will still use load/store for immediates.
sanitizers: Create the rustc_sanitizers crate
Create the `rustc_sanitizers` crate and move the source code for the CFI and KCFI sanitizers to it. The tracking issue for reviewing and moving sanitizers into a compiler crate is #123619. This is part of our work to organize and stabilize support for the sanitizers. (See our roadmap at https://hackmd.io/`@rcvalle/S1Ou9K6H6.)`
Create the rustc_sanitizers crate and move the source code for the CFI
and KCFI sanitizers to it.
Co-authored-by: David Wood <agile.lion3441@fuligin.ink>
The actual ABI implication here is that in some cases the values
are required to be "consecutive", i.e. must either all be passed
in registers or all on stack (without padding).
Adjust the code to either use Uniform::new() or Uniform::consecutive()
depending on which behavior is needed.
Then, when lowering this in LLVM, skip the [1 x i128] to i128
simplification if is_consecutive is set. i128 is the only case
I'm aware of where this is problematic right now. If we find
other cases, we can extend this (either based on target information
or possibly just by not simplifying for is_consecutive entirely).
When passing a 16 (or higher) aligned struct by value on ppc64le,
it needs to be passed as an array of `i128` rather than an array
of `i64`. This will force the use of an even starting register.
For the case of a 16 byte struct with alignment 16 it is important
that `[1 x i128]` is used instead of `i128` -- apparently, the
latter will get treated similarly to `[2 x i64]`, not exhibiting
the correct ABI. Add a `force_array` flag to `Uniform` to support
this.
The relevant clang code can be found here:
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L878-L884)fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L780-L784)
I think the corresponding psABI wording is this:
> Fixed size aggregates and unions passed by value are mapped to as
> many doublewords of the parameter save area as the value uses in
> memory. Aggregrates and unions are aligned according to their
> alignment requirements. This may result in doublewords being
> skipped for alignment.
In particular the last sentence.
Fixes https://github.com/rust-lang/rust/issues/122767.
CFI: Restore typeid_for_instance default behavior
Restore typeid_for_instance default behavior of performing self type erasure, since it's the most common case and what it does most of the time. Using concrete self (or not performing self type erasure) is for assigning a secondary type id, and secondary type ids are only assigned when they're unique and to methods, and also are only tested for when methods are used as function pointers.
Add aarch64-apple-visionos and aarch64-apple-visionos-sim tier 3 targets
Introduces `aarch64-apple-visionos` and `aarch64-apple-visionos-sim` as tier 3 targets. This allows native development for the Apple Vision Pro's visionOS platform.
This work has been tracked in https://github.com/rust-lang/compiler-team/issues/642. There is a corresponding `libc` change https://github.com/rust-lang/libc/pull/3568 that is not required for merge.
Ideally we would be able to incorporate [this change](https://github.com/gimli-rs/object/pull/626) to the `object` crate, but the author has stated that a release will not be cut for quite a while. Therefore, the two locations that would reference the xrOS constant from `object` are hardcoded to their MachO values of 11 and 12, accompanied by TODOs to mark the code as needing change. I am open to suggestions on what to do here to get this checked in.
# Tier 3 Target Policy
At this tier, the Rust project provides no official support for a target, so we place minimal requirements on the introduction of targets.
> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)
See [src/doc/rustc/src/platform-support/apple-visionos.md](e88379034a/src/doc/rustc/src/platform-support/apple-visionos.md)
> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.
> * Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.
> * If possible, use only letters, numbers, dashes and underscores for the name. Periods (.) are known to cause issues in Cargo.
This naming scheme matches `$ARCH-$VENDOR-$OS-$ABI` which is matches the iOS Apple Silicon simulator (`aarch64-apple-ios-sim`) and other Apple targets.
> Tier 3 targets may have unusual requirements to build or use, but must not
create legal issues or impose onerous legal terms for the Rust project or for
Rust developers or users.
> - The target must not introduce license incompatibilities.
> - Anything added to the Rust repository must be under the standard Rust license (`MIT OR Apache-2.0`).
> - The target must not cause the Rust tools or libraries built for any other host (even when supporting cross-compilation to the target) to depend on any new dependency less permissive than the Rust licensing policy. This applies whether the dependency is a Rust crate that would require adding new license exceptions (as specified by the `tidy` tool in the rust-lang/rust repository), or whether the dependency is a native library or binary. In other words, the introduction of the target must not cause a user installing or running a version of Rust or the Rust tools to besubject to any new license requirements.
> - Compiling, linking, and emitting functional binaries, libraries, or other code for the target (whether hosted on the target itself or cross-compiling from another target) must not depend on proprietary (non-FOSS) libraries. Host tools built for the target itself may depend on the ordinary runtime libraries supplied by the platform and commonly used by other applications built for the target, but those libraries must not be required for code generation for the target; cross-compilation to the target must not require such libraries at all. For instance, `rustc` built for the target may depend on a common proprietary C runtime library or console output library, but must not depend on a proprietary code generation library or code optimization library. Rust's license permits such combinations, but the Rust project has no interest in maintaining such combinations within the scope of Rust itself, even at tier 3.
> - "onerous" here is an intentionally subjective term. At a minimum, "onerous" legal/licensing terms include but are *not* limited to: non-disclosure requirements, non-compete requirements, contributor license agreements (CLAs) or equivalent, "non-commercial"/"research-only"/etc terms, requirements conditional on the employer or employment of any particular Rust developers, revocable terms, any requirements that create liability for the Rust project or its developers or users, or any requirements that adversely affect the livelihood or prospects of the Rust project or its developers or users.
This contribution is fully available under the standard Rust license with no additional legal restrictions whatsoever. This PR does not introduce any new dependency less permissive than the Rust license policy.
The new targets do not depend on proprietary libraries.
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.
This new target mirrors the standard library for watchOS and iOS, with minor divergences.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running binaries, or running tests (even if they do not pass), the documentation must explain how to run such binaries or tests for the target, using emulation if possible or dedicated hardware if necessary.
Documentation is provided in [src/doc/rustc/src/platform-support/apple-visionos.md](e88379034a/src/doc/rustc/src/platform-support/apple-visionos.md)
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
> * This requirement does not prevent part or all of this policy from being cited in an explicit contract or work agreement (e.g. to implement or maintain support for a target). This requirement exists to ensure that a developer or team responsible for reviewing and approving a target does not face any legal threats or obligations that would prevent them from freely exercising their judgment in such approval, even if such judgment involves subjective matters or goes beyond the letter of these requirements.
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via `@)` to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
> * Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
> * In particular, this may come up when working on closely related targets, such as variations of the same architecture with different features. Avoid introducing unconditional uses of features that another variation of the target may not have; use conditional compilation or runtime detection, as appropriate, to let each target run code supported by that target.
I acknowledge these requirements and intend to ensure that they are met.
This target does not touch any existing tier 2 or tier 1 targets and should not break any other targets.
Restore typeid_for_instance default behavior of performing self type
erasure, since it's the most common case and what it does most of the
time. Using concrete self (or not performing self type erasure) is for
assigning a secondary type id, and secondary type ids are only assigned
when they're unique and to methods, and also are only tested for when
methods are used as function pointers.
Rollup of 9 pull requests
Successful merges:
- #121546 (Error out of layout calculation if a non-last struct field is unsized)
- #122448 (Port hir-tree run-make test to ui test)
- #123212 (CFI: Change type transformation to use TypeFolder)
- #123218 (Add test for getting parent HIR for synthetic HIR node)
- #123324 (match lowering: make false edges more precise)
- #123389 (Avoid panicking unnecessarily on startup)
- #123397 (Fix diagnostic for qualifier in extern block)
- #123431 (Stabilize `proc_macro_byte_character` and `proc_macro_c_str_literals`)
- #123439 (coverage: Remove useless constants)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Remove useless constants
After #122972 and #123419, these constants don't serve any useful purpose, so get rid of them.
`@rustbot` label +A-code-coverage
coverage: Correctly report and check LLVM's coverage mapping version
I was puzzled by the fact that the LLVM 18 update (#120055) didn't need to modify this version check, despite the fact that LLVM 18 uses a newer version of the coverage mapping format.
This turned out to be because we were inappropriately hard-coding a specific version (`Version6`) in the C++ wrapper, instead of using `CovMapVersion::CurrentVersion` to reflect the version actually used by LLVM on our behalf.
This PR fixes that, and also changes the Rust-side version check to accept the new coverage mapping version used by LLVM 18, since the necessary compatibility work has already been done.
---
### Quick history of `LLVMRustCoverageMappingVersion`:
- Originally it returned LLVM's `coverage::CovMapVersion::CurrentVersion`, as intended. The Rust-side code would verify it, and also embed it as the actual coverage version number in the output binary.
- At some point it was changed to a hard-coded value, to work around a (now-irrelevant) compatibility issue. This was incorrect (but mostly benign), because the override should have been performed on the Rust side instead, after verifying LLVM's value.
- Later contributors dutifully updated the hard-coded value, because they didn't have enough context to identify the problem.
- With this PR, it once again returns LLVM's current coverage version number, and the Rust-side code checks it against an expected range. We don't override the result, but we do indicate where that override should occur if it ever becomes necessary.
CFI: Support function pointers for trait methods
Adds support for both CFI and KCFI for function pointers to trait methods by attaching both concrete and abstract types to functions.
KCFI does this through generation of a `ReifyShim` on any function pointer for a method that could go into a vtable, and keeping this separate from `ReifyShim`s that are *intended* for vtable us by setting a `ReifyReason` on them.
CFI does this by setting both the concrete and abstract type on every instance.
This should land after #123024 or a similar PR, as it diverges the implementation of CFI vs KCFI.
r? `@compiler-errors`
Rename `expose_addr` to `expose_provenance`
`expose_addr` is a bad name, an address is just a number and cannot be exposed. The operation is actually about the provenance of the pointer.
This PR thus changes the name of the method to `expose_provenance` without changing its return type. There is sufficient precedence for returning a useful value from an operation that does something else without the name indicating such, e.g. [`Option::insert`](https://doc.rust-lang.org/nightly/std/option/enum.Option.html#method.insert) and [`MaybeUninit::write`](https://doc.rust-lang.org/nightly/std/mem/union.MaybeUninit.html#method.write).
Returning the address is merely convenient, not a fundamental part of the operation. This is implied by the fact that integers do not have provenance since
```rust
let addr = ptr.addr();
ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
must behave exactly like
```rust
let addr = ptr.expose_provenance();
let new = ptr::with_exposed_provenance(addr);
```
as the result of `ptr.expose_provenance()` and `ptr.addr()` is the same integer. Therefore, this PR removes the `#[must_use]` annotation on the function and updates the documentation to reflect the important part.
~~An alternative name would be `expose_provenance`. I'm not at all opposed to that, but it makes a stronger implication than we might want that the provenance of the pointer returned by `ptr::with_exposed_provenance`[^1] is the same as that what was exposed, which is not yet specified as such IIUC. IMHO `expose` does not make that connection.~~
A previous version of this PR suggested `expose` as name, libs-api [decided on](https://github.com/rust-lang/rust/pull/122964#issuecomment-2033194319) `expose_provenance` to keep the symmetry with `with_exposed_provenance`.
CC `@RalfJung`
r? libs-api
[^1]: I'm using the new name for `from_exposed_addr` suggested by #122935 here.
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Use the `Align` type when parsing alignment attributes
Use the `Align` type in `rustc_attr::parse_alignment`, removing the need to call `Align::from_bytes(...).unwrap()` later in the compilation process.
Simplify trim-paths feature by merging all debuginfo options together
This PR simplifies the trim-paths feature by merging all debuginfo options together, as described in https://github.com/rust-lang/rust/issues/111540#issuecomment-1994010274.
And also do some correctness fixes found during the review.
cc `@weihanglo`
r? `@michaelwoerister`
CFI: Fix methods as function pointer cast
Fix casting between methods and function pointers by assigning a secondary type id to methods with their concrete self so they can be used as function pointers.
This was split off from #116404.
cc `@compiler-errors` `@workingjubilee`
Fix casting between methods and function pointers by assigning a
secondary type id to methods with their concrete self so they can be
used as function pointers.
Make `TyCtxt::coroutine_layout` take coroutine's kind parameter
For coroutines that come from coroutine-closures (i.e. async closures), we may have two kinds of bodies stored in the coroutine; one that takes the closure's captures by reference, and one that takes the captures by move.
These currently have identical layouts, but if we do any optimization for these layouts that are related to the upvars, then they will diverge -- e.g. https://github.com/rust-lang/rust/pull/120168#discussion_r1536943728.
This PR relaxes the assertion I added in #121122, and instead make the `TyCtxt::coroutine_layout` method take the `coroutine_kind_ty` argument from the coroutine, which will allow us to differentiate these by-move and by-ref bodies.
coverage: Re-enable `UnreachablePropagation` for coverage builds
This is a sequence of 3 related changes:
- Clean up the existing code that scans for unused functions
- Detect functions that were instrumented for coverage, but have had all their coverage statements removed by later MIR transforms (e.g. `UnreachablePropagation`)
- Re-enable `UnreachablePropagation` in coverage builds
Because we now detect functions that have lost their coverage statements, and treat them as unused, we don't need to worry about `UnreachablePropagation` removing all of those statements. This is demonstrated by `tests/coverage/unreachable.rs`.
Fixes#116171.
If a function was instrumented for coverage, but all of its coverage statements
have been removed by later MIR transforms, it should be treated as "unused"
even if the compiler generates an unreachable stub for it.
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Some of the marker statements used by coverage are added during MIR building
for use by the InstrumentCoverage pass (during analysis), and are not needed
afterwards.
various clippy fixes
We need to keep the order of the given clippy lint rules before passing them.
Since clap doesn't offer any useful interface for this purpose out of the box,
we have to handle it manually.
Additionally, this PR makes `-D` rules work as expected. Previously, lint rules were limited to `-W`. By enabling `-D`, clippy began to complain numerous lines in the tree, all of which have been resolved in this PR as well.
Fixes#121481
cc `@matthiaskrgr`
Update the minimum external LLVM to 17
With this change, we'll have stable support for LLVM 17 and 18.
For reference, the previous increase to LLVM 16 was #117947.
LLVM's default bad-alloc handler may throw if exceptions are enabled,
and `operator new` isn't hooked at all by default. Now we register our
own handler that prints a message similar to fatal errors, then aborts.
We also call the function that registers the C++ `std::new_handler`.
coverage: Initial support for branch coverage instrumentation
(This is a review-ready version of the changes that were drafted in #118305.)
This PR adds support for branch coverage instrumentation, gated behind the unstable flag value `-Zcoverage-options=branch`. (Coverage instrumentation must also be enabled with `-Cinstrument-coverage`.)
During THIR-to-MIR lowering (MIR building), if branch coverage is enabled, we collect additional information about branch conditions and their corresponding then/else blocks. We inject special marker statements into those blocks, so that the `InstrumentCoverage` MIR pass can reliably identify them even after the initially-built MIR has been simplified and renumbered.
The rest of the changes are mostly just plumbing needed to gather up the information that was collected during MIR building, and include it in the coverage metadata that we embed in the final binary.
Note that `llvm-cov show` doesn't print branch coverage information in its source views by default; that needs to be explicitly enabled with `--show-branches=count` or similar.
---
The current implementation doesn't have any support for instrumenting `if let` or let-chains. I think it's still useful without that, and adding it would be non-trivial, so I'm happy to leave that for future work.
coverage: Remove or migrate all unstable values of `-Cinstrument-coverage`
(This PR was substantially overhauled from its original version, which migrated all of the existing unstable values intact.)
This PR takes the three nightly-only values that are currently accepted by `-Cinstrument-coverage`, completely removes two of them (`except-unused-functions` and `except-unused-generics`), and migrates the third (`branch`) over to a newly-introduced unstable flag `-Zcoverage-options`.
I have a few motivations for wanting to do this:
- It's unclear whether anyone actually uses the `except-unused-*` values, so this serves as an opportunity to either remove them, or prompt existing users to object to their removal.
- After #117199, the stable values of `-Cinstrument-coverage` treat it as a boolean-valued flag, so having nightly-only extra values feels out-of-place.
- Nightly-only values also require extra ad-hoc code to make sure they aren't accidentally exposed to stable users.
- The new system allows multiple different settings to be toggled independently, which isn't possible in the current single-value system.
- The new system makes it easier to introduce new behaviour behind an unstable toggle, and then gather nightly-user feedback before possibly making it the default behaviour for all users.
- The new system also gives us a convenient place to put relatively-narrow options that won't ever be the default, but that nightly users might still want access to.
- It's likely that we will eventually want to give stable users more fine-grained control over coverage instrumentation. The new flag serves as a prototype of what that stable UI might eventually look like.
The `branch` option is a placeholder that currently does nothing. It will be used by #122322 to opt into branch coverage instrumentation.
---
I see `-Zcoverage-options` as something that will exist more-or-less indefinitely, though individual sub-options might come and go as appropriate. I think there will always be some demand for nightly-only toggles, so I don't see `-Zcoverage-options` itself ever being stable, though we might eventually stabilize something similar to it.
Only generate a ptrtoint in AtomicPtr codegen when absolutely necessary
This special case was added in this PR: https://github.com/rust-lang/rust/pull/77611 in response to this error message:
```
Intrinsic has incorrect argument type!
void ({}*)* `@llvm.ppc.cfence.p0sl_s`
in function rust_oom
LLVM ERROR: Broken function found, compilation aborted!
[RUSTC-TIMING] std test:false 20.161
error: could not compile `std`
```
But when I tried searching for more information about that intrinsic I found this: https://github.com/llvm/llvm-project/issues/55983 which is a report of someone hitting this same error and a fix was landed in LLVM, 2 years after the above Rust PR.
Ensure nested allocations in statics neither get deduplicated nor duplicated
This PR generates new `DefId`s for nested allocations in static items and feeds all the right queries to make the compiler believe these are regular `static` items. I chose this design, because all other designs are fragile and make the compiler horribly complex for such a niche use case.
At present this wrecks incremental compilation performance *in case nested allocations exist* (because any query creating a `DefId` will be recomputed and never loaded from the cache). This will be resolved later in https://github.com/rust-lang/rust/pull/115613 . All other statics are unaffected by this change and will not have performance regressions (heh, famous last words)
This PR contains various smaller refactorings that can be pulled out into separate PRs. It is best reviewed commit-by-commit. The last commit is where the actual magic happens.
r? `@RalfJung` on the const interner and engine changes
fixes https://github.com/rust-lang/rust/issues/79738
Fix 32-bit overflows in LLVM composite constants
Inspired by #121868. Fixes unsoundness created when constructing constant arrays, strings, and structs with 2^32 or more elements on x86_64. This introduces copies of a few LLVM functions that have their signatures updated to use size_t in place of unsigned int. Alternatively we could just add overflow checks and just disallow huge composite constants. That introduces less code, but maybe a huge static block of memory is useful in embedded/no-os situations?
Remove the unused `field_remapping` field from `TypeLowering`
The `field_remapping` field of `TypeLowering` has been unused since #121665. This PR removes it, then replaces the `TypeLowering` struct with its only remaining member `&'ll Type`.
std support for wasm32 panic=unwind
Tracking issue: #118168
This adds std support for `-Cpanic=unwind` on wasm, and with it slightly more fleshed out rustc support. Now, the stable default is still panic=abort without exception-handling, but if you `-Zbuild-std` with `RUSTFLAGS=-Cpanic=unwind`, you get wasm exception-handling try/catch blocks in the binary:
```rust
#[no_mangle]
pub fn foo_bar(x: bool) -> *mut u8 {
let s = Box::<str>::from("hello");
maybe_panic(x);
Box::into_raw(s).cast()
}
#[inline(never)]
#[no_mangle]
fn maybe_panic(x: bool) {
if x {
panic!("AAAAA");
}
}
```
```wat
;; snip...
(try $label$5
(do
(call $maybe_panic
(local.get $0)
)
(br $label$1)
)
(catch_all
(global.set $__stack_pointer
(local.get $1)
)
(call $__rust_dealloc
(local.get $2)
(i32.const 5)
(i32.const 1)
)
(rethrow $label$5)
)
)
;; snip...
```
Stop using LLVM struct types for byval/sret
For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout.
\*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until #112157.
†: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue.
Split out from #121577.
r? `@nikic`
Add asm goto support to `asm!`
Tracking issue: #119364
This PR implements asm-goto support, using the syntax described in "future possibilities" section of [RFC2873](https://rust-lang.github.io/rfcs/2873-inline-asm.html#asm-goto).
Currently I have only implemented the `label` part, not the `fallthrough` part (i.e. fallthrough is implicit). This doesn't reduce the expressive though, since you can use label-break to get arbitrary control flow or simply set a value and rely on jump threading optimisation to get the desired control flow. I can add that later if deemed necessary.
r? ``@Amanieu``
cc ``@ojeda``
Introduces the `arm64ec-pc-windows-msvc` target for building Arm64EC ("Emulation Compatible") binaries for Windows.
For more information about Arm64EC see <https://learn.microsoft.com/en-us/windows/arm/arm64ec>.
Tier 3 policy:
> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)
I will be the maintainer for this target.
> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.
Target uses the `arm64ec` architecture to match LLVM and MSVC, and the `-pc-windows-msvc` suffix to indicate that it targets Windows via the MSVC environment.
> Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.
Target name exactly specifies the type of code that will be produced.
> If possible, use only letters, numbers, dashes and underscores for the name. Periods (.) are known to cause issues in Cargo.
Done.
> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.
> The target must not introduce license incompatibilities.
Uses the same dependencies, requirements and licensing as the other `*-pc-windows-msvc` targets.
> Anything added to the Rust repository must be under the standard Rust license (MIT OR Apache-2.0).
Understood.
> The target must not cause the Rust tools or libraries built for any other host (even when supporting cross-compilation to the target) to depend on any new dependency less permissive than the Rust licensing policy. This applies whether the dependency is a Rust crate that would require adding new license exceptions (as specified by the tidy tool in the rust-lang/rust repository), or whether the dependency is a native library or binary. In other words, the introduction of the target must not cause a user installing or running a version of Rust or the Rust tools to be subject to any new license requirements.
> Compiling, linking, and emitting functional binaries, libraries, or other code for the target (whether hosted on the target itself or cross-compiling from another target) must not depend on proprietary (non-FOSS) libraries. Host tools built for the target itself may depend on the ordinary runtime libraries supplied by the platform and commonly used by other applications built for the target, but those libraries must not be required for code generation for the target; cross-compilation to the target must not require such libraries at all. For instance, rustc built for the target may depend on a common proprietary C runtime library or console output library, but must not depend on a proprietary code generation library or code optimization library. Rust's license permits such combinations, but the Rust project has no interest in maintaining such combinations within the scope of Rust itself, even at tier 3.
> "onerous" here is an intentionally subjective term. At a minimum, "onerous" legal/licensing terms include but are not limited to: non-disclosure requirements, non-compete requirements, contributor license agreements (CLAs) or equivalent, "non-commercial"/"research-only"/etc terms, requirements conditional on the employer or employment of any particular Rust developers, revocable terms, any requirements that create liability for the Rust project or its developers or users, or any requirements that adversely affect the livelihood or prospects of the Rust project or its developers or users.
Uses the same dependencies, requirements and licensing as the other `*-pc-windows-msvc` targets.
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
> This requirement does not prevent part or all of this policy from being cited in an explicit contract or work agreement (e.g. to implement or maintain support for a target). This requirement exists to ensure that a developer or team responsible for reviewing and approving a target does not face any legal threats or obligations that would prevent them from freely exercising their judgment in such approval, even if such judgment involves subjective matters or goes beyond the letter of these requirements.
Understood, I am not a member of the Rust team.
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.
Both `core` and `alloc` are supported.
Support for `std` dependends on making changes to the standard library, `stdarch` and `backtrace` which cannot be done yet as the bootstrapping compiler raises a warning ("unexpected `cfg` condition value") for `target_arch = "arm64ec"`.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running binaries, or running tests (even if they do not pass), the documentation must explain how to run such binaries or tests for the target, using emulation if possible or dedicated hardware if necessary.
Documentation is provided in src/doc/rustc/src/platform-support/arm64ec-pc-windows-msvc.md
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via @) to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
> Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
> In particular, this may come up when working on closely related targets, such as variations of the same architecture with different features. Avoid introducing unconditional uses of features that another variation of the target may not have; use conditional compilation or runtime detection, as appropriate, to let each target run code supported by that target.
Understood.
Rollup of 5 pull requests
Successful merges:
- #120761 (Add initial support for DataFlowSanitizer)
- #121622 (Preserve same vtable pointer when cloning raw waker, to fix Waker::will_wake)
- #121716 (match lowering: Lower bindings in a predictable order)
- #121731 (Now that inlining, mir validation and const eval all use reveal-all, we won't be constraining hidden types here anymore)
- #121841 (`f16` and `f128` step 2: intrinsics)
r? `@ghost`
`@rustbot` modify labels: rollup
`f16` and `f128` step 2: intrinsics
Continuation of https://github.com/rust-lang/rust/pull/121728, another portion of https://github.com/rust-lang/rust/pull/114607.
This PR adds `f16` and `f128` intrinsics, and hooks them up to both HIR and LLVM. This is all still unexposed to the frontend, which will probably be the next step. Also update itanium mangling per `@rcvalle's` in https://github.com/rust-lang/rust/pull/121728/files#r1506570300, and fix a typo from step 1.
Once these types are usable in code, I will add the codegen tests from #114607 (codegen is passing on that branch)
This does add more `unimplemented!`s to Clippy, but I still don't think we can do better until library support is added.
r? `@compiler-errors`
cc `@Nilstrieb`
`@rustbot` label +T-compiler +F-f16_and_f128
Adds initial support for DataFlowSanitizer to the Rust compiler. It
currently supports `-Zsanitizer-dataflow-abilist`. Additional options
for it can be passed to LLVM command line argument processor via LLVM
arguments using `llvm-args` codegen option (e.g.,
`-Cllvm-args=-dfsan-combine-pointer-labels-on-load=false`).
Add stubs in IR and ABI for `f16` and `f128`
This is the very first step toward the changes in https://github.com/rust-lang/rust/pull/114607 and the [`f16` and `f128` RFC](https://rust-lang.github.io/rfcs/3453-f16-and-f128.html). It adds the types to `rustc_type_ir::FloatTy` and `rustc_abi::Primitive`, and just propagates those out as `unimplemented!` stubs where necessary.
These types do not parse yet so there is no feature gate, and it should be okay to use `unimplemented!`.
The next steps will probably be AST support with parsing and the feature gate.
r? `@compiler-errors`
cc `@Nilstrieb` suggested breaking the PR up in https://github.com/rust-lang/rust/pull/120645#issuecomment-1925900572
Remove useless lifetime of ArchiveBuilder
`trait ArchiveBuilder<'a>` has a seemingly useless lifetime a, so I remove it. If this is intentional, please reject this PR.
```rust
pub trait ArchiveBuilder<'a> {
fn add_file(&mut self, path: &Path);
fn add_archive(
&mut self,
archive: &Path,
skip: Box<dyn FnMut(&str) -> bool + 'static>,
) -> io::Result<()>;
fn build(self: Box<Self>, output: &Path) -> bool;
}
```
rename 'try' intrinsic to 'catch_unwind'
The intrinsic has nothing to do with `try` blocks, and corresponds to the stable `catch_unwind` function, so this makes a lot more sense IMO.
Also rename Miri's special function while we are at it, to reflect the level of abstraction it works on: it's an unwinding mechanism, on which Rust implements panics.
Add newtypes for bool fields/params/return types
Fixed all the cases of this found with some simple searches for `*/ bool` and `bool /*`; probably many more
Improve codegen diagnostic handling
Clarify the workings of the temporary `Diagnostic` type used to send diagnostics from codegen threads to the main thread.
r? `@estebank`
First, introduce a typedef `DiagnosticArgMap`.
Second, make the `args` field public, and remove the `args` getter and
`replace_args` setter. These were necessary previously because the getter
had a `#[allow(rustc::potential_query_instability)]` attribute, but that
was removed in #120931 when the args were changed from `FxHashMap` to
`FxIndexMap`. (All the other `Diagnostic` fields are public.)
Add "algebraic" fast-math intrinsics, based on fast-math ops that cannot return poison
Setting all of LLVM's fast-math flags makes our fast-math intrinsics very dangerous, because some inputs are UB. This set of flags permits common algebraic transformations, but according to the [LangRef](https://llvm.org/docs/LangRef.html#fastmath), only the flags `nnan` (no nans) and `ninf` (no infs) can produce poison.
And this uses the algebraic float ops to fix https://github.com/rust-lang/rust/issues/120720
cc `@orlp`