It's crazy to have the integer methods in something close to random
order.
The reordering makes the gaps clear: `const_i64`, `const_i128`,
`const_isize`, and `const_u16`. I guess they just aren't needed.
In `get_fn` there is a complicated set of if/elses to determine if
`hidden` visibility should be applied. There are five calls to
`LLVMRustSetVisibility` and some repetition in the comments.
This commit streamlines it a bit:
- Computes `hidden` and then uses it to determine if a single call to
`LLVMRustSetVisibility` occurs.
- Converts some of the if/elses into boolean expressions.
- Removes the repetitive comments.
Overall this makes it quite a bit shorter, and I find it easier to read.
Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
It has `Backend` and `Deref` boudns, plus an associated type
`CodegenCx`, and it has a single use. This commit "inlines" it into
`BuilderMethods`, which makes the complicated backend trait situation a
little simpler.
Use -0.0 in `intrinsics::simd::reduce_add_unordered`
-0.0 is the actual neutral additive float, not +0.0, and this matters to codegen.
try-job: aarch64-gnu
deprecate -Csoft-float because it is unsound (and not fixable)
See https://github.com/rust-lang/rust/issues/129893 for details. The general sentiment there seems to be that this flag has no use and sound alternatives exist, so let's add this warning and see if anyone out there disagrees.
Also show a different warning on targets where it does nothing (as documented since https://github.com/rust-lang/rust/pull/36261): it seems to correspond to `-mfloat-abi` in GCC/clang, which is an ARM-specific option. To be really sure it does nothing, only forward the flag to LLVM for eabihf targets. This should not change behavior but makes me sleep better ;)
Don't leave debug locations for constants sitting on the builder indefinitely
Because constants are currently emitted *before* the prologue, leaving the debug location on the IRBuilder spills onto other instructions in the prologue and messes up both line numbers as well as the point LLVM chooses to be the prologue end.
Example LLVM IR (irrelevant IR elided):
Before:
```
define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr` align 8 %self) !dbg !347 { start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
```
After:
```
define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr` align 8 %self) !dbg !347 { start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
```
Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer falls through onto the store to %self.dbg.spill. This fixes argument values at entry when the constant is a ZST (e.g. `<Option as Try>::Residual`). This fixes#130003 (but note that it does *not* fix issues with argument values and non-ZST constants, which emit their own stores that have debug info on them, like #128945).
r? `@michaelwoerister`
Simplify some nested `if` statements
Applies some but not all instances of `clippy::collapsible_if`. Some ended up looking worse afterwards, though, so I left those out. Also applies instances of `clippy::collapsible_else_if`
Review with whitespace disabled please.
Add -Z small-data-threshold
This flag allows specifying the threshold size above which LLVM should not consider placing small objects in a `.sdata` or `.sbss` section.
Support is indicated in the target options via the
small-data-threshold-support target option, which can indicate either an
LLVM argument or an LLVM module flag. To avoid duplicate specifications
in a large number of targets, the default value for support is
DefaultForArch, which is translated to a concrete value according to the
target's architecture.
This flag allows specifying the threshold size above which LLVM should
not consider placing small objects in a .sdata or .sbss section.
Support is indicated in the target options via the
small-data-threshold-support target option, which can indicate either an
LLVM argument or an LLVM module flag. To avoid duplicate specifications
in a large number of targets, the default value for support is
DefaultForArch, which is translated to a concrete value according to the
target's architecture.
s390x: Fix a regression related to backchain feature
In #127506, we introduced a new IBM Z-specific target feature, `backchain`.
This particular `target-feature` was available as a function-level attribute in LLVM 17 and below, so some hacks were used to avoid blowing up LLVM when querying the supported LLVM features.
This led to an unfortunate regression where `cfg!(target-feature = "backchain")` will always return true.
This pull request aims to fix this issue, and a test has been introduced to ensure it will never happen again.
Fixes#129927.
r? `@RalfJung`
Do not request sanitizers for naked functions
Naked functions can only contain inline asm, so any instrumentation inserted by sanitizers is illegal. Don't request it.
Fixes https://github.com/rust-lang/rust/issues/129224.
Because constants are currently emitted *before* the prologue, leaving the
debug location on the IRBuilder spills onto other instructions in the prologue
and messes up both line numbers as well as the point LLVM chooses to be the
prologue end.
Example LLVM IR (irrelevant IR elided):
Before:
define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr align 8 %self) !dbg !347 {
start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
After:
define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr align 8 %self) !dbg !347 {
start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer
falls through onto the store to %self.dbg.spill. This fixes argument values
at entry when the constant is a ZST (e.g. <Option as Try>::Residual). This
fixes#130003 (but note that it does *not* fix issues with argument values and
non-ZST constants, which emit their own stores that have debug info on them,
like #128945).
Make `Ty::boxed_ty` return an `Option`
Looks like a good place to use Rust's type system.
---
Most of 4ac7bcbaad/compiler/rustc_middle/src/ty/sty.rs (L971-L1963) looks like it could be moved to `TyKind` (then I guess `Ty` should be made to deref to `TyKind`).
Don't emit `expect`/`assume` in opt-level=0
LLVM does not make use of expect/assume calls in `opt-level=0`, so we can simplify IR by not emitting them in this case.
rustc_target: Add various aarch64 features
Add various aarch64 features already supported by LLVM and Linux.
Additionally include some comment fixes to ensure consistency of feature names with the Arm ARM.
Compiler support for features added to stdarch by https://github.com/rust-lang/stdarch/pull/1614.
Tracking issue for unstable aarch64 features is https://github.com/rust-lang/rust/issues/127764.
List of added features:
- FEAT_CSSC
- FEAT_ECV
- FEAT_FAMINMAX
- FEAT_FLAGM2
- FEAT_FP8
- FEAT_FP8DOT2
- FEAT_FP8DOT4
- FEAT_FP8FMA
- FEAT_HBC
- FEAT_LSE128
- FEAT_LSE2
- FEAT_LUT
- FEAT_MOPS
- FEAT_LRCPC3
- FEAT_SVE_B16B16
- FEAT_SVE2p1
- FEAT_WFxT
- FEAT_SME
- FEAT_SME_F16F16
- FEAT_SME_F64F64
- FEAT_SME_F8F16
- FEAT_SME_F8F32
- FEAT_SME_FA64
- FEAT_SME_I16I64
- FEAT_SME_LUTv2
- FEAT_SME2
- FEAT_SME2p1
- FEAT_SSVE_FP8DOT2
- FEAT_SSVE_FP8DOT4
- FEAT_SSVE_FP8FMA
FEAT_FPMR is added in the first commit and then removed in a separate one to highlight it being removed from upstream LLVM 19. The intention is for it to be detectable at runtime through stdarch but not have a corresponding Rust compile-time feature.
LLVM uses the word "code" to refer to a particular kind of coverage mapping.
This unrelated usage of the word is confusing, and makes it harder to introduce
types whose names correspond to the LLVM classification of coverage kinds.
Convert to_llvm_features to return Option<LLVMFeature> so that it can
return None if the requested feature is not available for the current
LLVM version.
Add match rules to filter out aarch64 features not available in LLVM 17.
FEAT_FPMR has been removed from upstream LLVM as of LLVM 19.
Remove the feature from the target features list and temporarily hack
the LLVM codegen to always enable it until the minimum LLVM version is
bumped to 19.
Add various aarch64 features already supported by LLVM and Linux.
The features are marked as unstable using a newly added symbol, i.e.
aarch64_unstable_target_feature.
Additionally include some comment fixes to ensure consistency of
feature names with the Arm ARM and support for architecture version
target features up to v9.5a.
This commit adds compiler support for the following features:
- FEAT_CSSC
- FEAT_ECV
- FEAT_FAMINMAX
- FEAT_FLAGM2
- FEAT_FP8
- FEAT_FP8DOT2
- FEAT_FP8DOT4
- FEAT_FP8FMA
- FEAT_FPMR
- FEAT_HBC
- FEAT_LSE128
- FEAT_LSE2
- FEAT_LUT
- FEAT_MOPS
- FEAT_LRCPC3
- FEAT_SVE_B16B16
- FEAT_SVE2p1
- FEAT_WFxT
Add `f16` and `f128` inline ASM support for `aarch64`
Adds `f16` and `f128` inline ASM support for `aarch64`. SIMD vector types are taken from [the ARM intrinsics list](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:`@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiesarchitectures=[A64]).` Based on the work of `@lengrongfu` in #127043.
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
try-job: aarch64-gnu
try-job: aarch64-apple
simd_shuffle intrinsic: allow argument to be passed as vector
See https://github.com/rust-lang/rust/issues/128738 for context.
I'd like to get rid of [this hack](6c0b89dfac/compiler/rustc_codegen_ssa/src/mir/block.rs (L922-L935)). https://github.com/rust-lang/rust/pull/128537 almost lets us do that since constant SIMD vectors will then be passed as immediate arguments. However, simd_shuffle for some reason actually takes an *array* as argument, not a vector, so the hack is still required to ensure that the array becomes an immediate (which then later stages of codegen convert into a vector, as that's what LLVM needs).
This PR prepares simd_shuffle to also support a vector as the `idx` argument. Once this lands, stdarch can hopefully be updated to pass `idx` as a vector, and then support for arrays can be removed, which finally lets us get rid of that hack.
Add `#[warn(unreachable_pub)]` to a bunch of compiler crates
By default `unreachable_pub` identifies things that need not be `pub` and tells you to make them `pub(crate)`. But sometimes those things don't need any kind of visibility. So they way I did these was to remove the visibility entirely for each thing the lint identifies, and then add `pub(crate)` back in everywhere the compiler said it was necessary. (Or occasionally `pub(super)` when context suggested that was appropriate.) Tedious, but results in more `pub` removal.
There are plenty more crates to do but this seems like enough for a first PR.
r? `@compiler-errors`
Set the cfi-normalize-integers and kcfi-offset module flags when
Control-Flow Integrity sanitizers are used, so functions generated by
the LLVM backend use the same CFI/KCFI options as rustc.
cfi-normalize-integers tells LLVM to also use integer normalization
for generated functions when -Zsanitizer-cfi-normalize-integers is
used.
kcfi-offset specifies the number of prefix nops between the KCFI
type hash and the function entry when -Z patchable-function-entry is
used. Note that LLVM assumes all indirectly callable functions use the
same number of prefix NOPs with -Zsanitizer=kcfi.
Avoid extra `cast()`s after `CStr::as_ptr()`
These used to be `&str` literals that did need a pointer cast, but that
became a no-op after switching to `c""` literals in #118566.
Special case DUMMY_SP to emit line 0/column 0 locations on DWARF platforms.
Line 0 has a special meaning in DWARF. From the version 5 spec:
The compiler may emit the value 0 in cases
where an instruction cannot be attributed to any
source line.
DUMMY_SP spans cannot be attributed to any line. However, because rustc internally stores line numbers starting at zero, lookup_debug_loc() adjusts every line number by one. Special casing DUMMY_SP to actually emit line 0 ensures rustc communicates to the debugger that there's no meaningful source code for this instruction, rather than telling the debugger to jump to line 1 randomly.
With the new resolver, a few dependencies get brought in twice with
different licenses. For example, all dependencies from `wasm-tools`
gained Apache-2.0 and MIT options, and with the v2 resolver we were
using one version from before and one version from after this change.
This made tidy's license check difficult.
Update some minimum versions to remove duplicate dependencies and smooth
out license checking.
Fix `is_val_statically_known` for floats
The LLVM intrinsic name for floats differs from the LLVM type name, so handle them explicitly. Also adds support for `f16` and `f128`.
`f16`/`f128` tracking issue: #116909
Rework MIR inlining debuginfo so function parameters show up in debuggers.
Line numbers of multiply-inlined functions were fixed in #114643 by using a single DISubprogram. That, however, triggered assertions because parameters weren't deduplicated. The "solution" to that in #115417 was to insert a DILexicalScope below the DISubprogram and parent all of the parameters to that scope. That fixed the assertion, but debuggers (including gdb and lldb) don't recognize variables that are not parented to the subprogram itself as parameters, even if they are emitted with DW_TAG_formal_parameter.
Consider the program:
```rust
use std::env;
#[inline(always)]
fn square(n: i32) -> i32 {
n * n
}
#[inline(never)]
fn square_no_inline(n: i32) -> i32 {
n * n
}
fn main() {
let x = square(env::vars().count() as i32);
let y = square_no_inline(env::vars().count() as i32);
println!("{x} == {y}");
}
```
When making a release build with debug=2 and rustc 1.82.0-nightly (8b3870784 2024-08-07)
```
(gdb) r
Starting program: /ephemeral/tmp/target/release/tmp [Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, tmp::square () at src/main.rs:5
5 n * n
(gdb) info args
No arguments.
(gdb) info locals
n = 31
(gdb) c
Continuing.
Breakpoint 2, tmp::square_no_inline (n=31) at src/main.rs:10
10 n * n
(gdb) info args
n = 31
(gdb) info locals
No locals.
```
This issue is particularly annoying because it removes arguments from stack traces.
The DWARF for the inlined function looks like this:
```
< 2><0x00002132 GOFF=0x00002132> DW_TAG_subprogram
DW_AT_linkage_name _ZN3tmp6square17hc507052ff3d2a488E
DW_AT_name square
DW_AT_decl_file 0x0000000f /ephemeral/tmp/src/main.rs
DW_AT_decl_line 0x00000004
DW_AT_type 0x00001a56<.debug_info+0x00001a56>
DW_AT_inline DW_INL_inlined
< 3><0x00002142 GOFF=0x00002142> DW_TAG_lexical_block
< 4><0x00002143 GOFF=0x00002143> DW_TAG_formal_parameter
DW_AT_name n
DW_AT_decl_file 0x0000000f /ephemeral/tmp/src/main.rs
DW_AT_decl_line 0x00000004
DW_AT_type 0x00001a56<.debug_info+0x00001a56>
< 4><0x0000214e GOFF=0x0000214e> DW_TAG_null
< 3><0x0000214f GOFF=0x0000214f> DW_TAG_null
```
That DW_TAG_lexical_block inhibits every debugger I've tested from recognizing 'n' as a parameter.
This patch removes the additional lexical scope. Parameters can be easily deduplicated by a tuple of their scope and the argument index, at the trivial cost of taking a Hash + Eq bound on DIScope.
Use the `enum2$` Natvis visualiser for repr128 C-style enums
Use the preexisting `enum2$` Natvis visualiser to allow PDB debuggers to display fieldless `#[repr(u128)]]`/`#[repr(i128)]]` enums correctly.
Tracking issue: #56071
try-job: x86_64-msvc
Shrink `TyKind::FnPtr`.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
r? `@compiler-errors`
Line numbers of multiply-inlined functions were fixed in #114643 by using a
single DISubprogram. That, however, triggered assertions because parameters
weren't deduplicated. The "solution" to that in #115417 was to insert a
DILexicalScope below the DISubprogram and parent all of the parameters to that
scope. That fixed the assertion, but debuggers (including gdb and lldb) don't
recognize variables that are not parented to the subprogram itself as parameters,
even if they are emitted with DW_TAG_formal_parameter.
Consider the program:
use std::env;
fn square(n: i32) -> i32 {
n * n
}
fn square_no_inline(n: i32) -> i32 {
n * n
}
fn main() {
let x = square(env::vars().count() as i32);
let y = square_no_inline(env::vars().count() as i32);
println!("{x} == {y}");
}
When making a release build with debug=2 and rustc 1.82.0-nightly (8b3870784 2024-08-07)
(gdb) r
Starting program: /ephemeral/tmp/target/release/tmp
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, tmp::square () at src/main.rs:5
5 n * n
(gdb) info args
No arguments.
(gdb) info locals
n = 31
(gdb) c
Continuing.
Breakpoint 2, tmp::square_no_inline (n=31) at src/main.rs:10
10 n * n
(gdb) info args
n = 31
(gdb) info locals
No locals.
This issue is particularly annoying because it removes arguments from stack traces.
The DWARF for the inlined function looks like this:
< 2><0x00002132 GOFF=0x00002132> DW_TAG_subprogram
DW_AT_linkage_name _ZN3tmp6square17hc507052ff3d2a488E
DW_AT_name square
DW_AT_decl_file 0x0000000f /ephemeral/tmp/src/main.rs
DW_AT_decl_line 0x00000004
DW_AT_type 0x00001a56<.debug_info+0x00001a56>
DW_AT_inline DW_INL_inlined
< 3><0x00002142 GOFF=0x00002142> DW_TAG_lexical_block
< 4><0x00002143 GOFF=0x00002143> DW_TAG_formal_parameter
DW_AT_name n
DW_AT_decl_file 0x0000000f /ephemeral/tmp/src/main.rs
DW_AT_decl_line 0x00000004
DW_AT_type 0x00001a56<.debug_info+0x00001a56>
< 4><0x0000214e GOFF=0x0000214e> DW_TAG_null
< 3><0x0000214f GOFF=0x0000214f> DW_TAG_null
That DW_TAG_lexical_block inhibits every debugger I've tested from recognizing
'n' as a parameter.
This patch removes the additional lexical scope. Parameters can be easily
deduplicated by a tuple of their scope and the argument index, at the trivial
cost of taking a Hash + Eq bound on DIScope.
const vector passed through to codegen
This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic
#118209
r? RalfJung
nontemporal_store: make sure that the intrinsic is truly just a hint
The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](https://github.com/llvm/llvm-project/issues/64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.
~~Blocked on https://github.com/rust-lang/stdarch/pull/1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~
Fixes https://github.com/rust-lang/rust/issues/114582
Cc `@Amanieu` `@workingjubilee`
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and
`FnHeader`, which can be packed more efficiently. This reduces the size
of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms.
This reduces peak memory usage by a few percent on some benchmarks. It
also reduces cache misses and page faults similarly, though this doesn't
translate to clear cycles or wall-time improvements on CI.
codegen: better centralize function declaration attribute computation
For some reason, the codegen backend has two functions that compute which attributes a function declaration gets: `apply_attrs_llfn` and `attributes::from_fn_attrs`. They are called in different places, on entirely different layers of abstraction.
To me the code seems cleaner if we centralize this entirely in `apply_attrs_llfn`, so that's what this PR does.
Make create_dll_import_lib easier to implement
This will make it easier to implement raw-dylib support in cg_clif and cg_gcc. This PR doesn't yet include an create_dll_import_lib implementation for cg_clif as I need to correctly implement dllimport in cg_clif first before raw-dylib can work at all with cg_clif.
Required for https://github.com/rust-lang/rustc_codegen_cranelift/issues/1345
Add `f16` and `f128` math functions
This adds intrinsics and math functions for `f16` and `f128` floating point types. Support is quite limited and some things are broken so tests don't run on many platforms, but this provides a starting point.
Line 0 has a special meaning in DWARF. From the version 5 spec:
The compiler may emit the value 0 in cases
where an instruction cannot be attributed to any
source line.
DUMMY_SP spans cannot be attributed to any line. However, because rustc
internally stores line numbers starting at zero, lookup_debug_loc() adjusts
every line number by one. Special casing DUMMY_SP to actually emit line 0
ensures rustc communicates to the debugger that there's no meaningful source
code for this instruction, rather than telling the debugger to jump to line 1
randomly.
Since LLVM <https://reviews.llvm.org/D99439> (4c7f820b2b20, "Update
@llvm.powi to handle different int sizes for the exponent"), the size of
the integer can be specified for the `powi` intrinsic. Make use of this
so it is more obvious that integer size is consistent across all float
types.
This feature is available since LLVM 13 (October 2021). Based on
bootstrap we currently support >= 17.0, so there should be no support
problems.
When an archive fails to build, print the path
Currently the output on failure is as follows:
Compiling block-buffer v0.10.4
Compiling crypto-common v0.1.6
Compiling digest v0.10.7
Compiling sha2 v0.10.8
Compiling xz2 v0.1.7
error: failed to build archive: No such file or directory
error: could not compile `bootstrap` (lib) due to 1 previous error
Change this to print which file is being constructed, to give some hint about what is going on.
error: failed to build archive at `path/to/output`: No such file or directory
Rollup of 4 pull requests
Successful merges:
- #127574 (elaborate unknowable goals)
- #128141 (Set branch protection function attributes)
- #128315 (Fix vita build of std and forbid unsafe in unsafe in the os/vita module)
- #128339 ([rustdoc] Make the buttons remain when code example is clicked)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `select_unpredictable` to force LLVM to use CMOV
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
Set branch protection function attributes
Since LLVM 19, it is necessary to set not only module flags, but also function attributes for branch protection on aarch64. See e15d67cfc2 for the relevant LLVM change.
Fixes https://github.com/rust-lang/rust/issues/127829.
Update compiler_builtins to 0.1.114
The `weak-intrinsics` feature was removed from compiler_builtins in https://github.com/rust-lang/compiler-builtins/pull/598, so dropped the `compiler-builtins-weak-intrinsics` feature from alloc/std/sysroot.
In https://github.com/rust-lang/compiler-builtins/pull/593, some builtins for f16/f128 were added. These don't work for all compiler backends, so add a `compiler-builtins-no-f16-f128` feature and disable it for cranelift and gcc.
deps: dedup object, wasmparser, wasm-encoder
* dedups one `object`, additional dupe will be removed, with next `thorin-dwp` update
* `wasmparser` pinned to minor versions, so full merge isn't possible
* same with `wasm-encoder`
Turned off some features for `wasmparser` (see features https://github.com/bytecodealliance/wasm-tools/blob/v1.208.1/crates/wasmparser/Cargo.toml) in `run-make-support`, looks working?
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs
into branches if it comes from a `select` marked with an `unpredictable`
metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits
such a `select` and uses it in the implementation of `binary_search_by`.
compiler: Never debug_assert in codegen
In the name of Turing and his Hoarey heralds, assert our truths before creating a monster!
The `rustc_codegen_llvm` and `rustc_codegen_ssa` crates are fairly critical for rustc's correctness. Small mistakes here can easily result in undefined behavior, since a "small mistake" can mean something like "link and execute the wrong code". We should probably run any and all asserts in these modules unconditionally on whether this is a "debug build", and damn the costs in performance.
...Especially because the costs in performance seem to be *nothing*. It is not clear how much correctness we gain here, but I'll take free correctness improvements.
Since LLVM 19, it is necessary to set not only module flags, but
also function attributes for branch protection on aarch64. See
e15d67cfc2
for the relevant LLVM change.
rustc_target: add known safe s390x target features
This pull request adds known safe target features for s390x (aka IBM Z systems).
Currently, these features are unstable since stabilizing the target features requires submitting proposals.
The `vector` feature was added in IBM Z13 (`arch11`), and this is a SIMD feature for the newer IBM Z systems.
The `backchain` attribute is the IBM Z way of adding frame pointers like unwinding capabilities (the "frame-pointer" switch on IBM Z and IBM POWER platforms will add _emulated_ frame pointers to the binary, which profilers can't use for unwinding the stack).
Both attributes can be applied at the LLVM module or function levels. However, the `backchain` attribute has to be enabled for all the functions in the call stack to get a successful unwind process.
Handle .init_array link_section specially on wasm
Given that wasm-ld now has support for [.init_array](8f2bd8ae68/llvm/lib/MC/WasmObjectWriter.cpp (L1852)), it appears we can easily implement that section by falling through to the normal path rather than taking the typical custom_section path for wasm.
The wasm-ld appears to have a bunch of limitations. Only one static with the `link_section` in a crate or else you hit the fatal error in the link above "only one .init_array section fragment supported". They do not get merged.
You can still call multiple constructors by setting it to an array.
```
unsafe extern "C" fn ctor() {
println!("foo");
}
#[used]
#[link_section = ".init_array"]
static FOO: [unsafe extern "C" fn(); 2] = [ctor, ctor];
```
Another issue appears to be that if crate *A* depends on crate *B*, but *A* doesn't call any symbols from *B* and *B* doesn't `#[export_name = ...]` any symbols, then crate *B*'s constructor will not be called. The workaround to this is to provide an exported symbol in crate *B*.
... this is a special attribute that was made to be a target-feature in
LLVM 18+, but in all previous versions, this "feature" is a naked
attribute. We will have to handle this situation differently than all
other target-features.
Sync ar_archive_writer to LLVM 18.1.3
From LLVM 15.0.0-rc3. This adds support for COFF archives containing Arm64EC object files and has various fixes for AIX big archive files.
Currently the output on failure is as follows:
Compiling block-buffer v0.10.4
Compiling crypto-common v0.1.6
Compiling digest v0.10.7
Compiling sha2 v0.10.8
Compiling xz2 v0.1.7
error: failed to build archive: No such file or directory
error: could not compile `bootstrap` (lib) due to 1 previous error
Print which file is being constructed to give some hint about what is
going on.
In the future, branch and MC/DC mappings might have expressions that don't
correspond to any single point in the control-flow graph. That makes it
trickier to keep track of which expressions should expect an `ExpressionUsed`
node.
We therefore sidestep that complexity by only performing `ExpressionUsed`
simplification for expressions associated directly with ordinary `Code`
mappings.
Fix incorrect NDEBUG handling in LLVM bindings
We currently compile our LLVM bindings using `-DNDEBUG` if debuginfo for LLVM is disabled. However, `NDEBUG` doesn't have any relation to debuginfo, it controls whether assertions are enabled.
Split the LLVM_NDEBUG environment variable into two, so that assertions and debuginfo are controlled independently.
After this change, `LLVMRustDIBuilderInsertDeclareAtEnd` triggers an assertion failure on LLVM 19 due to an incorrect cast. Fix it by removing the unused return value entirely.
r? `@cuviper`
The return value changed from an Instruction to a DbgRecord in
LLVM 19. As we don't actually use the result, drop the return
value entirely to support both.
Rollup of 8 pull requests
Successful merges:
- #124599 (Suggest borrowing on fn argument that is `impl AsRef`)
- #127572 (Don't mark `DEBUG_EVENT` struct as `repr(packed)`)
- #127588 (core: Limit remaining f16 doctests to x86_64 linux)
- #127591 (Make sure that labels are defined after the primary span in diagnostics)
- #127598 (Allows `#[diagnostic::do_not_recommend]` to supress trait impls in suggestions as well)
- #127599 (Rename `lazy_cell_consume` to `lazy_cell_into_inner`)
- #127601 (check is_ident before parse_ident)
- #127605 (Remove extern "wasm" ABI)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `f16` and `f128` as simd types in LLVM
`@sayantn` is working on adding SIMD for `f16` and hitting the `FloatingPointVector` error. This should fix it and unblock adding support for `simd_fma` and `simd_fabs` in stdarch.
Remove the unstable `extern "wasm"` ABI (`wasm_abi` feature tracked
in #83788).
As discussed in https://github.com/rust-lang/rust/pull/127513#issuecomment-2220410679
and following, this ABI is a failed experiment that did not end
up being used for anything. Keeping support for this ABI in LLVM 19
would require us to switch wasm targets to the `experimental-mv`
ABI, which we do not want to do.
It should be noted that `Abi::Wasm` was internally used for two
things: The `-Z wasm-c-abi=legacy` ABI that is still used by
default on some wasm targets, and the `extern "wasm"` ABI. Despite
both being `Abi::Wasm` internally, they were not the same. An
explicit `extern "wasm"` additionally enabled the `+multivalue`
feature.
I've opted to remove `Abi::Wasm` in this patch entirely, instead
of keeping it as an ABI with only internal usage. Both
`-Z wasm-c-abi` variants are now treated as part of the normal
C ABI, just with different different treatment in
adjust_for_foreign_abi.
Add Natvis visualiser and debuginfo tests for `f16`
To render `f16`s in debuggers on MSVC targets, this PR changes the compiler to output `f16`s as `struct f16 { bits: u16 }`, and includes a Natvis visualiser that manually converts the `f16`'s bits to a `float` which is can then be displayed by debuggers. `gdb`, `lldb` and `cdb` tests are also included for `f16` .
`f16`/`f128` MSVC debug info issue: #121837
Tracking issue: #116909
Miri function identity hack: account for possible inlining
Having a non-lifetime generic is not the only reason a function can be duplicated. Another possibility is that the function may be eligible for cross-crate inlining. So also take into account the inlining attribute in this Miri hack for function pointer identity.
That said, `cross_crate_inlinable` will still sometimes return true even for `inline(never)` functions:
- when they are `DefKind::Ctor(..) | DefKind::Closure` -- I assume those cannot be `InlineAttr::Never` anyway?
- when `cross_crate_inline_threshold == InliningThreshold::Always`
so maybe this is still not quite the right criterion to use for function pointer identity.
Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Since this codegen flag now only controls LLVM-generated comments rather than
all assembly comments, make the name more accurate (and also match Clang).
`-Z patchable-function-entry` works like `-fpatchable-function-entry`
on clang/gcc. The arguments are total nop count and function offset.
See MCP rust-lang/compiler-team#704
Deprecate no-op codegen option `-Cinline-threshold=...`
This deprecates `-Cinline-threshold` since using it has no effect. This has been the case since the new LLVM pass manager started being used, more than 2 years ago.
Recommend using `-Cllvm-args=--inline-threshold=...` instead.
Closes#89742 which is E-help-wanted.
Add `f16` inline ASM support for 32-bit ARM
Adds `f16` inline ASM support for 32-bit ARM. SIMD vector types are taken from [here](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:`@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiesarchitectures=[A32]).`
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Add `f16` inline ASM support for RISC-V
This PR adds `f16` inline ASM support for RISC-V. A `FIXME` is left for `f128` support as LLVM does not support the required `Q` (Quad-Precision Floating-Point) extension yet.
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Honor collapse_debuginfo for statics.
fixes#126363
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.
This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using
r? <reviewer name>
-->
Also sort `crt-static` in `--print target-features` output
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`), because it was put at the bottom of the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out and remains sorted and does not need to be sorted an extra time.
On my machine the diff is just:
```diff
$ diff -u /tmp/before2.txt /tmp/after2.txt
--- /tmp/before2.txt 2024-06-13 20:40:27.091636592 +0200
+++ /tmp/after2.txt 2024-06-13 20:39:54.584894891 +0200
``@@`` -20,6 +20,7 ``@@``
bmi1 - Support BMI instructions.
bmi2 - Support BMI2 instructions.
cmpxchg16b - 64-bit with cmpxchg16b (this is true for most x86-64 chips, but not the first AMD chips).
+ crt-static - Enables C Run-time Libraries to be statically linked.
ermsb - REP MOVS/STOS are fast.
f16c - Support 16-bit floating point conversion instructions.
fma - Enable three-operand fused multiple-add.
``@@`` -49,7 +50,6 ``@@``
xsavec - Support xsavec instructions.
xsaveopt - Support xsaveopt instructions.
xsaves - Support xsaves instructions.
- crt-static - Enables C Run-time Libraries to be statically linked.
Code-generation features supported by LLVM for this target:
16bit-mode - 16-bit mode (i8086).
```
I couldn't find a ui test that tested this output. Let's see if CI finds a regression tests.
This deprecates `-Cinline-threshold` since using it has no effect. This
has been the case since the new LLVM pass manager started being used,
more than 2 years ago.
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`),
because it was put at the bottom the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out sorted and does not need to
be sorted an extra time.
We already do this for a number of crates, e.g. `rustc_middle`,
`rustc_span`, `rustc_metadata`, `rustc_span`, `rustc_errors`.
For the ones we don't, in many cases the attributes are a mess.
- There is no consistency about order of attribute kinds (e.g.
`allow`/`deny`/`feature`).
- Within attribute kind groups (e.g. the `feature` attributes),
sometimes the order is alphabetical, and sometimes there is no
particular order.
- Sometimes the attributes of a particular kind aren't even grouped
all together, e.g. there might be a `feature`, then an `allow`, then
another `feature`.
This commit extends the existing sorting to all compiler crates,
increasing consistency. If any new attribute line is added there is now
only one place it can go -- no need for arbitrary decisions.
Exceptions:
- `rustc_log`, `rustc_next_trait_solver` and `rustc_type_ir_macros`,
because they have no crate attributes.
- `rustc_codegen_gcc`, because it's quasi-external to rustc (e.g. it's
ignored in `rustfmt.toml`).
Directly add extension instead of using `Path::with_extension`
`Path::with_extension` has a nice footgun when the original path doesn't contain an extension: Anything after the last dot gets removed.
Make repr(packed) vectors work with SIMD intrinsics
In #117116 I fixed `#[repr(packed, simd)]` by doing the expected thing and removing padding from the layout. This should be the last step in providing a solution to rust-lang/portable-simd#319
Rollup of 6 pull requests
Successful merges:
- #125263 (rust-lld: fallback to rustc's sysroot if there's no path to the linker in the target sysroot)
- #125345 (rustc_codegen_llvm: add support for writing summary bitcode)
- #125362 (Actually use TAIT instead of emulating it)
- #125412 (Don't suggest adding the unexpected cfgs to the build-script it-self)
- #125445 (Migrate `run-make/rustdoc-with-short-out-dir-option` to `rmake.rs`)
- #125452 (Cleanup check-cfg handling in core and std)
r? `@ghost`
`@rustbot` modify labels: rollup
rustc_codegen_llvm: add support for writing summary bitcode
Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.
I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.
Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.