"trampolines", or "aliases (the default)) to allow targets to opt out of
the MergeFunctions LLVM pass. Also add a corresponding -Z option with
the same name and values.
This works around: https://github.com/rust-lang/rust/issues/57356
Motivation:
Basically, the problem is that the MergeFunctions pass, which rustc
currently enables by default at -O2 and -O3, and `extern "ptx-kernel"`
functions (specific to the NVPTX target) are currently not compatible
with each other. If the MergeFunctions pass is allowed to run, rustc can
generate invalid PTX assembly (i.e. a PTX file that is not accepted by
the native PTX assembler ptxas). Therefore we would like a way to opt
out of the MergeFunctions pass, which is what our target option does.
Related work:
The current behavior of rustc is to enable MergeFunctions at -O2 and -O3,
and also to enable the use of function aliases within MergeFunctions.
MergeFunctions both with and without function aliases is incompatible with
the NVPTX target.
clang's "solution" is to have a "-fmerge-functions" flag that opts in to
the MergeFunctions pass, but it is not enabled by default.
Instead disable creation of assumptions during inlining using an
LLVM opt flag.
The -Z arg-align-attributes option which previously controlled this
behavior is removed.
This appears to be called `cx16` in LLVM and a few other locations, but
the Intel Intrinsic Guide doesn't have a name for this and the CPU
manual from Intel only mentions `cmpxchg16b`, so that's the name chosen
here.
If the Rust LLVM fork is used, enable the -mergefunc-use-aliases
flag, which will create aliases for merged functions, rather than
inserting a call from one to the other.
A number of codegen tests needed to be adjusted, because functions
that previously fell below the thunk limit are now being merged.
Merging is prevented either using -C no-prepopulate-passes, or by
making the functions non-identical.
I expect that this is going to break something, somewhere, because
it isn't able to deal with aliases properly, but we won't find out
until we try :)
This fixes#52651.
Implement rotate using funnel shift on LLVM >= 7
Implement the rotate_left and rotate_right operations using
llvm.fshl and llvm.fshr if they are available (LLVM >= 7).
Originally I wanted to expose the funnel_shift_left and
funnel_shift_right intrinsics and implement rotate_left and
rotate_right on top of them. However, emulation of funnel
shifts requires emitting a conditional to check for zero shift
amount, which is not necessary for rotates. I was uncomfortable
doing that here, as I don't want to rely on LLVM to optimize
away that conditional (and for variable rotates, I'm not sure it
can). We should revisit that question when we raise our minimum
version requirement to LLVM 7 and don't need emulation code
anymore.
Fixes#52457.
Implement the rotate_left and rotate_right operations using
llvm.fshl and llvm.fshr if they are available (LLVM >= 7).
Originally I wanted to expose the funnel_shift_left and
funnel_shift_right intrinsics and implement rotate_left and
rotate_right on top of them. However, emulation of funnel
shifts requires emitting a conditional to check for zero shift
amount, which is not necessary for rotates. I was uncomfortable
doing that here, as I don't want to rely on LLVM to optimize
away that conditional (and for variable rotates, I'm not sure it
can). We should revisit that question when we raise our minimum
version requirement to LLVM 7 and don't need emulation code
anymore.
rustc: Prepare the `atomics` feature for wasm
This commit adds a few changes for atomic instructions on the
`wasm32-unknown-unknown` target. Atomic instructions are not yet stable in
WebAssembly itself but there are multiple implementations and LLVM has support
for the proposed instruction set, so let's work on exposing it!
Here there are a few inclusions:
* The `atomics` feature was whitelisted for LLVM, allowing code in Rust to
enable/disable/gate on this.
* The `singlethread` option is turned off for wasm when the `atomics` feature is
enabled. This means that by default wasm won't be lowering with atomics, but
when atomics are enabled globally we'll turn off single-threaded mode to
actually codegen atomics. This probably isn't what we'll want in the long term
but for now it should work.
* Finally the maximum atomic width is increased to 64 to reflect the current
wasm spec.
This commit adds a few changes for atomic instructions on the
`wasm32-unknown-unknown` target. Atomic instructions are not yet stable in
WebAssembly itself but there are multiple implementations and LLVM has support
for the proposed instruction set, so let's work on exposing it!
Here there are a few inclusions:
* The `atomics` feature was whitelisted for LLVM, allowing code in Rust to
enable/disable/gate on this.
* The `singlethread` option is turned off for wasm when the `atomics` feature is
enabled. This means that by default wasm won't be lowering with atomics, but
when atomics are enabled globally we'll turn off single-threaded mode to
actually codegen atomics. This probably isn't what we'll want in the long term
but for now it should work.
* Finally the maximum atomic width is increased to 64 to reflect the current
wasm spec.
This fixes a regression from #53031 where specifying `-C target-cpu=native` is
printing a lot of warnings from LLVM about `native` being an unknown CPU. It
turns out that `native` is indeed an unknown CPU and we have to perform a
mapping to an actual CPU name, but this mapping is only performed in one
location rather than all locations we inform LLVM about the target CPU.
This commit centralizes the mapping of `native` to LLVM's value of the native
CPU, ensuring that all locations we inform LLVM about the `target-cpu` it's
never `native`.
Closes#53322
- `dsp`: the subtarget supports the DSP (saturating arith. and such)
instructions
- `rclass`: target is a Cortex-R
Both features are useful to support ARM MCUs on `coresimd`.
Note: Cortex-R52 is the first Armv8-R with `neon` support