Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) +
c`, to be fused if the code generator determines that (i) the target
instruction set has support for a fused operation, and (ii) that the
fused operation is more efficient than the equivalent, separate pair
of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
MIRI support is included for f32 and f64.
The codegen_cranelift uses the `fma` function from libc, which is a
correct implementation, but without the desired performance semantic. I
think this requires an update to cranelift to expose a suitable
instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same
way (using `fma` from libc).
llvm: replace some deprecated functions
`LLVMMDStringInContext` and `LLVMMDNodeInContext` are deprecated, replace them with `LLVMMDStringInContext2` and `LLVMMDNodeInContext2`.
Also replace `Value` with `Metadata` in some function signatures for better consistency.
Use -0.0 in `intrinsics::simd::reduce_add_unordered`
-0.0 is the actual neutral additive float, not +0.0, and this matters to codegen.
try-job: aarch64-gnu
simd_shuffle intrinsic: allow argument to be passed as vector
See https://github.com/rust-lang/rust/issues/128738 for context.
I'd like to get rid of [this hack](6c0b89dfac/compiler/rustc_codegen_ssa/src/mir/block.rs (L922-L935)). https://github.com/rust-lang/rust/pull/128537 almost lets us do that since constant SIMD vectors will then be passed as immediate arguments. However, simd_shuffle for some reason actually takes an *array* as argument, not a vector, so the hack is still required to ensure that the array becomes an immediate (which then later stages of codegen convert into a vector, as that's what LLVM needs).
This PR prepares simd_shuffle to also support a vector as the `idx` argument. Once this lands, stdarch can hopefully be updated to pass `idx` as a vector, and then support for arrays can be removed, which finally lets us get rid of that hack.
Since LLVM <https://reviews.llvm.org/D99439> (4c7f820b2b20, "Update
@llvm.powi to handle different int sizes for the exponent"), the size of
the integer can be specified for the `powi` intrinsic. Make use of this
so it is more obvious that integer size is consistent across all float
types.
This feature is available since LLVM 13 (October 2021). Based on
bootstrap we currently support >= 17.0, so there should be no support
problems.
Add `select_unpredictable` to force LLVM to use CMOV
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs into branches if it comes from a `select` marked with an `unpredictable` metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits such a `select` and uses it in the implementation of `binary_search_by`.
Since https://reviews.llvm.org/D118118, LLVM will no longer turn CMOVs
into branches if it comes from a `select` marked with an `unpredictable`
metadata attribute.
This PR introduces `core::intrinsics::select_unpredictable` which emits
such a `select` and uses it in the implementation of `binary_search_by`.
Make repr(packed) vectors work with SIMD intrinsics
In #117116 I fixed `#[repr(packed, simd)]` by doing the expected thing and removing padding from the layout. This should be the last step in providing a solution to rust-lang/portable-simd#319
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.