Instead of parsing the different components of a function signature,
eagerly look for either the `where` keyword or the function body.
- Also address feedback to use `From` instead of `TryFrom` in cranelift
contract and ubcheck codegen.
Rename `tcx.ensure()` to `tcx.ensure_ok()`, and improve the associated docs
This is all based on my archaeology for https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/.60TyCtxtEnsure.60.
The main renamings are:
- `tcx.ensure()` → `tcx.ensure_ok()`
- `tcx.ensure_with_value()` → `tcx.ensure_done()`
- Query modifier `ensure_forwards_result_if_red` → `return_result_from_ensure_ok`
Hopefully these new names are a better fit for the *actual* function and purpose of these query call modes.
Implement MIR lowering for unsafe binders
This is the final bit of the unsafe binders puzzle. It implements MIR, CTFE, and codegen for unsafe binders, and enforces that (for now) they are `Copy`. Later on, I'll introduce a new trait that relaxes this requirement to being "is `Copy` or `ManuallyDrop<T>`" which more closely models how we treat union fields.
Namely, wrapping unsafe binders is now `Rvalue::WrapUnsafeBinder`, which acts much like an `Rvalue::Aggregate`. Unwrapping unsafe binders are implemented as a MIR projection `ProjectionElem::UnwrapUnsafeBinder`, which acts much like `ProjectionElem::Field`.
Tracking:
- https://github.com/rust-lang/rust/issues/130516
Insert null checks for pointer dereferences when debug assertions are enabled
Similar to how the alignment is already checked, this adds a check
for null pointer dereferences in debug mode. It is implemented similarly
to the alignment check as a `MirPass`.
This inserts checks in the same places as the `CheckAlignment` pass and additionally
also inserts checks for `Borrows`, so code like
```rust
let ptr: *const u32 = std::ptr::null();
let val: &u32 = unsafe { &*ptr };
```
will have a check inserted on dereference. This is done because null references
are UB. The alignment check doesn't cover these places, because in `&(*ptr).field`,
the exact requirement is that the final reference must be aligned. This is something to
consider further enhancements of the alignment check.
For now this is implemented as a separate `MirPass`, to make it easy to disable
this check if necessary.
This is related to a 2025H1 project goal for better UB checks in debug
mode: https://github.com/rust-lang/rust-project-goals/pull/177.
r? `@saethlin`
Similar to how the alignment is already checked, this adds a check
for null pointer dereferences in debug mode. It is implemented similarly
to the alignment check as a MirPass.
This is related to a 2025H1 project goal for better UB checks in debug
mode: https://github.com/rust-lang/rust-project-goals/pull/177.
Fix deduplication mismatches in vtables leading to upcasting unsoundness
We currently have two cases where subtleties in supertraits can trigger disagreements in the vtable layout, e.g. leading to a different vtable layout being accessed at a callsite compared to what was prepared during unsizing. Namely:
### #135315
In this example, we were not normalizing supertraits when preparing vtables. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Identity {
type Selff;
}
impl<Selff> Identity for Selff {
type Selff = Selff;
}
trait Middle<T>: Supertrait<()> + Supertrait<T> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T> Middle<T> for () {}
trait Trait: Middle<<() as Identity>::Selff> {}
impl Trait for () {}
fn main() {
(&() as &dyn Trait as &dyn Middle<()>).say_hello(&0);
}
```
When we prepare `dyn Trait`, we see a supertrait of `Middle<<() as Identity>::Selff>`, which itself has two supertraits `Supertrait<()>` and `Supertrait<<() as Identity>::Selff>`. These two supertraits are identical, but they are not duplicated because we were using structural equality and *not* considering normalization. This leads to a vtable layout with two trait pointers.
When we upcast to `dyn Middle<()>`, those two supertraits are now the same, leading to a vtable layout with only one trait pointer. This leads to an offset error, and we call the wrong method.
### #135316
This one is a bit more interesting, and is the bulk of the changes in this PR. It's a bit similar, except it uses binder equality instead of normalization to make the compiler get confused about two vtable layouts. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Trait<T, U>: Supertrait<T> + Supertrait<U> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T, U> Trait<T, U> for () {}
fn main() {
(&() as &'static dyn for<'a> Trait<&'static (), &'a ()>
as &'static dyn Trait<&'static (), &'static ()>)
.say_hello(&0);
}
```
When we prepare the vtable for `dyn for<'a> Trait<&'static (), &'a ()>`, we currently consider the PolyTraitRef of the vtable as the key for a supertrait. This leads two two supertraits -- `Supertrait<&'static ()>` and `for<'a> Supertrait<&'a ()>`.
However, we can upcast[^up] without offsetting the vtable from `dyn for<'a> Trait<&'static (), &'a ()>` to `dyn Trait<&'static (), &'static ()>`. This is just instantiating the principal trait ref for a specific `'a = 'static`. However, when considering those supertraits, we now have only one distinct supertrait -- `Supertrait<&'static ()>` (which is deduplicated since there are two supertraits with the same substitutions). This leads to similar offsetting issues, leading to the wrong method being called.
[^up]: I say upcast but this is a cast that is allowed on stable, since it's not changing the vtable at all, just instantiating the binder of the principal trait ref for some lifetime.
The solution here is to recognize that a vtable isn't really meaningfully higher ranked, and to just treat a vtable as corresponding to a `TraitRef` so we can do this deduplication more faithfully. That is to say, the vtable for `dyn for<'a> Tr<'a>` and `dyn Tr<'x>` are always identical, since they both would correspond to a set of free regions on an impl... Do note that `Tr<for<'a> fn(&'a ())>` and `Tr<fn(&'static ())>` are still distinct.
----
There's a bit more that can be cleaned up. In codegen, we can stop using `PolyExistentialTraitRef` basically everywhere. We can also fix SMIR to stop storing `PolyExistentialTraitRef` in its vtable allocations.
As for testing, it's difficult to actually turn this into something that can be tested with `rustc_dump_vtable`, since having multiple supertraits that are identical is a recipe for ambiguity errors. Maybe someone else is more creative with getting that attr to work, since the tests I added being run-pass tests is a bit unsatisfying. Miri also doesn't help here, since it doesn't really generate vtables that are offset by an index in the same way as codegen.
r? `@lcnr` for the vibe check? Or reassign, idk. Maybe let's talk about whether this makes sense.
<sup>(I guess an alternative would also be to not do any deduplication of vtable supertraits (or only a really conservative subset) rather than trying to normalize and deduplicate more faithfully here. Not sure if that works and is sufficient tho.)</sup>
cc `@steffahn` -- ty for the minimizations
cc `@WaffleLapkin` -- since you're overseeing the feature stabilization :3
Fixes#135315Fixes#135316
Windows x86: Change i128 to return via the vector ABI
Clang and GCC both return `i128` in xmm0 on windows-msvc and windows-gnu. Currently, Rust returns the type on the stack. Add a calling convention adjustment so we also return scalar `i128`s using the vector ABI, which makes our `i128` compatible with C.
In the future, Clang may change to return `i128` on the stack for its `-msvc` targets (more at [1]). If this happens, the change here will need to be adjusted to only affect MinGW.
Link: https://github.com/rust-lang/rust/issues/134288 (does not fix) [1]
try-job: x86_64-msvc
try-job: x86_64-msvc-ext1
try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
Clang and GCC both return `i128` in xmm0 on windows-msvc and
windows-gnu. Currently, Rust returns the type on the stack. Add a
calling convention adjustment so we also return scalar `i128`s using the
vector ABI, which makes our `i128` compatible with C.
In the future, Clang may change to return `i128` on the stack for its
`-msvc` targets (more at [1]). If this happens, the change here will
need to be adjusted to only affect MinGW.
Link: https://github.com/rust-lang/rust/issues/134288
remove support for the (unstable) #[start] attribute
As explained by `@Noratrieb:`
`#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.
I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:
- `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail)
- `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things *anyways*
`#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving *any* stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.
Note that this feature has **not** been RFCed in the first place.
*This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute.*
Closes https://github.com/rust-lang/rust/issues/29633
try-job: x86_64-gnu-nopt
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: test-various
Subtree sync for rustc_codegen_cranelift
Nothing too exciting this time, but this includes a fix for a linker hang on Windows: https://github.com/rust-lang/rustc_codegen_cranelift/pull/1554
r? ``@ghost``
``@rustbot`` label +A-codegen +A-cranelift +T-compiler
Partial progress on #132735: Replace extern "rust-intrinsic" with #[rustc_intrinsic] across the codebase
Part of #132735: Replace `extern "rust-intrinsic"` with `#[rustc_intrinsic]` macro
- Updated all instances of `extern "rust-intrinsic"` to use the `#[rustc_intrinsic]` macro.
- Skipped `.md` files and test files to avoid unnecessary changes.
Add gpu-kernel calling convention
The amdgpu-kernel calling convention was reverted in commit f6b21e90d1 (#120495 and https://github.com/rust-lang/rust-analyzer/pull/16463) due to inactivity in the amdgpu target.
Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.
Tracking issue: #135467
amdgpu target tracking issue: #135024
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.
Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
Fuchsia explicitly builds rust and all rust targets with `-C
panic=abort` to minimize code generation size. However, when compiling a
proc-macro with this setting it can cause a warning to be emitted, which
breaks `tests/ui/invalid-compile-flags/crate-type-flag.rs`. This hasn't
been a problem in the past for us since we compile our proc macros on
host, rather than inside Fuchsia.
This attempts to fix the issue by explicitly requiring that we're using
the unwinder when compiling this test to avoid the warning being
emitted.
Fixes#135223
Variants::Single: do not use invalid VariantIdx for uninhabited enums
~~Stacked on top of https://github.com/rust-lang/rust/pull/133681, only the last commit is new.~~
Currently, `Variants::Single` for an empty enum contains a `VariantIdx` of 0; looking that up in the enum variant list will ICE. That's quite confusing. So let's fix that by adding a new `Variants::Empty` case for types that have 0 variants.
try-job: i686-msvc
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
A bunch of cleanups (part 2)
Just like https://github.com/rust-lang/rust/pull/133567 these were all found while looking at the respective code, but are not blocking any other changes I want to make in the short term.
It is treated as a map already. This is using FxIndexMap rather than
UnordMap because the latter doesn't provide an api to pick a single
value iff all values are equal, which each_linked_rlib depends on.
Make `Copy` unsafe to implement for ADTs with `unsafe` fields
As a rule, the application of `unsafe` to a declaration requires that use-sites of that declaration also entail `unsafe`. For example, a field declared `unsafe` may only be read in the lexical context of an `unsafe` block.
For nearly all safe traits, the safety obligations of fields are explicitly discharged when they are mentioned in method definitions. For example, idiomatically implementing `Clone` (a safe trait) for a type with unsafe fields will require `unsafe` to clone those fields.
Prior to this commit, `Copy` violated this rule. The trait is marked safe, and although it has no explicit methods, its implementation permits reads of `Self`.
This commit resolves this by making `Copy` conditionally safe to implement. It remains safe to implement for ADTs without unsafe fields, but unsafe to implement for ADTs with unsafe fields.
Tracking: #132922
r? ```@compiler-errors```
A bunch of cleanups
These are all extracted from a branch I have to get rid of driver queries. Most of the commits are not directly necessary for this, but were found in the process of implementing the removal of driver queries.
Previous PR: https://github.com/rust-lang/rust/pull/132410
As a rule, the application of `unsafe` to a declaration requires that use-sites
of that declaration also require `unsafe`. For example, a field declared
`unsafe` may only be read in the lexical context of an `unsafe` block.
For nearly all safe traits, the safety obligations of fields are explicitly
discharged when they are mentioned in method definitions. For example,
idiomatically implementing `Clone` (a safe trait) for a type with unsafe fields
will require `unsafe` to clone those fields.
Prior to this commit, `Copy` violated this rule. The trait is marked safe, and
although it has no explicit methods, its implementation permits reads of `Self`.
This commit resolves this by making `Copy` conditionally safe to implement. It
remains safe to implement for ADTs without unsafe fields, but unsafe to
implement for ADTs with unsafe fields.
Tracking: #132922
Fix clobber_abi in RV32E and RV64E inline assembly
Currently clobber_abi in RV32E and RV64E inline assembly is implemented using InlineAsmClobberAbi::RiscV, but broken since x16-x31 cannot be used in RV32E and RV64E.
```
error: cannot use register `x16`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
error: cannot use register `x17`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
error: cannot use register `x28`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
error: cannot use register `x29`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
error: cannot use register `x30`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
error: cannot use register `x31`: register can't be used with the `e` target feature
--> <source>:42:14
|
42 | asm!("", clobber_abi("C"), options(nostack, nomem, preserves_flags));
| ^^^^^^^^^^^^^^^^
```
r? `@Amanieu`
`@rustbot` label O-riscv +A-inline-assembly
Support input/output in vector registers of s390x inline assembly (under asm_experimental_reg feature)
This extends currently clobber-only vector registers (`vreg`) support to allow passing `#[repr(simd)]` types, floats (f32/f64/f128), and integers (i32/i64/i128) as input/output.
This is unstable and gated under new `#![feature(asm_experimental_reg)]` (tracking issue: https://github.com/rust-lang/rust/issues/133416). If the feature is not enabled, only clober is supported as before.
| Architecture | Register class | Target feature | Allowed types |
| ------------ | -------------- | -------------- | -------------- |
| s390x | `vreg` | `vector` | `i32`, `f32`, `i64`, `f64`, `i128`, `f128`, `i8x16`, `i16x8`, `i32x4`, `i64x2`, `f32x4`, `f64x2` |
This matches the list of types that are supported by the vector registers in LLVM:
https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td#L301-L313
In addition to `core::simd` types and floats listed above, custom `#[repr(simd)]` types of the same size and type are also allowed. All allowed types other than i32/f32/i64/f64/i128, and relevant target features are currently unstable.
Currently there is no SIMD type for s390x in `core::arch`, but this is tracked in https://github.com/rust-lang/rust/issues/130869.
cc https://github.com/rust-lang/rust/issues/130869 about vector facility support in s390x
cc https://github.com/rust-lang/rust/issues/125398 & https://github.com/rust-lang/rust/issues/116909 about f128 support in asm
`@rustbot` label +O-SystemZ +A-inline-assembly
take 2
open up coroutines
tweak the wordings
the lint works up until 2021
We were missing one case, for ADTs, which was
causing `Result` to yield incorrect results.
only include field spans with significant types
deduplicate and eliminate field spans
switch to emit spans to impl Drops
Co-authored-by: Niko Matsakis <nikomat@amazon.com>
collect drops instead of taking liveness diff
apply some suggestions and add explantory notes
small fix on the cache
let the query recurse through coroutine
new suggestion format with extracted variable name
fine-tune the drop span and messages
bugfix on runtime borrows
tweak message wording
filter out ecosystem types earlier
apply suggestions
clippy
check lint level at session level
further restrict applicability of the lint
translate bid into nop for stable mir
detect cycle in type structure
Use `TypingMode` throughout the compiler instead of `ParamEnv`
Hopefully the biggest single PR as part of https://github.com/rust-lang/types-team/issues/128.
## `infcx.typing_env` while defining opaque types
I don't know how'll be able to correctly handle opaque types when using something taking a `TypingEnv` while defining opaque types. To correctly handle the opaques we need to be able to pass in the current `opaque_type_storage` and return constraints, i.e. we need to use a proper canonical query. We should migrate all the queries used during HIR typeck and borrowck where this matters to proper canonical queries. This is
## `layout_of` and `Reveal::All`
We convert the `ParamEnv` to `Reveal::All` right at the start of the `layout_of` query, so I've changed callers of `layout_of` to already use a post analysis `TypingEnv` when encountering it.
ca87b535a0/compiler/rustc_ty_utils/src/layout.rs (L51)
## `Ty::is_[unpin|sized|whatever]`
I haven't migrated `fn is_item_raw` to use `TypingEnv`, will do so in a followup PR, this should significantly reduce the amount of `typing_env.param_env`. At some point there will probably be zero such uses as using the type system while ignoring the `typing_mode` is incorrect.
## `MirPhase` and phase-transitions
When inside of a MIR-body, we can mostly use its `MirPhase` to figure out the right `typing_mode`. This does not work during phase transitions, most notably when transitioning from `Analysis` to `Runtime`:
dae7ac133b/compiler/rustc_mir_transform/src/lib.rs (L606-L625)
All these passes still run with `MirPhase::Analysis`, but we should only use `Reveal::All` once we're run the `RevealAll` pass. This required me to manually construct the right `TypingEnv` in all these passes. Given that it feels somewhat easy to accidentally miss this going forward, I would maybe like to change `Body::phase` to an `Option` and replace it at the start of phase transitions. This then makes it clear that the MIR is currently in a weird state.
r? `@ghost`
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
As a side effect this should add raw-dylib support to cg_gcc as the
default ArchiveBuilderBuilder that is used implements
create_dll_import_lib. I haven't tested if the raw-dylib support
actually works however.
Subtree sync for rustc_codegen_cranelift
The highlight this time is an update to Cranelift 0.113,
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler
The OS version depends on the deployment target environment variables,
the access of which we want to move to later in the compilation pipeline
that has access to more information, for example `env_depinfo`.
Effects cleanup
- removed extra bits from predicates queries that are no longer needed in the new system
- removed the need for `non_erasable_generics` to take in tcx and DefId, removed unused arguments in callers
r? compiler-errors
- removed extra bits from predicates queries that are no longer needed in the new system
- removed the need for `non_erasable_generics` to take in tcx and DefId, removed unused arguments in callers
Fundamentally, we have *three* disjoint categories of functions:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, several holes in recursive const stability checking are being closed.
There's still one potential hole that is hard to avoid, which is when MIR
building automatically inserts calls to a particular function in stable
functions -- which happens in the panic machinery. Those need to *not* be
`rustc_const_unstable` (or manually get a `rustc_const_stable_indirect`) to be
sure they follow recursive const stability. But that's a fairly rare and special
case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be
constified simply by marking it as `const fn`, and it will then be
const-callable from stable `const fn` and subject to recursive const stability
requirements. If it is publicly reachable (which implies it cannot be unmarked),
it will be const-unstable under the same feature gate. Only if the function ever
becomes `#[stable]` does it need a `#[rustc_const_unstable]` or
`#[rustc_const_stable]` marker to decide if this should also imply
const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to
use unstable const lang features (including intrinsics), or (b) `#[stable]`
functions that are not yet intended to be const-stable. Adding
`#[rustc_const_stable]` is only needed for functions that are actually meant to
be directly callable from stable const code. `#[rustc_const_stable_indirect]` is
used to mark intrinsics as const-callable and for `#[rustc_const_unstable]`
functions that are actually called from other, exposed-on-stable `const fn`. No
other attributes are required.
As part of the "arbitrary self types v2" project, we are going to
replace the current `Receiver` trait with a new mechanism based on a
new, different `Receiver` trait.
This PR renames the old trait to get it out the way. Naming is hard.
Options considered included:
* HardCodedReceiver (because it should only be used for things in the
standard library, and hence is sort-of hard coded)
* LegacyReceiver
* TargetLessReceiver
* OldReceiver
These are all bad names, but fortunately this will be temporary.
Assuming the new mechanism proceeds to stabilization as intended, the
legacy trait will be removed altogether.
Although we expect this trait to be used only in the standard library,
we suspect it may be in use elsehwere, so we're landing this change
separately to identify any surprising breakages.
It's known that this trait is used within the Rust for Linux project; a
patch is in progress to remove their dependency.
This is a part of the arbitrary self types v2 project,
https://github.com/rust-lang/rfcs/pull/3519https://github.com/rust-lang/rust/issues/44874
r? @wesleywiser
Add intrinsics `fmuladd{f16,f32,f64,f128}`. This computes `(a * b) +
c`, to be fused if the code generator determines that (i) the target
instruction set has support for a fused operation, and (ii) that the
fused operation is more efficient than the equivalent, separate pair
of `mul` and `add` instructions.
https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic
MIRI support is included for f32 and f64.
The codegen_cranelift uses the `fma` function from libc, which is a
correct implementation, but without the desired performance semantic. I
think this requires an update to cranelift to expose a suitable
instruction in its IR.
I have not tested with codegen_gcc, but it should behave the same
way (using `fma` from libc).
- fix for divergence
- fix error message
- fix another cranelift test
- fix some cranelift things
- don't set the NORETURN option for naked asm
- fix use of naked_asm! in doc comment
- fix use of naked_asm! in run-make test
- use `span_bug` in unreachable branch