Stabilize `naked_functions`
tracking issue: https://github.com/rust-lang/rust/issues/90957
request for stabilization on tracking issue: https://github.com/rust-lang/rust/issues/90957#issuecomment-2539270352
reference PR: https://github.com/rust-lang/reference/pull/1689
# Request for Stabilization
Two years later, we're ready to try this again. Even though this issue is already marked as having passed FCP, given the amount of time that has passed and the changes in implementation strategy, we should follow the process again.
## Summary
The `naked_functions` feature has two main parts: the `#[naked]` function attribute, and the `naked_asm!` macro.
An example of a naked function:
```rust
const THREE: usize = 3;
#[naked]
pub extern "sysv64" fn add_n(number: usize) -> usize {
// SAFETY: the validity of the used registers
// is guaranteed according to the "sysv64" ABI
unsafe {
core::arch::naked_asm!(
"add rdi, {}",
"mov rax, rdi",
"ret",
const THREE,
)
}
}
```
When the `#[naked]` attribute is applied to a function, the compiler won't emit a [function prologue](https://en.wikipedia.org/wiki/Function_prologue_and_epilogue) or epilogue when generating code for this function. This attribute is analogous to [`__attribute__((naked))`](https://developer.arm.com/documentation/100067/0608/Compiler-specific-Function--Variable--and-Type-Attributes/--attribute----naked---function-attribute) in C. The use of this feature allows the programmer to have precise control over the assembly that is generated for a given function.
The body of a naked function must consist of a single `naked_asm!` invocation, a heavily restricted variant of the `asm!` macro: the only legal operands are `const` and `sym`, and the only legal options are `raw` and `att_syntax`. In lieu of specifying operands, the `naked_asm!` within a naked function relies on the function's calling convention to determine the validity of registers.
## Documentation
The Rust Reference: https://github.com/rust-lang/reference/pull/1689
(Previous PR: https://github.com/rust-lang/reference/pull/1153)
## Tests
* [tests/run-make/naked-symbol-visiblity](https://github.com/rust-lang/rust/tree/master/tests/codegen/naked-fn) verifies that `pub`, `#[no_mangle]` and `#[linkage = "..."]` work correctly for naked functions
* [tests/codegen/naked-fn](https://github.com/rust-lang/rust/tree/master/tests/codegen/naked-fn) has tests for function alignment, use of generics, and validates the exact assembly output on linux, macos, windows and thumb
* [tests/ui/asm/naked-*](https://github.com/rust-lang/rust/tree/master/tests/ui/asm) tests for incompatible attributes, generating errors around incorrect use of `naked_asm!`, etc
## Interaction with other (unstable) features
### [fn_align](https://github.com/rust-lang/rust/issues/82232)
Combining `#[naked]` with `#[repr(align(N))]` works well, and is tested e.g. here
- https://github.com/rust-lang/rust/blob/master/tests/codegen/naked-fn/aligned.rs
- https://github.com/rust-lang/rust/blob/master/tests/codegen/naked-fn/min-function-alignment.rs
It's tested extensively because we do need to explicitly support the `repr(align)` attribute (and make sure we e.g. don't mistake powers of two for number of bytes).
## History
This feature was originally proposed in [RFC 1201](https://github.com/rust-lang/rfcs/pull/1201), filed on 2015-07-10 and accepted on 2016-03-21. Support for this feature was added in [#32410](https://github.com/rust-lang/rust/pull/32410), landing on 2016-03-23. Development languished for several years as it was realized that the semantics given in RFC 1201 were insufficiently specific. To address this, a minimal subset of naked functions was specified by [RFC 2972](https://github.com/rust-lang/rfcs/pull/2972), filed on 2020-08-07 and accepted on 2021-11-16. Prior to the acceptance of RFC 2972, all of the stricter behavior specified by RFC 2972 was implemented as a series of warn-by-default lints that would trigger on existing uses of the `naked` attribute; these lints became hard errors in [#93153](https://github.com/rust-lang/rust/pull/93153) on 2022-01-22. As a result, today RFC 2972 has completely superseded RFC 1201 in describing the semantics of the `naked` attribute.
More recently, the `naked_asm!` macro was added to replace the earlier use of a heavily restricted `asm!` invocation. The `naked_asm!` name is clearer in error messages, and provides a place for documenting the specific requirements of inline assembly in naked functions.
The implementation strategy was changed to emitting a global assembly block. In effect, an extern function
```rust
extern "C" fn foo() {
core::arch::naked_asm!("ret")
}
```
is emitted as something similar to
```rust
core::arch::global_asm!(
"foo:",
"ret"
);
extern "C" {
fn foo();
}
```
The codegen approach was chosen over the llvm naked function attribute because:
- the rust compiler can guarantee the behavior (no sneaky additional instructions, no inlining, etc.)
- behavior is the same on all backends (llvm, cranelift, gcc, etc)
Finally, there is now an allow list of compatible attributes on naked functions, so that e.g. `#[inline]` is rejected with an error. The `#[target_feature]` attribute on naked functions was later made separately unstable, because implementing it is complex and we did not want to block naked functions themselves on how target features work on them. See also https://github.com/rust-lang/rust/issues/138568.
relevant PRs for these recent changes
- https://github.com/rust-lang/rust/pull/127853
- https://github.com/rust-lang/rust/pull/128651
- https://github.com/rust-lang/rust/pull/128004
- https://github.com/rust-lang/rust/pull/138570
-
### Various historical notes
#### `noreturn`
[RFC 2972](https://github.com/rust-lang/rfcs/blob/master/text/2972-constrained-naked.md) mentions that naked functions
> must have a body which contains only a single asm!() statement which:
> iii. must contain the noreturn option.
Instead of `asm!`, the current implementation mandates that the body contain a single `naked_asm!` statement. The `naked_asm!` macro is a heavily restricted version of the `asm!` macro, making it easier to talk about and document the rules of assembly in naked functions and give dedicated error messages.
For `naked_asm!`, the behavior of the `asm!`'s `noreturn` option is implicit. The `noreturn` option means that it is UB for control flow to fall through the end of the assembly block. With `asm!`, this option is usually used for blocks that diverge (and thus have no return and can be typed as `!`). With `naked_asm!`, the intent is different: usually naked funtions do return, but they must do so from within the assembly block. The `noreturn` option was used so that the compiler would not itself also insert a `ret` instruction at the very end.
#### padding / `ud2`
A `naked_asm!` block that violates the safety assumption that control flow must not fall through the end of the assembly block is UB. Because no return instruction is emitted, whatever bytes follow the naked function will be executed, resulting in truly undefined behavior. There has been discussion whether rustc should emit an invalid instruction (e.g. `ud2` on x86) after the `naked_asm!` block to at least fail early in the case of an invalid `naked_asm!`. It was however decided that it is more useful to guarantee that `#[naked]` functions NEVER contain any instructions besides those in the `naked_asm!` block.
# unresolved questions
None
r? ``@Amanieu``
I've validated the tests on x86_64 and aarch64
Add target-specific NaN payloads for the missing tier 2 targets
This PR adds target-specific NaN payloads for the remaining tier 2 targets:
- `arm64ec`: This target is a mix of `x86_64` and `aarch64`, meaning as they both have no extra payloads `arm64ec` also has no extra payloads.
- `loongarch64`: Per [LoongArch Reference Manual - Volume 1: Basic Architecture](https://github.com/loongson/LoongArch-Documentation/releases/download/2023.04.20/LoongArch-Vol1-v1.10-EN.pdf) section 3.1.1.3, LoongArch does quieting NaN propagation with the Rust preferred NaN as its default NaN, meaning it has no extra payloads.
- `nvptx64`: Per [PTX ISA documentation](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions) section 9.7.3 (and section 9.7.4 for `f16`), all payloads are possible. The documentation explicitly states that `f16` and `f32` operations result in an unspecified NaN payload, while for `f64` it states "NaN payloads are supported" without specifying how or what payload will be generated if there are no input NaNs.
- `powerpc` and `powerpc64`: Per [Power Instruction Set Architecture](https://files.openpower.foundation/s/9izgC5Rogi5Ywmm/download/OPF_PowerISA_v3.1C.pdf) Book I section 4.3.2, PowerPC does quieting NaN propagation with the Rust preferred NaN being generated if no there are no input NaNs, meaning it has no extra payloads.
- `s390x`: Per [IBM z/Architecture Principles of Operation](https://www.vm.ibm.com/library/other/22783213.pdf#page=965) page 9-3, s390x does quieting NaN propagation with the Rust's preferred NaN as its default NaN, meaning it has no extra payloads.
Tracking issue: #128288
cc ``@RalfJung``
``@rustbot`` label +T-lang
Also cc relevant target maintainers of tier 2 targets:
- `arm64ec`: ``@dpaoliello``
- `loongarch64`: ``@heiher`` ``@xiangzhai`` ``@zhaixiaojuan`` ``@xen0n``
- `nvptx64`: ``@RDambrosio016`` ``@kjetilkjeka``
- `powerpc`: the only documented maintainer is ``@BKPepe`` for the tier 3 `powerpc-unknown-linux-muslspe`.
- `powerpc64`: ``@daltenty`` ``@gilamn5tr`` ``@Gelbpunkt`` ``@famfo`` ``@neuschaefer``
- `s390x`: ``@uweigand`` ``@cuviper``
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors
It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first...
Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`.
I have extremely low confidence in the GCC part of this PR.
See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
add next_index to Enumerate
Proposal: https://github.com/rust-lang/libs-team/issues/435
Tracking Issue: #130711
This basically just reopens#130682 but squashed and with the new function and the feature gate renamed to `next_index.`
There are two questions I have already:
- Shouldn't we add test coverage for that? I'm happy to provide some, but I might need a pointer to where these test would be.
- Maybe I could actually also add a doctest?
- For now, I just renamed the feature name in the unstable attribute to `next_index`, as well, so it matches the new name of the function. Is that okay? And can I just do that and use any string, or is there a sealed list of features defined somewhere where I also need to change the name?
Implement `pin!()` using `super let`
Tracking issue for super let: https://github.com/rust-lang/rust/issues/139076
This uses `super let` to implement `pin!()`.
This means we can remove [the hack](https://github.com/rust-lang/rust/pull/138717) we had to put in to fix https://github.com/rust-lang/rust/issues/138596.
It also means we can remove the original hack to make `pin!()` work, which used a questionable public-but-unstable field rather than a proper private field.
While `super let` is still unstable and subject to change, it seems safe to assume that future Rust will always have a way to express `pin!()` in a compatible way, considering `pin!()` is already stable.
It'd help [the experiment](https://github.com/rust-lang/rust/issues/139076) to have `pin!()` use `super let`, so we can get some more experience with it.
Fix drop handling in `hint::select_unpredictable`
This intrinsic doesn't drop the value that is not selected so this is manually done in the public function that wraps the intrinsic.
f*::NAN: guarantee that this is a quiet NaN
I think we should guarantee that this is a quiet NaN. This then implies that programs not using `f*::from_bits` (or unsafe type conversions) are guaranteed to only work with quiet NaNs. It would be awkward if people start to write `0.0 / 0.0` instead of using the constant just because they want to get a guaranteed-quiet NaN.
This is a `@rust-lang/libs-api` change. The definition of this constant currently is `0.0 / 0.0`, which is already guaranteed to be a quiet NaN. So all this does is forward that guarantee to our users.
Enable contracts for const functions
Use `const_eval_select!()` macro to enable contract checking only at runtime. The existing contract logic relies on closures, which are not supported in constant functions.
This commit also removes one level of indirection for ensures clauses since we no longer build a closure around the ensures predicate.
Resolves#136925
**Call-out:** This is still a draft PR since CI is broken due to a new warning message for unreachable code when the bottom of the function is indeed unreachable. It's not clear to me why the warning wasn't triggered before.
r? ```@compiler-errors```
Avoid unused clones in `Cloned<I>` and `Copied<I>`
Avoid cloning in `Cloned<I>` or copying in `Copied<I>` when elements are only needed by reference or not at all. There is already some precedent for this, given that `__iterator_get_unchecked` is implemented, which can skip elements. The reduced clones are technically observable by a user impl of `Clone`.
r? libs-api
Avoid cloning in `Cloned<I>` or copying in `Copied<I>` when elements are
only needed by reference or not at all. There is already some precedent
for this, given that `__iterator_get_unchecked` is implemented, which
can skip elements. The reduced clones are technically observable by a
user impl of `Clone`.
Initial `UnsafePinned` implementation [Part 1: Libs]
Initial libs changes necessary to unblock further work on [RFC 3467](https://rust-lang.github.io/rfcs/3467-unsafe-pinned.html).
Tracking issue: #125735
This PR is split off from #136964, and includes just the libs changes:
- `UnsafePinned` struct
- private `UnsafeUnpin` structural auto trait
- Lang items for both
- Compiler changes necessary to block niches on `UnsafePinned`
This PR does not change codegen, miri, the existing `!Unpin` hack, or anything else. That work is to be split into later PRs.
---
cc ``@RalfJung`` ``@Noratrieb``
``@rustbot`` label F-unsafe_pinned T-libs-api
Move `select_unpredictable` to the `hint` module
There has been considerable discussion in both the ACP (rust-lang/libs-team#468) and tracking issue (#133962) about whether the `bool::select_unpredictable` method should be in `core::hint` instead.
I believe this is the right move for the following reasons:
- The documentation explicitly says that it is a hint, not a codegen guarantee.
- `bool` doesn't have a corresponding `select` method, and I don't think we should be adding one.
- This shouldn't be something that people reach for with auto-completion unless they specifically understand the interactions with branch prediction. Using conditional moves can easily make code *slower* by preventing the CPU from speculating past the condition due to the data dependency.
- Although currently `core::hint` only contains no-ops, this isn't a hard rule (for example `unreachable_unchecked` is a bit of a gray area). The documentation only status that the module contains "hints to compiler that affects how code should be emitted or optimized". This is consistent with what `select_unpredictable` does.
Use the chaining methods on PartialOrd for slices too
#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
docs: clarify uint exponent for `is_power_of_two`
This makes the documentation more explicit for that method. I know this might seem "nit-picky", but `k` could be interpreted as "any Real or Complex number". A trivial example would be $`3 = 2^{log_2(3)}`$ which "proves that three is a power of two" (according to that vague definition).
BTW, when I read the implementation, I was surprised to see that `1` is considered a power of 2 despite being odd (it does make sense in some contexts, but still not intuitive). So I wrote "positive int" before correcting it to "unsigned int"
Polymorphize `array::IntoIter`'s iterator impl
Today we emit all the iterator methods for every different array width. That's wasteful since the actual array length never even comes into it -- the indices used are from the separate `alive: IndexRange` field, not even the `N` const param.
This PR switches things so that an `array::IntoIter<T, N>` stores a `PolymorphicIter<[MaybeUninit<T>; N]>`, which we *unsize* to `PolymorphicIter<[MaybeUninit<T>]>` and call methods on that non-`Sized` type for all the iterator methods.
That also necessarily makes the layout consistent between the different lengths of arrays, because of the unsizing. Compare that to today <https://rust.godbolt.org/z/Prb4xMPrb>, where different widths can't even be deduped because the offset to the indices is different for different array widths.
add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`
fixes https://github.com/rust-lang/rust/issues/137372
adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change.
Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that.
r? `@ghost`