Commit Graph

1202 Commits

Author SHA1 Message Date
Trevor Gross
07932ad111
Rollup merge of #142507 - folkertdev:fn-align-align-attribute, r=jdonszelmann
use `#[align]` attribute for `fn_align`

Tracking issue: https://github.com/rust-lang/rust/issues/82232

https://github.com/rust-lang/rfcs/pull/3806 decides to add the `#[align]` attribute for alignment of various items. Right now it's used for functions with `fn_align`, in the future it will get more uses (statics, struct fields, etc.)

(the RFC finishes FCP today)

r? `@ghost`
2025-06-18 20:22:49 -04:00
Jakub Beránek
ff146112f6
Rollup merge of #140774 - workingjubilee:should-force-frame-pointers-favor-the-target-or-cli, r=jieyouxu
Affirm `-Cforce-frame-pointers=off` does not override

This PR exists to document that we (that is, the compiler reviewer) implicitly made a decision in rust-lang/rust#86652 that defies the expectations of some programmers. Some programmers believe `-Cforce-frame-pointers=false` should obey the programmer in all cases, forcing the compiler to avoid generating frame pointers, even if the target specification would indicate they must be generated. However, many targets rely on frame pointers for fast or sound unwinding.

T-compiler had a weekly triage meeting on 2025-05-22. This topic was put to discussion because some programmers may expect the target-overriding behavior. In that meeting we decided removing frame pointers, at least with regards to the contract of the `-Cforce-frame-pointers` option, is not required, even if `=off` is passed, and that we will not do so if the target would expect them. This follows from the documentation here: https://doc.rust-lang.org/rustc/codegen-options/index.html#force-frame-pointers

We may separately pursue trying to clarify the situation more emphatically in our documentation, or warn when people pass the option when it doesn't do anything.
2025-06-18 18:06:48 +02:00
Folkert de Vries
1fdf2b5620
add #[align] attribute
Right now it's used for functions with `fn_align`, in the future it will
get more uses (statics, struct fields, etc.)
2025-06-18 12:37:08 +02:00
bors
6f935a044d Auto merge of #141061 - dpaoliello:shimasfn, r=bjorn3
Change __rust_no_alloc_shim_is_unstable to be a function

This fixes a long sequence of issues:

1. A customer reported that building for Arm64EC was broken: #138541
2. This was caused by a bug in my original implementation of Arm64EC support, namely that only functions on Arm64EC need to be decorated with `#` but Rust was decorating statics as well.
3. Once I corrected Rust to only decorate functions, I started linking failures where the linker couldn't find statics exported by dylib dependencies. This was caused by the compiler not marking exported statics in the generated DEF file with `DATA`, thus they were being exported as functions not data.
4. Once I corrected the way that the DEF files were being emitted, the linker started failing saying that it couldn't find `__rust_no_alloc_shim_is_unstable`. This is because the MSVC linker requires the declarations of statics imported from other dylibs to be marked with `dllimport` (whereas it will happily link to functions imported from other dylibs whether they are marked `dllimport` or not).
5. I then made a change to ensure that `__rust_no_alloc_shim_is_unstable` was marked as `dllimport`, but the MSVC linker started emitting warnings that `__rust_no_alloc_shim_is_unstable` was marked as `dllimport` but was declared in an obj file. This is a harmless warning which is a performance hint: anything that's marked `dllimport` must be indirected via an `__imp` symbol so I added a linker arg in the target to suppress the warning.
6. A customer then reported a similar warning when using `lld-link` (<https://github.com/rust-lang/rust/pull/140176#issuecomment-2872448443>). I don't think it was an implementation difference between the two linkers but rather that, depending on the obj that the declaration versus uses of `__rust_no_alloc_shim_is_unstable` landed in we would get different warnings, so I suppressed that warning as well: #140954.
7. Another customer reported that they weren't using the Rust compiler to invoke the linker, thus these warnings were breaking their build: <https://github.com/rust-lang/rust/pull/140176#issuecomment-2881867433>. At that point, my original change was reverted (#141024) leaving Arm64EC broken yet again.

Taking a step back, a lot of these linker issues arise from the fact that `__rust_no_alloc_shim_is_unstable` is marked as `extern "Rust"` in the standard library and, therefore, assumed to be a foreign item from a different crate BUT the Rust compiler may choose to generate it either in the current crate, some other crate that will be statically linked in OR some other crate that will by dynamically imported.

Worse yet, it is impossible while building a given crate to know if `__rust_no_alloc_shim_is_unstable` will statically linked or dynamically imported: it might be that one of its dependent crates is the one with an allocator kind set and thus that crate (which is compiled later) will decide depending if it has any dylib dependencies or not to import `__rust_no_alloc_shim_is_unstable` or generate it. Thus, there is no way to know if the declaration of `__rust_no_alloc_shim_is_unstable` should be marked with `dllimport` or not.

There is a simple fix for all this: there is no reason `__rust_no_alloc_shim_is_unstable` must be a static. It needs to be some symbol that must be linked in; thus, it could easily be a function instead. As a function, there is no need to mark it as `dllimport` when dynamically imported which avoids the entire mess above.

There may be a perf hit for changing the `volatile load` to be a `tail call`, so I'm happy to change that part back (although I question what the codegen of a `volatile load` would look like, and if the backend is going to try to use load-acquire semantics).

Build with this change applied BEFORE #140176 was reverted to demonstrate that there are no linking issues with either MSVC or MinGW: <https://github.com/rust-lang/rust/actions/runs/15078657205>

Incidentally, I fixed `tests/run-make/no-alloc-shim` to work with MSVC as I needed it to be able to test locally (FYI for #128602)

r? `@bjorn3`
cc `@jieyouxu`
2025-06-18 09:24:40 +00:00
bors
27733d46d7 Auto merge of #130887 - Soveu:repeatn, r=scottmcm
Safer implementation of RepeatN

I've seen the "Use MaybeUninit for RepeatN" commit while reading This Week In Rust and immediately thought about something I've written some time ago - https://github.com/Soveu/repeat_finite/blob/master/src/lib.rs.

Using the fact, that `Option` will find niche in `(T, NonZeroUsize)`, we can construct something that has the same size as `(T, usize)` while completely getting rid of `MaybeUninit`.
This leaves only `unsafe` on `TrustedLen`, which is pretty neat.
2025-06-18 03:18:10 +00:00
bors
86d0aef804 Auto merge of #137944 - davidtwco:sized-hierarchy, r=oli-obk
Sized Hierarchy: Part I

This patch implements the non-const parts of rust-lang/rfcs#3729. It introduces two new traits to the standard library, `MetaSized` and `PointeeSized`. See the RFC for the rationale behind these traits and to discuss whether this change makes sense in the abstract.

These traits are unstable (as is their constness), so users cannot refer to them without opting-in to `feature(sized_hierarchy)`. These traits are not behind `cfg`s as this would make implementation unfeasible, there would simply be too many `cfg`s required to add the necessary bounds everywhere. So, like `Sized`, these traits are automatically implemented by the compiler.

RFC 3729 describes changes which are necessary to preserve backwards compatibility given the introduction of these traits, which are implemented and as follows:

- `?Sized` is rewritten as `MetaSized`
- `MetaSized` is added as a default supertrait for all traits w/out an explicit sizedness supertrait already.

There are no edition migrations implemented in this,  as these are primarily required for the constness parts of the RFC and prior to stabilisation of this (and so will come in follow-up PRs alongside the const parts). All diagnostic output should remain the same (showing `?Sized` even if the compiler sees `MetaSized`) unless the `sized_hierarchy` feature is enabled.

Due to the use of unstable extern types in the standard library and rustc, some bounds in both projects have had to be relaxed already - this is unfortunate but unavoidable so that these extern types can continue to be used where they were before. Performing these relaxations in the standard library and rustc are desirable longer-term anyway, but some bounds are not as relaxed as they ideally would be due to the inability to relax `Deref::Target` (this will be investigated separately).

It is hoped that this is implemented such that it could be merged and these traits could exist "under the hood" without that being observable to the user (other than in any performance impact this has on the compiler, etc). Some details might leak through due to the standard library relaxations, but this has not been observed in test output.

**Notes:**

- Any commits starting with "upstream:" can be ignored, as these correspond to other upstream PRs that this is based on which have yet to be merged.
- This best reviewed commit-by-commit. I've attempted to make the implementation easy to follow and keep similar changes and test output updates together.
  - Each commit has a short description describing its purpose.
  - This patch is large but it's primarily in the test suite.
- I've worked on the performance of this patch and a few optimisations are implemented so that the performance impact is neutral-to-minor.
- `PointeeSized` is a different name from the RFC just to make it more obvious that it is different from `std::ptr::Pointee` but all the names are yet to be bikeshed anyway.
- `@nikomatsakis` has confirmed [that this can proceed as an experiment from the t-lang side](https://rust-lang.zulipchat.com/#narrow/channel/435869-project-goals/topic/SVE.20and.20SME.20on.20AArch64.20.28goals.23270.29/near/506196491)
- FCP in https://github.com/rust-lang/rust/pull/137944#issuecomment-2912207485

Fixes rust-lang/rust#79409.

r? `@ghost` (I'll discuss this with relevant teams to find a reviewer)
2025-06-17 15:08:50 +00:00
Jubilee Young
5a449fb40b tests: remove define so dso_local attr does not disrupt test 2025-06-16 23:21:23 -07:00
David Wood
3c3ba37ba5
tests: PointeeSized bounds with extern types
These tests necessarily need to change now that `?Sized` is not
sufficient to accept extern types and `PointeeSized` is now necessary. In
addition, the `size_of_val`/`align_of_val` test can now be changed to
expect an error.
2025-06-16 23:04:35 +00:00
David Wood
322cc31504
tests: {Meta,Pointee}Sized in non-minicore tests
As before, add `MetaSized` and `PointeeSized` traits to all of the
non-minicore `no_core` tests so that they don't fail for lack of
language items.
2025-06-16 23:04:33 +00:00
Daniel Paoliello
6906b44e1c Change __rust_no_alloc_shim_is_unstable to be a function 2025-06-16 10:54:07 -07:00
beetrees
5723c9997c
Fix RISC-V C function ABI when passing/returning structs containing floats 2025-06-16 10:14:07 +01:00
León Orell Valerian Liehr
16152661ff
Rollup merge of #142389 - beetrees:cranelift-arg-ext, r=bjorn3
Apply ABI attributes on return types in `rustc_codegen_cranelift`

- The [x86-64 System V ABI standard](https://gitlab.com/x86-psABIs/x86-64-ABI/-/jobs/artifacts/master/raw/x86-64-ABI/abi.pdf?job=build) doesn't sign/zero-extend integer arguments or return types.
- But the de-facto standard as implemented by Clang and GCC is to sign/zero-extend arguments to 32 bits (but not return types).
- Additionally, Apple targets [sign/zero-extend both arguments and return values to 32 bits](https://developer.apple.com/documentation/xcode/writing-64-bit-intel-code-for-apple-platforms#Pass-arguments-to-functions-correctly).
- However, the `rustc_target` ABI adjustment code currently [unconditionally extends both arguments and return values to 32 bits](e703dff8fe/compiler/rustc_target/src/callconv/x86_64.rs (L240)) on all targets.
- This doesn't cause a miscompilation when compiling with LLVM as LLVM will ignore the `signext`/`zeroext` attribute when applied to return types on non-Apple x86-64 targets.
- Cranelift, however, does not have a similar special case, requiring `rustc` to set the argument extension attribute correctly.
- However, `rustc_codegen_cranelift` doesn't currently apply ABI attributes to return types at all, meaning `rustc_codegen_cranelift` will currently miscompile `i8`/`u8`/`i16`/`u16` returns on x86-64 Apple targets as those targets require sign/zero-extension of return types.

This PR fixes the bug(s) by making the `rustc_target` x86-64 System V ABI only mark return types as sign/zero-extended on Apple platforms, while also making `rustc_codegen_cranelift` apply ABI attributes to return types. The RISC-V and s390x C ABIs also require sign/zero extension of return types, so this will fix those targets when building with `rustc_codegen_cranelift` too.

r? `````@bjorn3`````
2025-06-15 23:51:56 +02:00
Matthias Krüger
db23a76217
Rollup merge of #141811 - mejrs:bye_locals, r=compiler-errors
Unimplement unsized_locals

Implements https://github.com/rust-lang/compiler-team/issues/630

Tracking issue here: https://github.com/rust-lang/rust/issues/111942

Note that this just removes the feature, not the implementation, and does not touch `unsized_fn_params`. This is because it is required to support `Box<dyn FnOnce()>: FnOnce()`.

There may be more that should be removed (possibly in follow up prs)
- the `forget_unsized` function and `forget` intrinsic.
- the `unsized_locals` test directory; I've just fixed up the tests for now
- various codegen support for unsized values and allocas

cc ``@JakobDegen`` ``@oli-obk`` ``@Noratrieb`` ``@programmerjake`` ``@bjorn3``

``@rustbot`` label F-unsized_locals

Fixes rust-lang/rust#79409
2025-06-14 11:27:10 +02:00
Jubilee Young
4658aee127 tests: Convert two handwritten minicores to add-core-stubs 2025-06-13 18:59:41 -07:00
Soveu
8f77681656 100% safe implementation of RepeatN 2025-06-13 22:32:15 +02:00
Matthias Krüger
9daf8ea811
Rollup merge of #142176 - workingjubilee:dont-shuffle-bswaps-per-arch, r=nikic
tests: Split dont-shuffle-bswaps along opt-levels and arches

This duplicates dont-shuffle-bswaps in order to make each opt level its own test. Then -opt3.rs gets split into a revision per arch we want to test, with certain architectures gaining new target-cpu minimums.
2025-06-13 05:19:15 +02:00
mejrs
c0e02e26b3 Unimplement unsized_locals 2025-06-13 01:16:36 +02:00
beetrees
eb472e77cd
Apply ABI attributes on return types in rustc_codegen_cranelift 2025-06-12 00:47:01 +01:00
Jubilee Young
6b0deb2161 tests: Revise dont-shuffle-bswaps-opt3 per tested arch
Some architectures gain target-cpu minimums in doing so.
2025-06-09 23:15:44 -07:00
Jubilee Young
59069986e7 tests: Copy dont-shuffle-bswaps per tested opt level 2025-06-09 17:40:05 -07:00
Andrew Zhogin
5601490c9d -Zretpoline and -Zretpoline-external-thunk flags (target modifiers) to enable retpoline-related target features 2025-06-09 21:29:59 +07:00
Jubilee Young
e57b4b19e8 encode compiler team acceptance of -Cforce-frame-pointers change 2025-06-05 13:28:46 -07:00
Jubilee Young
a6b62d893f codegen: modernize frame-pointer-cli-control.rs
Update this time-traveler on the changes in compiletest and target specs
that they missed over the pass ~3 years by being caught in a time rift.
The aarch64-apple rev splits into itself and aarch64-apple-on, because
rustc obtained support for non-leaf frame-pointers ever since 9b67cba
implemented them and used them in aarch64-apple-darwin's spec.

Note that the aarch64-apple-off revision fails, despite modernization.
This is because 9b67cba also changed the behavior of rustc to defer to
the spec over the command-line interface.
2025-06-05 13:28:42 -07:00
Jubilee Young
f27ed88053 codegen: test frame pointer attr prefers CLI opt
This test only makes sense if you send it back in time and run it with
a now-old Rust commit, e.g. 50e0cc59ff
However, if you do go back that far in time, you will see it pass.
2025-06-04 14:51:51 -07:00
Ralf Jung
321db85fb4 x86 (32/64): go back to passing SIMD vectors by-ptr 2025-06-04 08:38:49 +02:00
bors
15825b7161 Auto merge of #139385 - joboet:threadlocal_address, r=nikic
rustc_codegen_llvm: use `threadlocal.address` intrinsic to access TLS

Fixes #136044
r? `@nikic`
2025-05-30 15:39:56 +00:00
joboet
e4d9b06cc8
rustc_codegen_llvm: use threadlocal.address intrinsic to access TLS 2025-05-29 16:07:43 +02:00
Jacob Pratt
8951c74e2a
Rollup merge of #138285 - beetrees:repr128-stable, r=traviscross,bjorn3
Stabilize `repr128`

## Stabilisation report

The `repr128` feature ([tracking issue](https://github.com/rust-lang/rust/issues/56071)) allows the use of `#[repr(u128)]` and `#[repr(i128)]` on enums in the same way that other primitive representations such as `#[repr(u64)]` can be used. For example:

```rust
#[repr(u128)]
enum Foo {
    One = 1,
    Two,
    Big = u128::MAX,
}

#[repr(i128)]
enum Bar {
    HasThing(u16) = 42,
    HasSomethingElse(i64) = u64::MAX as i128 + 1,
    HasNothing,
}
```

This is the final part of adding 128-bit integers to Rust ([RFC 1504](https://rust-lang.github.io/rfcs/1504-int128.html)); all other parts of 128-bit integer support were stabilised in #49101 back in 2018.

From a design perspective, `#[repr(u128)]`/`#[repr(i128)]` function like `#[repr(u64)]`/`#[repr(i64)]` but for 128-bit integers instead of 64-bit integers. The only differences are:

- FFI safety: as `u128`/`i128` are not currently considered FFI safe, neither are `#[repr(u128)]`/`#[repr(i128)]` enums (I discovered this wasn't the case while drafting this stabilisation report, so I have submitted #138282 to fix this).
- Debug info: while none of the major debuggers currently support 128-bit integers, as of LLVM 20 `rustc` will emit valid debuginfo for both DWARF and PDB (PDB makes use of the same natvis that is also used for all enums with fields, whereas DWARF has native support).

Tests for `#[repr(u128)]`/`#[repr(i128)]` enums include:
- [ui/enum-discriminant/repr128.rs](385970f0c1/tests/ui/enum-discriminant/repr128.rs): checks that 128-bit enum discriminants have the correct values.
- [debuginfo/msvc-pretty-enums.rs](385970f0c1/tests/debuginfo/msvc-pretty-enums.rs): checks the PDB debuginfo is correct.
- [run-make/repr128-dwarf](385970f0c1/tests/run-make/repr128-dwarf/rmake.rs): checks the DWARF debuginfo is correct.

Stabilising this feature does not require any changes to the Rust Reference as [the documentation on primitive representations](https://doc.rust-lang.org/nightly/reference/type-layout.html#r-layout.repr.primitive.intro) already includes `u128` and `i128`.

Closes #56071
Closes https://github.com/rust-lang/reference/issues/1368

r? lang

```@rustbot``` label +I-lang-nominated +T-lang
2025-05-29 04:50:46 +02:00
Trevor Gross
7f5f29b663
Rollup merge of #140697 - Sa4dUs:split-autodiff, r=ZuseZ4
Split `autodiff` into `autodiff_forward` and `autodiff_reverse`

This PR splits `#[autodiff]` macro so `#[autodiff(df, Reverse, args)]` would become `#[autodiff_reverse(df, args)]` and `#[autodiff(df, Forward, args)]` would become `#[autodiff_forwad(df, args)]`.
2025-05-28 10:28:08 -04:00
beetrees
467eeabbb5
Stabilise repr128 2025-05-28 15:14:34 +01:00
Augie Fackler
a963e6fc38 tests: mark option-niche-eq as fixed on LLVM 21
Some combination of recent Rust changes (between 3d86494a0d and
aa57e46e24 from what I can tell) and changes in LLVM 21 (not recently,
as best I can tell) have caused this test to start showing the behavior
we want, so it's time to move this test to a proper place and mark it as
fixed on LLVM 21.
2025-05-27 11:20:52 -04:00
bors
46264e6dfd Auto merge of #138489 - tmiasko:call-tmps-lifetime, r=workingjubilee
Describe lifetime of call argument temporaries passed indirectly

Fixes #132014.
2025-05-26 01:16:52 +00:00
Jacob Pratt
3f91bbcd5f
Rollup merge of #140950 - clubby789:nonzero-ord-test, r=Mark-Simulacrum
More option optimization tests

I noticed that although adding a manual implementation for PartialOrd on Option in #122024, I didn't add a test so that we can easily check if this behavior has improved.

This also adds a couple of `should-fail` tests - this will allow us to remove these hacky implementations if upstream LLVM improves.
2025-05-25 04:00:55 +02:00
Matthias Krüger
d6a61daf60
Rollup merge of #140832 - workingjubilee:aarch64-linux-should-use-frame-pointers, r=compiler-errors
aarch64-linux: Default to FramePointer::NonLeaf

For aarch64-apple and aarch64-windows, platform docs state that code must use frame pointers correctly. This is because the AAPCS64 mandates that a platform specify its frame pointer conformance requirements:
- Apple: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms#Respect-the-purpose-of-specific-CPU-registers
- Windows: https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170#integer-registers
- AAPCS64: 4492d1570e/aapcs64/aapcs64.rst (the-frame-pointer)

Unwinding code either requires unwind tables or frame pointers, and on aarch64 the expectation is that one can use frame pointers for this. Most Linux targets represent a motley variety of possible distributions, so it is unclear who to defer to on conformance, other than perhaps Arm. In the absence of a specific edict for a given aarch64-linux target, Rust will assume aarch64-linux targets also use non-leaf frame pointers. This reflects what compilers like clang do.
2025-05-23 20:30:09 +02:00
Marcelo Domínguez
8917ff6024 Update generic tests 2025-05-21 07:24:43 +00:00
Marcelo Domínguez
2041de7083 Update codegen and pretty tests
UI tests are pending, will depend on error messages change.
2025-05-21 07:24:42 +00:00
Tomasz Miąsko
3b7ca287a7 Describe lifetime of call argument temporaries passed indirectly 2025-05-17 09:49:03 +02:00
Jubilee Young
0c157b51d3 aarch64-linux: Default to FramePointer::NonLeaf
For aarch64-apple and aarch64-windows, platform docs state that code
must use frame pointers correctly. This is because the AAPCS64 mandates
that a platform specify its frame pointer conformance requirements:
- Apple: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms#Respect-the-purpose-of-specific-CPU-registers
- Windows: https://learn.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=msvc-170#integer-registers
- AAPCS64: 4492d1570e/aapcs64/aapcs64.rst (the-frame-pointer)

Unwinding code either requires unwind tables or frame pointers, and
on aarch64 the expectation is that one can use frame pointers for this.
Most Linux targets represent a motley variety of possible distributions,
so it is unclear who to defer to on conformance, other than perhaps Arm.
In the absence of a specific edict for a given aarch64-linux target,
Rust will assume aarch64-linux targets use non-leaf frame pointers.
This reflects what compilers like clang do.
2025-05-17 06:42:46 +02:00
clubby789
5eb47eeb85 Add failing tests for some Option optimizations 2025-05-12 17:33:58 +01:00
clubby789
bebcb9da21 Add test for Ord impl for Option::NonZero 2025-05-12 14:49:12 +00:00
HaeNoe
a504759afd
feat: add codegen test
Ensure that code for generic `d_primal::<T>` is generated even if `primal::<T>`
is never used.

- incorporate feedback from @ZuseZ4
2025-05-11 17:54:57 +02:00
Trevor Gross
82b792358c
Rollup merge of #140457 - fneddy:fix_s390x_codegen_const_vector, r=Mark-Simulacrum
Use target-cpu=z13 on s390x codegen const vector test

The default s390x cpu(z10) does not have vector support. Setting target-cpu at least to z13 enables vectorisation for s390x architecture and makes the test pass.
2025-05-04 18:11:48 -04:00
Stuart Cook
020d908b64
Rollup merge of #140456 - fneddy:fix_s390x_codegen_simd_ext_ins_dyn, r=wesleywiser
Fix test simd/extract-insert-dyn on s390x

Fix the test for s390x by enabling s390x vector extension via `target_feature(enable = "vector")`(#127506). As this is is still gated by `#![feature(s390x_target_feature)]` we need that attribute also.
2025-05-04 13:21:08 +10:00
Eduard Stefes
61488e5070 Fix test simd/extract-insert-dyn on s390x
Fix the test for s390x by enabling s390x vector extension via
`target_feature(enable = "vector")`(#127506). As this is is still
gated by `#![feature(s390x_target_feature)]` we need that attribute
also.
2025-05-03 10:15:32 +02:00
Amanieu d'Antras
72b110ada3 Stabilize select_unpredictable
FCP completed in tracking issue #133962.
2025-05-01 13:49:28 +01:00
Adrian Friedli
cf12e290fd enable msa feature for mips in codegen tests 2025-04-29 10:20:25 +02:00
Chris Denton
17495e0030
Rollup merge of #139656 - scottmcm:stabilize-slice-as-chunks, r=dtolnay
Stabilize `slice_as_chunks` library feature

~~Draft as this needs #139163 to land first.~~

FCP: https://github.com/rust-lang/rust/issues/74985#issuecomment-2769963395

Methods being stabilized are:
```rust
impl [T] {
    const fn as_chunks<const N: usize>(&self) -> (&[[T; N]], &[T]);
    const fn as_rchunks<const N: usize>(&self) -> (&[T], &[[T; N]]);
    const unsafe fn as_chunks_unchecked<const N: usize>(&self) -> &[[T; N]];
    const fn as_chunks_mut<const N: usize>(&mut self) -> (&mut [[T; N]], &mut [T]);
    const fn as_rchunks_mut<const N: usize>(&mut self) -> (&mut [T], &mut [[T; N]]);
    const unsafe fn as_chunks_unchecked_mut<const N: usize>(&mut self) -> &mut [[T; N]];
}
```

~~(FCP's not done quite yet, but will in another day if I'm counting right.)~~ FCP Complete: https://github.com/rust-lang/rust/issues/74985#issuecomment-2797951535
2025-04-28 23:29:15 +00:00
Eduard Stefes
f831670519 Use target-cpu=z13 on s390x codegen const vector test
The default s390x cpu(z10) does not have vector support. Setting
target-cpu at least to z13 enables vectorisation for s390x architecture
and makes the test pass.
2025-04-28 22:11:46 +02:00
bit-aloo
7018392337
remove noinline attribute and add alwaysinline after AD pass 2025-04-28 21:10:32 +05:30
Matthias Krüger
c3f811f02f
Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk
Autodiff flags

Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff.
At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes.

This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild.

It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is *not* the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038)

Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem.

Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations.

r? ``@oli-obk``

closes #139471

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-24 17:19:44 +02:00