Commit Graph

2738 Commits

Author SHA1 Message Date
bors
1a7f290a9a Auto merge of #140914 - Zalathar:asm-bindings, r=compiler-errors
cg_llvm: Clean up some inline assembly bindings

This PR combines a few loosely-related cleanups to LLVM bindings related to inline assembly. These include:
- Replacing `LLVMRustInlineAsm` with LLVM-C's `LLVMGetInlineAsm`
- Adjusting FFI declarations to avoid the need for explicit `as_c_char_ptr` conversions
- Flattening control flow in `inline_asm_call`

There should be no functional changes.
2025-05-12 17:39:21 +00:00
Pietro Albini
2ce08ca5d6
update cfg(bootstrap) 2025-05-12 15:33:37 +02:00
Zalathar
dbdbde2a72 Rename OperandBundleOwned to OperandBundleBox
As with `DIBuilderBox`, the "Box" suffix does a better job of communicating
that this is an owning pointer to some borrowable resource.

This also renames the `raw` method to `as_ref`, which is what it would have
been named originally if the `Deref` problem had been known at the time.
2025-05-11 21:21:38 +10:00
Zalathar
eccf0647d3 Flatten control-flow in inline_asm_call after verification 2025-05-11 14:38:42 +10:00
Zalathar
b6300294a8 Make LLVMRustInlineAsmVerify take *const c_uchar
This avoids the need for an explicit `as_c_char_ptr` conversion.
2025-05-11 14:38:42 +10:00
Zalathar
b1094f6a0a Add a safe wrapper for LLVMAppendModuleInlineAsm
This patch also changes the Rust-side declaration to take `*const c_uchar`
instead of `*const c_char`, to avoid the need for `AsCCharPtr`.
2025-05-11 14:38:42 +10:00
Zalathar
d1bb310a7a Use LLVMGetInlineAsm
This LLVM-C binding replaces the existing `LLVMRustInlineAsm` function.
2025-05-11 14:37:54 +10:00
Zalathar
8764ecd0c1 Add a searchable tag PTR_LEN_STR to explain *const c_uchar bindings
This module comment describes why it's OK for LLVM bindings to declare a
parameter type of `*const c_uchar` for pointer/length strings, even though the
corresponding parameter on the C/C++ side uses `const char *`.

Adding a searchable term to each such parameter should make it easier for
future maintainers to understand why `*const c_uchar` is being used instead of
`*const c_char`.
2025-05-11 14:26:14 +10:00
León Orell Valerian Liehr
3c8950c30d
Rollup merge of #140792 - Urgau:minimum-maximum-intrinsics, r=scottmcm,traviscross,tgross35
Use intrinsics for `{f16,f32,f64,f128}::{minimum,maximum}` operations

This PR creates intrinsics for `{f16,f32,f64,f64}::{minimum,maximum}` operations.

This wasn't done when those operations were added as the LLVM support was too weak but now that LLVM has libcalls for unsupported platforms we can finally use them.

Cranelift and GCC[^1] support are partial, Cranelift doesn't support `f16` and `f128`, while GCC doesn't support `f16`.

r? `@tgross35`

try-job: aarch64-gnu
try-job: dist-various-1
try-job: dist-various-2

[^1]: https://www.gnu.org/software///gnulib/manual/html_node/Functions-in-_003cmath_002eh_003e.html
2025-05-11 02:44:36 +02:00
Urgau
7f0ae5e3ad Use the fallback body for {minimum,maximum}f128 on LLVM as well. 2025-05-10 17:34:54 +02:00
Matthias Krüger
b8c55b438d
Rollup merge of #140660 - RalfJung:more-order, r=WaffleLapkin
remove 'unordered' atomic intrinsics

As their doc comment already indicates, these operations do not currently have a place in our memory model. The intrinsics were introduced to support a hack in compiler-builtins, but that hack recently got removed (see https://github.com/rust-lang/compiler-builtins/issues/788).
2025-05-10 16:26:02 +02:00
mejrs
684b7b70f4 don't depend on rustc_attr_parsing if rustc_data_structures will do 2025-05-09 23:16:55 +02:00
Ralf Jung
79dfd0a472 remove 'unordered' atomic intrinsics 2025-05-09 17:39:52 +02:00
Urgau
e7247df590 Use intrinsics for {f16,f32,f64,f128}::{minimum,maximum} operations 2025-05-09 17:11:23 +02:00
Zalathar
078144fdfa coverage: Detect unused local file IDs to avoid an LLVM assertion
This case can't actually happen yet (other than via a testing flag), because
currently all of a function's spans must belong to the same file and expansion.
But this will be an important edge case when adding expansion region support.
2025-05-10 00:24:03 +10:00
Zalathar
8cd8b23b9e coverage: Hoist counter_for_bcb out of its loop
Having this helper function in the loop was confusing, because it doesn't rely
on anything that changes between loop iterations.
2025-05-10 00:24:03 +10:00
Zalathar
339556eb02 coverage: Enlarge empty spans during MIR instrumentation, not codegen
This allows us to assume that coverage spans will only be discarded during
codegen in very unusual situations.
2025-05-10 00:24:01 +10:00
bors
c8b7f32434 Auto merge of #140176 - dpaoliello:arm64ecdec, r=wesleywiser
Fix linking statics on Arm64EC

Arm64EC builds recently started to fail due to the linker not finding a symbol:
```
symbols.o : error LNK2001: unresolved external symbol #_ZN3std9panicking11EMPTY_PANIC17hc8d2b903527827f1E (EC Symbol)
          C:\Code\hello-world\target\arm64ec-pc-windows-msvc\debug\deps\hello_world.exe : fatal error LNK1120: 1 unresolved externals
```

It turns out that `EMPTY_PANIC` is a new static variable that was being exported then imported from the standard library, but when exporting LLVM didn't prepend the name with `#` (as only functions are prefixed with this character), whereas Rust was prefixing with `#` when attempting to import it.

The fix is to have Rust not prefix statics with `#` when importing.

Adding tests discovered another issue: we need to correctly mark static exported from dylibs with `DATA`, otherwise MSVC's linker assumes they are functions and complains that there is no exit thunk for them.

CI found another bug: we only apply `DllImport` to non-local statics that aren't foreign items (i.e., in an `extern` block), that is we want to use `DllImport` for statics coming from other Rust crates. However, `__rust_no_alloc_shim_is_unstable` is a static generated by the Rust compiler if required, but downstream crates consider it a foreign item since it is declared in an `extern "Rust"` block, thus they do not apply `DllImport` to it and so fails to link if it is exported by the previous crate as `DATA`. The fix is to apply `DllImport` to foreign items that are marked with the `rustc_std_internal_symbol` attribute (i.e., we assume they aren't actually foreign and will be in some Rust crate).

Fixes #138541

---
try-job: dist-aarch64-msvc
try-job: dist-x86_64-msvc
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
2025-05-09 00:43:28 +00:00
Daniel Paoliello
6dabf7ea3a [Arm64EC] Only decorate functions with # 2025-05-07 10:36:12 -07:00
Jacob Pratt
4a8dbe0537
Rollup merge of #139534 - madhav-madhusoodanan:apx-target-feature-addition, r=workingjubilee
Added support for `apxf` target feature
2025-05-07 00:29:21 +00:00
Madhav Madhusoodanan
43357b4a64 Added apxf target feature support, under flag apx_target_feature 2025-05-06 23:28:27 +05:30
Michael Goulet
833c212b81 Rename Instance::new to Instance::new_raw and add a note that it is raw 2025-05-05 13:17:35 +00:00
Madhav Madhusoodanan
e4272d12f2 feat: Added capability to add multiple dependencies for an LLVMFeature 2025-05-05 12:33:37 +05:30
Bryanskiy
14535312b5 Initial support for dynamically linked crates 2025-05-04 22:03:15 +03:00
Stuart Cook
ed7590f1a0
Rollup merge of #139675 - sayantn:avx10, r=Amanieu
Add the AVX10 target features

Parent #138843

Adds the `avx10_target_feature` feature gate, and `avx10.1` and `avx10.2` target features.

It is confirmed that Intel is dropping AVX10/256 (see [this comment](https://github.com/rust-lang/rust/issues/111137#issuecomment-2795442288)), so this should be safe to implement now.

The LLVM fix for llvm/llvm-project#135394 was merged, and has been backported to LLVM20, and the patch has also been propagated to rustc in #140502

`@rustbot` label O-x86_64 O-x86_32 A-target-feature A-SIMD
2025-05-04 13:21:07 +10:00
Guillaume Gomez
9d7d782e50
Rollup merge of #140460 - heiher:issue-140455, r=Urgau
Fix handling of LoongArch target features not supported by LLVM 19

Fixes #140455
2025-05-01 22:27:23 +02:00
Matthias Krüger
555df301f8
Rollup merge of #134232 - bjorn3:naked_asm_improvements, r=wesleywiser
Share the naked asm impl between cg_ssa and cg_clif

This was introduced in https://github.com/rust-lang/rust/pull/128004.
2025-04-30 17:27:57 +02:00
WANG Rui
a2b3f11700 Filter out LoongArch features not supported by the current LLVM version 2025-04-29 22:12:27 +08:00
Trevor Gross
19e82b43eb Enable target_has_reliable_f16_math on x86
This has been disabled due to an LLVM misoptimization with `powi.f16`
[1]. This was fixed upstream and the fix is included in LLVM20, so tests
no longer need to be disabled.

`f16` still remains disabled on MinGW due to the ABI issue.

[1]: https://github.com/llvm/llvm-project/issues/98665
2025-04-29 05:39:15 +00:00
Chris Denton
e082bf341f
Rollup merge of #140323 - tgross35:cfg-unstable-float, r=Urgau
Implement the internal feature `cfg_target_has_reliable_f16_f128`

Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable.

Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates:

* `cfg(target_has_reliable_f16)`
* `cfg(target_has_reliable_f16_math)`
* `cfg(target_has_reliable_f128)`
* `cfg(target_has_reliable_f128_math)`

`reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs.

These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up.

The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend.

MCP: https://github.com/rust-lang/compiler-team/issues/866
Closes: https://github.com/rust-lang/compiler-team/issues/866

[1]: 555e1d0386/library/std/build.rs (L84-L186)

---

The second commit makes use of this config to replace `cfg_{f16,f128}{,_math}` in `library/`. I omitted providing a `cfg(bootstrap)` configuration to keep things simpler since the next beta branch is in two weeks.

try-job: aarch64-gnu
try-job: i686-msvc-1
try-job: test-various
try-job: x86_64-gnu
try-job: x86_64-msvc-ext2
2025-04-28 23:29:17 +00:00
Chris Denton
d4845e1b0b
Rollup merge of #139308 - Shourya742:2025-03-29-add-autodiff-inline, r=ZuseZ4
add autodiff inline

closes: #138920

r? ```@ZuseZ4```

try-job: dist-aarch64-linux
2025-04-28 23:29:14 +00:00
bit-aloo
7018392337
remove noinline attribute and add alwaysinline after AD pass 2025-04-28 21:10:32 +05:30
Andrew Zhogin
c366756a85 AsyncDrop implementation using shim codegen of async_drop_in_place::{closure}, scoped async drop added. 2025-04-28 16:23:13 +07:00
Trevor Gross
6ceeb0849e Implement the internal feature cfg_target_has_reliable_f16_f128
Support for `f16` and `f128` is varied across targets, backends, and
backend versions. Eventually we would like to reach a point where all
backends support these approximately equally, but until then we have to
work around some of these nuances of support being observable.

Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which
provides the following new configuration gates:

* `cfg(target_has_reliable_f16)`
* `cfg(target_has_reliable_f16_math)`
* `cfg(target_has_reliable_f128)`
* `cfg(target_has_reliable_f128_math)`

`reliable_f16` and `reliable_f128` indicate that basic arithmetic for
the type works correctly. The `_math` versions indicate that anything
relying on `libm` works correctly, since sometimes this hits a separate
class of codegen bugs.

These options match configuration set by the build script at [1]. The
logic for LLVM support is duplicated as-is from the same script. There
are a few possible updates that will come as a follow up.

The config introduced here is not planned to ever become stable, it is
only intended to replace the build scripts for `std` tests and
`compiler-builtins` that don't have any way to configure based on the
codegen backend.

MCP: https://github.com/rust-lang/compiler-team/issues/866
Closes: https://github.com/rust-lang/compiler-team/issues/866

[1]: 555e1d0386/library/std/build.rs (L84-L186)
2025-04-27 19:58:44 +00:00
sayantn
163fb854a2
Add the avx10.1 and avx10.2 target features 2025-04-26 11:40:13 +05:30
Matthias Krüger
564e5ccb5c
Rollup merge of #140202 - est31:let_chains_feature_compiler, r=lcnr
Make #![feature(let_chains)] bootstrap conditional in compiler/

Let chains have been stabilized recently in #132833, so we can remove the gating from our uses in the compiler (as the compiler uses edition 2024).
2025-04-25 07:50:25 +02:00
bit-aloo
9bc04016e6
add custom enzyme markers to target methods 2025-04-25 11:09:52 +05:30
bit-aloo
f319dd909e
add llvm wrappers and corresponding methods in attribute 2025-04-25 11:09:52 +05:30
Matthias Krüger
c3f811f02f
Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk
Autodiff flags

Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff.
At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes.

This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild.

It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is *not* the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038)

Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem.

Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations.

r? ``@oli-obk``

closes #139471

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-24 17:19:44 +02:00
Matthias Krüger
a8ebfb256a
Rollup merge of #139261 - RalfJung:msvc-align-mitigation, r=oli-obk
mitigate MSVC alignment issue on x86-32

This implements mitigation for https://github.com/rust-lang/rust/issues/112480 by stopping to emit `align` attributes on loads and function arguments when building for a win32 MSVC target. MSVC is known to not properly align `u64` and similar types, and claiming to LLVM that everything is properly aligned increases the chance that this will cause problems.

Of course, the misalignment is still a bug, but we can't fix that bug, only MSVC can.

Also add an errata note to the platform support page warning users about this known problem.

try-job: `i686-msvc*`
2025-04-24 11:40:35 +02:00
est31
7493e1cdf6 Make #![feature(let_chains)] bootstrap conditional in compiler/ 2025-04-23 16:40:30 +02:00
Chris Denton
d15c603173
Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkin
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors

It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first...

Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`.

I have extremely low confidence in the GCC part of this PR.

See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
2025-04-20 13:02:48 +00:00
Ralf Jung
566dfd1a0d simd intrinsics with mask: accept unsigned integer masks 2025-04-20 12:25:27 +02:00
Matthias Krüger
68b439c63b
Rollup merge of #138599 - adwinwhite:recursive-overflow, r=wesleywiser
avoid overflow when generating debuginfo for expanding recursive types

Fixes #135093
Fixes #121538
Fixes #107362
Fixes #100618
Fixes #115994

The overflow happens because expanding recursive types keep creating new nested types when recurring into sub fields.
I fixed that by returning an empty stub node when expanding recursion is detected.
2025-04-18 05:17:53 +02:00
Matthias Krüger
87a163523f
Rollup merge of #139351 - EnzymeAD:autodiff-batching2, r=oli-obk
Autodiff batching2

~I will rebase it once my first PR landed.~ done.
This autodiff batch mode is more similar to scalar autodiff, since it still only takes one shadow argument.
However, that argument is supposed to be `width` times larger.

r? `@oli-obk`

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-17 21:53:23 +02:00
Manuel Drehwald
a68ae0cbc1 working dupv and dupvonly for fwd mode 2025-04-16 17:13:31 -04:00
Vadim Petrochenkov
38f7060a73 Revert "Deduplicate template parameter creation"
This reverts commit 6adc2c1fd6.
2025-04-15 21:00:11 +03:00
Yotam Ofek
4b63362f3d Use newtype_index!-generated types more idiomatically 2025-04-14 16:17:06 +00:00
bjorn3
421f22e8bf Pass &mut self to codegen_global_asm 2025-04-14 09:38:04 +00:00
bjorn3
e2e96fa14e Pass MonoItemData to MonoItem::define 2025-04-14 09:38:03 +00:00
Manuel Drehwald
5ea9125f37 update documentation 2025-04-12 01:36:47 -04:00
Manuel Drehwald
31578dc587 fix "could not find source function" error by preventing function merging before AD 2025-04-12 01:36:47 -04:00
Manuel Drehwald
75f86e6e2e fix LooseTypes flag and PrintMod behaviour, add debug helper 2025-04-12 01:36:44 -04:00
Jacob Pratt
eea366c191
Rollup merge of #139664 - oli-obk:push-tkmurytmnsyw, r=RalfJung
Reuse address-space computation from global alloc

r? `@RalfJung`

just avoiding some minor duplication
2025-04-11 21:21:02 +02:00
bors
e1b06f7730 Auto merge of #139453 - compiler-errors:incr, r=jieyouxu
Prepend temp files with per-invocation random string to avoid temp filename conflicts

https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management.

Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name:

```
#[inline(never)]
#[cfg(any(rpass1, rpass3))]
fn a() -> i32 {
    0
}

#[cfg(any(cfail2))]
fn a() -> i32 {
    1
}

fn main() {
    evil::evil();
    assert_eq!(a(), 0);
}

mod evil {
    #[cfg(any(rpass1, rpass3))]
    pub fn evil() {
        unsafe {
            std::arch::asm!("/*  */");
        }
    }

    #[cfg(any(cfail2))]
    pub fn evil() {
        unsafe {
            std::arch::asm!("missing");
        }
    }
}
```

Session 1 (`rpass1`):
* Type-check, borrow-check, etc.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd.
* Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`.
* Save-temps option means we don't delete `XXX.rgcu.o`.
* Link the binary and stuff.
* Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name).

Session 2 (`cfail2`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph.
* Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory.
* Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session.

Session 3 (`rpass3`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1.
* All the dep graph nodes are green since we are basically replaying session 1.
* codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2.
* Link the binary and stuff.

This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1.

At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**.

This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example).

---

This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem.

That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc.

Fixes https://github.com/rust-lang/rust/issues/139407

[^1]: 175dcc7773/compiler/rustc_fs_util/src/lib.rs (L60)
[^2]: 175dcc7773/compiler/rustc_incremental/src/persist/fs.rs (L1-L40)
2025-04-11 13:59:33 +00:00
Oli Scherer
cfa52e48ae Reuse address-space computation from global alloc 2025-04-11 09:28:47 +00:00
Stuart Cook
45ebc4060b
Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcm
add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`

fixes https://github.com/rust-lang/rust/issues/137372

adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change.

Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that.

r? `@ghost`
2025-04-11 13:31:43 +10:00
Folkert de Vries
59c55339af
add simd_insert_dyn and simd_extract_dyn 2025-04-10 21:22:07 +02:00
Ralf Jung
2678d04dd9 mitigate MSVC unsoundness by not emitting alignment attributes on win32-msvc targets
also mention the MSVC alignment issue in platform-support.md
2025-04-07 23:30:55 +02:00
Michael Goulet
9c372d8940 Prepend temp files with a string per invocation of rustc 2025-04-07 20:48:40 +00:00
Michael Goulet
effef88ac7 Simplify temp path creation a bit 2025-04-07 20:48:40 +00:00
Stuart Cook
5863b426b9
Rollup merge of #139465 - EnzymeAD:autodiff-sret, r=oli-obk
add sret handling for scalar autodiff

r? `@oli-obk`

Fixing one of the todo's which I left in my previous batching PR.
This one handles sret for scalar autodiff.  `sret` mostly shows up when we try to return a lot of scalar floats.
People often start testing autodiff which toy functions which just use a few scalars as inputs and outputs, and those were the most likely to be affected by this issue. So this fix should make learning/teaching hopefully a bit easier.

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-07 22:29:21 +10:00
Stuart Cook
ddf099ff4e
Rollup merge of #139397 - Zalathar:virtual, r=jieyouxu
coverage: Build the CGU's global file table as late as possible

Embedding coverage metadata in the output binary is a delicate dance, because per-function records need to embed references to the per-CGU filename table, but we only want to include files in that table if they are successfully used by at least one function.

The way that we build the file tables has changed a few times over the last few years. This particular change is motivated by experimental work on properly supporting macro-expansion regions, which adds some additional constraints that our previous implementation wasn't equipped to deal with.

LLVM is very strict about not allowing unused entries in local file tables. Currently that's not much of an issue, because we assume one source file per function, but to support expansion regions we need the flexibility to avoid committing to the use of a file until we're completely sure that we are able and willing to produce at least one coverage mapping region for it. In particular, when preparing a function's covfun record, we need the flexibility to decide at a late stage that a particular file isn't needed/usable after all.

(It's OK for the *global* file table to contain unused entries, but we would still prefer to avoid that if possible, and this implementation also achieves that.)
2025-04-07 22:29:20 +10:00
Manuel Drehwald
d6467d34ae handle sret for scalar autodiff 2025-04-07 07:07:16 -04:00
Zalathar
4322b6e97d coverage: Build the CGU's global file table as late as possible 2025-04-07 17:11:49 +10:00
bors
8fb32ab8e5 Auto merge of #139473 - Kobzol:rollup-ycksn9b, r=Kobzol
Rollup of 5 pull requests

Successful merges:

 - #138314 (fix usage of `autodiff` macro with inner functions)
 - #139426 (Make the UnifyKey and UnifyValue imports non-nightly)
 - #139431 (Remove LLVM 18 inline ASM span fallback)
 - #139456 (style guide: add let-chain rules)
 - #139467 (More trivial tweaks)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-04-07 06:27:35 +00:00
Zalathar
b3c40cf374 coverage: Deal with unused functions and their names in one place 2025-04-06 13:55:28 +10:00
Zalathar
75135aaf19 coverage: Extract module mapgen::unused for handling unused functions 2025-04-06 13:55:27 +10:00
beetrees
3aac9a37a5
Remove LLVM 18 inline ASM span fallback 2025-04-06 02:31:52 +01:00
Josh Stone
12167d7064 Update the minimum external LLVM to 19 2025-04-05 11:44:38 -07:00
Matthias Krüger
543160dd62
Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco
KCFI: Add KCFI arity indicator support

Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 10:18:03 +02:00
Ramon de C Valle
a98546b961 KCFI: Add KCFI arity indicator support
Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311,
https://github.com/llvm/llvm-project/pull/121070, and
https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).
2025-04-05 04:05:04 +00:00
Stuart Cook
c6bf3a01ef
Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Manuel Drehwald
89d8948835 add new flag to print the module post-AD, before opts 2025-04-04 14:25:23 -04:00
Manuel Drehwald
b7c63a973f add autodiff batching backend 2025-04-04 14:24:23 -04:00
Matthias Krüger
66e61c78e7
Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin
Rename `is_like_osx` to `is_like_darwin`

Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.).

``@rustbot`` label O-apple
r? compiler
2025-04-04 08:02:05 +02:00
Stuart Cook
5b0f658922
Rollup merge of #138003 - sayantn:new-amx, r=Amanieu
Add the new `amx` target features and the `movrs` target feature

Adds 5 new `amx` target features included in LLVM20. These are guarded under `x86_amx_intrinsics` (#126622)

 - `amx-avx512`
 - `amx-fp8`
 - `amx-movrs`
 - `amx-tf32`
 - `amx-transpose`

Adds the `movrs` target feature (from #137976).

`@rustbot` label O-x86_64 O-x86_32 T-compiler A-target-feature
r? `@Amanieu`
2025-04-02 13:10:36 +11:00
bors
85f518ec8e Auto merge of #138742 - taiki-e:riscv-vector, r=Amanieu
rustc_target: Add more RISC-V vector-related features and use zvl*b target features in vector ABI check

Currently, we have only unstable `v` target feature, but RISC-V have more vector-related extensions. The first commit of this PR adds them to unstable `riscv_target_feature`.

- `unaligned-vector-mem`: Has reasonably performant unaligned vector
  - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L1379)
  - Similar to currently unstable `unaligned-scalar-mem` target feature, but for vector instructions.
- `zvfh`: Vector Extension for Half-Precision Floating-Point
  - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfh-vector-extension-for-half-precision-floating-point)
  - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L668)
  - This implies `zvfhmin` and `zfhmin`
- `zvfhmin`: Vector Extension for Minimal Half-Precision Floating-Point
  - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfhmin-vector-extension-for-minimal-half-precision-floating-point)
  - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L662)
  - This implies `zve32f`
- `zve32x`, `zve32f`, `zve64x`, `zve64f`, `zve64d`: Vector Extensions for Embedded Processors
  - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zve-vector-extensions-for-embedded-processors)
  - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L612-L641)
  - `zve32x` implies `zvl32b`
  - `zve32f` implies `zve32x` and `f`
  - `zve64x` implies `zve32x` and `zvl64b`
  - `zve64f` implies `zve32f` and `zve64x`
  - `zve64d` implies `zve64f` and `d`
  - `v` implies `zve64d`
- `zvl*b`: Minimum Vector Length Standard Extensions
  - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvl-minimum-vector-length-standard-extensions)
  - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L600-L610)
  - `zvl{N}b` implies `zvl{N>>1}b`
  - `v` implies `zvl128b`
- Vector Cryptography and Bit-manipulation Extensions
  - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/vector-crypto.adoc)
  - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L679-L807)
  - `zvkb`: Vector Bit-manipulation used in Cryptography
    - This implies `zve32x`
  - `zvbb`: Vector basic bit-manipulation instructions
    - This implies `zvkb`
  - `zvbc`: Vector Carryless Multiplication
    - This implies `zve64x`
  - `zvkg`: Vector GCM instructions for Cryptography
    - This implies `zve32x`
  - `zvkned`: Vector AES Encryption & Decryption (Single Round)
    - This implies `zve32x`
  - `zvknha`: Vector SHA-2 (SHA-256 only))
    - This implies `zve32x`
  - `zvknhb`: Vector SHA-2 (SHA-256 and SHA-512)
    - This implies `zve64x`
    - This is superset of `zvknha`, but doesn't imply that feature at least in LLVM
  - `zvksed`: SM4 Block Cipher Instructions
    - This implies `zve32x`
  - `zvksh`: SM3 Hash Function Instructions
    - This implies `zve32x`
  - `zvkt`: Vector Data-Independent Execution Latency
    - Similar to already stabilized scalar cryptography extension `zkt`.
  - `zvkn`: Shorthand for 'Zvkned', 'Zvknhb', 'Zvkb', and 'Zvkt'
    - Similar to already stabilized scalar cryptography extension `zkn`.
  - `zvknc`: Shorthand for 'Zvkn' and 'Zvbc'
  - `zvkng`: shorthand for 'Zvkn' and 'Zvkg'
  - `zvks`: shorthand for 'Zvksed', 'Zvksh', 'Zvkb', and 'Zvkt'
    - Similar to already stabilized scalar cryptography extension `zks`.
  - `zvksc`: shorthand for 'Zvks' and 'Zvbc'
  - `zvksg`: shorthand for 'Zvks' and 'Zvkg'

Also, our vector ABI check wants `zvl*b` target features, the second commit of this PR updates vector ABI check to use them.

4e2b096ed6/compiler/rustc_target/src/target_features.rs (L707-L708)

---

r? `@Amanieu`

`@rustbot` label +O-riscv +A-target-feature
2025-03-30 02:21:56 +00:00
bors
2a06022951 Auto merge of #138503 - bjorn3:string_merging, r=tmiasko
Avoid wrapping constant allocations in packed structs when not necessary

This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.

try-job: armhf-gnu
2025-03-28 10:18:32 +00:00
bjorn3
5c82a59bd3 Add test and comment 2025-03-28 09:19:57 +00:00
bjorn3
a5fa12b6b9 Avoid wrapping constant allocations in packed structs when not necessary
This way LLVM will set the string merging flag if the alloc is a nul
terminated string, reducing binary sizes.
2025-03-28 09:19:57 +00:00
bors
65899c06f1 Auto merge of #138893 - klensy:thorin-0.9, r=Mark-Simulacrum
bump thorin to 0.9 to drop duped deps

Bumps `thorin`, removing duped deps.

This also changes features for hashbrown:
```
hashbrown v0.15.2
`-- indexmap v2.7.0
    |-- object v0.36.7
    |-- wasmparser v0.219.1
    |-- wasmparser v0.223.0
    `-- wit-component v0.223.0
    |-- indexmap feature "default"
    |-- indexmap feature "serde"
    `-- indexmap feature "std"
|-- hashbrown feature "default-hasher"
|   |-- object v0.36.7 (*)
|   `-- wasmparser v0.223.0 (*)
|-- hashbrown feature "nightly"
|   |-- rustc_data_structures v0.0.0
|   `-- rustc_query_system v0.0.0
`-- hashbrown feature "serde"
    `-- wasmparser feature "serde"
```
to
```
hashbrown v0.15.2
`-- indexmap v2.7.0
    |-- object v0.36.7
    |-- wasmparser v0.219.1
    |-- wasmparser v0.223.0
    `-- wit-component v0.223.0
    |-- indexmap feature "default"
    |-- indexmap feature "serde"
    `-- indexmap feature "std"
|-- hashbrown feature "allocator-api2"
|   `-- hashbrown feature "default"
|-- hashbrown feature "default" (*)
|-- hashbrown feature "default-hasher"
|   |-- object v0.36.7 (*)
|   `-- wasmparser v0.223.0 (*)
|   `-- hashbrown feature "default" (*)
|-- hashbrown feature "equivalent"
|   `-- hashbrown feature "default" (*)
|-- hashbrown feature "inline-more"
|   `-- hashbrown feature "default" (*)
|-- hashbrown feature "nightly"
|   |-- rustc_data_structures v0.0.0
|   `-- rustc_query_system v0.0.0
|-- hashbrown feature "raw-entry"
|   `-- hashbrown feature "default" (*)
`-- hashbrown feature "serde"
    `-- wasmparser feature "serde"
```

To be safe, as this can be perf-sensitive:
`@bors` rollup=never
2025-03-26 07:54:26 +00:00
Mads Marquart
328846c6eb Rename is_like_osx to is_like_darwin 2025-03-25 21:53:52 +01:00
Matthias Krüger
b66e9320c5
Rollup merge of #137247 - dpaoliello:cleanllvm, r=Zalathar
cg_llvm: Reduce the visibility of types, modules and using declarations in `rustc_codegen_llvm`.

Final part of #135502

Reduces the visibility of types, modules and using declarations in the `rustc_codegen_llvm` to private or `pub(crate)` where possible, and marks unused fields and enum entries with `#[expect(dead_code)]`.

r? Zalathar
2025-03-25 18:09:03 +01:00
Daniel Paoliello
79b9664091 Reduce visibility of most items in rustc_codegen_llvm 2025-03-25 16:36:47 +11:00
bors
1df5affaca Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm
Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics

Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics.

These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19.

I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something.

r? `@scottmcm`
2025-03-24 22:53:12 +00:00
klensy
724a5a430b bump thorin to drop duped deps 2025-03-24 19:38:16 +03:00
Matthias Krüger
0c594da55f
Rollup merge of #138627 - EnzymeAD:autodiff-cleanups, r=oli-obk
Autodiff cleanups

Splitting out some cleanups to reduce the size of my batching PR and simplify ``@haenoe`` 's [PR](https://github.com/rust-lang/rust/pull/138314).

r? ``@oli-obk``

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-03-21 15:48:55 +01:00
Taiki Endo
55add8fce3 rustc_target: Add more RISC-V vector-related features 2025-03-20 19:47:57 +09:00
Zalathar
2e36990881 coverage: Convert and check span coordinates without a local file ID
For expansion region support, we will want to be able to convert and check
spans before creating a corresponding local file ID.

If we create local file IDs eagerly, but some expansion turns out to have no
successfully-converted spans, LLVM will complain about that expansion's file ID
having no regions.
2025-03-20 13:29:32 +11:00
Zalathar
d07ef5b0e1 coverage: Add LLVM plumbing for expansion regions
This is currently unused, but paves the way for future work on expansion
regions without having to worry about the FFI parts.
2025-03-20 12:40:36 +11:00
Matthias Krüger
5661e98058
Rollup merge of #138674 - oli-obk:llvm-cleanups, r=compiler-errors
Various codegen_llvm cleanups

Mostly just adding safe wrappers and deduplicating code
2025-03-19 08:17:19 +01:00
Oli Scherer
f4b0984854 Create a safe wrapper around LLVMRustDIBuilderCreateMemberType 2025-03-18 17:15:02 +00:00
Oli Scherer
1f34b19596 Avoid splitting up a layout 2025-03-18 17:01:09 +00:00
Zalathar
cc8336b6c1 coverage: Don't store a body span in FunctionCoverageInfo 2025-03-18 23:18:24 +11:00
Zalathar
cd2b978433 coverage: Don't refer to the body span when enlarging empty spans
Given that we now only enlarge empty spans to "{" or "}", there shouldn't be
any danger of enlarging beyond a function body.
2025-03-18 23:18:23 +11:00
Manuel Drehwald
47c07ed963 [NFC] simplify matching 2025-03-17 19:13:09 -04:00
Manuel Drehwald
f4c297802f [NFC] extract autodiff call lowering in cg_llvm into own function 2025-03-17 18:58:51 -04:00
bors
493c38ba37 Auto merge of #127173 - bjorn3:mangle_rustc_std_internal_symbol, r=wesleywiser,jieyouxu
Mangle rustc_std_internal_symbols functions

This reduces the risk of issues when using a staticlib or rust dylib compiled with a different rustc version in a rust program. Currently this will either (in the case of staticlib) cause a linker error due to duplicate symbol definitions, or (in the case of rust dylibs) cause rustc_std_internal_symbols functions to be silently overridden. As rust gets more commonly used inside the implementation of libraries consumed with a C interface (like Spidermonkey, Ruby YJIT (curently has to do partial linking of all rust code to hide all symbols not part of the C api), the Rusticl OpenCL implementation in mesa) this is becoming much more of an issue. With this PR the only symbols remaining with an unmangled name are rust_eh_personality (LLVM doesn't allow renaming it) and `__rust_no_alloc_shim_is_unstable`.

Helps mitigate https://github.com/rust-lang/rust/issues/104707

try-job: aarch64-gnu-debug
try-job: aarch64-apple
try-job: x86_64-apple-1
try-job: x86_64-mingw-1
try-job: i686-mingw-1
try-job: x86_64-msvc-1
try-job: i686-msvc-1
try-job: test-various
try-job: armhf-gnu
2025-03-17 22:16:22 +00:00
Oli Scherer
018032c682 Create a safe wrapper around LLVMRustDIBuilderCreateBasicType 2025-03-17 16:58:44 +00:00