Commit Graph

2605 Commits

Author SHA1 Message Date
Michael Goulet
a59a8f9e75 Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin"
This reverts commit a7a6c64a65, reversing
changes made to ebbe63891f.
2025-03-02 18:52:48 +00:00
Matthias Krüger
3bf976542a
Rollup merge of #137804 - RalfJung:backend-repr-simd-vector, r=workingjubilee
rename BackendRepr::Vector → SimdVector

For many Rustaceans, "vector" does not imply "SIMD", so let's be more clear in this type that is used pervasively in the compiler.

r? `@workingjubilee`
2025-03-01 16:03:10 +01:00
bors
0c72c0d11a Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic
The embedded bitcode should always be prepared for LTO/ThinLTO

Fixes #115344. Fixes #117220.

There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.

When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.

This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.

r? nikic
2025-03-01 08:22:18 +00:00
bors
30508faeb3 Auto merge of #137796 - jieyouxu:rollup-qt9yr1g, r=jieyouxu
Rollup of 10 pull requests

Successful merges:

 - #134943 (Add FileCheck annotations to mir-opt/issues)
 - #137017 (Don't error when adding a staticlib with bitcode files compiled by newer LLVM)
 - #137197 (Update some comparison codegen tests now that they pass in LLVM20)
 - #137540 (Fix (more) test directives that were accidentally ignored)
 - #137551 (import `simd_` intrinsics)
 - #137599 (tests: use minicore more)
 - #137673 (Fix Windows `Command` search path bug)
 - #137676 (linker: Fix escaping style for response files on Windows)
 - #137693 (Re-enable `--generate-link-to-defintion` for tools internal rustdoc)
 - #137770 (Fix sized constraint for unsafe binder)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-03-01 00:53:19 +00:00
Ralf Jung
aac65f562b rename BackendRepr::Vector → SimdVector 2025-02-28 17:17:45 +01:00
许杰友 Jieyou Xu (Joe)
61e90040db
Rollup merge of #137017 - bjorn3:ignore_invalid_bitcode, r=oli-obk
Don't error when adding a staticlib with bitcode files compiled by newer LLVM

cc https://github.com/rust-lang/rust/issues/128955#issuecomment-2657811196
2025-02-28 22:29:49 +08:00
许杰友 Jieyou Xu (Joe)
d65f568302
Rollup merge of #137713 - vayunbiyani:fix-enzyme-build-errors, r=oli-obk
Fix enzyme build errors

After [this PR](https://github.com/rust-lang/rust/pull/136428) was merged, I switched to master and attempted building `./x.py build --stage 1 library` with the config mentioned in the enzyme rustbook but it resulted in some errors tho the config.example.toml build succeeded
The errors were re:
### 1. Use of ref in match patterns
The errors were related to match ergonomics in Rust 2024, where ref is no longer needed when matching on references. Examples:
```
error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:136:31
    |
136 |             Annotatable::Item(ref iitem) => {
    |                               ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:136:13
    |
136 |             Annotatable::Item(ref iitem) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
136 -             Annotatable::Item(ref iitem) => {
136 +             Annotatable::Item(iitem) => {
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:146:36
    |
146 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |                                    ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:146:13
    |
146 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
146 -             Annotatable::AssocItem(ref assoc_item, _) => {
146 +             Annotatable::AssocItem(assoc_item, _) => {
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:174:31
    |
174 | ...       Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ide...
    |                             ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:174:13
    |
174 | ...   Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.c...
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
174 -             Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.clone()),
174 +             Annotatable::Item(iitem) => (iitem.vis.clone(), iitem.ident.clone()),
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:175:36
    |
175 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |                                    ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:175:13
    |
175 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
175 -             Annotatable::AssocItem(ref assoc_item, _) => {
175 +             Annotatable::AssocItem(assoc_item, _) => {
    |

error: could not compile `rustc_builtin_macros` (lib) due to 4 previous errors
warning: build failed, waiting for other jobs to finish...
Build completed unsuccessfully in 0:19:39
```
### 2. the use of external C blocks without unsafe in compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs (I don't have the error message handy)

The first commit fixes the errors above

---
## Additional Improvement:
`@ZuseZ4` suggested we consolidate the variants under `#[cfg(llvm_enzyme)]` and `#[cfg(not(llvm_enzyme))]` by conditionally checking for `cfg!(llvm_enzyme)` instead. This way,  the autodiff code is compiled but not executed avoiding such regressions

r? `@ZuseZ4`
cc: `@oli-obk`
2025-02-28 21:42:01 +08:00
Josh Stone
396c2a8659 Stop using hash_raw_entry in CodegenCx::const_str
That unstable feature completed fcp-close, so the compiler needs to be
migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc`
were using raw entries to optimize their `const_str_cache` lookup and
insertion. We can change that to separate `get` and (on miss) `insert`
calls, so we still have the fast path avoiding string allocation when
the cache hits.
2025-02-27 09:09:52 -08:00
bjorn3
9f190d764f Restore usage of io::Error 2025-02-26 13:45:35 +00:00
León Orell Valerian Liehr
a579a23a73
Rollup merge of #137603 - davidtwco:extern-types-no-deref, r=lcnr
codegen_llvm: avoid `Deref` impls w/ extern type

`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this).

In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature.

Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.
2025-02-26 04:15:06 +01:00
León Orell Valerian Liehr
1511ccd6f8
Rollup merge of #137595 - folkertdev:remove-simd-pow-powi, r=RalfJung
remove `simd_fpow` and `simd_fpowi`

Discussed in https://github.com/rust-lang/rust/issues/137555

These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic.

r? `@RalfJung`
2025-02-25 13:07:40 +01:00
Vayun Biyani
cb53e97870 Fix enzyme build errors 2025-02-25 17:25:50 +05:30
Folkert de Vries
60a268998c
remove simd_fpow and simd_fpowi 2025-02-25 09:20:10 +01:00
Michael Goulet
6c1f959288
Rollup merge of #137556 - RalfJung:simd_shuffle_const_generic, r=oli-obk
rename simd_shuffle_generic → simd_shuffle_const_generic

I've been confused by this name one time too often. ;)

r? `@oli-obk`
2025-02-24 19:21:51 -05:00
Michael Goulet
828a3a41b3
Rollup merge of #137417 - taiki-e:riscv-atomic, r=Amanieu
rustc_target: Add more RISC-V atomic-related features

This is a continuation of https://github.com/rust-lang/rust/pull/130877 and adds a few target features, including `zacas`, which was experimental in LLVM 19 and marked non-experimental in LLVM 20.

This adds the following target features to unstable riscv_target_feature:

- `za64rs` (Za64rs Extension 1.0): Reservation Set Size of at Most 64 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L227-L228), [available since LLVM 18](8649328060))
- `za128rs` (Za128rs Extension 1.0): Reservation Set Size of at Most 128 Bytes
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L230-L231), [available since LLVM 18](8649328060))
  - IIUC, `za*rs` can be referenced when implementing helpers to reduce contention in synchronization primitives, like [`crossbeam_utils::CachePadded`](https://docs.rs/crossbeam-utils/latest/crossbeam_utils/struct.CachePadded.html). (relevant discussion: https://github.com/riscv/riscv-profiles/issues/79)
- `zacas` (Zacas Extension 1.0): Atomic Compare-And-Swap Instructions (`amocas.{w,d,q}{,.aq,.rl,.aqrl}` and `amocas.{b,h}{,.aq,.rl,.aqrl}` when `zabha` is also enabled)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L240-L243), [available as non-experimental since LLVM 20](614aeda93b))
  - This implies `zaamo`.
  - This is used to optimize CAS in existing atomics and/or implement 64-bit/128-bit atomics on riscv32/riscv64 (e.g., https://github.com/taiki-e/portable-atomic/pull/173).
  - Note that [LLVM does not automatically use this instruction for 64-bit/128-bit atomics on riscv32/riscv64 even if this feature is enabled, because doing it changes the ABI](876174ffd7/llvm/docs/RISCVUsage.rst (riscv-zacas-note)). (If the ability to do that is provided by LLVM in the future, it should probably be controlled by another ABI feature similar to `forced-atomics`.)
- `zama16b` (Zama16b Extension 1.0): Atomic 16-byte misaligned loads, stores and AMOs
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L255-L256), [available since LLVM 19](b090569685))
  - IIUC, unlike AArch64 FEAT_LSE2 which also makes 16-byte aligned ldp ({i,u}128 load) atomic, this extension only affects instructions that already considered atomic if they were naturally aligned. i.e., fld (f64 load) on riscv32 would not be atomic with or without this extension ([relevant QEMU code](b69801dd6b/target/riscv/insn_trans/trans_rvd.c.inc (L50-L62))).
- `zawrs` (Zawrs Extension 1.0): Wait on Reservation Set (`wrs.nto` and `wrs.sto`)
  ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L258), [available as non-experimental since LLVM 17](d41a73aa94))
  - This is used to optimize synchronization primitives (e.g., Linux uses this for spinlocks (b8ddb0df30)).

Btw, the question of whether `zaamo` is implied by `zabha` or not, which was discussed in https://github.com/rust-lang/rust/pull/130877, has been resolved in LLVM 20, since LLVM now treats `zaamo` as implied by `zabha`/`zacas` (https://github.com/llvm/llvm-project/pull/115694), just like GCC and rustc.

r? `@Amanieu`

`@rustbot` label +O-riscv +A-target-feature
2025-02-24 19:21:47 -05:00
Ralf Jung
0362775fb5 rename simd_shuffle_generic → simd_shuffle_const_generic 2025-02-24 19:13:23 +01:00
Oli Scherer
553828c6f4 Mark more LLVM FFI as safe 2025-02-24 15:11:29 +00:00
Oli Scherer
3565603d25 Use a safe wrapper around an LLVM FFI function 2025-02-24 15:11:29 +00:00
Oli Scherer
f16f64b15a Remove inherent function that has a trait method duplicate of a commonly imported trait 2025-02-24 15:11:29 +00:00
Oli Scherer
241c83f0c7 Deduplicate more functions between SimpleCx and CodegenCx 2025-02-24 15:11:29 +00:00
Oli Scherer
29440b84a9 Remove an unused lifetime param 2025-02-24 15:11:29 +00:00
Oli Scherer
396baa750e Make allocator shim creation mostly use safe code 2025-02-24 15:11:29 +00:00
Oli Scherer
840e31b29f Generalize BaseTypeCodegenMethods 2025-02-24 15:11:29 +00:00
Oli Scherer
75356b7437 Generalize BackendTypes over GenericCx 2025-02-24 15:11:29 +00:00
Oli Scherer
bfd88cead0 Avoid some duplication between SimpleCx and CodegenCx 2025-02-24 15:11:29 +00:00
Oli Scherer
d4379d2afd Remove an unnecessary lifetime 2025-02-24 15:05:56 +00:00
Oli Scherer
a54bfcf52b Use safe FFI for various functions in codegen_llvm 2025-02-24 15:05:56 +00:00
David Wood
a5615d3c62
codegen_llvm: avoid Deref impls w/ extern type
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was
or contained an extern type - in my experimental implementation of
rust-lang/rfcs#3729, this isn't possible as the `Target` associated
type's `?Sized` bound cannot be relaxed backwards compatibly (unless we
come up with some way of doing this).

In later pull requests with the rust-lang/rfcs#3729 implementation,
breakage like this could only occur for nightly users relying on the
`extern_types` feature.

Upstreaming this to avoid needing to keep carrying this patch locally,
and I think it'll necessarily need to change eventually.
2025-02-24 08:08:55 +00:00
bors
e0be1a0262 Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcm
Emit getelementptr inbounds nuw for pointer::add()

Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.

Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-24 03:06:16 +00:00
Trevor Gross
a2bb4d748d
Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic

LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that.

Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35`

try-job: test-various
2025-02-23 14:30:25 -05:00
DianQK
da50297a6e
Save pre-link bitcode to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK
9431427cc3
Add new_regular and new_allocator to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK
1a99ca8da9
The embedded bitcode should always be prepared for LTO/ThinLTO 2025-02-23 21:23:36 +08:00
bors
15469f8f8a Auto merge of #137420 - matthiaskrgr:rollup-rr0q37f, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - #136910 (Implement feature `isolate_most_least_significant_one` for integer types)
 - #137183 (Prune dead regionck code)
 - #137333 (Use `edition = "2024"` in the compiler (redux))
 - #137356 (Ferris 🦀 Identifier naming conventions)
 - #137362 (Add build step log for `run-make-support`)
 - #137377 (Always allow reusing cratenum in CrateLoader::load)
 - #137388 (Fix(lib/fs/tests): Disable rename POSIX semantics FS tests under Windows 7)
 - #137410 (Use StableHasher + Hash64 for dep_tracking_hash)
 - #137413 (jubilee cleared out the review queue)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-02-22 13:32:44 +00:00
Taiki Endo
a343dcb97f rustc_target: Add more RISC-V atomic-related features 2025-02-22 16:15:14 +09:00
Manuel Drehwald
e2d250c3f6 update autodiff flags 2025-02-21 21:51:20 -05:00
Manuel Drehwald
f4e2218b13 clean up autodiff code/comments 2025-02-21 21:47:48 -05:00
Michael Goulet
e1819a889a Fix overcapturing, unsafe extern blocks, and new unsafe ops 2025-02-22 00:01:48 +00:00
Michael Goulet
76d341fa09 Upgrade the compiler to edition 2024 2025-02-22 00:01:48 +00:00
Matthias Krüger
636f4f19d8
Rollup merge of #137313 - oli-obk:push-ywvuqkxuqyom, r=petrochenkov
Some codegen_llvm cleanups

Using some more safe wrappers and thus being able to remove a large unsafe block.

As a next step we should probably look into safe extern fns
2025-02-21 12:45:26 +01:00
Zachary S
7ba3d7b54e Remove BackendRepr::Uninhabited, replaced with an uninhabited: bool field in LayoutData.
Also update comments that refered to BackendRepr::Uninhabited.
2025-02-20 13:27:32 -06:00
Oli Scherer
ce7f58bd91 Merge two operations that were always performed together 2025-02-20 11:24:00 +00:00
Oli Scherer
ea7180813b Create safe helper for LLVMSetDLLStorageClass 2025-02-20 11:15:00 +00:00
Scott McMurray
6f9cfd694d Rework OperandRef::extract_field to stop calling to_immediate_scalar on things which are already immediates
That means it stops trying to truncate things that are already `i1`s.
2025-02-19 12:03:40 -08:00
Scott McMurray
642a705f71 PR feedback 2025-02-19 11:36:52 -08:00
Scott McMurray
511bf307f0 Emit trunc nuw for unchecked shifts and to_immediate_scalar
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
2025-02-19 11:36:52 -08:00
Nikita Popov
31cc4c074d Emit getelementptr inbounds nuw for pointer::add() 2025-02-19 11:32:32 +01:00
Nikita Popov
5e9d8a7d55 Switch to the LLVMBuildGEPWithNoWrapFlags API
This API allows us to set the nuw flag as well.
2025-02-19 11:32:32 +01:00
Matthias Krüger
2bd65ebede
Rollup merge of #137210 - workingjubilee:fixup-passmode-import, r=RalfJung
compiler: Stop reexporting stuff in cg_llvm::abi

The reexports confuse tooling like rustdoc into thinking cg_llvm is the source of key types that originate in rustc_target.
2025-02-19 01:30:12 +01:00
Jubilee Young
2d2de18166 compiler: Stop reexporting stuff in cg_llvm::abi
The reexports confuse tooling like rustdoc into thinking cg_llvm is
the source of key types that originate in rustc_target.
2025-02-18 00:31:29 -08:00
bors
3b022d8cee Auto merge of #133852 - x17jiri:cold_path, r=saethlin
improve cold_path()

#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`

However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:

```
if let Some(x) = y { ... }
```

may generate 3-target switch:

```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```

and therefore marking a branch as cold will have no effect.

This PR improves `cold_path()` to work with arbitrary switch instructions.

Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
2025-02-18 07:49:09 +00:00
Nicholas Nethercote
fd7b4bf4e1 Move methods from Map to TyCtxt, part 2.
Continuing the work started in #136466.

Every method gains a `hir_` prefix, though for the ones that already
have a `par_` or `try_par_` prefix I added the `hir_` after that.
2025-02-18 10:17:44 +11:00
Jiri Bobek
7bb5f4dd78 improve cold_path() 2025-02-17 06:39:58 +01:00
Matthias Krüger
fab38375bc
Rollup merge of #137095 - saethlin:use-hash64-for-hashes, r=workingjubilee
Replace some u64 hashes with Hash64

I introduced the Hash64 and Hash128 types in https://github.com/rust-lang/rust/pull/110083, essentially as a mechanism to prevent hashes from landing in our leb128 encoding paths. If you just have a u64 or u128 field in a struct then derive Encodable/Decodable, that number gets leb128 encoding. So if you need to store a hash or some other value which behaves very close to a hash, don't store it as a u64.

This reverts part of https://github.com/rust-lang/rust/pull/117603, which turned an encoded Hash64 into a u64.

Based on https://github.com/rust-lang/rust/pull/110083, I don't expect this to be perf-sensitive on its own, though I expect that it may help stabilize some of the small rmeta size fluctuations we currently see in perf reports.
2025-02-17 06:38:14 +01:00
Ben Kimock
4cf21866e8 Move hashes from rustc_data_structure to rustc_hashes so they can be shared with rust-analyzer 2025-02-16 16:18:30 -05:00
Jacob Pratt
d3556c6644
Rollup merge of #136545 - durin42:nvptx64-align, r=nikic
nvptx64: update default alignment to match LLVM 21

This changed in llvm/llvm-project@91cb8f5d32. The commit itself is mostly about some intrinsic instructions, but as an aside it also mentions something about addrspace for tensor memory, which I believe is what this string is telling us.

`@rustbot` label: +llvm-main
2025-02-16 00:51:24 -05:00
bors
bdc97d1046 Auto merge of #136575 - scottmcm:nsuw-math, r=nikic
Set both `nuw` and `nsw` in slice size calculation

There's an old note in the code to do this, and now that [LLVM-C has an API for it](f0b8ff1251/llvm/include/llvm-c/Core.h (L4403-L4408)), we might as well.  And it's been there since what looks like LLVM 17 de9b6aa341 so doesn't even need to be conditional.

(There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
2025-02-14 14:21:29 +00:00
bjorn3
736ef0a4ce Don't error when adding a staticlib with bitcode files compiled by newer LLVM 2025-02-14 10:54:21 +00:00
bors
905b1bf1cc Auto merge of #137010 - workingjubilee:rollup-g00c07v, r=workingjubilee
Rollup of 9 pull requests

Successful merges:

 - #135439 (Make `-O` mean `OptLevel::Aggressive`)
 - #136460 (Simplify `rustc_span` `analyze_source_file`)
 - #136904 (add `IntoBounds` trait)
 - #136908 ([AIX] expect `EINVAL` for `pthread_mutex_destroy`)
 - #136924 (Add profiling of bootstrap commands using Chrome events)
 - #136951 (Use the right binder for rebinding `PolyTraitRef`)
 - #136981 (ci: switch loongarch jobs to free runners)
 - #136992 (Update backtrace)
 - #136993 ([cg_llvm] Remove dead error message)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-02-14 06:13:42 +00:00
Jubilee
e8d0d00798
Rollup merge of #136993 - dpaoliello:cleanllvm4, r=workingjubilee
[cg_llvm] Remove dead error message

Part of #135502

Discovered a dead error message in rustc_codegen_llvm, so removing it.

r? ``@Zalathar``
2025-02-13 21:37:54 -08:00
Scott McMurray
9ad6839f7a Set both nuw and nsw in slice size calculation
There's an old note in the code to do this, and now that LLVM-C has an API for it, we might as well.
2025-02-13 21:26:48 -08:00
Jubilee
864eba9fb1
Rollup merge of #136895 - maurer:fix-enum-discr, r=nikic
debuginfo: Set bitwidth appropriately in enum variant tags

Previously, we unconditionally set the bitwidth to 128-bits, the largest an enum would possibly be. Then, LLVM would cut down the constant by chopping off leading zeroes before emitting the DWARF. LLVM only supported 64-bit enumerators, so this would also have occasionally resulted in truncated data.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset trusts the constant to describe how wide the variant tag is, so the high 64-bits of zeros are considered potentially load-bearing.

As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(because `dwarf::BestForm` selected `data1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag, which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all debuggers / downstream tools.
3. Will result in smaller debug information.
2025-02-13 17:46:08 -08:00
Daniel Paoliello
bfdc96114c [cg_llvm] Remove dead error message 2025-02-13 15:04:39 -08:00
clubby789
2966256133 Make -O mean -C opt-level=3 2025-02-13 19:47:55 +00:00
Jacob Pratt
f7d5285062
Rollup merge of #136881 - dpaoliello:cleanllvm3, r=Zalathar
cg_llvm: Reduce visibility of all functions in the llvm module

Next part of #135502

This reduces the visibility of all functions in the `llvm` module to `pub(crate)` and marks the `enzyme_ffi` modules with `#![expect(dead_code)]` (as previously discussed: <https://github.com/rust-lang/rust/pull/135502#discussion_r1915608085>).

r? ``@Zalathar``
2025-02-13 03:53:31 -05:00
Jacob Pratt
1f669fdc7d
Rollup merge of #136858 - safinaskar:parallel-cleanup-2025-02-11-07-54, r=SparrowLii
Parallel-compiler-related cleanup

Parallel-compiler-related cleanup

I carefully split changes into commits. Commit messages are self-explanatory. Squashing is not recommended.

cc "Parallel Rustc Front-end" https://github.com/rust-lang/rust/issues/113349

r? SparrowLii

``@rustbot`` label: +WG-compiler-parallel
2025-02-13 03:53:31 -05:00
Daniel Paoliello
e7cef26a3d cg_llvm: Reduce visibility of all functions in the llvm module 2025-02-13 12:36:25 +11:00
Zalathar
659e20fa75 Remove LLVMGetModuleContext
This was unused after the removal of `-Zprofile` in #131829.
2025-02-13 12:36:09 +11:00
Jacob Pratt
33c186baf7
Rollup merge of #136807 - workingjubilee:merge-gpus-to-get-the-arcradeongeforce, r=bjorn3
compiler: internally merge `PtxKernel` into `GpuKernel`

r? ``@bjorn3`` for review
2025-02-12 20:10:00 -05:00
Jacob Pratt
0de2341fef
Rollup merge of #136217 - taiki-e:csky-asm-flags, r=Amanieu
Mark condition/carry bit as clobbered in C-SKY inline assembly

C-SKY's compare and some arithmetic/logical instructions modify condition/carry bit (C) in PSR, but there is currently no way to mark it as clobbered in `asm!`.

This PR marks it as clobbered except when [`options(preserves_flags)`](https://doc.rust-lang.org/reference/inline-assembly.html#r-asm.options.supported-options.preserves_flags) is used.

Refs:
- Section 1.3 "Programming model" and Section 1.3.5 "Condition/carry bit" in CSKY Architecture user_guide:
  9f7121f7d4/CSKY%20Architecture%20user_guide.pdf

  > Under user mode, condition/carry bit (C) is located in the lowest bit of PSR, and it can be
accessed and changed by common user instructions. It is the only data bit that can be visited
under user mode in PSR.

  > Condition or carry bit represents the result after one operation. Condition/carry bit can be
clearly set according to the results of compare instructions or unclearly set as some
high-precision arithmetic or logical instructions. In addition, special instructions such as
DEC[GT,LT,NE] and XTRB[0-3] will influence the value of condition/carry bit.

- Register definition in LLVM:
  https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0/llvm/lib/Target/CSKY/CSKYRegisterInfo.td#L88

cc ```@Dirreke``` ([target maintainer](aa6f5ab18e/src/doc/rustc/src/platform-support/csky-unknown-linux-gnuabiv2.md (target-maintainers)))

r? ```@Amanieu```

```@rustbot``` label +O-csky +A-inline-assembly
2025-02-12 20:09:58 -05:00
Jacob Pratt
a53cd3c979
Rollup merge of #135025 - Flakebi:alloca-addrspace, r=nikic
Cast allocas to default address space

Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if an `alloca` is created in a different address space, it is casted to the default address space before its value is used.

This is necessary for the amdgpu target and others where the default address space for `alloca`s is not 0.

For example the following code compiles incorrectly when not casting the address space to the default one:

```rust
fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ {
    let local = 0i8; /* addrspace(5) */
    let res = if cond { p } else { &raw const local };
    res
}
```

results in

```llvm
    %local = alloca addrspace(5) i8
    %res = alloca addrspace(5) ptr

if:
    ; Store 64-bit flat pointer
    store ptr %p, ptr addrspace(5) %res

else:
    ; Store 32-bit scratch pointer
    store ptr addrspace(5) %local, ptr addrspace(5) %res

ret:
    ; Load and return 64-bit flat pointer
    %res.load = load ptr, ptr addrspace(5) %res
    ret ptr %res.load
```

For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are 32-bit pointers.
The above code may store a 32-bit pointer and read it back as a 64-bit pointer, which is obviously wrong and cannot work. Instead, we need to `addrspacecast %local to ptr addrspace(0)`, then we store and load the correct type.

Tracking issue: #135024
2025-02-12 20:09:56 -05:00
Matthew Maurer
d82219a4fa debuginfo: Set bitwidth appropriately in enum variant tags
Previously, we unconditionally set the bitwidth to 128-bits, the largest
an discrimnator would possibly be. Then, LLVM would cut down the constant by
chopping off leading zeroes before emitting the DWARF. LLVM only
supported 64-bit descriminators, so this would also have occasionally
resulted in truncated data (or an assert) if more than 64-bits were
used.

LLVM added support for 128-bit enumerators in llvm/llvm-project#125578

That patchset also trusts the constant to describe how wide the variant tag is.
As a result, we went from emitting tags that looked like:
DW_AT_discr_value     (0xfe)

(`form1`)

to emitting tags that looked like:
DW_AT_discr_value	(<0x10> fe ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 )

This makes the `DW_AT_discr_value` encode at the bitwidth of the tag,
which:
1. Is probably closer to our intentions in terms of describing the data.
2. Doesn't invoke the 128-bit support which may not be supported by all
   debuggers / downstream tools.
3. Will result in smaller debug information.
2025-02-12 18:01:42 +00:00
Matthias Krüger
9e89feefb9
Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errors
Document some safety constraints and use more safe wrappers

Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.
2025-02-12 06:07:35 +01:00
Oli Scherer
dcf1e4d72b Document some safety constraints and use more safe wrappers 2025-02-11 09:47:13 +00:00
Oli Scherer
4b83038d63 Add a safe wrapper for WriteBitcodeToFile 2025-02-11 09:41:22 +00:00
Oli Scherer
b2cd1b8ead Remove an unsafe closure invariant by inlining the closure wrapper into the called function 2025-02-11 09:41:22 +00:00
Askar Safin
51f49d8464 compiler/rustc_codegen_llvm/src/lib.rs: remove "unsafe impl Send/Sync" 2025-02-11 09:58:53 +03:00
Jacob Pratt
c49ffaf7eb
Rollup merge of #136813 - mrkajetanp:aarch32-fp16-target-feature, r=davidtwco
rustc_target: Add the fp16 target feature for AArch32

As in the commit description. The feature is already available in rustc for AArch64.
2025-02-11 01:02:41 -05:00
Jacob Pratt
6153a8dcea
Rollup merge of #136721 - dpaoliello:cleanllvm2, r=Zalathar
cg_llvm: Reduce visibility of some items outside the `llvm` module

Next piece of #135502

This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-11 01:02:40 -05:00
Flakebi
cde7e805ad
Cast allocas to default address space
Pointers for variables all need to be in the same address space for
correct compilation. Therefore ensure that even if an `alloca` is
created in a different address space, it is casted to the default
address space before its value is used.

This is necessary for the amdgpu target and others where the default
address space for `alloca`s is not 0.

For example the following code compiles incorrectly when not casting the
address space to the default one:

```rust
fn f(p: *const i8 /* addrspace(0) */) -> *const i8 /* addrspace(0) */ {
    let local = 0i8; /* addrspace(5) */
    let res = if cond { p } else { &raw const local };
    res
}
```

results in

```llvm
    %local = alloca addrspace(5) i8
    %res = alloca addrspace(5) ptr

if:
    ; Store 64-bit flat pointer
    store ptr %p, ptr addrspace(5) %res

else:
    ; Store 32-bit scratch pointer
    store ptr addrspace(5) %local, ptr addrspace(5) %res

ret:
    ; Load and return 64-bit flat pointer
    %res.load = load ptr, ptr addrspace(5) %res
    ret ptr %res.load
```

For amdgpu, `addrspace(0)` are 64-bit pointers, `addrspace(5)` are
32-bit pointers.
The above code may store a 32-bit pointer and read it back as a 64-bit
pointer, which is obviously wrong and cannot work. Instead, we need to
`addrspacecast %local to ptr addrspace(0)`, then we store and load the
correct type.
2025-02-10 21:38:44 +01:00
Daniel Paoliello
5f29273921 rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm module 2025-02-10 10:17:25 -08:00
Matthias Krüger
78f5bddd57
Rollup merge of #136419 - EnzymeAD:autodiff-tests, r=onur-ozkan,jieyouxu
adding autodiff tests

I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion?

Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as
`%0 = fadd fast double %x.0.val, %x.0.val`
(I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag)

r? ``@jieyouxu``

Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails.

Tracking:

- https://github.com/rust-lang/rust/issues/124509

Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
2025-02-10 16:38:23 +01:00
Jubilee
7f8108afc8
Rollup merge of #136053 - Zalathar:defer-counters, r=saethlin
coverage: Defer part of counter-creation until codegen

Follow-up to #135481 and #135873.

One of the pleasant properties of the new counter-assignment algorithm is that we can stop partway through the process, store the intermediate state in MIR, and then resume the rest of the algorithm during codegen. This lets it take into account which parts of the control-flow graph were eliminated by MIR opts, resulting in fewer physical counters and simpler counter expressions.

Those improvements end up completely obsoleting much larger chunks of code that were previously responsible for cleaning up the coverage metadata after MIR opts, while also doing a more thorough cleanup job.

(That change also unlocks some further simplifications that I've kept out of this PR to limit its scope.)
2025-02-10 00:51:49 -08:00
Jubilee Young
e11e2b4d09 compiler: internally merge Conv::PtxKernel into GpuKernel
It is speculated that these two can be conceptually merged, and it can
start by ripping out rustc's notion of the PtxKernel call convention.
Leave the ExternAbi for now, but the nvptx target now should see it as
just a different way to spell Conv::GpuKernel.
2025-02-09 23:14:55 -08:00
Manuel Drehwald
061abbc369 remove outdated *First autodiff variants for higher-order ad 2025-02-10 01:35:53 -05:00
Manuel Drehwald
1221cff551 move second opt run to lto phase and cleanup code 2025-02-10 01:35:22 -05:00
bors
124cc92199 Auto merge of #136751 - bjorn3:update_rustfmt, r=Mark-Simulacrum
Update bootstrap compiler and rustfmt

The rustfmt version we previously used formats things differently from what the latest nightly rustfmt does. This causes issues for subtrees that get formatted both in-tree and in their own repo. Updating the rustfmt used in-tree solves those issues. Also bumped the bootstrap compiler as the stage0 update command always updates both at the same
time.
2025-02-09 15:44:16 +00:00
bors
a26e97be88 Auto merge of #136754 - Urgau:rollup-qlkhjqr, r=Urgau
Rollup of 5 pull requests

Successful merges:

 - #134679 (Windows: remove readonly files)
 - #136213 (Allow Rust to use a number of libc filesystem calls)
 - #136530 (Implement `x perf` directly in bootstrap)
 - #136601 (Detect (non-raw) borrows of null ZST pointers in CheckNull)
 - #136659 (Pick the max DWARF version when LTO'ing modules with different versions )

r? `@ghost`
`@rustbot` modify labels: rollup
2025-02-09 12:54:26 +00:00
Jubilee
5e4d6278af
Rollup merge of #136706 - workingjubilee:finish-up-rustc-abi-updates, r=compiler-errors
compiler: mostly-finish `rustc_abi` updates

This almost-finishes all the updates in the compiler to use `rustc_abi` and removes some of the reexports of `rustc_abi` items in `rustc_target` that were previously available.

r? ```@compiler-errors```
2025-02-08 20:41:21 -08:00
Urgau
5ec56e5fbb
Rollup merge of #136659 - wesleywiser:dwarf_version_lto_merge_behavior, r=jieyouxu
Pick the max DWARF version when LTO'ing modules with different versions

Currently, when rustc compiles code with `-Clto` enabled that was built
with different choices for `-Zdwarf-version`, a warning will be
reported. It's very easy to observe this by compiling most anything (eg,
"hello world") and specifying `-Clto -Zdwarf-version=5` since the
standard library is distributed with `-Zdwarf-version=4`.

This behavior isn't actually useful for a few reasons:
- From observation, LLVM chooses to pick the highest DWARF version
  anyway after issuing the warning.
- Clang specifies that in this case, the max version should be picked
  without a warning and as a general principle, we want to support
  x-lang LTO with Clang which implies using the same module flag merge
  behaviors.
- Debuggers need to be able to handle a variety of versions within the
  same debugging session as you can easily have some parts of a binary
  (or some dynamic libraries within an application) all compiled with
  different DWARF versions.

This commit changes the module flag merge behavior to match Clang and
use the highest version of DWARF. It also adds a test to ensure this
behavior is respected in the case of two crates being LTO'd together and
adds a test to ensure no warning is printed.

Fixes #130041 which fails due to these warnings being printed

cc #103057
2025-02-09 00:37:28 +01:00
bjorn3
1fcae03369 Rustfmt 2025-02-08 22:12:13 +00:00
Wesley Wiser
bbc40e7822 Pick the max DWARF version when LTO'ing modules with different versions
Currently, when rustc compiles code with `-Clto` enabled that was built
with different choices for `-Zdwarf-version`, a warning will be
reported. It's very easy to observe this by compiling most anything (eg,
"hello world") and specifying `-Clto -Zdwarf-version=5` since the
standard library is distributed with `-Zdwarf-version=4`.

This behavior isn't actually useful for a few reasons:
- from observation, LLVM chooses to pick the highest DWARF version
  anyway after issuing the warning
- Clang specifies that in this case, the max version should be picked
  without a warning and as a general principle, we want to support
  x-lang LTO with Clang which implies using the same module flag merge
  behaviors
- Debuggers need to be able to handle a variety of versions withing the
  same debugging session as you can easily have some parts of a binary
  (or some dynamic libraries within an application) all compiled with
  different DWARF versions

This commit changes the module flag merge behavior to match Clang and
use the highest version of DWARF. It also adds a test to ensure this
behavior is respected in the case of two crates being LTO'd together and
updates the test added in the previous commit to ensure no warning is
printed.
2025-02-08 16:33:36 +00:00
Manuel Drehwald
21d096184e fix non-enzyme builds 2025-02-07 22:27:46 -05:00
Matthias Krüger
c9771e9590
Rollup merge of #136691 - bjorn3:linkage_cleanup, r=jieyouxu
Remove Linkage::Private and Linkage::Appending

Neither of them has any use case. Neither known nor theoretical.
2025-02-08 03:58:48 +01:00
Matthias Krüger
93b194516a
Rollup merge of #136640 - Zalathar:debuginfo-align-bits, r=compiler-errors
Debuginfo for function ZSTs should have alignment of 8 bits, not 1 bit

In #116096, function ZSTs were made to have debuginfo that gives them an alignment of “1”. But because alignment in LLVM debuginfo is denoted in *bits*, not bytes, this resulted in an alignment specification of 1 bit instead of 1 byte.

I don't know whether this has any practical consequences, but I noticed that a test started failing when I accidentally fixed the mistake while working on #136632, so I extracted the fix (and the test adjustment) to this PR.
2025-02-08 03:58:45 +01:00
Jubilee Young
eddfe8f503 compiler: remove reexports from rustc_target::callconv 2025-02-07 11:25:18 -08:00
Kajetan Puchalski
53f9852224 rustc_target: Add the fp16 target feature for AArch32 2025-02-07 18:08:19 +00:00
bjorn3
f68cd90412 Remove Linkage::Appending
It can only be used for certain LLVM internal variables like
llvm.global_ctors which users are not allowed to define.
2025-02-07 16:02:19 +00:00
bjorn3
382e4031c2 Remove Linkage::Private
This is the same as Linkage::Internal except that it doesn't emit any
symbol. Some backends may not support it and it isn't all that useful
anyway.
2025-02-07 16:02:19 +00:00
Daniel Paoliello
2a6b27444a Remove dead code from rustc_codegen_llvm and the LLVM wrapper 2025-02-06 16:53:52 -08:00