nordic-dev.net/rust - rust

mirror of https://github.com/rust-lang/rust.git synced 2025-06-19 02:57:33 +00:00

Author	SHA1	Message	Date
Chris Denton	e082bf341f	Rollup merge of #140323 - tgross35:cfg-unstable-float, r=Urgau Implement the internal feature `cfg_target_has_reliable_f16_f128` Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable. Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates: * `cfg(target_has_reliable_f16)` * `cfg(target_has_reliable_f16_math)` * `cfg(target_has_reliable_f128)` * `cfg(target_has_reliable_f128_math)` `reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs. These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up. The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend. MCP: https://github.com/rust-lang/compiler-team/issues/866 Closes: https://github.com/rust-lang/compiler-team/issues/866 [1]: `555e1d0386/library/std/build.rs (L84-L186)` --- The second commit makes use of this config to replace `cfg_{f16,f128}{,_math}` in `library/`. I omitted providing a `cfg(bootstrap)` configuration to keep things simpler since the next beta branch is in two weeks. try-job: aarch64-gnu try-job: i686-msvc-1 try-job: test-various try-job: x86_64-gnu try-job: x86_64-msvc-ext2	2025-04-28 23:29:17 +00:00
Chris Denton	d4845e1b0b	Rollup merge of #139308 - Shourya742:2025-03-29-add-autodiff-inline, r=ZuseZ4 add autodiff inline closes: #138920 r? ```@ZuseZ4``` try-job: dist-aarch64-linux	2025-04-28 23:29:14 +00:00
bit-aloo	7018392337	remove noinline attribute and add alwaysinline after AD pass	2025-04-28 21:10:32 +05:30
Andrew Zhogin	c366756a85	AsyncDrop implementation using shim codegen of async_drop_in_place::{closure}, scoped async drop added.	2025-04-28 16:23:13 +07:00
Trevor Gross	6ceeb0849e	Implement the internal feature `cfg_target_has_reliable_f16_f128` Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable. Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates: * `cfg(target_has_reliable_f16)` * `cfg(target_has_reliable_f16_math)` * `cfg(target_has_reliable_f128)` * `cfg(target_has_reliable_f128_math)` `reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs. These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up. The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend. MCP: https://github.com/rust-lang/compiler-team/issues/866 Closes: https://github.com/rust-lang/compiler-team/issues/866 [1]: `555e1d0386/library/std/build.rs (L84-L186)`	2025-04-27 19:58:44 +00:00
sayantn	163fb854a2	Add the `avx10.1` and `avx10.2` target features	2025-04-26 11:40:13 +05:30
Matthias Krüger	564e5ccb5c	Rollup merge of #140202 - est31:let_chains_feature_compiler, r=lcnr Make #![feature(let_chains)] bootstrap conditional in compiler/ Let chains have been stabilized recently in #132833, so we can remove the gating from our uses in the compiler (as the compiler uses edition 2024).	2025-04-25 07:50:25 +02:00
bit-aloo	9bc04016e6	add custom enzyme markers to target methods	2025-04-25 11:09:52 +05:30
bit-aloo	f319dd909e	add llvm wrappers and corresponding methods in attribute	2025-04-25 11:09:52 +05:30
Matthias Krüger	c3f811f02f	Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk Autodiff flags Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff. At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes. This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild. It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is not the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038) Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem. Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations. r? ``@oli-obk`` closes #139471 Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-24 17:19:44 +02:00
Matthias Krüger	a8ebfb256a	Rollup merge of #139261 - RalfJung:msvc-align-mitigation, r=oli-obk mitigate MSVC alignment issue on x86-32 This implements mitigation for https://github.com/rust-lang/rust/issues/112480 by stopping to emit `align` attributes on loads and function arguments when building for a win32 MSVC target. MSVC is known to not properly align `u64` and similar types, and claiming to LLVM that everything is properly aligned increases the chance that this will cause problems. Of course, the misalignment is still a bug, but we can't fix that bug, only MSVC can. Also add an errata note to the platform support page warning users about this known problem. try-job: `i686-msvc*`	2025-04-24 11:40:35 +02:00
est31	7493e1cdf6	Make #![feature(let_chains)] bootstrap conditional in compiler/	2025-04-23 16:40:30 +02:00
Chris Denton	d15c603173	Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkin simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first... Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`. I have extremely low confidence in the GCC part of this PR. See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)	2025-04-20 13:02:48 +00:00
Ralf Jung	566dfd1a0d	simd intrinsics with mask: accept unsigned integer masks	2025-04-20 12:25:27 +02:00
Matthias Krüger	68b439c63b	Rollup merge of #138599 - adwinwhite:recursive-overflow, r=wesleywiser avoid overflow when generating debuginfo for expanding recursive types Fixes #135093 Fixes #121538 Fixes #107362 Fixes #100618 Fixes #115994 The overflow happens because expanding recursive types keep creating new nested types when recurring into sub fields. I fixed that by returning an empty stub node when expanding recursion is detected.	2025-04-18 05:17:53 +02:00
Matthias Krüger	87a163523f	Rollup merge of #139351 - EnzymeAD:autodiff-batching2, r=oli-obk Autodiff batching2 ~I will rebase it once my first PR landed.~ done. This autodiff batch mode is more similar to scalar autodiff, since it still only takes one shadow argument. However, that argument is supposed to be `width` times larger. r? `@oli-obk` Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-17 21:53:23 +02:00
Manuel Drehwald	a68ae0cbc1	working dupv and dupvonly for fwd mode	2025-04-16 17:13:31 -04:00
Vadim Petrochenkov	38f7060a73	Revert "Deduplicate template parameter creation" This reverts commit `6adc2c1fd6`.	2025-04-15 21:00:11 +03:00
Yotam Ofek	4b63362f3d	Use `newtype_index!`-generated types more idiomatically	2025-04-14 16:17:06 +00:00
bjorn3	421f22e8bf	Pass &mut self to codegen_global_asm	2025-04-14 09:38:04 +00:00
bjorn3	e2e96fa14e	Pass MonoItemData to MonoItem::define	2025-04-14 09:38:03 +00:00
Manuel Drehwald	5ea9125f37	update documentation	2025-04-12 01:36:47 -04:00
Manuel Drehwald	31578dc587	fix "could not find source function" error by preventing function merging before AD	2025-04-12 01:36:47 -04:00
Manuel Drehwald	75f86e6e2e	fix LooseTypes flag and PrintMod behaviour, add debug helper	2025-04-12 01:36:44 -04:00
Jacob Pratt	eea366c191	Rollup merge of #139664 - oli-obk:push-tkmurytmnsyw, r=RalfJung Reuse address-space computation from global alloc r? `@RalfJung` just avoiding some minor duplication	2025-04-11 21:21:02 +02:00
bors	e1b06f7730	Auto merge of #139453 - compiler-errors:incr, r=jieyouxu Prepend temp files with per-invocation random string to avoid temp filename conflicts https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* /"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous finalized incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. HERE IS THE PROBLEM: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the previous session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous finalized incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as reused from session 1 since dep nodes were green. That means we reuse `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst unsound. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not deterministic, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes https://github.com/rust-lang/rust/issues/139407 [^1]: `175dcc7773/compiler/rustc_fs_util/src/lib.rs (L60)` [^2]: `175dcc7773/compiler/rustc_incremental/src/persist/fs.rs (L1-L40)`	2025-04-11 13:59:33 +00:00
Oli Scherer	cfa52e48ae	Reuse address-space computation from global alloc	2025-04-11 09:28:47 +00:00
Stuart Cook	45ebc4060b	Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcm add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}` fixes https://github.com/rust-lang/rust/issues/137372 adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change. Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that. r? `@ghost`	2025-04-11 13:31:43 +10:00
Folkert de Vries	59c55339af	add `simd_insert_dyn` and `simd_extract_dyn`	2025-04-10 21:22:07 +02:00
Ralf Jung	2678d04dd9	mitigate MSVC unsoundness by not emitting alignment attributes on win32-msvc targets also mention the MSVC alignment issue in platform-support.md	2025-04-07 23:30:55 +02:00
Michael Goulet	9c372d8940	Prepend temp files with a string per invocation of rustc	2025-04-07 20:48:40 +00:00
Michael Goulet	effef88ac7	Simplify temp path creation a bit	2025-04-07 20:48:40 +00:00
Stuart Cook	5863b426b9	Rollup merge of #139465 - EnzymeAD:autodiff-sret, r=oli-obk add sret handling for scalar autodiff r? `@oli-obk` Fixing one of the todo's which I left in my previous batching PR. This one handles sret for scalar autodiff. `sret` mostly shows up when we try to return a lot of scalar floats. People often start testing autodiff which toy functions which just use a few scalars as inputs and outputs, and those were the most likely to be affected by this issue. So this fix should make learning/teaching hopefully a bit easier. Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-07 22:29:21 +10:00
Stuart Cook	ddf099ff4e	Rollup merge of #139397 - Zalathar:virtual, r=jieyouxu coverage: Build the CGU's global file table as late as possible Embedding coverage metadata in the output binary is a delicate dance, because per-function records need to embed references to the per-CGU filename table, but we only want to include files in that table if they are successfully used by at least one function. The way that we build the file tables has changed a few times over the last few years. This particular change is motivated by experimental work on properly supporting macro-expansion regions, which adds some additional constraints that our previous implementation wasn't equipped to deal with. LLVM is very strict about not allowing unused entries in local file tables. Currently that's not much of an issue, because we assume one source file per function, but to support expansion regions we need the flexibility to avoid committing to the use of a file until we're completely sure that we are able and willing to produce at least one coverage mapping region for it. In particular, when preparing a function's covfun record, we need the flexibility to decide at a late stage that a particular file isn't needed/usable after all. (It's OK for the global file table to contain unused entries, but we would still prefer to avoid that if possible, and this implementation also achieves that.)	2025-04-07 22:29:20 +10:00
Manuel Drehwald	d6467d34ae	handle sret for scalar autodiff	2025-04-07 07:07:16 -04:00
Zalathar	4322b6e97d	coverage: Build the CGU's global file table as late as possible	2025-04-07 17:11:49 +10:00
bors	8fb32ab8e5	Auto merge of #139473 - Kobzol:rollup-ycksn9b, r=Kobzol Rollup of 5 pull requests Successful merges: - #138314 (fix usage of `autodiff` macro with inner functions) - #139426 (Make the UnifyKey and UnifyValue imports non-nightly) - #139431 (Remove LLVM 18 inline ASM span fallback) - #139456 (style guide: add let-chain rules) - #139467 (More trivial tweaks) r? `@ghost` `@rustbot` modify labels: rollup	2025-04-07 06:27:35 +00:00
Zalathar	b3c40cf374	coverage: Deal with unused functions and their names in one place	2025-04-06 13:55:28 +10:00
Zalathar	75135aaf19	coverage: Extract module `mapgen::unused` for handling unused functions	2025-04-06 13:55:27 +10:00
beetrees	3aac9a37a5	Remove LLVM 18 inline ASM span fallback	2025-04-06 02:31:52 +01:00
Josh Stone	12167d7064	Update the minimum external LLVM to 19	2025-04-05 11:44:38 -07:00
Matthias Krüger	543160dd62	Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 10:18:03 +02:00
Ramon de C Valle	a98546b961	KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 04:05:04 +00:00
Stuart Cook	c6bf3a01ef	Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283	2025-04-05 13:18:13 +11:00
Manuel Drehwald	89d8948835	add new flag to print the module post-AD, before opts	2025-04-04 14:25:23 -04:00
Manuel Drehwald	b7c63a973f	add autodiff batching backend	2025-04-04 14:24:23 -04:00
Matthias Krüger	66e61c78e7	Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler	2025-04-04 08:02:05 +02:00
Stuart Cook	5b0f658922	Rollup merge of #138003 - sayantn:new-amx, r=Amanieu Add the new `amx` target features and the `movrs` target feature Adds 5 new `amx` target features included in LLVM20. These are guarded under `x86_amx_intrinsics` (#126622) - `amx-avx512` - `amx-fp8` - `amx-movrs` - `amx-tf32` - `amx-transpose` Adds the `movrs` target feature (from #137976). `@rustbot` label O-x86_64 O-x86_32 T-compiler A-target-feature r? `@Amanieu`	2025-04-02 13:10:36 +11:00
bors	85f518ec8e	Auto merge of #138742 - taiki-e:riscv-vector, r=Amanieu rustc_target: Add more RISC-V vector-related features and use zvlb target features in vector ABI check Currently, we have only unstable `v` target feature, but RISC-V have more vector-related extensions. The first commit of this PR adds them to unstable `riscv_target_feature`. - `unaligned-vector-mem`: Has reasonably performant unaligned vector - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L1379) - Similar to currently unstable `unaligned-scalar-mem` target feature, but for vector instructions. - `zvfh`: Vector Extension for Half-Precision Floating-Point - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfh-vector-extension-for-half-precision-floating-point) - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L668) - This implies `zvfhmin` and `zfhmin` - `zvfhmin`: Vector Extension for Minimal Half-Precision Floating-Point - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvfhmin-vector-extension-for-minimal-half-precision-floating-point) - [LLVM definition](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L662) - This implies `zve32f` - `zve32x`, `zve32f`, `zve64x`, `zve64f`, `zve64d`: Vector Extensions for Embedded Processors - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zve-vector-extensions-for-embedded-processors) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L612-L641) - `zve32x` implies `zvl32b` - `zve32f` implies `zve32x` and `f` - `zve64x` implies `zve32x` and `zvl64b` - `zve64f` implies `zve32f` and `zve64x` - `zve64d` implies `zve64f` and `d` - `v` implies `zve64d` - `zvlb`: Minimum Vector Length Standard Extensions - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/v-st-ext.adoc#zvl-minimum-vector-length-standard-extensions) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L600-L610) - `zvl{N}b` implies `zvl{N>>1}b` - `v` implies `zvl128b` - Vector Cryptography and Bit-manipulation Extensions - [ISA Manual](https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-2336fdc-2025-03-19/src/vector-crypto.adoc) - [LLVM definitions](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L679-L807) - `zvkb`: Vector Bit-manipulation used in Cryptography - This implies `zve32x` - `zvbb`: Vector basic bit-manipulation instructions - This implies `zvkb` - `zvbc`: Vector Carryless Multiplication - This implies `zve64x` - `zvkg`: Vector GCM instructions for Cryptography - This implies `zve32x` - `zvkned`: Vector AES Encryption & Decryption (Single Round) - This implies `zve32x` - `zvknha`: Vector SHA-2 (SHA-256 only)) - This implies `zve32x` - `zvknhb`: Vector SHA-2 (SHA-256 and SHA-512) - This implies `zve64x` - This is superset of `zvknha`, but doesn't imply that feature at least in LLVM - `zvksed`: SM4 Block Cipher Instructions - This implies `zve32x` - `zvksh`: SM3 Hash Function Instructions - This implies `zve32x` - `zvkt`: Vector Data-Independent Execution Latency - Similar to already stabilized scalar cryptography extension `zkt`. - `zvkn`: Shorthand for 'Zvkned', 'Zvknhb', 'Zvkb', and 'Zvkt' - Similar to already stabilized scalar cryptography extension `zkn`. - `zvknc`: Shorthand for 'Zvkn' and 'Zvbc' - `zvkng`: shorthand for 'Zvkn' and 'Zvkg' - `zvks`: shorthand for 'Zvksed', 'Zvksh', 'Zvkb', and 'Zvkt' - Similar to already stabilized scalar cryptography extension `zks`. - `zvksc`: shorthand for 'Zvks' and 'Zvbc' - `zvksg`: shorthand for 'Zvks' and 'Zvkg' Also, our vector ABI check wants `zvl*b` target features, the second commit of this PR updates vector ABI check to use them. `4e2b096ed6/compiler/rustc_target/src/target_features.rs (L707-L708)` --- r? `@Amanieu` `@rustbot` label +O-riscv +A-target-feature	2025-03-30 02:21:56 +00:00
bors	2a06022951	Auto merge of #138503 - bjorn3:string_merging, r=tmiasko Avoid wrapping constant allocations in packed structs when not necessary This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes. try-job: armhf-gnu	2025-03-28 10:18:32 +00:00
bjorn3	5c82a59bd3	Add test and comment	2025-03-28 09:19:57 +00:00
bjorn3	a5fa12b6b9	Avoid wrapping constant allocations in packed structs when not necessary This way LLVM will set the string merging flag if the alloc is a nul terminated string, reducing binary sizes.	2025-03-28 09:19:57 +00:00
bors	65899c06f1	Auto merge of #138893 - klensy:thorin-0.9, r=Mark-Simulacrum bump thorin to 0.9 to drop duped deps Bumps `thorin`, removing duped deps. This also changes features for hashbrown: ``` hashbrown v0.15.2 `-- indexmap v2.7.0 \|-- object v0.36.7 \|-- wasmparser v0.219.1 \|-- wasmparser v0.223.0 `-- wit-component v0.223.0 \|-- indexmap feature "default" \|-- indexmap feature "serde" `-- indexmap feature "std" \|-- hashbrown feature "default-hasher" \| \|-- object v0.36.7 () \| `-- wasmparser v0.223.0 () \|-- hashbrown feature "nightly" \| \|-- rustc_data_structures v0.0.0 \| `-- rustc_query_system v0.0.0 `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` to ``` hashbrown v0.15.2 `-- indexmap v2.7.0 \|-- object v0.36.7 \|-- wasmparser v0.219.1 \|-- wasmparser v0.223.0 `-- wit-component v0.223.0 \|-- indexmap feature "default" \|-- indexmap feature "serde" `-- indexmap feature "std" \|-- hashbrown feature "allocator-api2" \| `-- hashbrown feature "default" \|-- hashbrown feature "default" () \|-- hashbrown feature "default-hasher" \| \|-- object v0.36.7 () \| `-- wasmparser v0.223.0 () \| `-- hashbrown feature "default" () \|-- hashbrown feature "equivalent" \| `-- hashbrown feature "default" () \|-- hashbrown feature "inline-more" \| `-- hashbrown feature "default" () \|-- hashbrown feature "nightly" \| \|-- rustc_data_structures v0.0.0 \| `-- rustc_query_system v0.0.0 \|-- hashbrown feature "raw-entry" \| `-- hashbrown feature "default" (*) `-- hashbrown feature "serde" `-- wasmparser feature "serde" ``` To be safe, as this can be perf-sensitive: `@bors` rollup=never	2025-03-26 07:54:26 +00:00
Mads Marquart	328846c6eb	Rename `is_like_osx` to `is_like_darwin`	2025-03-25 21:53:52 +01:00
Matthias Krüger	b66e9320c5	Rollup merge of #137247 - dpaoliello:cleanllvm, r=Zalathar cg_llvm: Reduce the visibility of types, modules and using declarations in `rustc_codegen_llvm`. Final part of #135502 Reduces the visibility of types, modules and using declarations in the `rustc_codegen_llvm` to private or `pub(crate)` where possible, and marks unused fields and enum entries with `#[expect(dead_code)]`. r? Zalathar	2025-03-25 18:09:03 +01:00
Daniel Paoliello	79b9664091	Reduce visibility of most items in `rustc_codegen_llvm`	2025-03-25 16:36:47 +11:00
bors	1df5affaca	Auto merge of #133984 - DaniPopes:scmp-ucmp, r=scottmcm Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics. These are the intrinsics mentioned in https://github.com/rust-lang/rust/pull/118310, which are now available in LLVM 19. I couldn't find any follow-up PRs/discussions about this, please let me know if I missed something. r? `@scottmcm`	2025-03-24 22:53:12 +00:00
klensy	724a5a430b	bump thorin to drop duped deps	2025-03-24 19:38:16 +03:00
Matthias Krüger	0c594da55f	Rollup merge of #138627 - EnzymeAD:autodiff-cleanups, r=oli-obk Autodiff cleanups Splitting out some cleanups to reduce the size of my batching PR and simplify ``@haenoe`` 's [PR](https://github.com/rust-lang/rust/pull/138314). r? ``@oli-obk`` Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-03-21 15:48:55 +01:00
Taiki Endo	55add8fce3	rustc_target: Add more RISC-V vector-related features	2025-03-20 19:47:57 +09:00
Zalathar	2e36990881	coverage: Convert and check span coordinates without a local file ID For expansion region support, we will want to be able to convert and check spans before creating a corresponding local file ID. If we create local file IDs eagerly, but some expansion turns out to have no successfully-converted spans, LLVM will complain about that expansion's file ID having no regions.	2025-03-20 13:29:32 +11:00
Zalathar	d07ef5b0e1	coverage: Add LLVM plumbing for expansion regions This is currently unused, but paves the way for future work on expansion regions without having to worry about the FFI parts.	2025-03-20 12:40:36 +11:00
Matthias Krüger	5661e98058	Rollup merge of #138674 - oli-obk:llvm-cleanups, r=compiler-errors Various codegen_llvm cleanups Mostly just adding safe wrappers and deduplicating code	2025-03-19 08:17:19 +01:00
Oli Scherer	f4b0984854	Create a safe wrapper around `LLVMRustDIBuilderCreateMemberType`	2025-03-18 17:15:02 +00:00
Oli Scherer	1f34b19596	Avoid splitting up a layout	2025-03-18 17:01:09 +00:00
Zalathar	cc8336b6c1	coverage: Don't store a body span in `FunctionCoverageInfo`	2025-03-18 23:18:24 +11:00
Zalathar	cd2b978433	coverage: Don't refer to the body span when enlarging empty spans Given that we now only enlarge empty spans to "{" or "}", there shouldn't be any danger of enlarging beyond a function body.	2025-03-18 23:18:23 +11:00
Manuel Drehwald	47c07ed963	[NFC] simplify matching	2025-03-17 19:13:09 -04:00
Manuel Drehwald	f4c297802f	[NFC] extract autodiff call lowering in cg_llvm into own function	2025-03-17 18:58:51 -04:00
bors	493c38ba37	Auto merge of #127173 - bjorn3:mangle_rustc_std_internal_symbol, r=wesleywiser,jieyouxu Mangle rustc_std_internal_symbols functions This reduces the risk of issues when using a staticlib or rust dylib compiled with a different rustc version in a rust program. Currently this will either (in the case of staticlib) cause a linker error due to duplicate symbol definitions, or (in the case of rust dylibs) cause rustc_std_internal_symbols functions to be silently overridden. As rust gets more commonly used inside the implementation of libraries consumed with a C interface (like Spidermonkey, Ruby YJIT (curently has to do partial linking of all rust code to hide all symbols not part of the C api), the Rusticl OpenCL implementation in mesa) this is becoming much more of an issue. With this PR the only symbols remaining with an unmangled name are rust_eh_personality (LLVM doesn't allow renaming it) and `__rust_no_alloc_shim_is_unstable`. Helps mitigate https://github.com/rust-lang/rust/issues/104707 try-job: aarch64-gnu-debug try-job: aarch64-apple try-job: x86_64-apple-1 try-job: x86_64-mingw-1 try-job: i686-mingw-1 try-job: x86_64-msvc-1 try-job: i686-msvc-1 try-job: test-various try-job: armhf-gnu	2025-03-17 22:16:22 +00:00
Oli Scherer	018032c682	Create a safe wrapper around `LLVMRustDIBuilderCreateBasicType`	2025-03-17 16:58:44 +00:00
Oli Scherer	cc41dd4fa1	Create a safe wrapper function around `LLVMRustDIBuilderCreateFile`	2025-03-17 16:58:21 +00:00
Oli Scherer	e19e4e3a4b	Create a safe wrapper around `LLVMRustDIBuilderCreateSubroutineType`	2025-03-17 16:39:52 +00:00
Oli Scherer	6adc2c1fd6	Deduplicate template parameter creation	2025-03-17 16:32:21 +00:00
Oli Scherer	b4acf7a51e	Immediately create an `Option` instead of reallocating for it later	2025-03-17 16:17:48 +00:00
Oli Scherer	eef70a9db5	Create a safe wrapper around LLVMRustDIBuilderCreateTemplateTypeParameter	2025-03-17 15:56:48 +00:00
Matthias Krüger	8f5c09b37c	Rollup merge of #138349 - 1c3t3a:external-weak-cfi, r=rcvalle Emit function declarations for functions with `#[linkage="extern_weak"]` Currently, when declaring an extern weak function in Rust, we use the following syntax: ```rust unsafe extern "C" { #[linkage = "extern_weak"] static FOO: Option<unsafe extern "C" fn() -> ()>; } ``` This allows runtime-checking the extern weak symbol through the Option. When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g. ``` `@FOO` = extern_weak global i8 `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ``` This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199. This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. Example: ``` `@_rust_extern_with_linkage_FOO` = internal global ptr `@FOO` ... declare !type !61 !type !62 !type !63 !type !64 extern_weak void `@FOO(double)` unnamed_addr #6 ``` We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option. r? `@rcvalle` cc `@jakos-sec` try-job: test-various	2025-03-17 16:34:50 +01:00
bjorn3	b754ef727c	Remove implicit #[no_mangle] for #[rustc_std_internal_symbol]	2025-03-17 14:08:09 +00:00
Adwin White	8e235258f3	fix(debuginfo): avoid overflow when handling expanding recursive type	2025-03-17 18:33:40 +08:00
Bastian Kersting	b30cf11b96	Emit function declarations for functions with #[linkage="extern_weak"] Currently, when declaring an extern weak function in Rust, we use the following syntax: ```rust unsafe extern "C" { #[linkage = "extern_weak"] static FOO: Option<unsafe extern "C" fn() -> ()>; } ``` This allows runtime-checking the extern weak symbol through the Option. When emitting LLVM-IR, the Rust compiler currently emits this static as an i8, and a pointer that is initialized with the value of the global i8 and represents the nullabilty e.g. ``` @FOO = extern_weak global i8 @_rust_extern_with_linkage_FOO = internal global ptr @FOO ``` This approach does not work well with CFI, where we need to attach CFI metadata to a concrete function declaration, which was pointed out in https://github.com/rust-lang/rust/issues/115199. This change switches to emitting a proper function declaration instead of a global i8. This allows CFI to work for extern_weak functions. We keep initializing the Rust internal symbol with the function declaration, which preserves the correct behavior for runtime checking the Option. Co-authored-by: Jakob Koschel <jakobkoschel@google.com>	2025-03-17 08:27:53 +00:00
bors	227690a258	Auto merge of #137011 - LuuuXXX:promote-ohos-with-host-tools, r=Amanieu Promote ohos targets to tier2 with host tools. ### What does this PR try to resolve? Try to promote the following [[Tier 2 without Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-without-host-tools) targets to [[Tier 2 with Host Tools](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools)](https://doc.rust-lang.org/rustc/platform-support.html#tier-2-with-host-tools): - `aarch64-unknown-linux-ohos` - `armv7-unknown-linux-ohos` - `x86_64-unknown-linux-ohos` ### More Information? see MCP: https://github.com/rust-lang/compiler-team/issues/811 ### Blockage to be solved? - [x] Submit an MCP - [x] Submit code of promote ohos targets - [x] Resolve related dependencies （`measureme`） The modified code of the measureme has been merged （see https://github.com/rust-lang/measureme/pull/238）. [done] The new version will was released (https://github.com/rust-lang/measureme/pull/240). [done]	2025-03-16 18:42:18 +00:00
Matthias Krüger	d93ef397ce	Rollup merge of #138331 - nnethercote:use-RUSTC_LINT_FLAGS-more, r=onur-ozkan,jieyouxu Use `RUSTC_LINT_FLAGS` more An alternative to the failed #138084. Fixes #138106. r? ````@jieyouxu````	2025-03-12 17:59:08 +01:00
bors	ebf0cf75d3	Auto merge of #137586 - nnethercote:SetImpliedBits, r=bjorn3 Speed up target feature computation The LLVM backend calls `LLVMRustHasFeature` twice for every feature. In short-running rustc invocations, this accounts for a surprising amount of work. r? `@bjorn3`	2025-03-11 12:05:16 +00:00
Nicholas Nethercote	ff0a5fe975	Remove `#![warn(unreachable_pub)]` from all `compiler/` crates. It's no longer necessary now that `-Wunreachable_pub` is being passed.	2025-03-11 13:14:21 +11:00
许杰友 Jieyou Xu (Joe)	063ef18fdc	Revert "Use workspace lints for crates in `compiler/` #138084 " Revert <https://github.com/rust-lang/rust/pull/138084> to buy time to consider options that avoids breaking downstream usages of cargo on distributed `rustc-src` artifacts, where such cargo invocations fail due to inability to inherit `lints` from workspace root manifest's `workspace.lints` (this is only valid for the source rust-lang/rust workspace, but not really the distributed `rustc-src` artifacts). This breakage was reported in <https://github.com/rust-lang/rust/issues/138304>. This reverts commit `48caf81484`, reversing changes made to `c6662879b2`.	2025-03-10 18:12:47 +08:00
Matthias Krüger	827bb5e27b	Rollup merge of #122790 - Zoxc:dllimp-rev, r=ChrisDenton Apply dllimport in ThinLTO This partially reverts https://github.com/rust-lang/rust/pull/103353 by properly applying `dllimport` if `-Z dylib-lto` is passed. That PR should probably fully be reverted as it looks quite sketchy. We don't know locally if the entire crate graph would be statically linked. This should hopefully be sufficient to make ThinLTO work for rustc on Windows. r? ``@wesleywiser`` --- Edit: This PR is changed to just generally revert https://github.com/rust-lang/rust/pull/103353.	2025-03-09 16:41:48 +01:00
Matthias Krüger	48caf81484	Rollup merge of #138084 - nnethercote:workspace-lints, r=jieyouxu Use workspace lints for crates in `compiler/` This is nicer and hopefully less error prone than specifying lints via bootstrap. r? ``@jieyouxu``	2025-03-09 10:34:50 +01:00
Nicholas Nethercote	8a3e03392e	Remove `#![warn(unreachable_pub)]` from all `compiler/` crates. (Except for `rustc_codegen_cranelift`.) It's no longer necessary now that `unreachable_pub` is in the workspace lints.	2025-03-08 08:41:43 +11:00
Nicholas Nethercote	beba32cebb	Specify rust lints for `compiler/` crates via Cargo. By naming them in `[workspace.lints.rust]` in the top-level `Cargo.toml`, and then making all `compiler/` crates inherit them with `[lints] workspace = true`. (I omitted `rustc_codegen_{cranelift,gcc}`, because they're a bit different.) The advantages of this over the current approach: - It uses a standard Cargo feature, rather than special handling in bootstrap. So, easier to understand, and less likely to get accidentally broken in the future. - It works for proc macro crates. It's a shame it doesn't work for rustc-specific lints, as the comments explain.	2025-03-08 08:41:09 +11:00
Matthias Krüger	63c548d82c	Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco Clean up various LLVM FFI things in codegen_llvm cc ```@ZuseZ4``` I touched some autodiff parts The major change of this PR is [`bfd88ce`](`bfd88cead0`) which makes `CodegenCx` generic just like `GenericBuilder` The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks. best reviewed commit-by-commit	2025-03-07 19:15:34 +01:00
DaniPopes	58c10c66c1	Lower BinOp::Cmp to llvm.{s,u}cmp.* intrinsics Lowers `mir::BinOp::Cmp` (`three_way_compare` intrinsic) to the corresponding LLVM `llvm.{s,u}cmp.i8.*` intrinsics, added in LLVM 19.	2025-03-06 22:29:05 +08:00
sayantn	7c2434c52c	Add the `movrs` target feature and `movrs_target_feature` feature gate	2025-03-05 05:34:37 +05:30
sayantn	0ec1d460bb	Add the new `amx` target features	2025-03-05 05:34:37 +05:30
Nicholas Nethercote	cee3114544	Remove out of date comment. No smallvecs here.	2025-03-05 09:52:28 +11:00
Nicholas Nethercote	35b7994ea8	Use `collect` to initialize `features`.	2025-03-05 09:52:26 +11:00
Nicholas Nethercote	936a8232df	Change signature of `target_features_cfg`. Currently it is called twice, once with `allow_unstable` set to true and once with it set to false. This results in some duplicated work. Most notably, for the LLVM backend, `LLVMRustHasFeature` is called twice for every feature, and it's moderately slow. For very short running compilations on platforms with many features (e.g. a `check` build of hello-world on x86) this is a significant fraction of runtime. This commit changes `target_features_cfg` so it is only called once, and it now returns a pair of feature sets. This halves the number of `LLVMRustHasFeature` calls.	2025-03-05 09:49:17 +11:00
Nicholas Nethercote	2df8e657f2	Simplify `implied_target_features`. Currently its argument is an iterator, but in practice it's always a singleton.	2025-03-05 09:20:28 +11:00
Nicholas Nethercote	1df93fd6a7	Avoid double interning of feature names. Also improve some comments.	2025-03-05 09:20:27 +11:00
LuuuXXX	7279acf202	use measureme-12.0.1	2025-03-04 17:13:46 +08:00
LuuuXXX	6324b39873	promote ohos targets to tier to with host tools	2025-03-04 17:13:46 +08:00
bors	fd17deacce	Auto merge of #137959 - matthiaskrgr:rollup-62vjvwr, r=matthiaskrgr Rollup of 12 pull requests Successful merges: - #135767 (Future incompatibility warning `unsupported_fn_ptr_calling_conventions`: Also warn in dependencies) - #137852 (Remove layouting dead code for non-array SIMD types.) - #137863 (Fix pretty printing of unsafe binders) - #137882 (do not build additional stage on compiler paths) - #137894 (Revert "store ScalarPair via memset when one side is undef and the other side can be memset") - #137902 (Make `ast::TokenKind` more like `lexer::TokenKind`) - #137921 (Subtree update of `rust-analyzer`) - #137922 (A few cleanups after the removal of `cfg(not(parallel))`) - #137939 (fix order on shl impl) - #137946 (Fix docker run-local docs) - #137955 (Always allow rustdoc-json tests to contain long lines) - #137958 (triagebot.toml: Don't label `test/rustdoc-json` as A-rustdoc-search) r? `@ghost` `@rustbot` modify labels: rollup	2025-03-04 02:27:56 +00:00
Matthias Krüger	70b9968d1e	Rollup merge of #137894 - compiler-errors:no-scalar-pair-opt, r=oli-obk Revert "store ScalarPair via memset when one side is undef and the other side can be memset" cc #137892 reverts #135335 r? oli-obk	2025-03-03 20:47:12 +01:00
John Kåre Alsaker	cc39e5f266	Apply dllimport in ThinLTO	2025-03-03 13:44:53 +01:00
Matthias Krüger	fd4bf82264	Rollup merge of #137741 - cuviper:const_str-raw_entry, r=Mark-Simulacrum Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature (#56167) completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.	2025-03-03 10:41:00 +01:00
Michael Goulet	a59a8f9e75	Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin" This reverts commit `a7a6c64a65`, reversing changes made to `ebbe63891f`.	2025-03-02 18:52:48 +00:00
Matthias Krüger	3bf976542a	Rollup merge of #137804 - RalfJung:backend-repr-simd-vector, r=workingjubilee rename BackendRepr::Vector → SimdVector For many Rustaceans, "vector" does not imply "SIMD", so let's be more clear in this type that is used pervasively in the compiler. r? `@workingjubilee`	2025-03-01 16:03:10 +01:00
bors	0c72c0d11a	Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic The embedded bitcode should always be prepared for LTO/ThinLTO Fixes #115344. Fixes #117220. There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`. When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module. This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`. r? nikic	2025-03-01 08:22:18 +00:00
bors	30508faeb3	Auto merge of #137796 - jieyouxu:rollup-qt9yr1g, r=jieyouxu Rollup of 10 pull requests Successful merges: - #134943 (Add FileCheck annotations to mir-opt/issues) - #137017 (Don't error when adding a staticlib with bitcode files compiled by newer LLVM) - #137197 (Update some comparison codegen tests now that they pass in LLVM20) - #137540 (Fix (more) test directives that were accidentally ignored) - #137551 (import `simd_` intrinsics) - #137599 (tests: use minicore more) - #137673 (Fix Windows `Command` search path bug) - #137676 (linker: Fix escaping style for response files on Windows) - #137693 (Re-enable `--generate-link-to-defintion` for tools internal rustdoc) - #137770 (Fix sized constraint for unsafe binder) r? `@ghost` `@rustbot` modify labels: rollup	2025-03-01 00:53:19 +00:00
Ralf Jung	aac65f562b	rename BackendRepr::Vector → SimdVector	2025-02-28 17:17:45 +01:00
许杰友 Jieyou Xu (Joe)	61e90040db	Rollup merge of #137017 - bjorn3:ignore_invalid_bitcode, r=oli-obk Don't error when adding a staticlib with bitcode files compiled by newer LLVM cc https://github.com/rust-lang/rust/issues/128955#issuecomment-2657811196	2025-02-28 22:29:49 +08:00
许杰友 Jieyou Xu (Joe)	d65f568302	Rollup merge of #137713 - vayunbiyani:fix-enzyme-build-errors, r=oli-obk Fix enzyme build errors After [this PR](https://github.com/rust-lang/rust/pull/136428) was merged, I switched to master and attempted building `./x.py build --stage 1 library` with the config mentioned in the enzyme rustbook but it resulted in some errors tho the config.example.toml build succeeded The errors were re: ### 1. Use of ref in match patterns The errors were related to match ergonomics in Rust 2024, where ref is no longer needed when matching on references. Examples: ``` error: binding modifiers may only be written when the default binding mode is `move` --> compiler/rustc_builtin_macros/src/autodiff.rs:136:31 \| 136 \| Annotatable::Item(ref iitem) => { \| ^^^ binding modifier not allowed under `ref` default binding mode \| = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html> note: matching on a reference type with a non-reference pattern changes the default binding mode --> compiler/rustc_builtin_macros/src/autodiff.rs:136:13 \| 136 \| Annotatable::Item(ref iitem) => { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_` help: remove the unnecessary binding modifier \| 136 - Annotatable::Item(ref iitem) => { 136 + Annotatable::Item(iitem) => { \| error: binding modifiers may only be written when the default binding mode is `move` --> compiler/rustc_builtin_macros/src/autodiff.rs:146:36 \| 146 \| Annotatable::AssocItem(ref assoc_item, _) => { \| ^^^ binding modifier not allowed under `ref` default binding mode \| = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html> note: matching on a reference type with a non-reference pattern changes the default binding mode --> compiler/rustc_builtin_macros/src/autodiff.rs:146:13 \| 146 \| Annotatable::AssocItem(ref assoc_item, _) => { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_` help: remove the unnecessary binding modifier \| 146 - Annotatable::AssocItem(ref assoc_item, _) => { 146 + Annotatable::AssocItem(assoc_item, _) => { \| error: binding modifiers may only be written when the default binding mode is `move` --> compiler/rustc_builtin_macros/src/autodiff.rs:174:31 \| 174 \| ... Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ide... \| ^^^ binding modifier not allowed under `ref` default binding mode \| = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html> note: matching on a reference type with a non-reference pattern changes the default binding mode --> compiler/rustc_builtin_macros/src/autodiff.rs:174:13 \| 174 \| ... Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.c... \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_` help: remove the unnecessary binding modifier \| 174 - Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.clone()), 174 + Annotatable::Item(iitem) => (iitem.vis.clone(), iitem.ident.clone()), \| error: binding modifiers may only be written when the default binding mode is `move` --> compiler/rustc_builtin_macros/src/autodiff.rs:175:36 \| 175 \| Annotatable::AssocItem(ref assoc_item, _) => { \| ^^^ binding modifier not allowed under `ref` default binding mode \| = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html> note: matching on a reference type with a non-reference pattern changes the default binding mode --> compiler/rustc_builtin_macros/src/autodiff.rs:175:13 \| 175 \| Annotatable::AssocItem(ref assoc_item, _) => { \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_` help: remove the unnecessary binding modifier \| 175 - Annotatable::AssocItem(ref assoc_item, _) => { 175 + Annotatable::AssocItem(assoc_item, _) => { \| error: could not compile `rustc_builtin_macros` (lib) due to 4 previous errors warning: build failed, waiting for other jobs to finish... Build completed unsuccessfully in 0:19:39 ``` ### 2. the use of external C blocks without unsafe in compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs (I don't have the error message handy) The first commit fixes the errors above --- ## Additional Improvement: `@ZuseZ4` suggested we consolidate the variants under `#[cfg(llvm_enzyme)]` and `#[cfg(not(llvm_enzyme))]` by conditionally checking for `cfg!(llvm_enzyme)` instead. This way, the autodiff code is compiled but not executed avoiding such regressions r? `@ZuseZ4` cc: `@oli-obk`	2025-02-28 21:42:01 +08:00
Josh Stone	396c2a8659	Stop using `hash_raw_entry` in `CodegenCx::const_str` That unstable feature completed fcp-close, so the compiler needs to be migrated away to allow its removal. In this case, `cg_llvm` and `cg_gcc` were using raw entries to optimize their `const_str_cache` lookup and insertion. We can change that to separate `get` and (on miss) `insert` calls, so we still have the fast path avoiding string allocation when the cache hits.	2025-02-27 09:09:52 -08:00
bjorn3	9f190d764f	Restore usage of io::Error	2025-02-26 13:45:35 +00:00
León Orell Valerian Liehr	a579a23a73	Rollup merge of #137603 - davidtwco:extern-types-no-deref, r=lcnr codegen_llvm: avoid `Deref` impls w/ extern type `rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this). In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature. Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.	2025-02-26 04:15:06 +01:00
León Orell Valerian Liehr	1511ccd6f8	Rollup merge of #137595 - folkertdev:remove-simd-pow-powi, r=RalfJung remove `simd_fpow` and `simd_fpowi` Discussed in https://github.com/rust-lang/rust/issues/137555 These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic. r? `@RalfJung`	2025-02-25 13:07:40 +01:00
Vayun Biyani	cb53e97870	Fix enzyme build errors	2025-02-25 17:25:50 +05:30
Folkert de Vries	60a268998c	remove `simd_fpow` and `simd_fpowi`	2025-02-25 09:20:10 +01:00
Michael Goulet	6c1f959288	Rollup merge of #137556 - RalfJung:simd_shuffle_const_generic, r=oli-obk rename simd_shuffle_generic → simd_shuffle_const_generic I've been confused by this name one time too often. ;) r? `@oli-obk`	2025-02-24 19:21:51 -05:00
Michael Goulet	828a3a41b3	Rollup merge of #137417 - taiki-e:riscv-atomic, r=Amanieu rustc_target: Add more RISC-V atomic-related features This is a continuation of https://github.com/rust-lang/rust/pull/130877 and adds a few target features, including `zacas`, which was experimental in LLVM 19 and marked non-experimental in LLVM 20. This adds the following target features to unstable riscv_target_feature: - `za64rs` (Za64rs Extension 1.0): Reservation Set Size of at Most 64 Bytes ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L227-L228), [available since LLVM 18](`8649328060`)) - `za128rs` (Za128rs Extension 1.0): Reservation Set Size of at Most 128 Bytes ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L230-L231), [available since LLVM 18](`8649328060`)) - IIUC, `za*rs` can be referenced when implementing helpers to reduce contention in synchronization primitives, like [`crossbeam_utils::CachePadded`](https://docs.rs/crossbeam-utils/latest/crossbeam_utils/struct.CachePadded.html). (relevant discussion: https://github.com/riscv/riscv-profiles/issues/79) - `zacas` (Zacas Extension 1.0): Atomic Compare-And-Swap Instructions (`amocas.{w,d,q}{,.aq,.rl,.aqrl}` and `amocas.{b,h}{,.aq,.rl,.aqrl}` when `zabha` is also enabled) ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L240-L243), [available as non-experimental since LLVM 20](`614aeda93b`)) - This implies `zaamo`. - This is used to optimize CAS in existing atomics and/or implement 64-bit/128-bit atomics on riscv32/riscv64 (e.g., https://github.com/taiki-e/portable-atomic/pull/173). - Note that [LLVM does not automatically use this instruction for 64-bit/128-bit atomics on riscv32/riscv64 even if this feature is enabled, because doing it changes the ABI](`876174ffd7/llvm/docs/RISCVUsage.rst (riscv-zacas-note)`). (If the ability to do that is provided by LLVM in the future, it should probably be controlled by another ABI feature similar to `forced-atomics`.) - `zama16b` (Zama16b Extension 1.0): Atomic 16-byte misaligned loads, stores and AMOs ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L255-L256), [available since LLVM 19](`b090569685`)) - IIUC, unlike AArch64 FEAT_LSE2 which also makes 16-byte aligned ldp ({i,u}128 load) atomic, this extension only affects instructions that already considered atomic if they were naturally aligned. i.e., fld (f64 load) on riscv32 would not be atomic with or without this extension ([relevant QEMU code](`b69801dd6b/target/riscv/insn_trans/trans_rvd.c.inc (L50-L62)`)). - `zawrs` (Zawrs Extension 1.0): Wait on Reservation Set (`wrs.nto` and `wrs.sto`) ([definition in LLVM](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/lib/Target/RISCV/RISCVFeatures.td#L258), [available as non-experimental since LLVM 17](`d41a73aa94`)) - This is used to optimize synchronization primitives (e.g., Linux uses this for spinlocks (`b8ddb0df30`)). Btw, the question of whether `zaamo` is implied by `zabha` or not, which was discussed in https://github.com/rust-lang/rust/pull/130877, has been resolved in LLVM 20, since LLVM now treats `zaamo` as implied by `zabha`/`zacas` (https://github.com/llvm/llvm-project/pull/115694), just like GCC and rustc. r? `@Amanieu` `@rustbot` label +O-riscv +A-target-feature	2025-02-24 19:21:47 -05:00
Ralf Jung	0362775fb5	rename simd_shuffle_generic → simd_shuffle_const_generic	2025-02-24 19:13:23 +01:00
Oli Scherer	553828c6f4	Mark more LLVM FFI as safe	2025-02-24 15:11:29 +00:00
Oli Scherer	3565603d25	Use a safe wrapper around an LLVM FFI function	2025-02-24 15:11:29 +00:00
Oli Scherer	f16f64b15a	Remove inherent function that has a trait method duplicate of a commonly imported trait	2025-02-24 15:11:29 +00:00
Oli Scherer	241c83f0c7	Deduplicate more functions between `SimpleCx` and `CodegenCx`	2025-02-24 15:11:29 +00:00
Oli Scherer	29440b84a9	Remove an unused lifetime param	2025-02-24 15:11:29 +00:00
Oli Scherer	396baa750e	Make allocator shim creation mostly use safe code	2025-02-24 15:11:29 +00:00
Oli Scherer	840e31b29f	Generalize BaseTypeCodegenMethods	2025-02-24 15:11:29 +00:00
Oli Scherer	75356b7437	Generalize `BackendTypes` over `GenericCx`	2025-02-24 15:11:29 +00:00
Oli Scherer	bfd88cead0	Avoid some duplication between SimpleCx and CodegenCx	2025-02-24 15:11:29 +00:00
Oli Scherer	d4379d2afd	Remove an unnecessary lifetime	2025-02-24 15:05:56 +00:00
Oli Scherer	a54bfcf52b	Use safe FFI for various functions in codegen_llvm	2025-02-24 15:05:56 +00:00
David Wood	a5615d3c62	codegen_llvm: avoid `Deref` impls w/ extern type `rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this). In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature. Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.	2025-02-24 08:08:55 +00:00
bors	e0be1a0262	Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcm Emit getelementptr inbounds nuw for pointer::add() Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative. Fixes https://github.com/rust-lang/rust/issues/137217.	2025-02-24 03:06:16 +00:00
Trevor Gross	a2bb4d748d	Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35 intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that. Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35` try-job: test-various	2025-02-23 14:30:25 -05:00
DianQK	da50297a6e	Save pre-link bitcode to `ModuleCodegen`	2025-02-23 21:23:38 +08:00
DianQK	9431427cc3	Add `new_regular` and `new_allocator` to `ModuleCodegen`	2025-02-23 21:23:38 +08:00
DianQK	1a99ca8da9	The embedded bitcode should always be prepared for LTO/ThinLTO	2025-02-23 21:23:36 +08:00
bors	15469f8f8a	Auto merge of #137420 - matthiaskrgr:rollup-rr0q37f, r=matthiaskrgr Rollup of 9 pull requests Successful merges: - #136910 (Implement feature `isolate_most_least_significant_one` for integer types) - #137183 (Prune dead regionck code) - #137333 (Use `edition = "2024"` in the compiler (redux)) - #137356 (Ferris 🦀 Identifier naming conventions) - #137362 (Add build step log for `run-make-support`) - #137377 (Always allow reusing cratenum in CrateLoader::load) - #137388 (Fix(lib/fs/tests): Disable rename POSIX semantics FS tests under Windows 7) - #137410 (Use StableHasher + Hash64 for dep_tracking_hash) - #137413 (jubilee cleared out the review queue) r? `@ghost` `@rustbot` modify labels: rollup	2025-02-22 13:32:44 +00:00
Taiki Endo	a343dcb97f	rustc_target: Add more RISC-V atomic-related features	2025-02-22 16:15:14 +09:00
Manuel Drehwald	e2d250c3f6	update autodiff flags	2025-02-21 21:51:20 -05:00
Manuel Drehwald	f4e2218b13	clean up autodiff code/comments	2025-02-21 21:47:48 -05:00
Michael Goulet	e1819a889a	Fix overcapturing, unsafe extern blocks, and new unsafe ops	2025-02-22 00:01:48 +00:00
Michael Goulet	76d341fa09	Upgrade the compiler to edition 2024	2025-02-22 00:01:48 +00:00
Matthias Krüger	636f4f19d8	Rollup merge of #137313 - oli-obk:push-ywvuqkxuqyom, r=petrochenkov Some codegen_llvm cleanups Using some more safe wrappers and thus being able to remove a large unsafe block. As a next step we should probably look into safe extern fns	2025-02-21 12:45:26 +01:00
Zachary S	7ba3d7b54e	Remove `BackendRepr::Uninhabited`, replaced with an `uninhabited: bool` field in `LayoutData`. Also update comments that refered to BackendRepr::Uninhabited.	2025-02-20 13:27:32 -06:00
Oli Scherer	ce7f58bd91	Merge two operations that were always performed together	2025-02-20 11:24:00 +00:00
Oli Scherer	ea7180813b	Create safe helper for LLVMSetDLLStorageClass	2025-02-20 11:15:00 +00:00
Scott McMurray	6f9cfd694d	Rework `OperandRef::extract_field` to stop calling `to_immediate_scalar` on things which are already immediates That means it stops trying to truncate things that are already `i1`s.	2025-02-19 12:03:40 -08:00
Scott McMurray	642a705f71	PR feedback	2025-02-19 11:36:52 -08:00
Scott McMurray	511bf307f0	Emit `trunc nuw` for unchecked shifts and `to_immediate_scalar` - For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information - Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information	2025-02-19 11:36:52 -08:00

1 2 3 4 5 ...

2759 Commits