nordic-dev.net/rust - rust

mirror of https://github.com/rust-lang/rust.git synced 2025-05-12 01:47:38 +00:00

Author	SHA1	Message	Date
Guillaume Gomez	9d7d782e50	Rollup merge of #140460 - heiher:issue-140455, r=Urgau Fix handling of LoongArch target features not supported by LLVM 19 Fixes #140455	2025-05-01 22:27:23 +02:00
Matthias Krüger	555df301f8	Rollup merge of #134232 - bjorn3:naked_asm_improvements, r=wesleywiser Share the naked asm impl between cg_ssa and cg_clif This was introduced in https://github.com/rust-lang/rust/pull/128004.	2025-04-30 17:27:57 +02:00
WANG Rui	a2b3f11700	Filter out LoongArch features not supported by the current LLVM version	2025-04-29 22:12:27 +08:00
Trevor Gross	19e82b43eb	Enable `target_has_reliable_f16_math` on x86 This has been disabled due to an LLVM misoptimization with `powi.f16` [1]. This was fixed upstream and the fix is included in LLVM20, so tests no longer need to be disabled. `f16` still remains disabled on MinGW due to the ABI issue. [1]: https://github.com/llvm/llvm-project/issues/98665	2025-04-29 05:39:15 +00:00
Chris Denton	e082bf341f	Rollup merge of #140323 - tgross35:cfg-unstable-float, r=Urgau Implement the internal feature `cfg_target_has_reliable_f16_f128` Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable. Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates: * `cfg(target_has_reliable_f16)` * `cfg(target_has_reliable_f16_math)` * `cfg(target_has_reliable_f128)` * `cfg(target_has_reliable_f128_math)` `reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs. These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up. The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend. MCP: https://github.com/rust-lang/compiler-team/issues/866 Closes: https://github.com/rust-lang/compiler-team/issues/866 [1]: `555e1d0386/library/std/build.rs (L84-L186)` --- The second commit makes use of this config to replace `cfg_{f16,f128}{,_math}` in `library/`. I omitted providing a `cfg(bootstrap)` configuration to keep things simpler since the next beta branch is in two weeks. try-job: aarch64-gnu try-job: i686-msvc-1 try-job: test-various try-job: x86_64-gnu try-job: x86_64-msvc-ext2	2025-04-28 23:29:17 +00:00
Chris Denton	d4845e1b0b	Rollup merge of #139308 - Shourya742:2025-03-29-add-autodiff-inline, r=ZuseZ4 add autodiff inline closes: #138920 r? ```@ZuseZ4``` try-job: dist-aarch64-linux	2025-04-28 23:29:14 +00:00
bit-aloo	7018392337	remove noinline attribute and add alwaysinline after AD pass	2025-04-28 21:10:32 +05:30
Andrew Zhogin	c366756a85	AsyncDrop implementation using shim codegen of async_drop_in_place::{closure}, scoped async drop added.	2025-04-28 16:23:13 +07:00
Trevor Gross	6ceeb0849e	Implement the internal feature `cfg_target_has_reliable_f16_f128` Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable. Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates: * `cfg(target_has_reliable_f16)` * `cfg(target_has_reliable_f16_math)` * `cfg(target_has_reliable_f128)` * `cfg(target_has_reliable_f128_math)` `reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs. These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up. The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend. MCP: https://github.com/rust-lang/compiler-team/issues/866 Closes: https://github.com/rust-lang/compiler-team/issues/866 [1]: `555e1d0386/library/std/build.rs (L84-L186)`	2025-04-27 19:58:44 +00:00
Matthias Krüger	564e5ccb5c	Rollup merge of #140202 - est31:let_chains_feature_compiler, r=lcnr Make #![feature(let_chains)] bootstrap conditional in compiler/ Let chains have been stabilized recently in #132833, so we can remove the gating from our uses in the compiler (as the compiler uses edition 2024).	2025-04-25 07:50:25 +02:00
bit-aloo	9bc04016e6	add custom enzyme markers to target methods	2025-04-25 11:09:52 +05:30
bit-aloo	f319dd909e	add llvm wrappers and corresponding methods in attribute	2025-04-25 11:09:52 +05:30
Matthias Krüger	c3f811f02f	Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk Autodiff flags Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff. At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes. This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild. It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is not the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038) Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem. Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations. r? ``@oli-obk`` closes #139471 Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-24 17:19:44 +02:00
Matthias Krüger	a8ebfb256a	Rollup merge of #139261 - RalfJung:msvc-align-mitigation, r=oli-obk mitigate MSVC alignment issue on x86-32 This implements mitigation for https://github.com/rust-lang/rust/issues/112480 by stopping to emit `align` attributes on loads and function arguments when building for a win32 MSVC target. MSVC is known to not properly align `u64` and similar types, and claiming to LLVM that everything is properly aligned increases the chance that this will cause problems. Of course, the misalignment is still a bug, but we can't fix that bug, only MSVC can. Also add an errata note to the platform support page warning users about this known problem. try-job: `i686-msvc*`	2025-04-24 11:40:35 +02:00
est31	7493e1cdf6	Make #![feature(let_chains)] bootstrap conditional in compiler/	2025-04-23 16:40:30 +02:00
Chris Denton	d15c603173	Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkin simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first... Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`. I have extremely low confidence in the GCC part of this PR. See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)	2025-04-20 13:02:48 +00:00
Ralf Jung	566dfd1a0d	simd intrinsics with mask: accept unsigned integer masks	2025-04-20 12:25:27 +02:00
Matthias Krüger	68b439c63b	Rollup merge of #138599 - adwinwhite:recursive-overflow, r=wesleywiser avoid overflow when generating debuginfo for expanding recursive types Fixes #135093 Fixes #121538 Fixes #107362 Fixes #100618 Fixes #115994 The overflow happens because expanding recursive types keep creating new nested types when recurring into sub fields. I fixed that by returning an empty stub node when expanding recursion is detected.	2025-04-18 05:17:53 +02:00
Matthias Krüger	87a163523f	Rollup merge of #139351 - EnzymeAD:autodiff-batching2, r=oli-obk Autodiff batching2 ~I will rebase it once my first PR landed.~ done. This autodiff batch mode is more similar to scalar autodiff, since it still only takes one shadow argument. However, that argument is supposed to be `width` times larger. r? `@oli-obk` Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-17 21:53:23 +02:00
Manuel Drehwald	a68ae0cbc1	working dupv and dupvonly for fwd mode	2025-04-16 17:13:31 -04:00
Vadim Petrochenkov	38f7060a73	Revert "Deduplicate template parameter creation" This reverts commit `6adc2c1fd6`.	2025-04-15 21:00:11 +03:00
Yotam Ofek	4b63362f3d	Use `newtype_index!`-generated types more idiomatically	2025-04-14 16:17:06 +00:00
bjorn3	421f22e8bf	Pass &mut self to codegen_global_asm	2025-04-14 09:38:04 +00:00
bjorn3	e2e96fa14e	Pass MonoItemData to MonoItem::define	2025-04-14 09:38:03 +00:00
Manuel Drehwald	5ea9125f37	update documentation	2025-04-12 01:36:47 -04:00
Manuel Drehwald	31578dc587	fix "could not find source function" error by preventing function merging before AD	2025-04-12 01:36:47 -04:00
Manuel Drehwald	75f86e6e2e	fix LooseTypes flag and PrintMod behaviour, add debug helper	2025-04-12 01:36:44 -04:00
Jacob Pratt	eea366c191	Rollup merge of #139664 - oli-obk:push-tkmurytmnsyw, r=RalfJung Reuse address-space computation from global alloc r? `@RalfJung` just avoiding some minor duplication	2025-04-11 21:21:02 +02:00
bors	e1b06f7730	Auto merge of #139453 - compiler-errors:incr, r=jieyouxu Prepend temp files with per-invocation random string to avoid temp filename conflicts https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management. Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name: ``` #[inline(never)] #[cfg(any(rpass1, rpass3))] fn a() -> i32 { 0 } #[cfg(any(cfail2))] fn a() -> i32 { 1 } fn main() { evil::evil(); assert_eq!(a(), 0); } mod evil { #[cfg(any(rpass1, rpass3))] pub fn evil() { unsafe { std::arch::asm!("/* /"); } } #[cfg(any(cfail2))] pub fn evil() { unsafe { std::arch::asm!("missing"); } } } ``` Session 1 (`rpass1`): Type-check, borrow-check, etc. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd. * Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`. * Save-temps option means we don't delete `XXX.rgcu.o`. * Link the binary and stuff. * Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name). Session 2 (`cfail2`): * Load artifacts from the previous finalized incremental session, namely the dep graph. * Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red. * Serialize the dep graph to the incremental working directory `.../s-...-working/`. * Codegen object file to a temp file `XXX.rcgu.o`. HERE IS THE PROBLEM: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory. * Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the previous session is the last finalized session. Session 3 (`rpass3`): * Load artifacts from the previous finalized incremental session, namely the dep graph. NOTE that this is from session 1. * All the dep graph nodes are green since we are basically replaying session 1. * codegen object file `XXX.o`, which is detected as reused from session 1 since dep nodes were green. That means we reuse `XXX.o` which had been dirtied from session 2. * Link the binary and stuff. This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1. At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst unsound. This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example). --- This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not deterministic, but temporary files are transient anyways, so I don't believe this is a problem. That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc. Fixes https://github.com/rust-lang/rust/issues/139407 [^1]: `175dcc7773/compiler/rustc_fs_util/src/lib.rs (L60)` [^2]: `175dcc7773/compiler/rustc_incremental/src/persist/fs.rs (L1-L40)`	2025-04-11 13:59:33 +00:00
Oli Scherer	cfa52e48ae	Reuse address-space computation from global alloc	2025-04-11 09:28:47 +00:00
Stuart Cook	45ebc4060b	Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcm add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}` fixes https://github.com/rust-lang/rust/issues/137372 adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change. Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that. r? `@ghost`	2025-04-11 13:31:43 +10:00
Folkert de Vries	59c55339af	add `simd_insert_dyn` and `simd_extract_dyn`	2025-04-10 21:22:07 +02:00
Ralf Jung	2678d04dd9	mitigate MSVC unsoundness by not emitting alignment attributes on win32-msvc targets also mention the MSVC alignment issue in platform-support.md	2025-04-07 23:30:55 +02:00
Michael Goulet	9c372d8940	Prepend temp files with a string per invocation of rustc	2025-04-07 20:48:40 +00:00
Michael Goulet	effef88ac7	Simplify temp path creation a bit	2025-04-07 20:48:40 +00:00
Stuart Cook	5863b426b9	Rollup merge of #139465 - EnzymeAD:autodiff-sret, r=oli-obk add sret handling for scalar autodiff r? `@oli-obk` Fixing one of the todo's which I left in my previous batching PR. This one handles sret for scalar autodiff. `sret` mostly shows up when we try to return a lot of scalar floats. People often start testing autodiff which toy functions which just use a few scalars as inputs and outputs, and those were the most likely to be affected by this issue. So this fix should make learning/teaching hopefully a bit easier. Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-07 22:29:21 +10:00
Stuart Cook	ddf099ff4e	Rollup merge of #139397 - Zalathar:virtual, r=jieyouxu coverage: Build the CGU's global file table as late as possible Embedding coverage metadata in the output binary is a delicate dance, because per-function records need to embed references to the per-CGU filename table, but we only want to include files in that table if they are successfully used by at least one function. The way that we build the file tables has changed a few times over the last few years. This particular change is motivated by experimental work on properly supporting macro-expansion regions, which adds some additional constraints that our previous implementation wasn't equipped to deal with. LLVM is very strict about not allowing unused entries in local file tables. Currently that's not much of an issue, because we assume one source file per function, but to support expansion regions we need the flexibility to avoid committing to the use of a file until we're completely sure that we are able and willing to produce at least one coverage mapping region for it. In particular, when preparing a function's covfun record, we need the flexibility to decide at a late stage that a particular file isn't needed/usable after all. (It's OK for the global file table to contain unused entries, but we would still prefer to avoid that if possible, and this implementation also achieves that.)	2025-04-07 22:29:20 +10:00
Manuel Drehwald	d6467d34ae	handle sret for scalar autodiff	2025-04-07 07:07:16 -04:00
Zalathar	4322b6e97d	coverage: Build the CGU's global file table as late as possible	2025-04-07 17:11:49 +10:00
bors	8fb32ab8e5	Auto merge of #139473 - Kobzol:rollup-ycksn9b, r=Kobzol Rollup of 5 pull requests Successful merges: - #138314 (fix usage of `autodiff` macro with inner functions) - #139426 (Make the UnifyKey and UnifyValue imports non-nightly) - #139431 (Remove LLVM 18 inline ASM span fallback) - #139456 (style guide: add let-chain rules) - #139467 (More trivial tweaks) r? `@ghost` `@rustbot` modify labels: rollup	2025-04-07 06:27:35 +00:00
Zalathar	b3c40cf374	coverage: Deal with unused functions and their names in one place	2025-04-06 13:55:28 +10:00
Zalathar	75135aaf19	coverage: Extract module `mapgen::unused` for handling unused functions	2025-04-06 13:55:27 +10:00
beetrees	3aac9a37a5	Remove LLVM 18 inline ASM span fallback	2025-04-06 02:31:52 +01:00
Josh Stone	12167d7064	Update the minimum external LLVM to 19	2025-04-05 11:44:38 -07:00
Matthias Krüger	543160dd62	Rollup merge of #138368 - rcvalle:rust-kcfi-arity, r=davidtwco KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see https://github.com/rust-lang/rust/issues/138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 10:18:03 +02:00
Ramon de C Valle	a98546b961	KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see rust-lang/rust#138311, https://github.com/llvm/llvm-project/pull/121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).	2025-04-05 04:05:04 +00:00
Stuart Cook	c6bf3a01ef	Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283	2025-04-05 13:18:13 +11:00
Manuel Drehwald	89d8948835	add new flag to print the module post-AD, before opts	2025-04-04 14:25:23 -04:00
Manuel Drehwald	b7c63a973f	add autodiff batching backend	2025-04-04 14:24:23 -04:00
Matthias Krüger	66e61c78e7	Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler	2025-04-04 08:02:05 +02:00

1 2 3 4 5 ...

2662 Commits