Commit Graph

368 Commits

Author SHA1 Message Date
bit-aloo
7018392337
remove noinline attribute and add alwaysinline after AD pass 2025-04-28 21:10:32 +05:30
Matthias Krüger
c3f811f02f
Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk
Autodiff flags

Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff.
At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes.

This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild.

It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is *not* the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038)

Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem.

Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations.

r? ``@oli-obk``

closes #139471

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-24 17:19:44 +02:00
Manuel Drehwald
5ea9125f37 update documentation 2025-04-12 01:36:47 -04:00
Manuel Drehwald
31578dc587 fix "could not find source function" error by preventing function merging before AD 2025-04-12 01:36:47 -04:00
Manuel Drehwald
75f86e6e2e fix LooseTypes flag and PrintMod behaviour, add debug helper 2025-04-12 01:36:44 -04:00
Michael Goulet
9c372d8940 Prepend temp files with a string per invocation of rustc 2025-04-07 20:48:40 +00:00
Michael Goulet
effef88ac7 Simplify temp path creation a bit 2025-04-07 20:48:40 +00:00
beetrees
3aac9a37a5
Remove LLVM 18 inline ASM span fallback 2025-04-06 02:31:52 +01:00
Stuart Cook
c6bf3a01ef
Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Manuel Drehwald
89d8948835 add new flag to print the module post-AD, before opts 2025-04-04 14:25:23 -04:00
Matthias Krüger
66e61c78e7
Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin
Rename `is_like_osx` to `is_like_darwin`

Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.).

``@rustbot`` label O-apple
r? compiler
2025-04-04 08:02:05 +02:00
Mads Marquart
328846c6eb Rename is_like_osx to is_like_darwin 2025-03-25 21:53:52 +01:00
Daniel Paoliello
79b9664091 Reduce visibility of most items in rustc_codegen_llvm 2025-03-25 16:36:47 +11:00
bors
0c72c0d11a Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic
The embedded bitcode should always be prepared for LTO/ThinLTO

Fixes #115344. Fixes #117220.

There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.

When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.

This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.

r? nikic
2025-03-01 08:22:18 +00:00
许杰友 Jieyou Xu (Joe)
61e90040db
Rollup merge of #137017 - bjorn3:ignore_invalid_bitcode, r=oli-obk
Don't error when adding a staticlib with bitcode files compiled by newer LLVM

cc https://github.com/rust-lang/rust/issues/128955#issuecomment-2657811196
2025-02-28 22:29:49 +08:00
bjorn3
9f190d764f Restore usage of io::Error 2025-02-26 13:45:35 +00:00
David Wood
a5615d3c62
codegen_llvm: avoid Deref impls w/ extern type
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was
or contained an extern type - in my experimental implementation of
rust-lang/rfcs#3729, this isn't possible as the `Target` associated
type's `?Sized` bound cannot be relaxed backwards compatibly (unless we
come up with some way of doing this).

In later pull requests with the rust-lang/rfcs#3729 implementation,
breakage like this could only occur for nightly users relying on the
`extern_types` feature.

Upstreaming this to avoid needing to keep carrying this patch locally,
and I think it'll necessarily need to change eventually.
2025-02-24 08:08:55 +00:00
DianQK
da50297a6e
Save pre-link bitcode to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK
9431427cc3
Add new_regular and new_allocator to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK
1a99ca8da9
The embedded bitcode should always be prepared for LTO/ThinLTO 2025-02-23 21:23:36 +08:00
Manuel Drehwald
e2d250c3f6 update autodiff flags 2025-02-21 21:51:20 -05:00
Manuel Drehwald
f4e2218b13 clean up autodiff code/comments 2025-02-21 21:47:48 -05:00
Oli Scherer
ce7f58bd91 Merge two operations that were always performed together 2025-02-20 11:24:00 +00:00
bjorn3
736ef0a4ce Don't error when adding a staticlib with bitcode files compiled by newer LLVM 2025-02-14 10:54:21 +00:00
clubby789
2966256133 Make -O mean -C opt-level=3 2025-02-13 19:47:55 +00:00
Matthias Krüger
9e89feefb9
Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errors
Document some safety constraints and use more safe wrappers

Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.
2025-02-12 06:07:35 +01:00
Oli Scherer
dcf1e4d72b Document some safety constraints and use more safe wrappers 2025-02-11 09:47:13 +00:00
Oli Scherer
4b83038d63 Add a safe wrapper for WriteBitcodeToFile 2025-02-11 09:41:22 +00:00
Oli Scherer
b2cd1b8ead Remove an unsafe closure invariant by inlining the closure wrapper into the called function 2025-02-11 09:41:22 +00:00
Jacob Pratt
6153a8dcea
Rollup merge of #136721 - dpaoliello:cleanllvm2, r=Zalathar
cg_llvm: Reduce visibility of some items outside the `llvm` module

Next piece of #135502

This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-11 01:02:40 -05:00
Daniel Paoliello
5f29273921 rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm module 2025-02-10 10:17:25 -08:00
Manuel Drehwald
1221cff551 move second opt run to lto phase and cleanup code 2025-02-10 01:35:22 -05:00
Manuel Drehwald
21d096184e fix non-enzyme builds 2025-02-07 22:27:46 -05:00
Ken Matsui
44e8c43976
rustc_codegen_llvm: remove outdated asm-to-obj codegen note
Remove comment about missing integrated assembler handling, which was
removed in commit 02840ca.
2025-01-22 17:58:50 -05:00
Matthias Krüger
448bad9eba
Rollup merge of #133752 - klensy:cp, r=davidtwco
replace copypasted ModuleLlvm::parse

replaced code same as in bd36e69d25/compiler/rustc_codegen_llvm/src/lib.rs (L426-L445)

except before error message was emitted via `write::llvm_err`, which returned other error kind, but it still ok?
2025-01-13 15:56:55 +01:00
Matthew Maurer
fc32dd49cb llvm: Ignore error value that is always false
See llvm/llvm-project#121851

For LLVM 20+, this function (`renameModuleForThinLTO`) has no return
value. For prior versions of LLVM, this never failed, but had a
signature which allowed an error value people were handling.
2025-01-07 01:02:22 +00:00
Manuel Drehwald
d753cbf779 upstream rustc_codegen_llvm changes for enzyme/autodiff 2025-01-01 21:42:45 +01:00
Ralf Jung
a0dbb37ebd add llvm_floatabi field to target spec that controls FloatABIType 2024-12-30 21:59:05 +01:00
Ralf Jung
fff026c8e5 rustc_llvm: expose FloatABIType target machine parameter 2024-12-30 18:10:59 +01:00
Ralf Jung
62bb35ab5d make -Csoft-float have an effect on all ARM targets 2024-12-29 11:10:36 +01:00
Nicholas Nethercote
2620eb42d7 Re-export more rustc_span::symbol things from rustc_span.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.

This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
2024-12-18 13:38:53 +11:00
bors
903d2976fd Auto merge of #129181 - beetrees:asm-spans, r=pnkfelix,compiler-errors
Pass end position of span through inline ASM cookie

Before this PR, only the start position of the span was passed though the inline ASM cookie to diagnostics. LLVM 19 has full support for 64-bit inline ASM cookies; this PR uses that to pass the end position of the span in the upper 32 bits, meaning inline ASM diagnostics now point at the entire line the error occurred on, not just the first character of it.
2024-12-12 02:34:06 +00:00
Kornel
eadea7764e
Use c"lit" for CStrings without unwrap 2024-12-02 18:16:36 +00:00
klensy
694950d73c replace copypasted ModuleLlvm::parse 2024-12-02 16:02:46 +03:00
Nikita Popov
d3ad000943 Respect verify-llvm-ir option in the backend
We are currently unconditionally verifying the LLVM IR in the
backend (twice), ignoring the value of the verify-llvm-ir option.
2024-11-26 15:26:03 +01:00
beetrees
68227a3777
Pass end position of span through inline ASM cookie 2024-11-26 13:00:08 +00:00
DianQK
3a23669787
embed-bitcode is no longer used in iOS 2024-11-24 15:51:47 +08:00
bjorn3
9e6d2da83d Reduce dependence on the target name
The target name can be anything with custom target specs. Matching on
fields inside the target spec is much more robust than matching on the
target name.
2024-11-03 18:29:01 +00:00
Noratrieb
a26450cf81 Rename target triple to target tuple in many places in the compiler
This changes the naming to the new naming, used by `--print
target-tuple`.
It does not change all locations, but many.
2024-11-02 21:29:59 +01:00
Matthias Krüger
bb544f863f
Rollup merge of #131037 - madsmtm:move-llvm-target-versioning, r=petrochenkov
Move versioned Apple LLVM targets from `rustc_target` to `rustc_codegen_ssa`

Fully specified LLVM targets contain the OS version on macOS/iOS/tvOS/watchOS/visionOS, and this version depends on the deployment target environment variables like `MACOSX_DEPLOYMENT_TARGET`, `IPHONEOS_DEPLOYMENT_TARGET` etc.

We would like to move this to later in the compilation pipeline, both because it feels impure to access environment variables when fetching target information, but mostly because we need access to more information from https://github.com/rust-lang/rust/pull/130883 to do https://github.com/rust-lang/rust/issues/118204. See also https://github.com/rust-lang/rust/pull/129342#issuecomment-2335156119 for some discussion.

The first and second commit does the actual refactor, it should be a non-functional change, the third commit adds diagnostics for invalid deployment targets, which are now possible to do because we have access to the session.

Tested with the same commands as in https://github.com/rust-lang/rust/pull/130435.

r? ``````@petrochenkov``````
2024-11-02 08:33:10 +01:00