adding autodiff tests
I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion?
Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as
`%0 = fadd fast double %x.0.val, %x.0.val`
(I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag)
r? ``@jieyouxu``
Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
cg_llvm: Replace some DIBuilder wrappers with LLVM-C API bindings (part 1)
Part of #134001, follow-up to #136326, extracted from #134009.
This PR performs an arbitrary subset of the LLVM-C binding migrations from #134009, which should make it less tedious to review. The remaining migrations can occur in one or more subsequent PRs.
Cast global variables to default address space
Pointers for variables all need to be in the same address space for correct compilation. Therefore ensure that even if a global variable is created in a different address space, it is casted to the default address space before its value is used.
This is necessary for the amdgpu target and others where the default address space for global variables is not 0.
For example `core` does not compile in debug mode when not casting the address space to the default one because it tries to emit the following (simplified) LLVM IR, containing a type mismatch:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) `@alloc_0` }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```
For this to compile, we need to insert a constant `addrspacecast` before we use a global variable:
```llvm
`@alloc_0` = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
`@alloc_1` = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) `@alloc_0` to ptr) }>, align 8
```
As vtables are global variables as well, they are also created with an `addrspacecast`. In the SSA backend, after a vtable global is created, metadata is added to it. To add metadata, we need the non-casted global variable. Therefore we strip away an addrspacecast if there is one, to get the underlying global.
Tracking issue: #135024
Make our `DIFlags` match `LLVMDIFlags` in the LLVM-C API
In order to be able to use a mixture of LLVM-C and C++ bindings for debuginfo, our Rust-side `DIFlags` needs to have the same layout as LLVM-C's `LLVMDIFlags`, and we also need to be able to convert it to the `DIFlags` accepted by LLVM's C++ API.
Internally, LLVM converts between the two types with a simple cast. We can't necessarily rely on that always being true, and LLVM doesn't expose a conversion function, so we have two potential options:
- Convert each bit/subvalue individually
- Statically assert that doing a cast is actually fine
As long as both types do remain the same under the hood (which seems likely), the static-assert-and-cast approach is easier and faster. If the static assertions ever start failing against some future version of LLVM, we'll have to switch over to the convert-each-subvalue approach, which is a bit more error-prone.
---
Extracted from #134009, though this PR ended up choosing the static-assert-and-cast approach over the convert-each-subvalue approach.
Add gpu-kernel calling convention
The amdgpu-kernel calling convention was reverted in commit f6b21e90d1 (#120495 and https://github.com/rust-lang/rust-analyzer/pull/16463) due to inactivity in the amdgpu target.
Introduce a `gpu-kernel` calling convention that translates to `ptx_kernel` or `amdgpu_kernel`, depending on the target that rust compiles for.
Tracking issue: #135467
amdgpu target tracking issue: #135024
The amdgpu-kernel calling convention was reverted in commit
f6b21e90d1 due to inactivity in the amdgpu
target.
Introduce a `gpu-kernel` calling convention that translates to
`ptx_kernel` or `amdgpu_kernel`, depending on the target that rust
compiles for.
See llvm/llvm-project#121851
For LLVM 20+, this function (`renameModuleForThinLTO`) has no return
value. For prior versions of LLVM, this never failed, but had a
signature which allowed an error value people were handling.
[Debuginfo] Force enum `DISCR_*` to `static const u64` to allow for inspection via LLDB
see [here](https://rust-lang.zulipchat.com/#narrow/channel/317568-t-compiler.2Fwg-debugging/topic/Revamping.20Debuginfo/near/486614878) for more info.
This change mainly helps `*-msvc` debugged with LLDB. Currently, LLDB cannot inspect `static` struct fields, so the intended visualization for enums is only borderline functional, and niche enums with ranges of discriminant cannot be determined at all .
LLDB *can* inspect `static const` values (though for whatever reason, non-enum/non-u64 consts don't work).
This change adds the `LLVMRustDIBuilderCreateQualifiedType` to the rust FFI layer to wrap the discr type with a `const` modifier, as well as forcing all generated integer enum `DISCR_*` values to be u64's. Those values will only ever be used by debugger visualizers anyway, so it shouldn't be a huge deal, but I left a fixme comment for it just in case.. The `tag` also still properly reflects the discriminant type, so no information is lost.
Pointers for variables all need to be in the same address space for
correct compilation. Therefore ensure that even if a global variable is
created in a different address space, it is casted to the default
address space before its value is used.
This is necessary for the amdgpu target and others where the default
address space for global variables is not 0.
For example `core` does not compile in debug mode when not casting the
address space to the default one because it tries to emit the following
(simplified) LLVM IR, containing a type mismatch:
```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspace(1) @alloc_0 }>, align 8
; ^ here a struct containing a `ptr` is needed, but it is created using a `ptr addrspace(1)`
```
For this to compile, we need to insert a constant `addrspacecast` before
we use a global variable:
```llvm
@alloc_0 = addrspace(1) constant <{ [6 x i8] }> <{ [6 x i8] c"bit.rs" }>, align 1
@alloc_1 = addrspace(1) constant <{ ptr }> <{ ptr addrspacecast (ptr addrspace(1) @alloc_0 to ptr) }>, align 8
```
As vtables are global variables as well, they are also created with an
`addrspacecast`. In the SSA backend, after a vtable global is created,
metadata is added to it. To add metadata, we need the non-casted global
variable. Therefore we strip away an addrspacecast if there is one, to
get the underlying global.
In the LLVM-C API, boolean values are passed as `typedef int LLVMBool`, but our
Rust-side typedef was using `c_uint` instead.
Signed and unsigned integers have the same ABI on most platforms, but that
isn't universally true, so we should prefer to be consistent with LLVM.
Pass end position of span through inline ASM cookie
Before this PR, only the start position of the span was passed though the inline ASM cookie to diagnostics. LLVM 19 has full support for 64-bit inline ASM cookies; this PR uses that to pass the end position of the span in the upper 32 bits, meaning inline ASM diagnostics now point at the entire line the error occurred on, not just the first character of it.
Allow disabling ASan instrumentation for globals
AddressSanitizer adds instrumentation to global variables unless the [`no_sanitize_address`](https://llvm.org/docs/LangRef.html#global-attributes) attribute is set on them.
This commit extends the existing `#[no_sanitize(address)]` attribute to set this; previously it only had the desired effect on functions.
(cc https://github.com/rust-lang/rust/issues/39699)
CFI: Append debug location to CFI blocks
Currently we're not appending debug locations to the inserted CFI blocks. This shows up in #132615 and #100783. This change fixes that by passing down the debug location to the CFI type-test generation and appending it to the blocks.
Credits also belong to `@jakos-sec` who worked with me on this.
LLVM does not expect to ever see multiple dbg_declares for the same variable at the same
location with different values. proc-macros make it possible for arbitrary code,
including multiple calls that get inlined, to happen at any given location in the source
code. Add discriminators when that happens so these locations are different to LLVM.
This may interfere with the AddDiscriminators pass in LLVM, which is added by the
unstable flag -Zdebug-info-for-profiling.
Fixes#131944
Trim and tidy includes in `rustc_llvm`
These includes tend to accumulate over time, and are usually only removed when something breaks in a new LLVM version, so it's nice to clean them up manually once in a while.
General strategy used for this PR:
- Remove all includes from `LLVMWrapper.h` that aren't needed by the header itself, transplanting them to individual source files as necessary.
- For each source file, temporarily remove each include if doing so doesn't cause a compile error.
- If a “required” include looks like it shouldn't be needed, try replacing it with its sub-includes, then trim that list.
- After doing all of the above, go back and re-add any removed include if the file does actually use things defined in that header, even if the header happens to also be included by something else.
Simplify FFI calls for `-Ztime-llvm-passes` and `-Zprint-codegen-stats`
The existing code for these unstable LLVM-infodump flags was jumping through hoops to pass an allocated C string across the FFI boundary, when it's much simpler to just write to a `&RustString` instead.
AddressSanitizer adds instrumentation to global variables unless the
[`no_sanitize_address`](https://llvm.org/docs/LangRef.html#global-attributes)
attribute is set on them.
This commit extends the existing `#[no_sanitize(address)]` attribute to
set this; previously it only had the desired effect on functions.
cg_llvm: Clean up FFI calls for operand bundles
All of these FFI functions have equivalents in the stable LLVM-C API, though `LLVMBuildCallBr` requires a temporary polyfill on LLVM 18.
This PR also creates a clear split between `OperandBundleOwned` and `OperandBundle`, and updates the internals of the owner to be a little less terrifying.