Try to write the panic message with a single `write_all` call
This writes the panic message to a buffer before writing to stderr. This allows it to be printed with a single `write_all` call, preventing it from being interleaved with other outputs. It also adds newlines before and after the message ensuring that only the panic message will have its own lines.
Before:
```
thread 'thread 'thread 'thread 'thread '<unnamed>thread 'thread 'thread 'thread '<unnamed><unnamed>thread '<unnamed>' panicked at ' panicked at <unnamed><unnamed><unnamed><unnamed><unnamed>' panicked at <unnamed>' panicked at src\heap.rssrc\heap.rs'
panicked at ' panicked at ' panicked at ' panicked at ' panicked at src\heap.rs' panicked at src\heap.rs::src\heap.rssrc\heap.rssrc\heap.rssrc\heap.rssrc\heap.rs:src\heap.rs:455455:::::455:455::455455455455455:455:99:::::9:9:
:
999:
999:
assertion failed: size <= (*queue).block_size:
:
assertion failed: size <= (*queue).block_size:
assertion failed: size <= (*queue).block_size:
:
:
assertion failed: size <= (*queue).block_sizeassertion failed: size <= (*queue).block_sizeassertion failed: size <= (*queue).block_size
assertion failed: size <= (*queue).block_size
assertion failed: size <= (*queue).block_sizeassertion failed: size <= (*queue).block_sizeerror: process didn't exit successfully: `target\debug\direct_test.exe` (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN)
```
After:
```
thread '<unnamed>' panicked at src\heap.rs:455:9:
assertion failed: size <= (*queue).block_size
thread '<unnamed>' panicked at src\heap.rs:455:9:
assertion failed: size <= (*queue).block_size
thread '<unnamed>' panicked at src\heap.rs:455:9:
assertion failed: size <= (*queue).block_size
error: process didn't exit successfully: `target\debug\direct_test.exe` (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN)
```
---
try-jobs: x86_64-gnu-llvm-18
When `-Cstrip` was changed to use the bundled rust-objcopy instead of
/usr/bin/strip on OSX, strip-like arguments were preserved.
But strip and objcopy are, while being the same binary, different, they
have different defaults depending on which binary they are.
Notably, strip strips everything by default, and objcopy doesn't strip
anything by default.
Additionally, `-S` actually means `--strip-all`, so debuginfo stripped
everything and symbols didn't strip anything.
We now correctly pass `--strip-debug` and `--strip-all`.
This ensures `std::intrinsics::transmute` is deemphasized
in the search engine and other UI, by cleaning it into a deprecation
without propagating it through reexports when the parent module
is stable.
Fix formatting command
The formatting command previously had two issues:
- if rustfmt failed, it would print the command invocation. this is unnecessarily noisy
- there was a race condition that lead to orphan rustfmts that would print their output after bootstrap exited
We fix this by
- removing the printing, it's not really useful
- threading failure through properly instead of just yoloing exit(1)
Remove range-metadata amdgpu workaround
Range metadata was disabled for amdgpu due to a backend bug. I did not encounter any problems when removing the workaround to enable range metadata (tried compiling `core` and `alloc`), so I assume this has been fixed in LLVM in the last years.
Remove the workaround to re-enable range metadata.
Tracking issue: #135024
The formatting command previously had two issues:
- if rustfmt failed, it would print the command invocation. this is
unnecessarily noisy
- there was a race condition that lead to orphan rustfmts that would
print their output after bootstrap exited
We fix this by
- removing the printing, it's not really useful
- threading failure through properly instead of just yoloing exit(1)
bootstrap: Overhaul and simplify the `tool_extended!` macro
Similar to #134950, but for the macro that declares build steps for some tools.
The main changes are:
- Removing some functionality that isn't needed by any of the tools currently using the macro
- Moving some code out of the macro and into ordinary helper functions
- Switching to one macro invocation per tool, and struct-like syntax so that rustfmt will format them
There should be no functional change.
Range metadata was disabled for amdgpu due to a backend bug. I did not
encounter any problems when removing the workaround to enable range
metadata (tried compiling `core` and `alloc`), so I assume this has
been fixed in LLVM in the last years.
Remove the workaround to re-enable range metadata.
bootstrap: flip `compile::Rustc` vs `compile::Assemble`
The `PathSet` prefix matching unfortunately also has implications for `./x build compiler --stage 0`, because the path filter `"compiler"` gets consumed by `compile::Rustc` step first after `PathSet` prefix matching, whereas before `PathSet` prefix matching, `compile::Rustc` would not have consumed `"compiler"`.
This merely papers over #134970 to unblock contributors from using `./x build compiler --stage 0`.
The `PathSet` prefix matching behavior is tracked in #135022.
Closes#134970.
The PathSet prefix matching unfortunately also has implications for `./x
build compiler`, because the path filter `"compiler"` gets consumed by
`compile::Rustc` step first after PathSet prefix matching, whereas
before PathSet prefix matching, the later-registered `compile::Assemble`
step would've consumed the `"compiler"` path filter.
This merely papers over the issue with PathSet prefix handling to
unblock contributors for using `./x build compiler`.
Using struct-like syntax allows rustfmt to format macro invocations, instead of
giving up and ignoring them.
Using a separate macro invocation per tool makes the macro slightly simpler,
and isolates syntax errors to individual invocations.
Turn rustc-dev-guide into a Josh subtree
Discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/channel/196385-t-compiler.2Fwg-rustc-dev-guide/topic/a.20move.20to.20main.20repo.20.28rust-lang.2Frust.29).
Accompanying rustc-dev-guide PR: https://github.com/rust-lang/rustc-dev-guide/pull/2183
I didn't create a bootstrap step for rustc-dev-guide yet, because the rustc-dev-guide version that we currently use in this repo doesn't have linkcheck enabled and that fails tests.
The subtree starts with commit [ad93c5f1c49f2aeb45f7a4954017b1e607df9f5e](ad93c5f1c4).
What I did:
```
export DIR=src/doc/rustc-dev-guide
# Remove submodule
git submodule status ${DIR}
git submodule deinit ${DIR}
git rm -r --cached ${DIR}
rm -rf ${DIR}
# Remove rustc-dev-guide from .gitmodules
git commit -m"Removed `${DIR}` submodule"
# Import history with josh
git fetch https://github.com/rust-lang/rustc-dev-guide ad93c5f1c49f2aeb45f7a4954017b1e607df9f5e
josh-filter ':prefix=src/doc/rustc-dev-guide' FETCH_HEAD
git merge --allow-unrelated FILTERED_HEAD
# A few follow-up cleanup commits
```
r? ehuss
Autodiff Upstreaming - rustc_codegen_llvm changes
Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the `rustc_codegen_llvm` changes.
It also includes small changes to three files under `compiler/rustc_ast`, which overlap with my frontend PR (https://github.com/rust-lang/rust/pull/129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to `compiler/rustc_codegen_ssa`, the majority of changes there will be in another PR, once either this or the frontend gets merged.
We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.
This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?
To already highlight the things which reviewers might want to discuss:
1) `enzyme_ffi.rs`: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.
2) `add_panic_msg_to_global` was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.
3) `SanitizeHWAddress`: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?
4) Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is _the_ reason why Enzyme is so fast, so the compile time is acceptable for autodiff users: https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).
5) I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?
6) *I'm happy to split this PR up further if reviewers have recommendations on how to.*
For the full implementation, see: https://github.com/rust-lang/rust/pull/129175
Tracking:
- https://github.com/rust-lang/rust/issues/124509