Do not ignore uninhabited types for function-call ABI purposes. (Remove BackendRepr::Uninhabited)
Accepted MCP: https://github.com/rust-lang/compiler-team/issues/832Fixes#135802
Do not consider the inhabitedness of a type for function call ABI purposes.
* Remove the [`rustc_abi::BackendRepr::Uninhabited`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/enum.BackendRepr.html) variant
* Instead calculate the `BackendRepr` of uninhabited types "normally" (as though they were not uninhabited "at the top level", but still considering inhabitedness of variants to determine enum layout, etc)
* Add an `uninhabited: bool` field to [`rustc_abi::LayoutData`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_abi/struct.LayoutData.html) so inhabitedness of a `LayoutData` can still be queried when necessary (e.g. when determining if an enum variant needs a tag value allocated to it).
This should not affect type layouts (size/align/field offset); this should only affect function call ABI, and only of uninhabited types.
cc ``@RalfJung``
Pass through of target features to llvm-bitcode-linker and handling them
When using the llvm-bitcode-linker (`linker-flavor=llbc`) target-features are not passed through and are not handled by it.
The llvm-bitcode-linker is mainly used as a self contained linker to link llvm bitcode for the nvptx64 target. It uses `llvm-link`, `opt` and `llc` internally. To produce a `.ptx` file of a specific ptx-version it is necessary to pass the version to llc with the `--mattr` option. Without explicitly setting it, the emitted `.ptx`-version is the minimum supported version of the `--target-cpu`.
I would like to be able to explicitly set the ptx version as [some llvm problems only occur in earlier `.ptx`-versions](https://github.com/llvm/llvm-project/issues/112998).
Therefore this pull request adds support for passing target features to llvm-bitcode-linker and handling them.
I was not quite sure if adding these features to `rustc_target/src/target_features.rs` is necessary or not. If so I will gladly add these.
r? ``@kjetilkjeka``
Create a generic AVR target: avr-none
This commit removes the `avr-unknown-gnu-atmega328` target and replaces it with a more generic `avr-none` variant that must be specialized using `-C target-cpu` (e.g. `-C target-cpu=atmega328p`).
Seizing the day, I'm adding myself as the maintainer of this target - I've been already fixing the bugs anyway, might as well make it official 🙂
Related discussions:
- https://github.com/rust-lang/rust/pull/131171
- https://github.com/rust-lang/compiler-team/issues/800
try-job: x86_64-gnu-debug
Fix codegen of uninhabited PassMode::Indirect return types.
Add codegen test for uninhabited PassMode::Indirect return types.
Enable optimizations for uninhabited return type codegen test
- For shifts this shrinks the IR by no longer needing an `assume` while still providing the UB information
- Having this on the `i8`→`i1` truncations will hopefully help with some places that have to load `i8`s or pass those in LLVM structs without range information
This commit removes the `avr-unknown-gnu-atmega328` target and replaces
it with a more generic `avr-none` variant that must be specialized with
the `-C target-cpu` flag (e.g. `-C target-cpu=atmega328p`).
Remove `rustc_middle::mir::tcx` module.
This is a really weird module. For example, what does `tcx` in `rustc_middle::mir::tcx::PlaceTy` mean? The answer is "not much".
The top-level module comment says:
> Methods for the various MIR types. These are intended for use after
> building is complete.
Awfully broad for a module that has a handful of impl blocks for some MIR types, none of which really relates to `TyCtxt`. `git blame` indicates the comment is ancient, from 2015, and made sense then.
This module is now vestigial. This commit removes it and moves all the code within into `rustc_middle::mir::statement`. Some specifics:
- `Place`, `PlaceRef`, `Rvalue`, `Operand`, `BorrowKind`: they all have `impl` blocks in both the `tcx` and `statement` modules. The commit merges the former into the latter.
- `BinOp`, `UnOp`: they only have `impl` blocks in `tcx`. The commit moves these into `statement`.
- `PlaceTy`, `RvalueInitializationState`: they are defined in `tcx`. This commit moves them into `statement` *and* makes them available in `mir::*`, like many other MIR types.
r? `@tmandry`
This is a really weird module. For example, what does `tcx` in
`rustc_middle::mir::tcx::PlaceTy` mean? The answer is "not much".
The top-level module comment says:
> Methods for the various MIR types. These are intended for use after
> building is complete.
Awfully broad for a module that has a handful of impl blocks for some
MIR types, none of which really relates to `TyCtxt`. `git blame`
indicates the comment is ancient, from 2015, and made sense then.
This module is now vestigial. This commit removes it and moves all the
code within into `rustc_middle::mir::statement`. Some specifics:
- `Place`, `PlaceRef`, `Rvalue`, `Operand`, `BorrowKind`: they all have `impl`
blocks in both the `tcx` and `statement` modules. The commit merges
the former into the latter.
- `BinOp`, `UnOp`: they only have `impl` blocks in `tcx`. The commit
moves these into `statement`.
- `PlaceTy`, `RvalueInitializationState`: they are defined in `tcx`.
This commit moves them into `statement` *and* makes them available in
`mir::*`, like many other MIR types.
improve cold_path()
#120370 added a new instrinsic `cold_path()` and used it to fix `likely` and `unlikely`
However, in order to limit scope, the information about cold code paths is only used in 2-target switch instructions. This is sufficient for `likely` and `unlikely`, but limits usefulness of `cold_path` for idiomatic rust. For example, code like this:
```
if let Some(x) = y { ... }
```
may generate 3-target switch:
```
switch y.discriminator:
0 => true branch
1 = > false branch
_ => unreachable
```
and therefore marking a branch as cold will have no effect.
This PR improves `cold_path()` to work with arbitrary switch instructions.
Note that for 2-target switches, we can use `llvm.expect`, but for multiple targets we need to manually emit branch weights. I checked Clang and it also emits weights in this situation. The Clang's weight calculation is more complex that this PR, which I believe is mainly because `switch` in `C/C++` can have multiple cases going to the same target.
Rollup of 7 pull requests
Successful merges:
- #137095 (Replace some u64 hashes with Hash64)
- #137100 (HIR analysis: Remove unnecessary abstraction over list of clauses)
- #137105 (Restrict DerefPure for Cow<T> impl to T = impl Clone, [impl Clone], str.)
- #137120 (Enable `relative-path-include-bytes-132203` rustdoc-ui test on Windows)
- #137125 (Re-add missing empty lines in the releases notes)
- #137145 (use add-core-stubs / minicore for a few more tests)
- #137149 (Remove SSE ABI from i586-pc-windows-msvc)
r? `@ghost`
`@rustbot` modify labels: rollup
Replace some u64 hashes with Hash64
I introduced the Hash64 and Hash128 types in https://github.com/rust-lang/rust/pull/110083, essentially as a mechanism to prevent hashes from landing in our leb128 encoding paths. If you just have a u64 or u128 field in a struct then derive Encodable/Decodable, that number gets leb128 encoding. So if you need to store a hash or some other value which behaves very close to a hash, don't store it as a u64.
This reverts part of https://github.com/rust-lang/rust/pull/117603, which turned an encoded Hash64 into a u64.
Based on https://github.com/rust-lang/rust/pull/110083, I don't expect this to be perf-sensitive on its own, though I expect that it may help stabilize some of the small rmeta size fluctuations we currently see in perf reports.
The end goal is to eliminate `Map` altogether.
I added a `hir_` prefix to all of them, that seemed simplest. The
exceptions are `module_items` which became `hir_module_free_items` because
there was already a `hir_module_items`, and `items` which became
`hir_free_items` for consistency with `hir_module_free_items`.
The .ptx version produced by llc can be specified by passing it with --mattr. Currently it is not possible to specify the .ptx version with -Ctarget-feature because these are not passed through to llvm-bitcode-linker and handled by it. This commit adds both.
--target-feature and -mattr are passed with equals to mitigate issues when the value starts with a - (minus).
Bitcode linkers like llvm-bitcode-linker or bpf linker hand over the target features to llvm during link stage. During link stage the `TyCtxt` is already gone so it is not possible to create a query for the global backend features any longer. The features preserved in `Session.target_features` only incorporate target features known to rustc. This would contradict with the behaviour during codegen stage which also passes target features to llvm which are unknown to rustc.
This commit adds target features as a field to the `CrateInfo` struct and queries the target features in its new function. This way the target features are preserved beyond tcx and available at link stage.
To make sure the `global_backend_features` query is always registered even if the CodegenBackend does not register it, this registration is added to the `provide`function of the `rustc_codegen_ssa` crate.
Export kernel descriptor for amdgpu kernels
The host runtime (HIP or HSA) expects a kernel descriptor object for each kernel in the ELF file. The amdgpu LLVM backend generates the object. It is created as a symbol with the name of the kernel plus a `.kd` suffix.
Add it to the exported symbols in the linker script, so that it can be found.
For reference, the symbol is created here in LLVM: d5457e4c16/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp (L966)
I wrote [a test](6a9115b121) for this as well, I’ll add that once the target is merged and working.
With this, all PRs to get working code for amdgpu are open (this + the target + the two patches adding addrspacecasts for alloca and global variables).
Tracking issue: #135024
r? `@workingjubilee`
Set both `nuw` and `nsw` in slice size calculation
There's an old note in the code to do this, and now that [LLVM-C has an API for it](f0b8ff1251/llvm/include/llvm-c/Core.h (L4403-L4408)), we might as well. And it's been there since what looks like LLVM 17 de9b6aa341 so doesn't even need to be conditional.
(There's other places, like `RawVecInner` or `Layout`, that might want to do things like this too, but I'll leave those for a future PR.)
`transmute` should also assume non-null pointers
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
Previously it only did integer-ABI things, but this way it does data pointers too. That gives more information in general to the backend, and allows slightly simplifying one of the helpers in slice iterators.
Stabilize target_feature_11
# Stabilization report
This is an updated version of https://github.com/rust-lang/rust/pull/116114, which is itself a redo of https://github.com/rust-lang/rust/pull/99767. Most of this commit and report were copied from those PRs. Thanks ```@LeSeulArtichaut``` and ```@calebzulawski!```
## Summary
Allows for safe functions to be marked with `#[target_feature]` attributes.
Functions marked with `#[target_feature]` are generally considered as unsafe functions: they are unsafe to call, cannot *generally* be assigned to safe function pointers, and don't implement the `Fn*` traits.
However, calling them from other `#[target_feature]` functions with a superset of features is safe.
```rust
// Demonstration function
#[target_feature(enable = "avx2")]
fn avx2() {}
fn foo() {
// Calling `avx2` here is unsafe, as we must ensure
// that AVX is available first.
unsafe {
avx2();
}
}
#[target_feature(enable = "avx2")]
fn bar() {
// Calling `avx2` here is safe.
avx2();
}
```
Moreover, once https://github.com/rust-lang/rust/pull/135504 is merged, they can be converted to safe function pointers in a context in which calling them is safe:
```rust
// Demonstration function
#[target_feature(enable = "avx2")]
fn avx2() {}
fn foo() -> fn() {
// Converting `avx2` to fn() is a compilation error here.
avx2
}
#[target_feature(enable = "avx2")]
fn bar() -> fn() {
// `avx2` coerces to fn() here
avx2
}
```
See the section "Closures" below for justification of this behaviour.
## Test cases
Tests for this feature can be found in [`tests/ui/target_feature/`](f6cb952dc1/tests/ui/target-feature).
## Edge cases
### Closures
* [target-feature 1.1: should closures inherit target-feature annotations? #73631](https://github.com/rust-lang/rust/issues/73631)
Closures defined inside functions marked with #[target_feature] inherit the target features of their parent function. They can still be assigned to safe function pointers and implement the appropriate `Fn*` traits.
```rust
#[target_feature(enable = "avx2")]
fn qux() {
let my_closure = || avx2(); // this call to `avx2` is safe
let f: fn() = my_closure;
}
```
This means that in order to call a function with #[target_feature], you must guarantee that the target-feature is available while the function, any closures defined inside it, as well as any safe function pointers obtained from target-feature functions inside it, execute.
This is usually ensured because target features are assumed to never disappear, and:
- on any unsafe call to a `#[target_feature]` function, presence of the target feature is guaranteed by the programmer through the safety requirements of the unsafe call.
- on any safe call, this is guaranteed recursively by the caller.
If you work in an environment where target features can be disabled, it is your responsibility to ensure that no code inside a target feature function (including inside a closure) runs after this (until the feature is enabled again).
**Note:** this has an effect on existing code, as nowadays closures do not inherit features from the enclosing function, and thus this strengthens a safety requirement. It was originally proposed in #73631 to solve this by adding a new type of UB: “taking a target feature away from your process after having run code that uses that target feature is UB” .
This was motivated by userspace code already assuming in a few places that CPU features never disappear from a program during execution (see i.e. 2e29bdf908/crates/std_detect/src/detect/arch/x86.rs); however, concerns were raised in the context of the Linux kernel; thus, we propose to relax that requirement to "causing the set of usable features to be reduced is unsafe; when doing so, the programmer is required to ensure that no closures or safe fn pointers that use removed features are still in scope".
* [Fix #[inline(always)] on closures with target feature 1.1 #111836](https://github.com/rust-lang/rust/pull/111836)
Closures accept `#[inline(always)]`, even within functions marked with `#[target_feature]`. Since these attributes conflict, `#[inline(always)]` wins out to maintain compatibility.
### ABI concerns
* [The extern "C" ABI of SIMD vector types depends on target features #116558](https://github.com/rust-lang/rust/issues/116558)
The ABI of some types can change when compiling a function with different target features. This could have introduced unsoundness with target_feature_11, but recent fixes (#133102, #132173) either make those situations invalid or make the ABI no longer dependent on features. Thus, those issues should no longer occur.
### Special functions
The `#[target_feature]` attribute is forbidden from a variety of special functions, such as main, current and future lang items (e.g. `#[start]`, `#[panic_handler]`), safe default trait implementations and safe trait methods.
This was not disallowed at the time of the first stabilization PR for target_features_11, and resulted in the following issues/PRs:
* [`#[target_feature]` is allowed on `main` #108645](https://github.com/rust-lang/rust/issues/108645)
* [`#[target_feature]` is allowed on default implementations #108646](https://github.com/rust-lang/rust/issues/108646)
* [#[target_feature] is allowed on #[panic_handler] with target_feature 1.1 #109411](https://github.com/rust-lang/rust/issues/109411)
* [Prevent using `#[target_feature]` on lang item functions #115910](https://github.com/rust-lang/rust/pull/115910)
## Documentation
* Reference: [Document the `target_feature_11` feature reference#1181](https://github.com/rust-lang/reference/pull/1181)
---
cc tracking issue https://github.com/rust-lang/rust/issues/69098
cc ```@workingjubilee```
cc ```@RalfJung```
r? ```@rust-lang/lang```
Rename rustc_middle::Ty::is_unsafe_ptr to is_raw_ptr
The wording unsafe pointer is less common and not mentioned in a lot of places, instead this is usually called a "raw pointer". For the sake of uniformity, we rename this method.
This came up during the review of
https://github.com/rust-lang/rust/pull/134424.
r? `@Noratrieb`
The host runtime (HIP or HSA) expects a kernel descriptor object for
each kernel in the ELF file. The amdgpu LLVM backend generates the
object. It is created as a symbol with the name of the kernel plus a
`.kd` suffix.
Add it to the exported symbols in the linker script, so that it can be
found.
compiler: gate `extern "{abi}"` in ast_lowering
I don't believe low-level crates like `rustc_abi` should have to know or care about higher-level concerns like whether the ABI string is stable for users. These implementation details can be made less open to public inspection. This way the code that governs stability is near the code that enforces stability, and compiled together.
It also abstracts away certain error messages instead of constantly repeating them.
A few error messages are simply deleted outright, instead of made uniform, because they are either too dated to be useful or redundant with other diagnostic improvements we could make. These can be pursued in followups: my first concern was making sure there wasn't unnecessary diagnostics-related code in `rustc_abi`, which is not well-positioned to understand what kind of errors are going to be generated based on how it is used.
r? ``@ghost``
adding autodiff tests
I'd like to get started with upstreaming some tests, even though I'm still waiting for an answer on how to best integrate the enzyme pass. Can we therefore temporarily support the -Z llvm-plugins here without too much effort? And in that case, how would that work? I saw you can do remapping, e.g. `rust-src-base`, but I don't think that will give me the path to libEnzyme.so. Do you have another suggestion?
Other than that this test simply checks that the derivative of `x*x` is `2.0 * x`, which in this case is computed as
`%0 = fadd fast double %x.0.val, %x.0.val`
(I'll add a few more tests and move it to an autodiff folder if we can use the -Z flag)
r? ``@jieyouxu``
Locally at least `-Zllvm-plugins=${PWD}/build/x86_64-unknown-linux-gnu/enzyme/build/Enzyme/libEnzyme-19.so` seems to work if I copy the command I get from x.py test and run it manually. However, running x.py test itself fails.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
Zulip discussion: https://rust-lang.zulipchat.com/#narrow/channel/326414-t-infra.2Fbootstrap/topic/Enzyme.20build.20changes
The wording unsafe pointer is less common and not mentioned in a lot of
places, instead this is usually called a "raw pointer". For the sake of
uniformity, we rename this method.
This came up during the review of
https://github.com/rust-lang/rust/pull/134424.
Update bootstrap compiler and rustfmt
The rustfmt version we previously used formats things differently from what the latest nightly rustfmt does. This causes issues for subtrees that get formatted both in-tree and in their own repo. Updating the rustfmt used in-tree solves those issues. Also bumped the bootstrap compiler as the stage0 update command always updates both at the same
time.
compiler: mostly-finish `rustc_abi` updates
This almost-finishes all the updates in the compiler to use `rustc_abi` and removes some of the reexports of `rustc_abi` items in `rustc_target` that were previously available.
r? ```@compiler-errors```
Generate correct terminate block under Wasm EH
This fixes failing LLVM assertions during insnsel.
Improves #135665.
r? bjorn3
^ you reviewed the PR bringing Wasm EH in, I assume this is within your area of expertise?
tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`
tree-wide: parallel: Fully removed all `Lrc`, replaced with `Arc`
This is continuation of https://github.com/rust-lang/rust/pull/132282 .
I'm pretty sure I did everything right. In particular, I searched all occurrences of `Lrc` in submodules and made sure that they don't need replacement.
There are other possibilities, through.
We can define `enum Lrc<T> { Rc(Rc<T>), Arc(Arc<T>) }`. Or we can make `Lrc` a union and on every clone we can read from special thread-local variable. Or we can add a generic parameter to `Lrc` and, yes, this parameter will be everywhere across all codebase.
So, if you think we should take some alternative approach, then don't merge this PR. But if it is decided to stick with `Arc`, then, please, merge.
cc "Parallel Rustc Front-end" ( https://github.com/rust-lang/rust/issues/113349 )
r? SparrowLii
`@rustbot` label WG-compiler-parallel
#[contracts::requires(...)] + #[contracts::ensures(...)]
cc https://github.com/rust-lang/rust/issues/128044
Updated contract support: attribute syntax for preconditions and postconditions, implemented via a series of desugarings that culminates in:
1. a compile-time flag (`-Z contract-checks`) that, similar to `-Z ub-checks`, attempts to ensure that the decision of enabling/disabling contract checks is delayed until the end user program is compiled,
2. invocations of lang-items that handle invoking the precondition, building a checker for the post-condition, and invoking that post-condition checker at the return sites for the function, and
3. intrinsics for the actual evaluation of pre- and post-condition predicates that third-party verification tools can intercept and reinterpret for their own purposes (e.g. creating shims of behavior that abstract away the function body and replace it solely with the pre- and post-conditions).
Known issues:
* My original intent, as described in the MCP (https://github.com/rust-lang/compiler-team/issues/759) was to have a rustc-prefixed attribute namespace (like rustc_contracts::requires). But I could not get things working when I tried to do rewriting via a rustc-prefixed builtin attribute-macro. So for now it is called `contracts::requires`.
* Our attribute macro machinery does not provide direct support for attribute arguments that are parsed like rust expressions. I spent some time trying to add that (e.g. something that would parse the attribute arguments as an AST while treating the remainder of the items as a token-tree), but its too big a lift for me to undertake. So instead I hacked in something approximating that goal, by semi-trivially desugaring the token-tree attribute contents into internal AST constucts. This may be too fragile for the long-term.
* (In particular, it *definitely* breaks when you try to add a contract to a function like this: `fn foo1(x: i32) -> S<{ 23 }> { ... }`, because its token-tree based search for where to inject the internal AST constructs cannot immediately see that the `{ 23 }` is within a generics list. I think we can live for this for the short-term, i.e. land the work, and continue working on it while in parallel adding a new attribute variant that takes a token-tree attribute alongside an AST annotation, which would completely resolve the issue here.)
* the *intent* of `-Z contract-checks` is that it behaves like `-Z ub-checks`, in that we do not prematurely commit to including or excluding the contract evaluation in upstream crates (most notably, `core` and `std`). But the current test suite does not actually *check* that this is the case. Ideally the test suite would be extended with a multi-crate test that explores the matrix of enabling/disabling contracts on both the upstream lib and final ("leaf") bin crates.
Rename `tcx.ensure()` to `tcx.ensure_ok()`, and improve the associated docs
This is all based on my archaeology for https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/.60TyCtxtEnsure.60.
The main renamings are:
- `tcx.ensure()` → `tcx.ensure_ok()`
- `tcx.ensure_with_value()` → `tcx.ensure_done()`
- Query modifier `ensure_forwards_result_if_red` → `return_result_from_ensure_ok`
Hopefully these new names are a better fit for the *actual* function and purpose of these query call modes.
Implement MIR lowering for unsafe binders
This is the final bit of the unsafe binders puzzle. It implements MIR, CTFE, and codegen for unsafe binders, and enforces that (for now) they are `Copy`. Later on, I'll introduce a new trait that relaxes this requirement to being "is `Copy` or `ManuallyDrop<T>`" which more closely models how we treat union fields.
Namely, wrapping unsafe binders is now `Rvalue::WrapUnsafeBinder`, which acts much like an `Rvalue::Aggregate`. Unwrapping unsafe binders are implemented as a MIR projection `ProjectionElem::UnwrapUnsafeBinder`, which acts much like `ProjectionElem::Field`.
Tracking:
- https://github.com/rust-lang/rust/issues/130516
Insert null checks for pointer dereferences when debug assertions are enabled
Similar to how the alignment is already checked, this adds a check
for null pointer dereferences in debug mode. It is implemented similarly
to the alignment check as a `MirPass`.
This inserts checks in the same places as the `CheckAlignment` pass and additionally
also inserts checks for `Borrows`, so code like
```rust
let ptr: *const u32 = std::ptr::null();
let val: &u32 = unsafe { &*ptr };
```
will have a check inserted on dereference. This is done because null references
are UB. The alignment check doesn't cover these places, because in `&(*ptr).field`,
the exact requirement is that the final reference must be aligned. This is something to
consider further enhancements of the alignment check.
For now this is implemented as a separate `MirPass`, to make it easy to disable
this check if necessary.
This is related to a 2025H1 project goal for better UB checks in debug
mode: https://github.com/rust-lang/rust-project-goals/pull/177.
r? `@saethlin`
Similar to how the alignment is already checked, this adds a check
for null pointer dereferences in debug mode. It is implemented similarly
to the alignment check as a MirPass.
This is related to a 2025H1 project goal for better UB checks in debug
mode: https://github.com/rust-lang/rust-project-goals/pull/177.
Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle
This PR should not be merged until the rustc_codegen_llvm part is merged.
I will also alter it a little based on what get's shaved off from the cg_llvm PR,
and address some of the feedback I received in the other PR (including cleanups).
I am putting it already up to
1) Discuss with `@jieyouxu` if there is more work needed to add tests to this and
2) Pray that there is someone reviewing who can tell me why some of my autodiff invocations get lost.
Re 1: My test require fat-lto. I also modify the compilation pipeline. So if there are any other llvm-ir tests in the same compilation unit then I will likely break them. Luckily there are two groups who currently have the same fat-lto requirement for their GPU code which I have for my autodiff code and both groups have some plans to enable support for thin-lto. Once either that work pans out, I'll copy it over for this feature. I will also work on not changing the optimization pipeline for functions not differentiated, but that will require some thoughts and engineering, so I think it would be good to be able to run the autodiff tests isolated from the rest for now. Can you guide me here please?
For context, here are some of my tests in the samples folder: https://github.com/EnzymeAD/rustbook
Re 2: This is a pretty serious issue, since it effectively prevents publishing libraries making use of autodiff: https://github.com/EnzymeAD/rust/issues/173. For some reason my dummy code persists till the end, so the code which calls autodiff, deletes the dummy, and inserts the code to compute the derivative never gets executed. To me it looks like the rustc_autodiff attribute just get's dropped, but I don't know WHY? Any help would be super appreciated, as rustc queries look a bit voodoo to me.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
r? `@jieyouxu`
Fix deduplication mismatches in vtables leading to upcasting unsoundness
We currently have two cases where subtleties in supertraits can trigger disagreements in the vtable layout, e.g. leading to a different vtable layout being accessed at a callsite compared to what was prepared during unsizing. Namely:
### #135315
In this example, we were not normalizing supertraits when preparing vtables. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Identity {
type Selff;
}
impl<Selff> Identity for Selff {
type Selff = Selff;
}
trait Middle<T>: Supertrait<()> + Supertrait<T> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T> Middle<T> for () {}
trait Trait: Middle<<() as Identity>::Selff> {}
impl Trait for () {}
fn main() {
(&() as &dyn Trait as &dyn Middle<()>).say_hello(&0);
}
```
When we prepare `dyn Trait`, we see a supertrait of `Middle<<() as Identity>::Selff>`, which itself has two supertraits `Supertrait<()>` and `Supertrait<<() as Identity>::Selff>`. These two supertraits are identical, but they are not duplicated because we were using structural equality and *not* considering normalization. This leads to a vtable layout with two trait pointers.
When we upcast to `dyn Middle<()>`, those two supertraits are now the same, leading to a vtable layout with only one trait pointer. This leads to an offset error, and we call the wrong method.
### #135316
This one is a bit more interesting, and is the bulk of the changes in this PR. It's a bit similar, except it uses binder equality instead of normalization to make the compiler get confused about two vtable layouts. In the example,
```
trait Supertrait<T> {
fn _print_numbers(&self, mem: &[usize; 100]) {
println!("{mem:?}");
}
}
impl<T> Supertrait<T> for () {}
trait Trait<T, U>: Supertrait<T> + Supertrait<U> {
fn say_hello(&self, _: &usize) {
println!("Hello!");
}
}
impl<T, U> Trait<T, U> for () {}
fn main() {
(&() as &'static dyn for<'a> Trait<&'static (), &'a ()>
as &'static dyn Trait<&'static (), &'static ()>)
.say_hello(&0);
}
```
When we prepare the vtable for `dyn for<'a> Trait<&'static (), &'a ()>`, we currently consider the PolyTraitRef of the vtable as the key for a supertrait. This leads two two supertraits -- `Supertrait<&'static ()>` and `for<'a> Supertrait<&'a ()>`.
However, we can upcast[^up] without offsetting the vtable from `dyn for<'a> Trait<&'static (), &'a ()>` to `dyn Trait<&'static (), &'static ()>`. This is just instantiating the principal trait ref for a specific `'a = 'static`. However, when considering those supertraits, we now have only one distinct supertrait -- `Supertrait<&'static ()>` (which is deduplicated since there are two supertraits with the same substitutions). This leads to similar offsetting issues, leading to the wrong method being called.
[^up]: I say upcast but this is a cast that is allowed on stable, since it's not changing the vtable at all, just instantiating the binder of the principal trait ref for some lifetime.
The solution here is to recognize that a vtable isn't really meaningfully higher ranked, and to just treat a vtable as corresponding to a `TraitRef` so we can do this deduplication more faithfully. That is to say, the vtable for `dyn for<'a> Tr<'a>` and `dyn Tr<'x>` are always identical, since they both would correspond to a set of free regions on an impl... Do note that `Tr<for<'a> fn(&'a ())>` and `Tr<fn(&'static ())>` are still distinct.
----
There's a bit more that can be cleaned up. In codegen, we can stop using `PolyExistentialTraitRef` basically everywhere. We can also fix SMIR to stop storing `PolyExistentialTraitRef` in its vtable allocations.
As for testing, it's difficult to actually turn this into something that can be tested with `rustc_dump_vtable`, since having multiple supertraits that are identical is a recipe for ambiguity errors. Maybe someone else is more creative with getting that attr to work, since the tests I added being run-pass tests is a bit unsatisfying. Miri also doesn't help here, since it doesn't really generate vtables that are offset by an index in the same way as codegen.
r? `@lcnr` for the vibe check? Or reassign, idk. Maybe let's talk about whether this makes sense.
<sup>(I guess an alternative would also be to not do any deduplication of vtable supertraits (or only a really conservative subset) rather than trying to normalize and deduplicate more faithfully here. Not sure if that works and is sufficient tho.)</sup>
cc `@steffahn` -- ty for the minimizations
cc `@WaffleLapkin` -- since you're overseeing the feature stabilization :3
Fixes#135315Fixes#135316
Target option to require explicit cpu
Some targets have many different CPUs and no generic CPU that can be used as a default. For these targets, the user needs to explicitly specify a CPU through `-C target-cpu=`.
Add an option for targets and an error message if no CPU is set.
This affects the proposed amdgpu and avr targets.
amdgpu tracking issue: #135024
AVR MCP: https://github.com/rust-lang/compiler-team/issues/800
Lower index bounds checking to `PtrMetadata`, this time with the right fake borrow semantics 😸
Change `Rvalue::RawRef` to take a `RawRefKind` instead of just a `Mutability`. Then introduce `RawRefKind::FakeForPtrMetadata` and use that for lowering index bounds checking to a `PtrMetadata`. This new `RawRefKind::FakeForPtrMetadata` acts like a shallow fake borrow in borrowck, which mimics the semantics of the old `Rvalue::Len` operation we're replacing.
We can then use this `RawRefKind` instead of using a span desugaring hack in CTFE.
cc ``@scottmcm`` ``@RalfJung``
- Don't show environment variables. Seeing PATH is almost never useful, and it can be extremely long.
- For .rlibs in the sysroot, replace crate hashes with a `"-*"` string. This will expand to the full crate name when pasted into the shell.
- Move `.rlib` to outside the glob.
- Abbreviate the sysroot path to `<sysroot>` wherever it appears in the arguments.
This also adds an example of the linker output as a run-make test. Currently it only runs on x86_64-unknown-linux-gnu, because each platform has its own linker arguments. So that it's stable across machines, pass BUILD_ROOT as an argument through compiletest through to run-make tests.
- Only use linker-flavor=gnu-cc if we're actually going to compare the output. It doesn't exist on MacOS.
Add `#[optimize(none)]`
cc #54882
This extends the `optimize` attribute to add `none`, which corresponds to the LLVM `OptimizeNone` attribute.
Not sure if an MCP is required for this, happy to file one if so.
Separate Builder methods from tcx
As part of the autodiff upstreaming we noticed, that it would be nice to have various builder methods available without the TypeContext, which prevents the normal CodegenCx to be passed around between threads.
We introduce a SimpleCx which just owns the llvm module and llvm context, to encapsulate them.
The previous CodegenCx now implements deref and forwards access to the llvm module or context to it's SimpleCx sub-struct. This gives us a bit more flexibility, because now we can pass (or construct) the SimpleCx in locations where we don't have enough information to construct a CodegenCx, or are not able to pass it around due to the tcx lifetimes (and it not implementing send/sync).
This also introduces an SBuilder, similar to the SimpleCx. The SBuilder uses a SimpleCx, whereas the existing Builder uses the larger CodegenCx. I will push updates to make implementations generic (where possible) to be implemented once and work for either of the two. I'll also clean up the leftover code.
`call` is a bit tricky, because it requires a tcx, I probably need to duplicate it after all.
Tracking:
- https://github.com/rust-lang/rust/issues/124509
These cannot be silenced with a CLI flag, and are not useful to warn
about. They can still be viewed for debugging purposes using
`RUSTC_LOG=rustc_codegen_ssa:🔗:back`.
support wasm inline assembly in `naked_asm!`
fixes https://github.com/rust-lang/rust/issues/135518
Webassembly was overlooked previously, but now `naked_asm!` and `#[naked]` functions work on the webassembly targets.
Or, they almost do right now. I guess this is no surprise, but the `wasm32-unknown-unknown` target causes me some trouble. I'll add some inline comments with more details.
r? ```````@bjorn3```````
cc ```````@daxpedda,``````` ```````@tgross35```````
Update our range `assume`s to the format that LLVM prefers
I found out in https://github.com/llvm/llvm-project/issues/123278#issuecomment-2597440158 that the way I started emitting the `assume`s in #109993 was suboptimal, and as seen in that LLVM issue the way we're doing it -- with two `assume`s sometimes -- can at times lead to CVP/SCCP not realize what's happening because one of them turns into a `ne` instead of conveying a range.
So this updates how it's emitted from
```
assume( x >= LOW );
assume( x <= HIGH );
```
or
```
// (for ranges that wrap the range)
assume( (x <= LOW) | (x >= HIGH) );
```
to
```
assume( (x - LOW) <= (HIGH - LOW) );
```
so that we don't need multiple `icmp`s nor multiple `assume`s for a single value, and both wrappping and non-wrapping ranges emit the same shape.
(And we don't bother emitting the subtraction if `LOW` is zero, since that's trivial for us to check too.)
bump compiler and tools to windows 0.59, bootstrap to 0.57
This bumps compiler and tools to windows 0.59 (temporary dupes version, as `sysinfo` still depend on <= 0.57).
Bootstrap bumps only to 0.57 (the same sysinfo dep).
This additionally resolves my comment https://github.com/rust-lang/rust/pull/130874#issuecomment-2393562071
Will work on it in follow up pr: There still some sus imports for `rustc_driver.dll` like ws2_32 or RoOriginateErrorW, but i will look at them later.
remove support for the (unstable) #[start] attribute
As explained by `@Noratrieb:`
`#[start]` should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.
I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:
- `std`-using cross-platform programs should use `fn main()`. the compiler, together with `std`, will then ensure that code ends up at `main` (by having a platform-specific entrypoint that gets directed through `lang_start` in `std` to `main` - but that's just an implementation detail)
- `no_std` platform-specific programs should use `#![no_main]` and define their own platform-specific entrypoint symbol with `#[no_mangle]`, like `main`, `_start`, `WinMain` or `my_embedded_platform_wants_to_start_here`. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things *anyways*
`#[start]` is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is a total lie. Those arguments are just stubbed out to zero on ~~Windows~~ wasm, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by `std`, like Windows or Unix-likes. `my_embedded_platform_wants_to_start_here` can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving *any* stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.
Note that this feature has **not** been RFCed in the first place.
*This comment was posted [in May](https://github.com/rust-lang/rust/issues/29633#issuecomment-2088596042) and so far nobody spoke up in that issue with a usecase that would require keeping the attribute.*
Closes https://github.com/rust-lang/rust/issues/29633
try-job: x86_64-gnu-nopt
try-job: x86_64-msvc-1
try-job: x86_64-msvc-2
try-job: test-various
Revert most of #133194 (except the test and the comment fixes). Then refix
not emitting locations at all when the correct location discriminator value
exceeds LLVM's capacity.
Some targets have many different CPUs and no generic CPU that can be
used as a default. For these targets, the user needs to explicitly
specify a CPU through `-C target-cpu=`.
Add an option for targets and an error message if no CPU is set.
This affects the proposed amdgpu and avr targets.
If we build the standard library with wasm-eh then we need to link
with `-fwasm-exceptions` even if we compile with `panic=abort`
Without this change, linking a `panic=abort` crate fails with:
`undefined symbol: __cpp_exception`.
Followup to #131830.
Rename `BitSet` to `DenseBitSet`
r? `@Mark-Simulacrum` as you requested this in https://github.com/rust-lang/rust/pull/134438#discussion_r1890659739 after such a confusion.
This PR renames `BitSet` to `DenseBitSet` to make it less obvious as the go-to solution for bitmap needs, as well as make its representation (and positives/negatives) clearer. It also expands the comments there to hopefully make it clearer when it's not a good fit, with some alternative bitsets types.
(This migrates the subtrees cg_gcc and clippy to use the new name in separate commits, for easier review by their respective owners, but they can obvs be squashed)
add `-Zmin-function-alignment`
tracking issue: https://github.com/rust-lang/rust/issues/82232
This PR adds the `-Zmin-function-alignment=<align>` flag, that specifies a minimum alignment for all* functions.
### Motivation
This feature is requested by RfL [here](https://github.com/rust-lang/rust/issues/128830):
> i.e. the equivalents of `-fmin-function-alignment` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fmin-function-alignment_003dn), Clang does not support it) / `-falign-functions` ([GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-falign-functions), [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang1-falign-functions)).
>
> For the Linux kernel, the behavior wanted is that of GCC's `-fmin-function-alignment` and Clang's `-falign-functions`, i.e. align all functions, including cold functions.
>
> There is [`feature(fn_align)`](https://github.com/rust-lang/rust/issues/82232), but we need to do it globally.
### Behavior
The `fn_align` feature does not have an RFC. It was decided at the time that it would not be necessary, but maybe we feel differently about that now? In any case, here are the semantics of this flag:
- `-Zmin-function-alignment=<align>` specifies the minimum alignment of all* functions
- the `#[repr(align(<align>))]` attribute can be used to override the function alignment on a per-function basis: when `-Zmin-function-alignment` is specified, the attribute's value is only used when it is higher than the value passed to `-Zmin-function-alignment`.
- the target may decide to use a higher value (e.g. on x86_64 the minimum that LLVM generates is 16)
- The highest supported alignment in rust is `2^29`: I checked a bunch of targets, and they all emit the `.p2align 29` directive for targets that align functions at all (some GPU stuff does not have function alignment).
*: Only with `build-std` would the minimum alignment also be applied to `std` functions.
---
cc `@ojeda`
r? `@workingjubilee` you were active on the tracking issue
Use llvm.memset.p0i8.* to initialize all same-bytes arrays
Similar to #43488
debug builds can now handle `0x0101_u16` and other multi-byte scalars that have all the same bytes (instead of special casing just `0`)
Adds `#[rustc_force_inline]` which is similar to always inlining but
reports an error if the inlining was not possible, and which always
attempts to inline annotated items, regardless of optimisation levels.
It can only be applied to free functions to guarantee that the MIR
inliner will be able to resolve calls.
Add support for wasm exception handling to Emscripten target
This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling. I don't know how to add a setting like that, would appreciate advice from Rust folks. We could maybe choose to use the new exception handling if `Ctarget-feature=+exception-handling` is passed? I tried this but I get errors from llvm so I'm not doing it right.
Add a notion of "some ABIs require certain target features"
I think I finally found the right shape for the data and checks that I recently added in https://github.com/rust-lang/rust/pull/133099, https://github.com/rust-lang/rust/pull/133417, https://github.com/rust-lang/rust/pull/134337: we have a notion of "this ABI requires the following list of target features, and it is incompatible with the following list of target features". Both `-Ctarget-feature` and `#[target_feature]` are updated to ensure we follow the rules of the ABI. This removes all the "toggleability" stuff introduced before, though we do keep the notion of a fully "forbidden" target feature -- this is needed to deal with target features that are actual ABI switches, and hence are needed to even compute the list of required target features.
We always explicitly (un)set all required and in-conflict features, just to avoid potential trouble caused by the default features of whatever the base CPU is. We do this *before* applying `-Ctarget-feature` to maintain backward compatibility; this poses a slight risk of missing some implicit feature dependencies in LLVM but has the advantage of not breaking users that deliberately toggle ABI-relevant target features. They get a warning but the feature does get toggled the way they requested.
For now, our logic supports x86, ARM, and RISC-V (just like the previous logic did). Unsurprisingly, RISC-V is the nicest. ;)
As a side-effect this also (unstably) allows *enabling* `x87` when that is harmless. I used the opportunity to mark SSE2 as required on x86-64, to better match the actual logic in LLVM and because all x86-64 chips do have SSE2. This infrastructure also prepares us for requiring SSE on x86-32 when we want to use that for our ABI (and for float semantics sanity), see https://github.com/rust-lang/rust/issues/133611, but no such change is happening in this PR.
r? `@workingjubilee`
Pass the arch rather than full target name to windows_registry::find_tool
The full target name can be anything with custom target specs. Passing just the arch wasn't possible before cc 1.2, but is now thanks to https://github.com/rust-lang/cc-rs/pull/1285.
try-job: i686-msvc
When `-Cstrip` was changed to use the bundled rust-objcopy instead of
/usr/bin/strip on OSX, strip-like arguments were preserved.
But strip and objcopy are, while being the same binary, different, they
have different defaults depending on which binary they are.
Notably, strip strips everything by default, and objcopy doesn't strip
anything by default.
Additionally, `-S` actually means `--strip-all`, so debuginfo stripped
everything and symbols didn't strip anything.
We now correctly pass `--strip-debug` and `--strip-all`.
stabilize const_swap
libs-api FCP passed in https://github.com/rust-lang/rust/issues/83163.
However, I only just realized that this actually involves an intrinsic. The intrinsic could be implemented entirely with existing stable const functionality, but we choose to make it a primitive to be able to detect more UB. So nominating for `@rust-lang/lang` to make sure they are aware; I leave it up to them whether they want to FCP this.
While at it I also renamed the intrinsic to make the "nonoverlapping" constraint more clear.
Fixes#83163
rustc_codegen_ssa: Buffer file writes in link_rlib
This makes this step take ~25ms on my machine (M3 Max 64GB) for Zed repo instead of ~150ms (on editor crate). Additionally it takes down the time needed for a clean cargo build of ripgrep from ~6.1s to 5.9s.
This change is mostly relevant for dev builds of crates with multiple large CGUs.
I imagine it could be quite relevant for dev scenarios on Windows, but sadly I have no way to measure that myself.
This makes this step take ~25ms on my machine (M3 Max 64GB) for Zed repo instead of ~150ms. Additionally it takes down the time needed for a clean cargo build of ripgrep from ~6.1s to 5.9s.
This change is mostly relevant for crates with multiple large CGUs.
Begin to implement type system layer of unsafe binders
Mostly TODOs, but there's a lot of match arms that are basically just noops so I wanted to split these out before I put up the MIR lowering/projection part of this logic.
r? oli-obk
Tracking:
- https://github.com/rust-lang/rust/issues/130516
Reduce the amount of explicit FatalError.raise()
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These guarantee that an error actually happened previously and thus we don't silently abort.
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These
guarantee that an error actually happened previously and thus we don't
silently abort.
Variants::Single: do not use invalid VariantIdx for uninhabited enums
~~Stacked on top of https://github.com/rust-lang/rust/pull/133681, only the last commit is new.~~
Currently, `Variants::Single` for an empty enum contains a `VariantIdx` of 0; looking that up in the enum variant list will ICE. That's quite confusing. So let's fix that by adding a new `Variants::Empty` case for types that have 0 variants.
try-job: i686-msvc
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
Hir attributes
This PR needs some explanation, it's somewhat large.
- This is step one as described in https://github.com/rust-lang/compiler-team/issues/796. I've added a new `hir::Attribute` which is a lowered version of `ast::Attribute`. Right now, this has few concrete effects, however every place that after this PR parses a `hir::Attribute` should later get a pre-parsed attribute as described in https://github.com/rust-lang/compiler-team/issues/796 and transitively https://github.com/rust-lang/rust/issues/131229.
- an extension trait `AttributeExt` is added, which is implemented for both `ast::Attribute` and `hir::Atribute`. This makes `hir::Attributes` mostly compatible with code that used to parse `ast::Attribute`. All its methods are also added as inherent methods to avoid having to import the trait everywhere in the compiler.
- Incremental can not not hash `ast::Attribute` at all.
reject aarch64 target feature toggling that would change the float ABI
~~Stacked on top of https://github.com/rust-lang/rust/pull/133099. Only the last two commits are new.~~
The first new commit lays the groundwork for separately controlling whether a feature may be enabled or disabled. The second commit uses that to make it illegal to *disable* the `neon` feature (which is only possible via `-Ctarget-feature`, and so the new check just adds a warning). Enabling the `neon` feature remains allowed on targets that don't disable `neon` or `fp-armv8`, which is all our built-in targets. This way, the entire PR is not a breaking change.
Fixes https://github.com/rust-lang/rust/issues/131058 for hardfloat targets (together with https://github.com/rust-lang/rust/pull/133102 which fixed it for softfloat targets).
Part of https://github.com/rust-lang/rust/issues/116344.
Modifies the index instruction from `gep [0 x %Type]` to `gep %Type`
Fixes#133979.
This PR modifies the index instruction from `gep [0 x %Type]` to `gep %Type`, which is the same with pointer offset calculation.
This will help LLVM calculate various formats of GEP instructions. According to [[RFC] Replacing getelementptr with ptradd](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we ultimately aim to canonicalize everything to `gep i8`. Based on the results from https://github.com/rust-lang/rust/pull/134117#issuecomment-2531717076, I think we still need to investigate some missing optimizations, so this PR is just a small step forward.
r? compiler
Add some convenience helper methods on `hir::Safety`
Makes a lot of call sites simpler and should make any refactorings needed for https://github.com/rust-lang/rust/pull/134090#issuecomment-2541332415 simpler, as fewer sites have to be touched in case we end up storing some information in the variants of `hir::Safety`
don't show the full linker args unless `--verbose` is passed
the linker arguments can be *very* long, especially for crates with many dependencies. often they are not useful. omit them unless the user specifically requests them.
split out from https://github.com/rust-lang/rust/pull/119286. fixes https://github.com/rust-lang/rust/issues/109979.
r? `@bjorn3`
try-build: i686-mingw
the linker arguments can be *very* long, especially for crates with many dependencies. some parts of them are not very useful. unless specifically requested:
- omit object files specific to the current invocation
- fold rlib files into a single braced argument (in shell expansion format)
this shortens the output significantly without removing too much information.
A bunch of cleanups (part 2)
Just like https://github.com/rust-lang/rust/pull/133567 these were all found while looking at the respective code, but are not blocking any other changes I want to make in the short term.
forbid toggling x87 and fpregs on hard-float targets
Part of https://github.com/rust-lang/rust/issues/116344, follow-up to https://github.com/rust-lang/rust/pull/129884:
The `x87` target feature on x86 and the `fpregs` target feature on ARM must not be disabled on a hardfloat target, as that would change the float ABI. However, *enabling* `fpregs` on ARM is [explicitly requested](https://github.com/rust-lang/rust/issues/130988) as it seems to be useful. Therefore, we need to refine the distinction of "forbidden" target features and "allowed" target features: all (un)stable target features can determine on a per-target basis whether they should be allowed to be toggled or not. `fpregs` then checks whether the current target has the `soft-float` feature, and if yes, `fpregs` is permitted -- otherwise, it is not. (Same for `x87` on x86).
Also fixes https://github.com/rust-lang/rust/issues/132351. Since `fpregs` and `x87` can be enabled on some builds and disabled on others, it would make sense that one can query it via `cfg`. Therefore, I made them behave in `cfg` like any other unstable target feature.
The first commit prepares the infrastructure, but does not change behavior. The second commit then wires up `fpregs` and `x87` with that new infrastructure.
r? `@workingjubilee`
It is treated as a map already. This is using FxIndexMap rather than
UnordMap because the latter doesn't provide an api to pick a single
value iff all values are equal, which each_linked_rlib depends on.
Pass end position of span through inline ASM cookie
Before this PR, only the start position of the span was passed though the inline ASM cookie to diagnostics. LLVM 19 has full support for 64-bit inline ASM cookies; this PR uses that to pass the end position of the span in the upper 32 bits, meaning inline ASM diagnostics now point at the entire line the error occurred on, not just the first character of it.
codegen `#[naked]` functions using global asm
tracking issue: https://github.com/rust-lang/rust/issues/90957Fixes#124375
This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc).
I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about.
Combined with https://github.com/rust-lang/rust/pull/127853, if both accepted, I think that resolves all steps from the tracking issue.
r? `@Amanieu`
[AIX] keep profile-rt symbol alive
Clang passes `-u __llvm_profile_runtime` on AIX. https://reviews.llvm.org/D136192
We want to preserve the symbol in the case there are no instrumented object files.
[AIX] Pass -bnoipath when adding rust upstream dynamic crates
Unlike ELF linkers, AIX doesn't feature `DT_SONAME` to override
the dependency name when outputing a shared library, which is something
we rely on for dylib crates.
See for reference:
bc145cec45/compiler/rustc_codegen_ssa/src/back/linker.rs (L464))
Thus, `ld` on AIX will use the full path to shared libraries as the dependency if passed it
by default unless `noipath` is passed, so pass it here so we don't end up with full path dependencies
for dylib crates.
Lint on combining `#[no_mangle]` and `#[export_name]`
This is my very first contribution to the compiler, even though I read the [chapter about lints](https://rustc-dev-guide.rust-lang.org/diagnostics.html) I'm not very certain that this ~~new lint is done right as a builtin lint~~ PR is right. I appreciate any guidance on how to improve the code.
- Add test for issue #47446
- ~~Implement the new lint `mixed_export_name_and_no_mangle` as a builtin lint (not sure if that is the right way to go)~~ Extend `unused_attributes` lint
- Add suggestion how to fix it
<details>
<summary>Old proposed new lint</summary>
> The `mixed_export_name_and_no_mangle` lint detects usage of both `#[export_name]` and `#[no_mangle]` on the same item which results on `#[no_mangle]` being ignored.
>
> *warn-by-default*
>
> ### Example
>
> ```rust
> #[no_mangle] // ignored
> #[export_name = "foo"] // takes precedences
> pub fn bar() {}
> ```
>
> ### Explanation
>
> The compiler will not respect the `#[no_mangle]` attribute when generating the symbol name for the function, as the `#[export_name]` attribute takes precedence. This can lead to confusion and is unnecessary.
</details>
A bunch of cleanups
These are all extracted from a branch I have to get rid of driver queries. Most of the commits are not directly necessary for this, but were found in the process of implementing the removal of driver queries.
Previous PR: https://github.com/rust-lang/rust/pull/132410
It was inconsistently done (sometimes even within a single function) and
most of the rest of the compiler uses fatal errors instead, which need
to be caught using catch_with_exit_code anyway. Using fatal errors
instead of ErrorGuaranteed everywhere in the driver simplifies things a
bit.
Print name of env var in `--print=deployment-target`
The deployment target environment variable is OS-specific, and if you're in a place where you're asking `rustc` for the deployment target, you're likely to also wanna know the name of the environment variable. I myself wanted this for some code I'm working on in bootstrap, for example.
Behaviour before this PR:
```console
$ rustc --print=deployment-target --target=aarch64-apple-darwin
deployment_target=11.0
$ rustc --print=deployment-target --target=aarch64-apple-visionos
deployment_target=1.0
```
Behaviour after this PR:
```console
$ rustc --print=deployment-target --target=aarch64-apple-darwin
MACOSX_DEPLOYMENT_TARGET=11.0
$ rustc --print=deployment-target --target=aarch64-apple-visionos
XROS_DEPLOYMENT_TARGET=1.0
```
My _belief_ is that this option is extremely rarely used in general, and a GitHub search for "rustc print deployment-target" seems to confirm this, it revealed only the following actual pieces of code using this:
- b292ef6934/src/build_context.rs (L1199-L1220)
- daab9244b0/src/lib.rs (L3422-L3426)
`maturin` does `.split('=').last()`, so it will continue to work after this change, but `cc v1.0.84` did `.strip_prefix("deployment_target=")` since [this PR](https://github.com/rust-lang/cc-rs/pull/848), so it would break. That's _probably_ fine though, it was broken in a lot of scenarios anyway, and [got](https://github.com/rust-lang/cc-rs/pull/901) [reverted](https://github.com/rust-lang/cc-rs/pull/943) in `v1.0.85`.
So while this is _technically_ a breaking change, I really doubt that anyone is going to observe it, so it's probably fine.
``@BlackHoleFox`` wdyt?
``@rustbot`` label O-apple
r? compiler
Properly pass linker arguments that contain commas
When linking with the system C compiler, we sometimes want to forward certain arguments unchanged to the linker. This can be done with `-Wl,arg1,arg2` or `-Xlinker arg1 -Xlinker arg2`. `-Wl` is used when possible, since it is more compact, but it does not support commas in the argument itself - in those cases, we need to use `-Xlinker`, and that is what this PR implements.
This also fixes using sanitizers on macOS with `-Clinker-flavor=ld`, as those were previously manually using `-Wl`/`-Xlinker` (probably since the support wasn't present in the `link_args` function).
Note that there has been [a previous PR for this](https://github.com/rust-lang/rust/pull/38798), but it only implemented this in certain cases when passing `-rpath`.
r? compiler
use stores of the correct size to set discriminants
Resolves an old HACK /FIXME.
Note that I haven't worked much with codegen so I'm not sure if I'm using the functions correctly and I was surprised seeing out-of-range values being fed into `const_uint_big` but apparently they're wrapped implicitly? By making it explicit we can pass in-range values instead.
Enable -Zshare-generics for inline(never) functions
This avoids inlining cross-crate generic items when possible that are
already marked inline(never), implying that the author is not intending
for the function to be inlined by callers. As such, having a local copy
may make it easier for LLVM to optimize but mostly just adds to binary
bloat and codegen time. In practice our benchmarks indicate this is
indeed a win for larger compilations, where the extra cost in dynamic
linking to these symbols is diminished compared to the advantages in
fewer copies that need optimizing in each binary.
It might also make sense it expand this with other heuristics (e.g.,
`#[cold]`) in the future, but this seems like a good starting point.
FWIW, I expect that doing cleanup in where we make the decision
what should/shouldn't be shared is also a good idea. Way too
much code needed to be tweaked to check this. But I'm hoping
to leave that for a follow-up PR rather than blocking this on it.
This reduces code sizes and better respects programmer intent when
marking inline(never). Previously such a marking was essentially ignored
for generic functions, as we'd still inline them in remote crates.
add a test for target-feature-ABI warnings in closures and when calling extern fn
Also update the comment regarding the inheritance of target features into closures, to make it more clear that we really shouldn't do this right now.
Using `cc_args` panics when using `-Clinker-flavor=ld`, because the
arguments are in a form tailored for `-Clinker-flavor=gcc`.
So instead, we use `link_args` and let that wrap the arguments with the
appropriate `-Wl` or `-Xlinker` when needed.
[AIX] Add option -X32_64 to the "strip" command
The AIX `strip` utility requires option `-X` to specify the object mode. This patch adds the `-X32_64` option to the `strip` command so that it can handle both 32-bit and 64-bit objects. The parameter `option` of function `strip_symbols_with_external_utility`, previously a single string, has been changed to `options`, an array of string slices, to accommodate multiple `strip` options.
The deployment target environment variable is OS-specific, and if you're
in a place where you're asking `rustc` for the deployment target, you're
likely to also wanna know the environment variable.
[AIX] change system dynamic library format
Historically on AIX, almost all dynamic libraries are distributed in `.a` Big Archive Format which can consists of both static and shared objects in the same archive (e.g. `libc++abi.a(libc++abi.so.1)`). During the initial porting process, the dynamic libraries are kept as `.a` to simplify the migration, but semantically having an XCOFF object under the archive extension is wrong. For crate type `cdylib` we want to be able to distribute the libraries as archives as well.
We are migrating to archives with the following format:
```
$ ar -t lib<name>.a
lib<name>.so
```
where each archive contains a single member that is a shared XCOFF object that can be loaded.
Rollup of 6 pull requests
Successful merges:
- #129838 (uefi: process: Add args support)
- #130800 (Mark `get_mut` and `set_position` in `std::io::Cursor` as const.)
- #132708 (Point at `const` definition when used instead of a binding in a `let` statement)
- #133226 (Make `PointerLike` opt-in instead of built-in)
- #133244 (Account for `wasm32v1-none` when exporting TLS symbols)
- #133257 (Add `UnordMap::clear` method)
r? `@ghost`
`@rustbot` modify labels: rollup
take 2
open up coroutines
tweak the wordings
the lint works up until 2021
We were missing one case, for ADTs, which was
causing `Result` to yield incorrect results.
only include field spans with significant types
deduplicate and eliminate field spans
switch to emit spans to impl Drops
Co-authored-by: Niko Matsakis <nikomat@amazon.com>
collect drops instead of taking liveness diff
apply some suggestions and add explantory notes
small fix on the cache
let the query recurse through coroutine
new suggestion format with extracted variable name
fine-tune the drop span and messages
bugfix on runtime borrows
tweak message wording
filter out ecosystem types earlier
apply suggestions
clippy
check lint level at session level
further restrict applicability of the lint
translate bid into nop for stable mir
detect cycle in type structure
The maximum discriminator value LLVM can currently encode is 2^12. If macro use
results in more than 2^12 calls to the same function attributed to the same
callsite, and those calls are MIR-inlined, we will require more than the maximum
discriminator value to completely represent the debug information. Once we reach
that point drop the debug info instead.
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
Over in Zed we've noticed that loading crates for a large-ish workspace can take almost 200ms. We've pinned it down to how rustc searches for paths, as it performs a linear search over the list of candidate paths. In our case the candidate list had about 20k entries which we had to iterate over for each dependency being loaded.
This commit introduces a simple FilesIndex that's just a sorted Vec under the hood. Since crates are looked up by both prefix and suffix, we perform a range search on said Vec (which constraints the search space based on prefix) and follow up with a linear scan of entries with matching suffixes.
FilesIndex is also pre-filtered before any queries are performed using available target information; query prefixes/sufixes are based on the target we are compiling for, so we can remove entries that can never match up front.
Overall, this commit brings down build time for us in dev scenarios by about 6%.
100ms might not seem like much, but this is a constant cost that each of our workspace crates has to pay, even when said crate is miniscule.
CFI: Append debug location to CFI blocks
Currently we're not appending debug locations to the inserted CFI blocks. This shows up in #132615 and #100783. This change fixes that by passing down the debug location to the CFI type-test generation and appending it to the blocks.
Credits also belong to `@jakos-sec` who worked with me on this.
Add a default implementation for CodegenBackend::link
As a side effect this should add raw-dylib support to cg_gcc as the default ArchiveBuilderBuilder that is used implements create_dll_import_lib. I haven't tested if the raw-dylib support actually works however.
After link_binary the temporary files referenced by CodegenResults are
deleted, so calling link_binary again with the same CodegenResults
should not be allowed.
As a side effect this should add raw-dylib support to cg_gcc as the
default ArchiveBuilderBuilder that is used implements
create_dll_import_lib. I haven't tested if the raw-dylib support
actually works however.
Use lld with non-LLVM backends
On arm64, Cranelift used to produce object files that don't work with lld. This has since been fixed. The GCC backend should always produce object files that work with lld unless lld for whatever reason drops GCC support. Most of the other more niche backends don't use cg_ssa's linker code at all. If they do and don't work with lld, they can always disable lld usage using a cli argument.
Without this commit using cg_clif is by default in a non-trivial amount of cases a perf regression on Linux due to ld.bfd being a fair bit slower than lld. It is possible to explicitly enable it without this commit, but most users are unlikely to do this.
On arm64, Cranelift used to produce object files that don't work with
lld. This has since been fixed. The GCC backend should always produce
object files that work with lld unless lld for whatever reason drops GCC
support. Most of the other more niche backends don't use cg_ssa's linker
code at all. If they do and don't work with lld, they can always disable
lld usage using a cli argument.
Without this commit using cg_clif is by default in a non-trivial amount
of cases a perf regression on Linux due to ld.bfd being a fair bit
slower than lld. It is possible to explicitly enable it without this
commit, but most users are unlikely to do this.
Set "symbol name" in raw-dylib import libraries to the decorated name
`windows-rs` received a bug report that mixing raw-dylib generated and the Windows SDK import libraries was causing linker failures: <https://github.com/microsoft/windows-rs/issues/3285>
The root cause turned out to be #124958, that is we are not including the decorated name in the import library and so the import name type is also not being correctly set.
This change modifies the generation of import libraries to set the "symbol name" to the fully decorated name and correctly marks the import as being data vs function.
Note that this also required some changes to how the symbol is named within Rust: for MSVC we now need to use the decorated name but for MinGW we still need to use partially decorated (or undecorated) name.
Fixes#124958
Passing i686 MSVC and MinGW build: <https://github.com/rust-lang/rust/actions/runs/11000433888?pr=130586>
r? `@ChrisDenton`
bootstrap/codegen_ssa: ship llvm-strip and use it for -Cstrip
Fixes#131206.
- Includes `llvm-strip` (a symlink to `llvm-objcopy`) in the compiler dist artifact so that it can be used for `-Cstrip` instead of the system tooling.
- Uses `llvm-strip` instead of `/usr/bin/strip` for macOS. macOS needs a specific linker and the system one is preferred, hence #130781 but that doesn't work when cross-compiling, so use the `llvm-strip` utility instead.
cc #123151
mark some target features as 'forbidden' so they cannot be (un)set with -Ctarget-feature
The context for this is https://github.com/rust-lang/rust/issues/116344: some target features change the way floats are passed between functions. Changing those target features is unsound as code compiled for the same target may now use different ABIs.
So this introduces a new concept of "forbidden" target features (on top of the existing "stable " and "unstable" categories), and makes it a hard error to (un)set such a target feature. For now, the x86 and ARM feature `soft-float` is on that list. We'll have to make some effort to collect more relevant features, and similar features from other targets, but that can happen after the basic infrastructure for this landed. (These features are being collected in https://github.com/rust-lang/rust/issues/131799.)
I've made this a warning for now to give people some time to speak up if this would break something.
MCP: https://github.com/rust-lang/compiler-team/issues/780
Remove unnecessary pub enum glob-imports from `rustc_middle::ty`
We used to have an idiom in the compiler where we'd prefix or suffix all the variants of an enum, for example `BoundRegionKind`, with something like `Br`, and then *glob-import* that enum variant directly.
`@noratrieb` brought this up, and I think that it's easier to read when we just use the normal style `EnumName::Variant`.
This PR is a bit large, but it's just naming.
The only somewhat opinionated change that this PR does is rename `BorrowKind::Imm` to `BorrowKind::Immutable` and same for the other variants. I think these enums are used sparingly enough that the extra length is fine.
r? `@noratrieb` or reassign
Reduce dependence on the target name
The target name can be anything with custom target specs. Matching on fields inside the target spec is much more robust than matching on the target name.
Also remove the unused is_builtin target spec field.
Generate correct symbols.o for sparc-unknown-none-elf
This fixes#130172 by selecting the correct ELF Machine type for sparc-unknown-none-elf (which has a baseline of SPARC V7).
The target name can be anything with custom target specs. Matching on
fields inside the target spec is much more robust than matching on the
target name.
Move versioned Apple LLVM targets from `rustc_target` to `rustc_codegen_ssa`
Fully specified LLVM targets contain the OS version on macOS/iOS/tvOS/watchOS/visionOS, and this version depends on the deployment target environment variables like `MACOSX_DEPLOYMENT_TARGET`, `IPHONEOS_DEPLOYMENT_TARGET` etc.
We would like to move this to later in the compilation pipeline, both because it feels impure to access environment variables when fetching target information, but mostly because we need access to more information from https://github.com/rust-lang/rust/pull/130883 to do https://github.com/rust-lang/rust/issues/118204. See also https://github.com/rust-lang/rust/pull/129342#issuecomment-2335156119 for some discussion.
The first and second commit does the actual refactor, it should be a non-functional change, the third commit adds diagnostics for invalid deployment targets, which are now possible to do because we have access to the session.
Tested with the same commands as in https://github.com/rust-lang/rust/pull/130435.
r? ``````@petrochenkov``````
Remove support for `-Zprofile` (gcov-style coverage instrumentation)
Tracking issue: #42524
MCP: https://github.com/rust-lang/compiler-team/issues/798
---
This PR removes the unstable `-Zprofile` flag, which enables ”gcov-style” coverage instrumentation, along with its associated `-Zprofile-emit` configuration flag.
(The profile flag predates and is almost entirely separate from the stable `-Cinstrument-coverage` flag.)
Notably, the `-Zprofile` flag:
- Is largely untested in-tree, having only one run-make test that does not check whether its output is correct or useful.
- Has no known maintainer.
- Has seen no push towards stabilization.
- Has at least one severe regression reported in 2022 that apparently remains unaddressed.
- #100125
- Is confusingly named, since it appears to be more about coverage than performance profiling, and has nothing to do with PGO.
- Is fundamentally limited by relying on counters auto-inserted by LLVM, with no knowledge of Rust beyond debuginfo.