rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
This is just one part of the MCP, but it's the one that IMHO removes the most noise from the standard library code.
Seems net simpler this way, since MIR already supported heterogeneous shifts anyway, and thus it's not more work for backends than before.
compiler: fix few unused_peekable and needless_pass_by_ref_mut clippy lints
This fixes few instances of `unused_peekable` and `needless_pass_by_ref_mut`. While i expected to fix more warnings, `needless_pass_by_ref_mut` produced too much for one PR, so i stopped here.
Better reviewed commit by commit, as fixes splitted by chunks.
Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).
A sample improvement from nightly:
```
leaq str.0(%rip), %rdi
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
movl $25, %esi
callq *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```
to this PR:
```
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
callq *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```
[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_codegen_ssa\src\back\rpath.rs:80:41
|
80 | fn get_rpath_relative_to_output(config: &mut RPathConfig<'_>, lib: &Path) -> OsString {
| ^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&RPathConfig<'_>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_codegen_ssa\src\back\rpath.rs:76:42
|
76 | fn get_rpaths_relative_to_output(config: &mut RPathConfig<'_>) -> Vec<OsString> {
| ^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&RPathConfig<'_>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_codegen_ssa\src\back\rpath.rs:55:23
|
55 | fn get_rpaths(config: &mut RPathConfig<'_>) -> Vec<OsString> {
| ^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&RPathConfig<'_>`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
warning: this argument is a mutable reference, but not used mutably
--> compiler\rustc_codegen_ssa\src\back\rpath.rs:15:32
|
15 | pub fn get_rpath_flags(config: &mut RPathConfig<'_>) -> Vec<OsString> {
| ^^^^^^^^^^^^^^^^^^^^ help: consider changing to: `&RPathConfig<'_>`
|
= warning: changing this function will impact semver compatibility
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_ref_mut
Don't emit an error about failing to produce a file with a specific name if user never gave an explicit name
Fixes#122509
You can ask `rustc` to produce some intermediate results with `--emit foo`, this operation comes in two flavors: `--emit asm` and `--emit asm=foo.s`. First one produces one or more `.s` files without any name guarantees, second one renames it into `foo.s`. Second version only works when compiler produces a single file - for asm files this means using a single compilation unit for example.
In case compilation produced more than a single file `rustc` runs following check to emit some warnings:
```rust
if crate_output.outputs.contains_key(&output_type) {
// 2) Multiple codegen units, with `--emit foo=some_name`. We have
// no good solution for this case, so warn the user.
sess.dcx().emit_warn(errors::IgnoringEmitPath { extension });
} else if crate_output.single_output_file.is_some() {
// 3) Multiple codegen units, with `-o some_name`. We have
// no good solution for this case, so warn the user.
sess.dcx().emit_warn(errors::IgnoringOutput { extension });
} else {
// 4) Multiple codegen units, but no explicit name. We
// just leave the `foo.0.x` files in place.
// (We don't have to do any work in this case.)
}
```
Comment in the final `else` branch implies that if user didn't ask for a specific name - there's no need to emit warnings. However because of the internal representation of `crate_output.outputs` - this doesn't work as expected: if user asked to produce an asm file without giving it an implicit name it will contain `Some(None)`.
To fix the problem new code actually checks if user gave an explicit name. I think this was an original intentional behavior, at least comments imply that.
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
CFI: Strip auto traits off Virtual calls
We already use `Instance` at declaration sites when available to glean additional information about possible abstractions of the type in use. This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it generates type information for indirect calls through `Virtual` instances.
This is needed for the "separate machinery" version of my approach to the vtable issues (#122573), because we need to respond differently to a `Virtual` call to the same type as a non-virtual call, specifically [stripping auto traits off the receiver's `Self`](54b15b0c36) because there isn't a separate vtable for `Foo` vs `Foo + Send`.
This would also make a more general underlying mechanism that could be used by rcvalle's [proposed drop detection / encoding](edcd1e20a1) if we end up using his approach, as we could condition out on the `def_id` in the CFI code rather than requiring the generating code to explicitly note whether it was calling drop.
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
Let codegen decide when to `mem::swap` with immediates
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
Thus this PR introduces a new `typed_swap` intrinsic with a fallback body, and replaces that fallback implementation when swapping immediates or scalar pairs.
r? oli-obk
Replaces #111744, and means we'll never need more libs PRs like #111803 or #107140
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
Remove `TypeAndMut` from `ty::RawPtr` variant, make it take `Ty` and `Mutability`
Pretty much mechanically converting `ty::RawPtr(ty::TypeAndMut { ty, mutbl })` to `ty::RawPtr(ty, mutbl)` and its fallout.
r? lcnr
cc rust-lang/types-team#124
"Handle" calls to upstream monomorphizations in compiler_builtins
This is pretty cooked, but I think it works.
compiler-builtins has a long-standing problem that at link time, its rlib cannot contain any calls to `core`. And yet, in codegen we _love_ inserting calls to symbols in `core`, generally from various panic entrypoints.
I intend this PR to attack that problem as completely as possible. When we generate a function call, we now check if we are generating a function call from `compiler_builtins` and whether the callee is a function which was not lowered in the current crate, meaning we will have to link to it.
If those conditions are met, actually generating the call is asking for a linker error. So we don't. If the callee diverges, we lower to an abort with the same behavior as `core::intrinsics::abort`. If the callee does not diverge, we produce an error. This means that compiler-builtins can contain panics, but they'll SIGILL instead of panicking. I made non-diverging calls a compile error because I'm guessing that they'd mostly get into compiler-builtins by someone making a mistake while working on the crate, and compile errors are better than linker errors. We could turn such calls into aborts as well if that's preferred.
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
Backend and target selection is a mess: the target can override the
backend (via `Target::default_codegen_backend`), *and* the backend can
override the target (via `CodegenBackend::target_override`).
The code that handles this is ugly. It calls `build_target_config`
twice, once before getting the backend and once again afterward. It also
must check that both overrides aren't triggering at the same time.
This commit removes the latter override. It's used in rust-gpu but
@eddyb said via Zulip that removing it would be ok. This simplifies the
code greatly, and will allow some nice follow-up refactorings.
LLD parses @ files like the command arguments on the platform it's on,
so on windows it needs to follow the MSVC style to work correctly.
Otherwise builds can fail if the linker command gets too long and the
build path contains spaces.
Fix ICE: `global_asm!()` Don't Panic When Unable to Evaluate Constant
Fixes#121099
A bit of an inelegant fix but given that the error is created only
after call to `const_eval_poly()` and that the calling function
cannot propagate the error anywhere else, the error has to be
explicitly handled inside `mono_item.rs`.
r? `@Amanieu`
Stabilize associated type bounds (RFC 2289)
This PR stabilizes associated type bounds, which were laid out in [RFC 2289]. This gives us a shorthand to express nested type bounds that would otherwise need to be expressed with nested `impl Trait` or broken into several `where` clauses.
### What are we stabilizing?
We're stabilizing the associated item bounds syntax, which allows us to put bounds in associated type position within other bounds, i.e. `T: Trait<Assoc: Bounds...>`. See [RFC 2289] for motivation.
In all position, the associated type bound syntax expands into a set of two (or more) bounds, and never anything else (see "How does this differ[...]" section for more info).
Associated type bounds are stabilized in four positions:
* **`where` clauses (and APIT)** - This is equivalent to breaking up the bound into two (or more) `where` clauses. For example, `where T: Trait<Assoc: Bound>` is equivalent to `where T: Trait, <T as Trait>::Assoc: Bound`.
* **Supertraits** - Similar to above, `trait CopyIterator: Iterator<Item: Copy> {}`. This is almost equivalent to breaking up the bound into two (or more) `where` clauses; however, the bound on the associated item is implied whenever the trait is used. See #112573/#112629.
* **Associated type item bounds** - This allows constraining the *nested* rigid projections that are associated with a trait's associated types. e.g. `trait Trait { type Assoc: Trait2<Assoc2: Copy>; }`.
* **opaque item bounds (RPIT, TAIT)** - This allows constraining associated types that are associated with the opaque without having to *name* the opaque. For example, `impl Iterator<Item: Copy>` defines an iterator whose item is `Copy` without having to actually name that item bound.
The latter three are not expressible in surface Rust (though for associated type item bounds, this will change in #120752, which I don't believe should block this PR), so this does represent a slight expansion of what can be expressed in trait bounds.
### How does this differ from the RFC?
Compared to the RFC, the current implementation *always* desugars associated type bounds to sets of `ty::Clause`s internally. Specifically, it does *not* introduce a position-dependent desugaring as laid out in [RFC 2289], and in particular:
* It does *not* desugar to anonymous associated items in associated type item bounds.
* It does *not* desugar to nested RPITs in RPIT bounds, nor nested TAITs in TAIT bounds.
This position-dependent desugaring laid out in the RFC existed simply to side-step limitations of the trait solver, which have mostly been fixed in #120584. The desugaring laid out in the RFC also added unnecessary complication to the design of the feature, and introduces its own limitations to, for example:
* Conditionally lowering to nested `impl Trait` in certain positions such as RPIT and TAIT means that we inherit the limitations of RPIT/TAIT, namely lack of support for higher-ranked opaque inference. See this code example: https://github.com/rust-lang/rust/pull/120752#issuecomment-1979412531.
* Introducing anonymous associated types makes traits no longer object safe, since anonymous associated types are not nameable, and all associated types must be named in `dyn` types.
This last point motivates why this PR is *not* stabilizing support for associated type bounds in `dyn` types, e.g, `dyn Assoc<Item: Bound>`. Why? Because `dyn` types need to have *concrete* types for all associated items, this would necessitate a distinct lowering for associated type bounds, which seems both complicated and unnecessary compared to just requiring the user to write `impl Trait` themselves. See #120719.
### Implementation history:
Limited to the significant behavioral changes and fixes and relevant PRs, ping me if I left something out--
* #57428
* #108063
* #110512
* #112629
* #120719
* #120584Closes#52662
[RFC 2289]: https://rust-lang.github.io/rfcs/2289-associated-type-bounds.html
A bit of an inelegant fix but given that the error is created only
after call to `const_eval_poly()` and that the calling function
cannot propagate the error anywhere else, the error has to be
explicitly handled inside `mono_item.rs`.
Remove fixme about LLVM basic block naming
~This may be a small perf win.~
Originally, this PR implemented the fixme, but it didn't have any measurable perf improvement.
r? ``@ghost``
Making `libcore` decide this is silly; the backend has so much better information about when it's a good idea.
So introduce a new `typed_swap` intrinsic with a fallback body, but replace that implementation for immediates and scalar pairs.
link.exe: Don't embed full path to PDB file in binary.
This PR makes `rustc` unconditionally pass `/PDBALTPATH:%_PDB%` to MSVC-style linkers, causing the linker to only embed the filename of the PDB in the binary instead of the full path. This will help implement the [trim-paths RFC](https://github.com/rust-lang/rust/issues/111540) for `*-msvc` targets.
Passing `/PDBALTPATH:%_PDB%` to the linker is already done by many projects that need reproducible builds and [debugger's should still be able to find the PDB](https://learn.microsoft.com/cpp/build/reference/pdbpath) if it is in the same directory as the binary.
r? `@ghost`
Fixes https://github.com/rust-lang/rust/issues/87825
Add `-Z external-clangrt`
This adds the unstable `-Z external-clangrt` flag that will prevent rustc from emitting linker paths for the in-tree LLVM sanitizer runtime library.
[AIX] Remove AixLinker's debuginfo() implementation
AIX ld's `-s` option doesn't perfectly fit` debuginfo()`'s semantics and may unexpectedly remove metadata in shared libraries. Remove the implementation of `AixLinker` and suggest user to use `strip` utility instead.
add test ensuring simd codegen checks don't run when a static assertion failed
stdarch relies on this to ensure that SIMD indices are in bounds.
I would love to know why this works, but I can't figure out where codegen decides to not codegen a function if a required-const does not evaluate. `@oli-obk` `@bjorn3` do you have any idea?
This adds the unstable `-Z external-sanitizer-runtime` flag that will
prevent rustc from emitting linker paths for the in-tree LLVM sanitizer
runtime library.