Supertraits of `BuilderMethods` are all called `XyzBuilderMethods`.
Supertraits of `CodegenMethods` are all called `XyzMethods`. This commit
changes the latter to `XyzCodegenMethods`, for consistency.
It has `Backend` and `Deref` boudns, plus an associated type
`CodegenCx`, and it has a single use. This commit "inlines" it into
`BuilderMethods`, which makes the complicated backend trait situation a
little simpler.
Don't leave debug locations for constants sitting on the builder indefinitely
Because constants are currently emitted *before* the prologue, leaving the debug location on the IRBuilder spills onto other instructions in the prologue and messes up both line numbers as well as the point LLVM chooses to be the prologue end.
Example LLVM IR (irrelevant IR elided):
Before:
```
define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr` align 8 %self) !dbg !347 { start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
```
After:
```
define internal { i64, i64 } `@_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr` align 8 %self) !dbg !347 { start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
```
Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer falls through onto the store to %self.dbg.spill. This fixes argument values at entry when the constant is a ZST (e.g. `<Option as Try>::Residual`). This fixes#130003 (but note that it does *not* fix issues with argument values and non-ZST constants, which emit their own stores that have debug info on them, like #128945).
r? `@michaelwoerister`
Because constants are currently emitted *before* the prologue, leaving the
debug location on the IRBuilder spills onto other instructions in the prologue
and messes up both line numbers as well as the point LLVM chooses to be the
prologue end.
Example LLVM IR (irrelevant IR elided):
Before:
define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17he02116165b0fc08cE(ptr align 8 %self) !dbg !347 {
start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8, !dbg !357
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
After:
define internal { i64, i64 } @_ZN3tmp3Foo18var_return_opt_try17h00b17d08874ddd90E(ptr align 8 %self) !dbg !347 {
start:
%self.dbg.spill = alloca [8 x i8], align 8
%_0 = alloca [16 x i8], align 8
%residual.dbg.spill = alloca [0 x i8], align 1
#dbg_declare(ptr %residual.dbg.spill, !353, !DIExpression(), !357)
store ptr %self, ptr %self.dbg.spill, align 8
#dbg_declare(ptr %self.dbg.spill, !350, !DIExpression(), !358)
Note in particular how !357 from %residual.dbg.spill's dbg_declare no longer
falls through onto the store to %self.dbg.spill. This fixes argument values
at entry when the constant is a ZST (e.g. <Option as Try>::Residual). This
fixes#130003 (but note that it does *not* fix issues with argument values and
non-ZST constants, which emit their own stores that have debug info on them,
like #128945).
This shrinks `compiler/rustc_codegen_gcc/Cargo.lock` quite a bit. The
only remaining dependencies in `compiler/rustc_codegen_gcc/Cargo.lock`
are `gccjit`, `lang_tester`, and `boml`, all of which aren't used in any
other compiler crates.
The commit also reorders and adds comments to the `extern crate` items
so they match those in miri.
simd_shuffle intrinsic: allow argument to be passed as vector
See https://github.com/rust-lang/rust/issues/128738 for context.
I'd like to get rid of [this hack](6c0b89dfac/compiler/rustc_codegen_ssa/src/mir/block.rs (L922-L935)). https://github.com/rust-lang/rust/pull/128537 almost lets us do that since constant SIMD vectors will then be passed as immediate arguments. However, simd_shuffle for some reason actually takes an *array* as argument, not a vector, so the hack is still required to ensure that the array becomes an immediate (which then later stages of codegen convert into a vector, as that's what LLVM needs).
This PR prepares simd_shuffle to also support a vector as the `idx` argument. Once this lands, stdarch can hopefully be updated to pass `idx` as a vector, and then support for arrays can be removed, which finally lets us get rid of that hack.
Shrink `TyKind::FnPtr`.
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
r? `@compiler-errors`
const vector passed through to codegen
This allows constant vectors using a repr(simd) type to be propagated
through to the backend by reusing the functionality used to do a similar
thing for the simd_shuffle intrinsic
#118209
r? RalfJung
nontemporal_store: make sure that the intrinsic is truly just a hint
The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](https://github.com/llvm/llvm-project/issues/64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores.
~~Blocked on https://github.com/rust-lang/stdarch/pull/1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~
Fixes https://github.com/rust-lang/rust/issues/114582
Cc `@Amanieu` `@workingjubilee`
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and
`FnHeader`, which can be packed more efficiently. This reduces the size
of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms.
This reduces peak memory usage by a few percent on some benchmarks. It
also reduces cache misses and page faults similarly, though this doesn't
translate to clear cycles or wall-time improvements on CI.
Sync ar_archive_writer to LLVM 18.1.3
From LLVM 15.0.0-rc3. This adds support for COFF archives containing Arm64EC object files and has various fixes for AIX big archive files.
Miri function identity hack: account for possible inlining
Having a non-lifetime generic is not the only reason a function can be duplicated. Another possibility is that the function may be eligible for cross-crate inlining. So also take into account the inlining attribute in this Miri hack for function pointer identity.
That said, `cross_crate_inlinable` will still sometimes return true even for `inline(never)` functions:
- when they are `DefKind::Ctor(..) | DefKind::Closure` -- I assume those cannot be `InlineAttr::Never` anyway?
- when `cross_crate_inline_threshold == InliningThreshold::Always`
so maybe this is still not quite the right criterion to use for function pointer identity.
rustc_codegen_llvm: add support for writing summary bitcode
Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.
I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.
Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.
Typical uses of ThinLTO don't have any use for this as a standalone
file, but distributed ThinLTO uses this to make the linker phase more
efficient. With clang you'd do something like `clang -flto=thin
-fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o
(full of bitcode) and foo.indexing.o (just the summary or index part of
the bitcode). That's then usable by a two-stage linking process that's
more friendly to distributed build systems like bazel, which is why I'm
working on this area.
I talked some to @teresajohnson about naming in this area, as things
seem to be a little confused between various blog posts and build
systems. "bitcode index" and "bitcode summary" tend to be a little too
ambiguous, and she tends to use "thin link bitcode" and "minimized
bitcode" (which matches the descriptions in LLVM). Since the clang
option is thin-link-bitcode, I went with that to try and not add a new
spelling in the world.
Per @dtolnay, you can work around the lack of this by using `lld
--thinlto-index-only` to do the indexing on regular .o files of
bitcode, but that is a bit wasteful on actions when we already have all
the information in rustc and could just write out the matching minimized
bitcode. I didn't test that at all in our infrastructure, because by the
time I learned that I already had this patch largely written.
Rollup of 5 pull requests
Successful merges:
- #124615 (coverage: Further simplify extraction of mapping info from MIR)
- #124778 (Fix parse error message for meta items)
- #124797 (Refactor float `Primitive`s to a separate `Float` type)
- #124888 (Migrate `run-make/rustdoc-output-path` to rmake)
- #124957 (Make `Ty::builtin_deref` just return a `Ty`)
r? `@ghost`
`@rustbot` modify labels: rollup
Refactor float `Primitive`s to a separate `Float` type
Now there are 4 of them, it makes sense to refactor `F16`, `F32`, `F64` and `F128` out of `Primitive` and into a separate `Float` type (like integers already are). This allows patterns like `F16 | F32 | F64 | F128` to be simplified into `Float(_)`, and is consistent with `ty::FloatTy`.
As a side effect, this PR also makes the `Ty::primitive_size` method work with `f16` and `f128`.
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128