Currently all generators are named with a `generator$N` suffix,
regardless of where they come from. This means an `async fn` shows up as
a generator in stack traces, which can be surprising to async
programmers since they should not need to know that async functions are
implementated using generators.
This change generators a different name depending on the generator kind,
allowing us to tell whether the generator is the result of an async
block, an async closure, an async fn, or a plain generator.
Make rlib metadata strip works with MIPSr6 architecture
Because MIPSr6 has many differences with previous MIPSr2 arch, the previous rlib metadata stripping code in `rustc_codegen_ssa` is only for MIPSr2/r3/r5 (which share the same elf e_flags).
This commit fixed this problem. It makes `rustc_codegen_ssa` happy when compiling rustc for MIPSr6 target or hosts.
e_flags REF: e356027016/llvm/include/llvm/BinaryFormat/ELF.h (L562)
The field is also renamed from `ident` to `name. In most cases,
we don't actually need the `Span`. A new `ident` method is added
to `VariantDef` and `FieldDef`, which constructs the full `Ident`
using `tcx.def_ident_span()`. This method is used in the cases
where we actually need an `Ident`.
This makes incremental compilation properly track changes
to the `Span`, without all of the invalidations caused by storing
a `Span` directly via an `Ident`.
Consolidate checking for msvc when generating debuginfo
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
r? ``@michaelwoerister``
This PR is not urgent so please don't let it interrupt your holidays! 🎄🎁
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
Because MIPSr6 has many differences with previous MIPSr2 arch, the previous rlib metadata stripping code in `rustc_codegen_ssa` is only for MIPSr2/r3/r5 (which share the same elf e_flags).
This commit fixed this problem. It makes `rustc_codegen_ssa` happy when compiling rustc for MIPSr6 target or hosts.
`thorin` is a Rust implementation of a DWARF packaging utility that
supports reading DWARF objects from archive files (i.e. rlibs) and
therefore is better suited for integration into rustc.
Signed-off-by: David Wood <david.wood@huawei.com>
In #79570, `-Z split-dwarf-kind={none,single,split}` was replaced by `-C
split-debuginfo={off,packed,unpacked}`. `-C split-debuginfo`'s packed
and unpacked aren't exact parallels to single and split, respectively.
On Unix, `-C split-debuginfo=packed` will put debuginfo into object
files and package debuginfo into a DWARF package file (`.dwp`) and
`-C split-debuginfo=unpacked` will put debuginfo into dwarf object files
and won't package it.
In the initial implementation of Split DWARF, split mode wrote sections
which did not require relocation into a DWARF object (`.dwo`) file which
was ignored by the linker and then packaged those DWARF objects into
DWARF packages (`.dwp`). In single mode, sections which did not require
relocation were written into object files but ignored by the linker and
were not packaged. However, both split and single modes could be
packaged or not, the primary difference in behaviour was where the
debuginfo sections that did not require link-time relocation were
written (in a DWARF object or the object file).
This commit re-introduces a `-Z split-dwarf-kind` flag, which can be
used to pick between split and single modes when `-C split-debuginfo` is
used to enable Split DWARF (either packed or unpacked).
Signed-off-by: David Wood <david.wood@huawei.com>
Actually set IMAGE_SCN_LNK_REMOVE for .rmeta
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Remove `NullOp::Box`
Follow up of #89030 and MCP rust-lang/compiler-team#460.
~1 month later nothing seems to be broken, apart from a small regression that #89332 (1aac85bb716c09304b313d69d30d74fe7e8e1a8e) shows could be regained by remvoing the diverging path, so it shall be safe to continue and remove `NullOp::Box` completely.
r? `@jonas-schievink`
`@rustbot` label T-compiler
Continue supporting -Z instrument-coverage for compatibility for now,
but show a deprecation warning for it.
Update uses and documentation to use the -C option.
Move the documentation from the unstable book to stable rustc
documentation.
Mark drop calls in landing pads `cold` instead of `noinline`
Now that deferred inlining has been disabled in LLVM (#92110), this shouldn't cause catastrophic size blowup.
I confirmed that the test cases from https://github.com/rust-lang/rust/issues/41696#issuecomment-298696944 still compile quickly (<1s) after this change. ~Although note that I wasn't able to reproduce the original issue using a recent rustc/llvm with deferred inlining enabled, so those tests may no longer be representative. I was also unable to create a modified test case that reproduced the original issue.~ (edit: I reproduced it on CI by accident--the first commit timed out on the LLVM 12 builder, because I forgot to make it conditional on LLVM version)
r? `@nagisa`
cc `@arielb1` (this effectively reverts #42771 "mark calls in the unwind path as !noinline")
cc `@RalfJung` (fixes#46515)
edit: also fixes#87055
Allow loading LLVM plugins with both legacy and new pass manager
Opening a draft PR to get feedback and start discussion on this feature. There is already a codegen option `passes` which allow giving a list of LLVM pass names, however we currently can't use a LLVM pass plugin (as described here : https://llvm.org/docs/WritingAnLLVMPass.html), the only available passes are the LLVM built-in ones.
The proposed modification would be to add another codegen option `pass-plugins`, which can be set with a list of paths to shared library files. These libraries are loaded using the LLVM function `PassPlugin::Load`, which calls the expected symbol `lvmGetPassPluginInfo`, and register the pipeline parsing and optimization callbacks.
An example usage with a single plugin and 3 passes would look like this in the `.cargo/config`:
```toml
rustflags = [
"-C", "pass-plugins=/tmp/libLLVMPassPlugin",
"-C", "passes=pass1 pass2 pass3",
]
```
This would give the same functionality as the opt LLVM tool directly integrated in rust build system.
Additionally, we can also not specify the `passes` option, and use a plugin which inserts passes in the optimization pipeline, as one could do using clang.
The `AggregateKind` enum ends up in the final mir `Body`. Currently,
any changes to `AdtDef` (regardless of how significant they are)
will legitimately cause the overall result of `optimized_mir` to change,
invalidating any codegen re-use involving that mir.
This will get worse once we start hashing the `Span` inside `FieldDef`
(which is itself contained in `AdtDef`).
To try to reduce these kinds of invalidations, this commit changes
`AggregateKind::Adt` to store just the `DefId`, instead of the full
`AdtDef`. This allows the result of `optimized_mir` to be unchanged
if the `AdtDef` changes in a way that doesn't actually affect any
of the MIR we build.
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Explicitly set no ELF flags for .rustc section
For a data section, the object crate will set the SHF_ALLOC by default, which is exactly what we don't want. Explicitly set sh_flags to zero to avoid this.
I checked with `objdump -h` that this produces the right flags for ELF.
Fixes#92013.
Remove `SymbolStr`
This was originally proposed in https://github.com/rust-lang/rust/pull/74554#discussion_r466203544. As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences.
Best reviewed one commit at a time.
r? `@oli-obk`
Rollup of 7 pull requests
Successful merges:
- #91566 (Apply path remapping to DW_AT_GNU_dwo_name when producing split DWARF)
- #91926 (Remove `in_band_lifetimes` from `rustc_metadata`)
- #91931 (Remove `in_band_lifetimes` from `rustc_codegen_llvm`)
- #92024 (rustc_codegen_llvm: Give each codegen unit a unique DWARF name on all platforms, not just Apple ones.)
- #92037 (Use a const ParamEnv when in default_method_body_is_const)
- #92047 (Set `RUST_BACKTRACE=0` when running location-detail tests)
- #92050 (Add a space and 2 grave accents )
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
For a data section, the object crate will set the SHF_ALLOC by
default, which is exactly what we don't want. Explicitly set
sh_flags to zero to avoid this.
Apply path remapping to DW_AT_GNU_dwo_name when producing split DWARF
`--remap-path-prefix` doesn't apply to paths to `.o` (in case of packed) or `.dwo` (in case of unpacked) files in `DW_AT_GNU_dwo_name`. GCC also has this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91888
DF_ORIGIN flag signifies that the object being loaded may make reference to the $ORIGIN substitution string.
Some implementations are just ignoring DF_ORIGIN and do substitution for $ORIGIN if present (whatever DF_ORIGIN pr
Set the flag inconditionally if rpath is wanted.
Remove `in_band_lifetimes` from `rustc_codegen_ssa`
See #91867 for more information.
In `compiler/rustc_codegen_ssa/src/coverageinfo/map.rs`, there are several functions with an explicit `'a` lifetime but only a single `&'a self` parameter. These lifetimes should be redundant given lifetime elision, unless the existential `impl Iterator` has weird issues regarding that. Should the redundant lifetimes be removed?
The resulting profile will include the crate name and will be stored in
the `--out-dir` directory.
This implementation makes it convenient to use LLVM time trace together
with cargo, in the contrast to the previous implementation which would
overwrite profiles or store them in `.cargo/registry/..`.
asm: Allow using r9 (ARM) and x18 (AArch64) if they are not reserved by the current target
This supersedes https://github.com/rust-lang/rust/pull/88879.
cc `@Skirmisher`
r? `@joshtriplett`
Remove the reg_thumb register class for asm! on ARM
Also restricts r8-r14 from being used on Thumb1 targets as per #90736.
cc ``@Lokathor``
r? ``@joshtriplett``
Use object crate for .rustc metadata generation
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjusts the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not `fix` #90326 by itself yet, as .llvmbc will need to be
handled separately.
r? `@nagisa`
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjust the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not fix#90326 by itself yet, as .llvmbc will need to be
handled separately.
...because alignment is always nonzero.
This helps eliminate redundant runtime alignment checks, when a DST
is a field of a struct whose remaining fields have alignment 1.
Add support for LLVM coverage mapping format versions 5 and 6
This PR cherry-pick's Swatinem's initial commit in unsubmitted PR #90047.
My additional commit augments Swatinem's great starting point, but adds full support for LLVM
Coverage Mapping Format version 6, conditionally, if compiling with LLVM 13.
Version 6 requires adding the compilation directory when file paths are
relative, and since Rustc coverage maps use relative paths, we should
add the expected compilation directory entry.
Note, however, that with the compilation directory, coverage reports
from `llvm-cov show` can now report file names (when the report includes
more than one file) with the full absolute path to the file.
This would be a problem for test results, but the workaround (for the
rust coverage tests) is to include an additional `llvm-cov show`
parameter: `--compilation-dir=.`
Remove workaround for the forward progress handling in LLVM
this workaround was only needed for LLVM < 12 and the minimum LLVM version was updated to 12 in #90175
Fix ld64 flags
- The `-exported_symbols_list` argument appears to be malformed for `ld64` (if you are not going through `clang`).
- The `-dynamiclib` argument isn't support for `ld64`. It should be guarded behind a compiler flag.
These problems are fixed by these changes. I have also refactored the way linker arguments are generated to be ld/compiler agnostic and therefore less error prone.
These changes are necessary to support cross-compilation to darwin targets.
Leave -Z strip available temporarily as an alias, to avoid breaking
cargo until cargo transitions to using -C strip. (If the user passes
both, the -C version wins.)
This commit refactors linker argument generation to leverage a helper
function that abstracts away details governing how these arguments are
transformed and provided to the linker.
This fixes the misuse of the `-exported_symbols_list` when an ld-like
linker is used rather than a compiler. A compiler would expect
`-Wl,-exported_symbols_list,path` but ld would expect
`-exported_symbols_list` and `path` as two seperate arguments. Prior
to this change, an ld-like linker was given
`-exported_symbols_list,path`.
Linker arguments must transformed when Rust is interacting with the
linker through a compiler. This commit introduces a helper function
that abstracts away details of this transformation.
Record more artifact sizes during self-profiling.
This PR adds artifact size recording for
- "linked artifacts" (executables, RLIBs, dylibs, static libs)
- object files
- dwo files
- assembly files
- crate metadata
- LLVM bitcode files
- LLVM IR files
- codegen unit size estimates
Currently the identifiers emitted for these are hard-coded as string literals. Is it worth adding constants to https://github.com/rust-lang/measureme/blob/master/measureme/src/rustc.rs instead? We don't do that for query names and the like -- but artifact kinds might be more stable than query names.
enable `dotprod` target feature in arm
To implement `vdot` neon insturction in stdarch, we need to enable `dotprod` target feature in arm in rustc.
r? `@Amanieu`
Don't abort compilation after giving a lint error
The only reason to use `abort_if_errors` is when the program is so broken that either:
1. later passes get confused and ICE
2. any diagnostics from later passes would be noise
This is never the case for lints, because the compiler has to be able to deal with `allow`-ed lints.
So it can continue to lint and compile even if there are lint errors.
Closes https://github.com/rust-lang/rust/issues/82761. This is a WIP because I have a feeling it will exit with 0 even if there were lint errors; I don't have a computer that can build rustc locally at the moment.
The only reason to use `abort_if_errors` is when the program is so broken that either:
1. later passes get confused and ICE
2. any diagnostics from later passes would be noise
This is never the case for lints, because the compiler has to be able to deal with `allow`-ed lints.
So it can continue to lint and compile even if there are lint errors.
In https://reviews.llvm.org/D71059 LLVM 11, the time trace profiler was
extended to support multiple threads.
`timeTraceProfilerInitialize` creates a thread local profiler instance.
When a thread finishes `timeTraceProfilerFinishThread` moves a thread
local instance into a global collection of instances. Finally when all
codegen work is complete `timeTraceProfilerWrite` writes data from the
current thread local instance and the instances in global collection
of instances.
Previously, the profiler was intialized on a single thread only. Since
this thread performs no code generation on its own, the resulting
profile was empty.
Update LLVM codegen to initialize & finish time trace profiler on each
code generation thread.
Add LLVM CFI support to the Rust compiler
This PR adds LLVM Control Flow Integrity (CFI) support to the Rust compiler. It initially provides forward-edge control flow protection for Rust-compiled code only by aggregating function pointers in groups identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by defining and using compatible type identifiers (see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e., -Clto).
Thank you, `@eddyb` and `@pcc,` for all the help!
Properly check `target_features` not to trigger an assertion
Fixes#89875
I think it should be a condition instead of an assertion to check if it's a register as it's possible that `reg` is a register class.
Also, this isn't related to the issue directly, but `is_target_supported` doesn't check `target_features` attributes. Is there any way to check it on rustc_codegen_llvm?
r? `@Amanieu`
Avoid a branch on key being local for queries that use the same local and extern providers
Currently based on https://github.com/rust-lang/rust/pull/85810 as it slightly conflicts with it. Only the last two commits are new.
This commit adds LLVM Control Flow Integrity (CFI) support to the Rust
compiler. It initially provides forward-edge control flow protection for
Rust-compiled code only by aggregating function pointers in groups
identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by defining and using compatible type identifiers
(see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e.,
-Clto).
Erase late-bound regions before computing vtable debuginfo name.
Fixes#90019.
The `msvc_enum_fallback()` for computing enum type names needs to access the memory layout of niche enums in order to determine the type name. `compute_debuginfo_vtable_name()` did not properly erase regions before computing type names which made memory layout computation ICE when encountering un-erased regions.
r? `@wesleywiser`
This performs a substitution of code following the pattern:
let <id> = if let <pat> = ... { identity } else { ... : ! };
To simplify it to:
let <pat> = ... { identity } else { ... : ! };
By adopting the let_else feature.
Create more accurate debuginfo for vtables.
Before this PR all vtables would have the same name (`"vtable"`) in debuginfo. Now they get an unambiguous name that identifies the implementing type and the trait that is being implemented.
This is only one of several possible improvements:
- This PR describes vtables as arrays of `*const u8` pointers. It would nice to describe them as structs where function pointer is represented by a field with a name indicative of the method it maps to. However, this requires coming up with a naming scheme that avoids clashes between methods with the same name (which is possible if the vtable contains multiple traits).
- The PR does not update the debuginfo we generate for the vtable-pointer field in a fat `dyn` pointer. Right now there does not seem to be an easy way of getting ahold of a vtable-layout without also knowing the concrete self-type of a trait object.
r? `@wesleywiser`
rustc_driver: Enable the `WARN` log level by default
This commit changes the `tracing_subscriber` initialization in
`rustc_driver` so that the `WARN` verbosity level is enabled by default
when the `RUSTC_LOG` env variable is empty. If the `RUSTC_LOG` env
variable is set, the filter string in the environment variable is
honored, instead.
Fixes#76824Closes#89623
cc ``@eddyb,`` ``@oli-obk``
Turn vtable_allocation() into a query
This PR removes the untracked vtable-const-allocation cache from the `tcx` and turns the `vtable_allocation()` method into a query.
The change is pretty straightforward and should be backportable without too much effort.
Fixes https://github.com/rust-lang/rust/issues/89598.
Before this commit all vtables would have the same name "vtable" in
debuginfo. Now they get a name that identifies the implementing type
and the trait that is being implemented.
On macOS, make strip="symbols" not pass any options to strip
This makes the output with `strip="symbols"` match the result of just
calling `strip` on the output binary, minimizing the size of the binary.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdoFixes#64892.
Introduce `Rvalue::ShallowInitBox`
Polished version of #88700.
Implements MCP rust-lang/compiler-team#460, and should allow #43596 to go forward.
In short, creating an empty box is split from a nullary-op `NullOp::Box` into two steps, first a call to `exchange_malloc`, then a `Rvalue::ShallowInitBox` which transmutes `*mut u8` to a shallow-initialized `Box<T>`. This allows the `exchange_malloc` call to unwind. Details can be found in the MCP.
`NullOp::Box` is not yet removed, purely to make reverting easier in case anything goes wrong as the result of this PR. If revert is needed a reversion of "Use Rvalue::ShallowInitBox for box expression" commit followed by a test bless should be sufficient.
Experiments in #88700 showed a very slight compile-time perf regression due to (supposedly) slightly more time spent in LLVM. We could omit unwind edge generation (in non-`oom=panic` case) in box expression MIR construction to restore perf; but I don't think it's necessary since runtime perf isn't affected and perf difference is rather small.
This PR allows applying a `#[track_caller]` attribute to a
closure/generator expression. The attribute as interpreted as applying
to the compiler-generated implementation of the corresponding trait
method (`FnOnce::call_once`, `FnMut::call_mut`, `Fn::call`, or
`Generator::resume`).
This feature does not have its own feature gate - however, it requires
`#![feature(stmt_expr_attributes)]` in order to actually apply
an attribute to a closure or generator.
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
Fix debuginfo for parameters passed via the ScalarPair abi on Windows
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used
in #81898.
Fixes#88625
Querify `FnAbi::of_{fn_ptr,instance}` as `fn_abi_of_{fn_ptr,instance}`.
*Note: opening this PR as draft because it's based on #88499*
This more or less replicates the `LayoutOf::layout_of` setup from #88499, to replace `FnAbi::of_{fn_ptr,instance}` with `FnAbiOf::fn_abi_of_{fn_ptr,instance}`, and also route them through queries (which `layout_of` has used for a while).
The two changes at the use sites (other than the names) are:
* return type is now wrapped in `&'tcx`
* the value *is* interned, which may affect performance
* the `extra_args` list is now an interned `&'tcx ty::List<Ty<'tcx>>`
* should be cheap (it's empty for anything other than C variadics)
Theoretically, a `FnAbiOfHelpers` implementer could choose to keep the `Result<...>` instead of eagerly erroring, but the only existing users of these APIs are codegen backends, so they don't (want to) take advantage of this.
At least miri could make use of this, since it prefers propagating errors (it "just" doesn't use `FnAbi` yet - cc `@RalfJung).`
The way this is done is probably less efficient than what is possible, because the queries handle the correctness-oriented API (i.e. the split into `fn` pointers vs instances), whereas a lower-level query could end up with more reuse between different instances with identical signatures.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Couple of changes to FileSearch and SearchPath
* Turn a couple of regular comments into doc comments
* Move `get_tools_search_paths` from `FileSearch` to `Session`
* Use Lrc instead of Option to avoid duplication of a `SearchPath`
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used in
PR #81898.
Introduce NullOp::AlignOf
This PR introduces `Rvalue::NullaryOp(NullOp::AlignOf, ty)`, which will be lowered from `align_of`, similar to `size_of` lowering to `Rvalue::NullaryOp(NullOp::SizeOf, ty)`.
The changes are originally part of #88700 but since it's not dependent on other changes and could have performance impact on its own, it's separated into its own PR.
Move *_max methods back to util
change to inline instead of inline(always)
Remove valid_range_exclusive from scalar
Use WrappingRange instead
implement always_valid_for in a safer way
Fix accidental edit
Fix handling of +whole-archive native link modifier.
This PR fixes a bug in `add_upstream_native_libraries` that led to the `+whole-archive` modifier being ignored when linking in native libs.
~~Note that the PR does not address the situation when `+whole-archive` is combined with `+bundle`.~~
`@wesleywiser's` commit adds validation code that turns combining `+whole-archive` with `+bundle` into an error.
Fixes https://github.com/rust-lang/rust/issues/88085.
r? `@petrochenkov`
cc `@wesleywiser` `@gcoakes`
Provide `layout_of` automatically (given tcx + param_env + error handling).
After #88337, there's no longer any uses of `LayoutOf` within `rustc_target` itself, so I realized I could move the trait to `rustc_middle::ty::layout` and redesign it a bit.
This is similar to #88338 (and supersedes it), but at no ergonomic loss, since there's no funky `C: LayoutOf<Ty = Ty>` -> `Ty: TyAbiInterface<C>` generic `impl` chain, and each `LayoutOf` still corresponds to one `impl` (of `LayoutOfHelpers`) for the specific context.
After this PR, this is what's needed to get `trait LayoutOf` (with the `layout_of` method) implemented on some context type:
* `TyCtxt`, via `HasTyCtxt`
* `ParamEnv`, via `HasParamEnv`
* a way to transform `LayoutError`s into the desired error type
* an error type of `!` can be paired with having `cx.layout_of(...)` return `TyAndLayout` *without* `Result<...>` around it, such as used by codegen
* this is done through a new `LayoutOfHelpers` trait (and so is specifying the type of `cx.layout_of(...)`)
When going through this path (and not bypassing it with a manual `impl` of `LayoutOf`), the end result is that only the error case can be customized, the query itself and the success paths are guaranteed to be uniform.
(**EDIT**: just noticed that because of the supertrait relationship, you cannot actually implement `LayoutOf` yourself, the blanket `impl` fully covers all possible context types that could ever implement it)
Part of the motivation for this shape of API is that I've been working on querifying `FnAbi::of_*`, and what I want/need to introduce for that looks a lot like the setup in this PR - in particular, it's harder to express the `FnAbi` methods in `rustc_target`, since they're much more tied to `rustc` concepts.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Include debug info for the allocator shim
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
rustc_target: `TyAndLayout::field` should never error.
This refactor (making `TyAndLayout::field` return `TyAndLayout` without any `Result` around it) is based on a simple observation, regarding `TyAndLayout::field`:
If `cx.layout_of(ty)` succeeds (for some `cx` and `ty`), then `.field(cx, i)` on the resulting `TyAndLayout` should *always* succeed in computing `cx.layout_of(field_ty)` (where `field_ty` is the type of the `i`th field of `ty`).
The reason for this is that no matter which field is chosen, `cx.layout_of(field_ty)` *will have already been computed*, as part of computing `cx.layout_of(ty)`, as we cannot determine the layout of *any* type without considering the layouts of *all* of its fields.
And so it should be fine to turn any errors into ICEs, since they likely indicate a `cx` mismatch, or some other edge case that is due to a compiler bug (as opposed to ever being an user-facing error).
<hr/>
Each commit should probably be reviewed separately, though note that there's some `where` clauses (in `rustc_target::abi::call::*`) that change in most commits.
cc `@nagisa` `@oli-obk`
Make `-Z gcc-ld=lld` work for Apple targets
`-Z gcc-ld=lld` was introduced in #85961. It does not work on Macos because lld needs be either named `ld64` or passed `-flavor darwin` as the first two arguments in order to select the Mach-O flavor. Rust invokes cc (=clang) on Macos for linking which calls `ld` as linker binary and not `ld64`, so just creating an `ld64` binary and modifying the search path with `-B` does not work.
In order to solve this patch does:
* Set the `lld_flavor` for all Apple-derived targets to `LldFlavor::Ld64`. As far as I can see this actually works towards fixing `-Xlinker=rust-lld` as all those targets use the Mach-O object format.
* Copy/hardlink rust-lld to the gcc-ld subdirectory as ld64 next to ld.
* If `-Z gcc-ld=lld` is used and the target lld flavor is Ld64 add `-fuse-ld=/path/to/ld64` to the linker invocation.
Fixes#86945.
Adjust linking order of static nobundle libraries
Link the static libraries with "-bundle" modifier from upstream rust crate right after linking this rust crate.
Some linker such as GNU linker `ld.bdf` treat order of linking as order of dependency.
After this change, static libraries with "-bundle" modifier is linked in the same order as "+bundle" modifier.
So we can change the value of "bundle" modifier without causing linking error.
fix: https://github.com/rust-lang/rust/issues/87541
r? `@petrochenkov`
Link the static libraries with "-bundle" modifier from upstream rust crate
right after linking this rust crate. Some linker such as GNU linker
`ld.bdf` treat order of linking as order of dependency. After this change,
static libraries with "-bundle" modifier is linked in the same order as
"+bundle" modifier. So we can change the value of "bundle" modifier without
causing linking error.
Use custom wrap-around type instead of RangeInclusive
Two reasons:
1. More memory is allocated than necessary for `valid_range` in `Scalar`. The range is not used as an iterator and `exhausted` is never used.
2. `contains`, `count` etc. methods in `RangeInclusive` are doing very unhelpful(and dangerous!) things when used as a wrap-around range. - In general this PR wants to limit potentially confusing methods, that have a low probability of working.
Doing a local perf run, every metric shows improvement except for instructions.
Max-rss seem to have a very consistent improvement.
Sorry - newbie here, probably doing something wrong.
Trait upcasting coercion (part 3)
By using separate candidates for each possible choice, this fixes type-checking issues in previous commits.
r? `@nikomatsakis`
Upgrade to LLVM 13
Work in progress update to LLVM 13. Main changes:
* InlineAsm diagnostics reported using SrcMgr diagnostic kind are now handled. Previously these used a separate diag handler.
* Codegen tests are updated for additional attributes.
* Some data layouts have changed.
* Switch `#[used]` attribute from `llvm.used` to `llvm.compiler.used` to avoid SHF_GNU_RETAIN flag introduced in https://reviews.llvm.org/D97448, which appears to trigger a bug in older versions of gold.
* Set `LLVM_INCLUDE_TESTS=OFF` to avoid Python 3.6 requirement.
Upstream issues:
* ~~https://bugs.llvm.org/show_bug.cgi?id=51210 (InlineAsm diagnostic reporting for module asm)~~ Fixed by 1558bb80c0.
* ~~https://bugs.llvm.org/show_bug.cgi?id=51476 (Miscompile on AArch64 due to incorrect comparison elimination)~~ Fixed by 81b106584f.
* https://bugs.llvm.org/show_bug.cgi?id=51207 (Can't set custom section flags anymore). Problematic change reverted in our fork, https://reviews.llvm.org/D107216 posted for upstream revert.
* https://bugs.llvm.org/show_bug.cgi?id=51211 (Regression in codegen for #83623). This is an optimization regression that we may likely have to eat for this release. The fix for #83623 was based on an incorrect premise, and this needs to be properly addressed in the MergeICmps pass.
The [compile-time impact](https://perf.rust-lang.org/compare.html?start=ef9549b6c0efb7525c9b012148689c8d070f9bc0&end=0983094463497eec22d550dad25576a894687002) is mixed, but quite positive as LLVM upgrades go.
The LLVM 13 final release is scheduled for Sep 21st. The current nightly is scheduled for stable release on Oct 21st.
r? `@ghost`
This commit updates the backtrace crate in libstd now that dependencies
have been updated to use `memchr` from the standard library as well.
This is mostly just making sure deps are up-to-date and have all the
latest-and-greatest fixes and such.
Closesrust-lang/backtrace-rs#432
Rather than relying on `getPointerElementType()` from LLVM function
pointers, we now pass the function type explicitly when building `call`
or `invoke` instructions.
Allow more "unknown argument" strings from linker
Some toolchains emit slightly different errors, e.g.
ppc-vle-gcc: error: unrecognized option '-no-pie'
Remove the aarch64 `crypto` target_feature
The subfeatures `aes` or `sha2` should be used instead.
This can't yet be done for ARM targets as some LLVM intrinsics still require `crypto`.
Also update the runtime feature detection tests in `library/std` to mirror the updates in `stdarch`. This also helps https://github.com/rust-lang/rust/issues/86941
r? ``@Amanieu``
Trait upcasting coercion (part2)
This is the second part of trait upcasting coercion implementation.
Currently this is blocked on #86264 .
The third part might be implemented using unsafety checking
r? `@bjorn3`
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
[debuginfo] Emit associated type bindings in trait object type names.
This PR updates debuginfo type name generation for trait objects to include associated type bindings and auto trait bounds -- so that, for example, the debuginfo type name of `&dyn Iterator<Item=Foo>` and `&dyn Iterator<Item=Bar>` don't both map to just `&dyn Iterator` anymore.
The following table shows examples of debuginfo type names before and after the PR:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `&dyn Iterator` | `&dyn Iterator<Item=u32>` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `&dyn Iterator` | `&(dyn Iterator<Item=u32> + Sync)` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `&dyn SomeTrait<bool, i8>` | `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` |
For targets that need C++-like type names, we use `assoc$<Item,u32>` instead of `Item=u32`:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> > > >` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> >,Sync> >` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `ref$<dyn$<SomeTrait<bool, i8> > >` | `ref$<dyn$<SomeTrait<bool,i8,assoc$<Bar,u32> > >,Send> >` |
The PR also adds self-profiling measurements for debuginfo type name generation (re. https://github.com/rust-lang/rust/issues/86431). It looks like the compiler spends up to 0.5% of its time in that task, so the potential for optimizing it via caching seems limited.
However, the perf run also shows [the biggest regression](https://perf.rust-lang.org/detailed-query.html?commit=585e91c718b0b2c5319e1fffd0ff1e62aaf7ccc2&base_commit=b9197978a90be6f7570741eabe2da175fec75375&benchmark=tokio-webpush-simple-debug&run_name=incr-unchanged) in a test case that does not even invoke the code in question. This suggests that the length of the names we generate here can affect performance by influencing how much data the linker has to copy around.
Fixes https://github.com/rust-lang/rust/issues/86134.
Don't use gc-sections with profile-generate.
When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.
#78226
r? `@Mark-Simulacrum`
Remove nondeterminism in multiple-definitions test
Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function. Restore the multiple-definitions test.
Resolves#87084.
CTFE/Miri engine Pointer type overhaul
This fixes the long-standing problem that we are using `Scalar` as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a `Scalar` and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like `force_mplace_ptr` and `force_op_ptr`.
Another problem this solves is that in Miri, it would make a lot more sense to have the `Pointer::offset` field represent the full absolute address (instead of being relative to the `AllocId`). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.
To solve this, the `Pointer` type is made entirely parametric over the provenance, so that we can use `Pointer<AllocId>` inside `Scalar` but use `Pointer<Option<AllocId>>` when accessing memory (where `None` represents the case that we could not figure out an `AllocId`; in that case the `offset` is an absolute address). Moreover, the `Provenance` trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.
I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes https://github.com/rust-lang/miri/issues/841
Miri PR: https://github.com/rust-lang/miri/pull/1851
r? `@oli-obk`
Handle non-integer const generic parameters in debuginfo type names.
This PR fixes an ICE introduced by https://github.com/rust-lang/rust/pull/85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits.
The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value.
The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully.
I'd be interested in ideas on how to do this better.
r? `@wesleywiser`
cc `@dpaoliello` (do you see any problems with this approach?)
cc `@Mark-Simulacrum` & `@nagisa` (who I've seen comment on debuginfo issues recently -- anyone else?)
Fixes https://github.com/rust-lang/rust/issues/86893
When building with profile-generate request that metadata is kept
during the gc_sections call, as this can sometimes strip out profile
data.
This missing information in the prof files can then result in missing
functions when using the profile information.
Improve opaque pointers support
Opaque pointers are coming, and rustc is not ready.
This adds partial support by passing an explicit load type to LLVM. Two issues I've encountered:
* The necessary type was not available at the point where non-temporal copies were generated. I've pushed the code for that upwards out of the memcpy implementation and moved the position of a cast to make do with the types we have available. (I'm not sure that cast is needed at all, but have retained it in the interest of conservativeness.)
* The `PlaceRef::project_deref()` function used during debuginfo generation seems to be buggy in some way -- though I haven't figured out specifically what it does wrong. Replacing it with `load_operand().deref()` did the trick, but I don't really know what I'm doing here.
Simply shift the bitcast from the store to the load, so that
we can use the destination type. I'm not sure the bitcast is
really necessary, but keeping it for now.
I'm not really sure what is wrong here, but I was getting load
type mismatches in the debuginfo code (which is the only place
using this function).
Replacing the project_deref() implementation with a generic
load_operand + deref did the trick.
This makes load generation compatible with opaque pointers.
The generation of nontemporal copies still accesses the pointer
element type, as fixing this requires more movement.
Refactor linker code
This merges `LinkerInfo` into `CrateInfo` as there is no reason to keep them separate. `LinkerInfo::to_linker` is merged into `get_linker` as both have different logic for each linker type and `to_linker` is directly called after `get_linker`. Also contains a couple of small cleanups.
See the individual commits for all changes.