not need `Option` for `dbg_scope`
This PR fixes a few FIXME about not using `Option` in `dbg_scope` field of `DebugScope`, during `create_function_debug_context` func in codegen parts.
Added a `BitSet<SourceScope>` parameter to `make_mir_scope` to indicate whether the `DebugScope` has been instantiated.
cc ````@eddyb````
Generate synthetic object file to ensure all exported and used symbols participate in the linking
Fix#50007 and #47384
This is the synthetic object file approach that I described in https://github.com/rust-lang/rust/pull/95363#issuecomment-1079932354, allowing all exported and used symbols to be linked while still allowing them to be GCed.
Related #93791, #95363
r? `@petrochenkov`
cc `@carbotaniuman`
Drop support for legacy PM with LLVM 15
LLVM 15 already removes some of the legacy PM APIs we're using. This patch forces use of NewPM with LLVM 15 (with `-Z new-llvm-pass-manager=no` throwing a warning) and stubs out various FFI methods with a report_fatal_error on LLVM 15.
For LLVMPassManagerBuilderPopulateLTOPassManager() I went with adding our own wrapper, as the alternative would be to muck about with weak symbols, which seems to be non-trivial as far as cross-platform support is concerned (std has `weak!` for this purpose, but only as an internal utility.)
Fixes#96072.
Fixes#96362.
asm: Add a kreg0 register class on x86 which includes k0
Previously we only exposed a kreg register class which excludes the k0
register since it can't be used in many instructions. However k0 is a
valid register and we need to have a way of marking it as clobbered for
clobber_abi.
Fixes#94977
Previously we only exposed a kreg register class which excludes the k0
register since it can't be used in many instructions. However k0 is a
valid register and we need to have a way of marking it as clobbered for
clobber_abi.
Fixes#94977
Allow self-profiler to only record potentially costly arguments when argument recording is turned on
As discussed [on zulip](https://rust-lang.zulipchat.com/#narrow/stream/247081-t-compiler.2Fperformance/topic/Identifying.20proc-macro.20slowdowns/near/277304909) with `@wesleywiser,` I'd like to record proc-macro expansions in the self-profiler, with some detailed data (per-expansion spans for example, to follow #95473).
At the same time, I'd also like to avoid doing expensive things when tracking a generic activity's arguments, if they were not specifically opted into the event filter mask, to allow the self-profiler to be used in hotter contexts.
This PR tries to offer:
- a way to ensure a closure to record arguments will only be called in that situation, so that potentially costly arguments can still be recorded when needed. With the additional requirement that, if possible, it would offer a way to record non-owned data without adding many `generic_activity_with_arg_{...}`-style methods. This lead to the `generic_activity_with_arg_recorder` single entry-point, and the closure parameter would offer the new methods, able to be executed in a context where costly argument could be created without disturbing the profiled piece of code.
- some facilities/patterns allowing to record more rustc specific data in this situation, without making `rustc_data_structures` where the self-profiler is defined, depend on other rustc crates (causing circular dependencies): in particular, spans. They are quite tricky to turn into strings (if the default `Debug` impl output does not match the context one needs them for), and since I'd also like to avoid the allocation there when arg recording is turned off today, that has turned into another flexibility requirement for the API in this PR (separating the span-specific recording into an extension trait). **edit**: I've removed this from the PR so that it's easier to review, and opened https://github.com/rust-lang/rust/pull/95739.
- allow for extensibility in the future: other ways to record arguments, or additional data attached to them could be added in the future (e.g. recording the argument's name as well as its data).
Some areas where I'd love feedback:
- the API and names: the `EventArgRecorder` and its method for example. As well as the verbosity that comes from the increased flexibility.
- if I should convert the existing `generic_activity_with_arg{s}` to just forward to `generic_activity_with_arg_recorder` + `recorder.record_arg` (or remove them altogether ? Probably not): I've used the new API in the simple case I could find of allocating for an arg that may not be recorded, and the rest don't seem costly.
- [x] whether this API should panic if no arguments were recorded by the user-provided closure (like this PR currently does: it seems like an error to use an API dedicated to record arguments but not call the methods to then do so) or if this should just record a generic activity without arguments ?
- whether the `record_arg` function should be `#[inline(always)]`, like the `generic_activity_*` functions ?
As mentioned, r? `@wesleywiser` following our recent discussion.
Rollup of 9 pull requests
Successful merges:
- #93969 (Only add codegen backend to dep info if -Zbinary-dep-depinfo is used)
- #94605 (Add missing links in platform support docs)
- #95372 (make unaligned_references lint deny-by-default)
- #95859 (Improve diagnostics for unterminated nested block comment)
- #95961 (implement SIMD gather/scatter via vector getelementptr)
- #96004 (Consider lifetimes when comparing types for equality in MIR validator)
- #96050 (Remove some now-dead code that was only relevant before deaggregation.)
- #96070 ([test] Add test cases for untested functions for BTreeMap)
- #96099 (MaybeUninit array cleanup)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
implement SIMD gather/scatter via vector getelementptr
Fixes https://github.com/rust-lang/portable-simd/issues/271
However, I don't *really* know what I am doing here... Cc ``@workingjubilee`` ``@calebzulawski``
I didn't do anything for cranelift -- ``@bjorn3`` not sure if it's okay for that backend to temporarily break. I'm happy to cherry-pick a patch that adds cranelift support. :)
We may sometimes emit an `invoke` instead of a `call` for inline
assembly during the MIR -> LLVM IR lowering. But we failed to update
the IR builder's current basic block before writing the results to the
outputs. This would result in invalid IR because the basic block would
end in a `store` instruction, which isn't a valid terminator.
Skip needless bitset for debuginfo
Found this while digging around looking at the inlining logic.
Seemed obvious enough so I decided to try to take care of it.
Is this what you had in mind, `@eddyb?`
Before this fix, the debuginfo for the fields was generated from the
struct defintion of Box<T>, but (at least at the moment) the compiler
pretends that Box<T> is just a (fat) pointer, so the fields need to be
`pointer` and `vtable` instead of `__0: Unique<T>` and `__1: Allocator`.
This is meant as a temporary mitigation until we can make sure that
simply treating Box as a regular struct in debuginfo does not cause too
much breakage in the ecosystem.
Fold aarch64 feature +fp into +neon
Arm's FEAT_FP and Feat_AdvSIMD describe the same thing on AArch64:
The Neon unit, which handles both floating point and SIMD instructions.
Moreover, a configuration for AArch64 must include both or neither.
Arm says "entirely proprietary" toolchains may omit floating point:
https://developer.arm.com/documentation/102374/0101/Data-processing---floating-point
In the Programmer's Guide for Armv8-A, Arm says AArch64 can have
both FP and Neon or neither in custom implementations:
https://developer.arm.com/documentation/den0024/a/AArch64-Floating-point-and-NEON
In "Bare metal boot code for Armv8-A", enabling Neon and FP
is just disabling the same trap flag:
https://developer.arm.com/documentation/dai0527/a
In an unlikely future where "Neon and FP" become unrelated,
we can add "[+-]fp" as its own feature flag.
Until then, we can simplify programming with Rust on AArch64 by
folding both into "[+-]neon", which is valid as it supersets both.
"[+-]neon" is retained for niche uses such as firmware, kernels,
"I just hate floats", and so on.
I am... pretty sure no one is relying on this.
An argument could be made that, as we are not an "entirely proprietary" toolchain, we should not support AArch64 without floats at all. I think that's a bit excessive. However, I want to recognize the intent: programming for AArch64 should be simplified where possible. For x86-64, programmers regularly set up illegal feature configurations because it's hard to understand them, see https://github.com/rust-lang/rust/issues/89586. And per the above notes, plus the discussion in https://github.com/rust-lang/rust/issues/86941, there should be no real use cases for leaving these features split: the two should in fact always go together.
- Fixesrust-lang/rust#95002.
- Fixesrust-lang/rust#95064.
- Fixesrust-lang/rust#95122.
Arm's FEAT_FP and Feat_AdvSIMD describe the same thing on AArch64:
The Neon unit, which handles both floating point and SIMD instructions.
Moreover, a configuration for AArch64 must include both or neither.
Arm says "entirely proprietary" toolchains may omit floating point:
https://developer.arm.com/documentation/102374/0101/Data-processing---floating-point
In the Programmer's Guide for Armv8-A, Arm says AArch64 can have
both FP and Neon or neither in custom implementations:
https://developer.arm.com/documentation/den0024/a/AArch64-Floating-point-and-NEON
In "Bare metal boot code for Armv8-A", enabling Neon and FP
is just disabling the same trap flag:
https://developer.arm.com/documentation/dai0527/a
In an unlikely future where "Neon and FP" become unrelated,
we can add "[+-]fp" as its own feature flag.
Until then, we can simplify programming with Rust on AArch64 by
folding both into "[+-]neon", which is valid as it supersets both.
"[+-]neon" is retained for niche uses such as firmware, kernels,
"I just hate floats", and so on.
Implement -Z oom=panic
This PR removes the `#[rustc_allocator_nounwind]` attribute on `alloc_error_handler` which allows it to unwind with a panic instead of always aborting. This is then used to implement `-Z oom=panic` as per RFC 2116 (tracking issue #43596).
Perf and binary size tests show negligible impact.
debuginfo: Refactor debuginfo generation for types
This PR implements the refactoring of the `rustc_codegen_llvm::debuginfo::metadata` module as described in MCP https://github.com/rust-lang/compiler-team/issues/482.
In particular it
- changes names to use `di_node` instead of `metadata`
- uniformly names all functions that build new debuginfo nodes `build_xyz_di_node`
- renames `CrateDebugContext` to `CodegenUnitDebugContext` (which is more accurate)
- removes outdated parts from `compiler/rustc_codegen_llvm/src/debuginfo/doc.md`
- moves `TypeMap` and functions that work directly work with it to a new `type_map` module
- moves enum related builder functions to a new `enums` module
- splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate
- uses `SmallVec` instead of `Vec` in many places
- removes the old infrastructure for dealing with recursion cycles (`create_and_register_recursive_type_forward_declaration()`, `RecursiveTypeDescription`, `set_members_of_composite_type()`, `MemberDescription`, `MemberDescriptionFactory`, `prepare_xyz_metadata()`, etc)
- adds `type_map::build_type_with_children()` as a replacement for dealing with recursion cycles
- adds many (doc-)comments explaining what's going on
- changes cpp-like naming for C-Style enums so they don't get a `enum$<...>` name (because the NatVis visualizer does not apply to them)
- fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields
- changes cpp-like naming for generator enums so that NatVis works for them
- changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it
The following could be done in subsequent PRs:
- add caching for `closure_saved_names_of_captured_variables`
- add caching for `generator_layout_and_saved_local_names`
- fix inconsistent handling of what is considered a C-style enum wrt to debuginfo
- rename `metadata` module to `types`
- move common generator fields to front instead of appending them
This PR is based on https://github.com/rust-lang/rust/pull/93644 which is not merged yet.
Right now, the changes are all done in one big commit. They could be split into smaller commits but hopefully the list of changes above makes it tractable to review them as a single commit too.
For now: r? `@ghost` (let's see if this affects compile times)