The field is also renamed from `ident` to `name. In most cases,
we don't actually need the `Span`. A new `ident` method is added
to `VariantDef` and `FieldDef`, which constructs the full `Ident`
using `tcx.def_ident_span()`. This method is used in the cases
where we actually need an `Ident`.
This makes incremental compilation properly track changes
to the `Span`, without all of the invalidations caused by storing
a `Span` directly via an `Ident`.
Consolidate checking for msvc when generating debuginfo
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
r? ``@michaelwoerister``
This PR is not urgent so please don't let it interrupt your holidays! 🎄🎁
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
In #79570, `-Z split-dwarf-kind={none,single,split}` was replaced by `-C
split-debuginfo={off,packed,unpacked}`. `-C split-debuginfo`'s packed
and unpacked aren't exact parallels to single and split, respectively.
On Unix, `-C split-debuginfo=packed` will put debuginfo into object
files and package debuginfo into a DWARF package file (`.dwp`) and
`-C split-debuginfo=unpacked` will put debuginfo into dwarf object files
and won't package it.
In the initial implementation of Split DWARF, split mode wrote sections
which did not require relocation into a DWARF object (`.dwo`) file which
was ignored by the linker and then packaged those DWARF objects into
DWARF packages (`.dwp`). In single mode, sections which did not require
relocation were written into object files but ignored by the linker and
were not packaged. However, both split and single modes could be
packaged or not, the primary difference in behaviour was where the
debuginfo sections that did not require link-time relocation were
written (in a DWARF object or the object file).
This commit re-introduces a `-Z split-dwarf-kind` flag, which can be
used to pick between split and single modes when `-C split-debuginfo` is
used to enable Split DWARF (either packed or unpacked).
Signed-off-by: David Wood <david.wood@huawei.com>
No functional changes intended.
The LLVM commit
ec501f15a8
removed the signed version of `createExpression`. This adapts the Rust
LLVM wrappers accordingly.
Remove `SymbolStr`
This was originally proposed in https://github.com/rust-lang/rust/pull/74554#discussion_r466203544. As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences.
Best reviewed one commit at a time.
r? `@oli-obk`
rustc_codegen_llvm: Give each codegen unit a unique DWARF name on all platforms, not just Apple ones.
To avoid breaking split DWARF, we need to ensure that each codegen unit has a
unique `DW_AT_name`. This is because there's a remote chance that different
codegen units for the same module will have entirely identical DWARF entries
for the purpose of the DWO ID, which would violate Appendix F ("Split Dwarf
Object Files") of the DWARF 5 specification. LLVM uses the algorithm specified
in section 7.32 "Type Signature Computation" to compute the DWO ID, which does
not include any fields that would distinguish compilation units. So we must
embed the codegen unit name into the `DW_AT_name`.
Closes#88521.
Remove `in_band_lifetimes` from `rustc_codegen_llvm`
See #91867 for more information.
This one took a while. This crate has dozens of functions not associated with any type, and most of them were using in-band lifetimes for `'ll` and `'tcx`.
Apply path remapping to DW_AT_GNU_dwo_name when producing split DWARF
`--remap-path-prefix` doesn't apply to paths to `.o` (in case of packed) or `.dwo` (in case of unpacked) files in `DW_AT_GNU_dwo_name`. GCC also has this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91888
platforms, not just Apple ones.
To avoid breaking split DWARF, we need to ensure that each codegen unit has a
unique `DW_AT_name`. This is because there's a remote chance that different
codegen units for the same module will have entirely identical DWARF entries
for the purpose of the DWO ID, which would violate Appendix F ("Split Dwarf
Object Files") of the DWARF 5 specification. LLVM uses the algorithm specified
in section 7.32 "Type Signature Computation" to compute the DWO ID, which does
not include any fields that would distinguish compilation units. So we must
embed the codegen unit name into the `DW_AT_name`.
Closes#88521.
By changing `as_str()` to take `&self` instead of `self`, we can just
return `&str`. We're still lying about lifetimes, but it's a smaller lie
than before, where `SymbolStr` contained a (fake) `&'static str`!
Create more accurate debuginfo for vtables.
Before this PR all vtables would have the same name (`"vtable"`) in debuginfo. Now they get an unambiguous name that identifies the implementing type and the trait that is being implemented.
This is only one of several possible improvements:
- This PR describes vtables as arrays of `*const u8` pointers. It would nice to describe them as structs where function pointer is represented by a field with a name indicative of the method it maps to. However, this requires coming up with a naming scheme that avoids clashes between methods with the same name (which is possible if the vtable contains multiple traits).
- The PR does not update the debuginfo we generate for the vtable-pointer field in a fat `dyn` pointer. Right now there does not seem to be an easy way of getting ahold of a vtable-layout without also knowing the concrete self-type of a trait object.
r? `@wesleywiser`
Before this commit all vtables would have the same name "vtable" in
debuginfo. Now they get a name that identifies the implementing type
and the trait that is being implemented.
Using symbol::Interner makes it very easy to mixup UniqueTypeId symbols
with the global interner. In fact the Debug implementation of
UniqueTypeId did exactly this.
Using a separate interner type also avoids prefilling the interner with
unused symbols and allow for optimizing the symbol interner for parallel
access without negatively affecting the single threaded module codegen.
Provide `layout_of` automatically (given tcx + param_env + error handling).
After #88337, there's no longer any uses of `LayoutOf` within `rustc_target` itself, so I realized I could move the trait to `rustc_middle::ty::layout` and redesign it a bit.
This is similar to #88338 (and supersedes it), but at no ergonomic loss, since there's no funky `C: LayoutOf<Ty = Ty>` -> `Ty: TyAbiInterface<C>` generic `impl` chain, and each `LayoutOf` still corresponds to one `impl` (of `LayoutOfHelpers`) for the specific context.
After this PR, this is what's needed to get `trait LayoutOf` (with the `layout_of` method) implemented on some context type:
* `TyCtxt`, via `HasTyCtxt`
* `ParamEnv`, via `HasParamEnv`
* a way to transform `LayoutError`s into the desired error type
* an error type of `!` can be paired with having `cx.layout_of(...)` return `TyAndLayout` *without* `Result<...>` around it, such as used by codegen
* this is done through a new `LayoutOfHelpers` trait (and so is specifying the type of `cx.layout_of(...)`)
When going through this path (and not bypassing it with a manual `impl` of `LayoutOf`), the end result is that only the error case can be customized, the query itself and the success paths are guaranteed to be uniform.
(**EDIT**: just noticed that because of the supertrait relationship, you cannot actually implement `LayoutOf` yourself, the blanket `impl` fully covers all possible context types that could ever implement it)
Part of the motivation for this shape of API is that I've been working on querifying `FnAbi::of_*`, and what I want/need to introduce for that looks a lot like the setup in this PR - in particular, it's harder to express the `FnAbi` methods in `rustc_target`, since they're much more tied to `rustc` concepts.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Include debug info for the allocator shim
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
Fixes#85019
A `SourceFile` created during compilation may have a relative
path (e.g. if rustc itself is invoked with a relative path).
When we write out crate metadata, we convert all relative paths
to absolute paths using the current working direction.
However, the working directory is not included in the crate hash.
This means that the crate metadata can change while the crate
hash remains the same. Among other problems, this can cause a
fingerprint mismatch ICE, since incremental compilation uses
the crate metadata hash to determine if a foreign query is green.
This commit moves the field holding the working directory from
`Session` to `Options`, including it as part of the crate hash.
Name the captured upvars for closures/generators in debuginfo
Previously, debuggers print closures as something like
```
y::main::closure-0 (0x7fffffffdd34)
```
The pointer actually references to an upvar. It is not very obvious, especially for beginners.
It's because upvars don't have names before, as they are packed into a tuple. This PR names the upvars, so we can expect to see something like
```
y::main::closure-0 {_captured_ref__b: 0x[...]}
```
r? `@tmandry`
Discussed at https://github.com/rust-lang/rust/pull/84752#issuecomment-831639489 .
This makes load generation compatible with opaque pointers.
The generation of nontemporal copies still accesses the pointer
element type, as fixing this requires more movement.
- Closures in external crates may get compiled in because of
monomorphization. We should store names of captured variables
in `optimized_mir`, so that they are written into the metadata
file and we can use them to generate debuginfo.
- If there are breakpoints inside closures, the names of captured
variables stored in `optimized_mir` can be used to print them.
Now the name is more precise when disjoint fields are captured.
Previously, debuggers print closures as something like
```
y::main::closure-0 (0x7fffffffdd34)
```
The pointer actually references to an upvar. It is not
very obvious, especially for beginners.
It's because upvars don't have names before, as they
are packed into a tuple. This commit names the upvars,
so we can expect to see something like
```
y::main::closure-0 {_captured_ref__b: 0x[...]}
```
Previously, only the fields would be displayed with no indication of the
variant name. If you already knew the enum was univariant, this was ok
but if the enum was univariant because of layout, for example, a
`Result<T, !>` then it could be very confusing which variant was the
active one.
Previously, directly tagged enums had a `variant$` field which would
show the name of the active variant. We now show the variant using a
`[variant]` synthetic item just like we do for niche-layout enums.
There are several cases where names of types and functions in the debug info are either ambiguous, or not helpful, such as including ambiguous placeholders (e.g., `{{impl}}`, `{{closure}}` or `dyn _'`) or dropping qualifications (e.g., for dynamic types).
Instead, each debug symbol name should be unique and useful:
* Include disambiguators for anonymous `DefPathDataName` (closures and generators), and unify their formatting when used as a path-qualifier vs item being qualified.
* Qualify the principal trait for dynamic types.
* If there is no principal trait for a dynamic type, emit all other traits instead.
* Respect the `qualified` argument when emitting ref and pointer types.
* For implementations, emit the disambiguator.
* Print const generics when emitting generic parameters or arguments.
Additionally, when targeting MSVC, its debugger treats many command arguments as C++ expressions, even when the argument is defined to be a symbol name. As such names in the debug info need to be more C++-like to be parsed correctly:
* Avoid characters with special meaning (`#`, `[`, `"`, `+`).
* Never start a name with `<` or `{` as this is treated as an operator.
* `>>` is always treated as a right-shift, even when parsing generic arguments (so add a space to avoid this).
* Emit function declarations using C/C++ style syntax (e.g., leading return type).
* Emit arrays as a synthetic `array$<type, size>` type.
* Include a `$` in all synthetic types as this is a legal character for C++, but not Rust (thus we avoid collisions with user types).
Previously, we would generate a single struct with the layout of the
dataful variant plus an extra field whose name contained the value of
the niche (this would only really work for things like `Option<&_>`
where we can determine that the `None` case maps to `0` but for enums
that have multiple tag only variants, this doesn't work).
Now, we generate a union of two structs, one which is the layout of the
dataful variant and one which just has a way of reading the
discriminant. We also generate an enum which maps the discriminant value
to the tag only variants.
We also encode information about the range of values which correspond to
the dataful variant in the type name and then use natvis to determine
which union field we should display to the user.
As a result of this change, all niche-layout enums render correctly in
WinDbg and Visual Studio!