dladdr cannot leave dli_fname to be null
There are two places in the repo calling `dladdr`, and they are inconsistent wrt their assumption of whether the `dli_fname` field can be null. Let's make them consistent. I see nothing in the docs that allows it to be null, but just to be on the safe side let's make this an assertion so hopefully we get a report if that ever happens.
coverage: Detect unused local file IDs to avoid an LLVM assertion
Each function's coverage metadata contains a *local file table* that maps local file IDs (used by the function's mapping regions) to global file IDs (shared by all functions in the same CGU).
LLVM requires all local file IDs to have at least one mapping region, and has an assertion that will fail if it detects a local file ID with no regions. To make sure that assertion doesn't fire, we need to detect and skip functions whose metadata would trigger it.
(This can't actually happen yet, because currently all of a function's spans must belong to the same file and expansion. But this will be an important edge case when adding expansion region support.)
Stage0 bootstrap update
This PR [follows the release process](https://forge.rust-lang.org/release/process.html#master-bootstrap-update-tuesday) to update the stage0 compiler.
The only thing of note is 58651d1b31, which was flagged by clippy as a correctness fix. I think allowing that lint in our case makes sense, but it's worth to have a second pair of eyes on it.
r? `@Mark-Simulacrum`
This case can't actually happen yet (other than via a testing flag), because
currently all of a function's spans must belong to the same file and expansion.
But this will be an important edge case when adding expansion region support.
Previously `-Zprint-mono-items` would override the mono item collection
strategy. When debugging one doesn't want to change the behaviour, so
this was counter productive. Additionally, the produced behaviour was
artificial and might never arise without using the option in the first
place (`-Zprint-mono-items=eager` without `-Clink-dead-code`). Finally,
the option was incorrectly marked as `UNTRACKED`.
Resolve those issues, by turning `-Zprint-mono-items` into a boolean
flag that prints results of mono item collection without changing the
behaviour of mono item collection.
For codegen-units test incorporate `-Zprint-mono-items` flag directly
into compiletest tool.
Test changes are mechanical. `-Zprint-mono-items=lazy` was removed
without additional changes, and `-Zprint-mono-items=eager` was turned
into `-Clink-dead-code`. Linking dead code disables internalization, so
tests have been updated accordingly.
Fix `-Zremap-path-scope` rmeta handling
This PR fixes the conditional remapping (`-Zremap-path-scope`) of rmeta file paths ~~by using the `debuginfo` scope~~ by conditionally embedding the local path in addition to the remapped path.
Fixes https://github.com/rust-lang/rust/issues/139217
Make `-Zfixed-x18` into a target modifier
As part of #136966, the `-Zfixed-x18` flag should be turned into a target modifier. This is a blocker to stabilization of the flag. The flag was originally added in #124655 and the MCP for its addition is [MCP#748](https://github.com/rust-lang/compiler-team/issues/748).
On some aarch64 targets, the x18 register is used as a temporary caller-saved register by default. When the `-Zfixed-x18` flag is passed, this is turned off so that the compiler doesn't use the x18 register. This allows end-users to use the x18 register for other purposes. For example, by accessing it with inline asm you can use the register as a very efficient thread-local variable. Another common use-case is to store the stack pointer needed by the shadow-call-stack sanitizer. There are also some aarch64 targets where not using x18 is the default – in those cases the flag is a no-op.
Note that this flag does not *on its own* cause an ABI mismatch. What actually causes an ABI mismatch is when you have different compilation units that *disagree* on what it should be used for. But having a CU that uses it and another CU that doesn't normally isn't enough to trigger an ABI problem. However, we still consider the flag to be a target modifier in all cases, since it is assumed that you are passing the flag because you intend to assign some other meaning to the register. Rejecting all flag mismatches even if not all are unsound is consistent with [RFC#3716](https://rust-lang.github.io/rfcs/3716-target-modifiers.html). See the headings "not all mismatches are unsound" and "cases that are not caught" for additional discussion of this.
On aarch64 targets where `-Zfixed-x18` is not a no-op, it is an error to pass `-Zsanitizer=shadow-call-stack` without also passing `-Zfixed-x18`.
Implement the internal feature `cfg_target_has_reliable_f16_f128`
Support for `f16` and `f128` is varied across targets, backends, and backend versions. Eventually we would like to reach a point where all backends support these approximately equally, but until then we have to work around some of these nuances of support being observable.
Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which provides the following new configuration gates:
* `cfg(target_has_reliable_f16)`
* `cfg(target_has_reliable_f16_math)`
* `cfg(target_has_reliable_f128)`
* `cfg(target_has_reliable_f128_math)`
`reliable_f16` and `reliable_f128` indicate that basic arithmetic for the type works correctly. The `_math` versions indicate that anything relying on `libm` works correctly, since sometimes this hits a separate class of codegen bugs.
These options match configuration set by the build script at [1]. The logic for LLVM support is duplicated as-is from the same script. There are a few possible updates that will come as a follow up.
The config introduced here is not planned to ever become stable, it is only intended to replace the build scripts for `std` tests and `compiler-builtins` that don't have any way to configure based on the codegen backend.
MCP: https://github.com/rust-lang/compiler-team/issues/866
Closes: https://github.com/rust-lang/compiler-team/issues/866
[1]: 555e1d0386/library/std/build.rs (L84-L186)
---
The second commit makes use of this config to replace `cfg_{f16,f128}{,_math}` in `library/`. I omitted providing a `cfg(bootstrap)` configuration to keep things simpler since the next beta branch is in two weeks.
try-job: aarch64-gnu
try-job: i686-msvc-1
try-job: test-various
try-job: x86_64-gnu
try-job: x86_64-msvc-ext2
Support for `f16` and `f128` is varied across targets, backends, and
backend versions. Eventually we would like to reach a point where all
backends support these approximately equally, but until then we have to
work around some of these nuances of support being observable.
Introduce the `cfg_target_has_reliable_f16_f128` internal feature, which
provides the following new configuration gates:
* `cfg(target_has_reliable_f16)`
* `cfg(target_has_reliable_f16_math)`
* `cfg(target_has_reliable_f128)`
* `cfg(target_has_reliable_f128_math)`
`reliable_f16` and `reliable_f128` indicate that basic arithmetic for
the type works correctly. The `_math` versions indicate that anything
relying on `libm` works correctly, since sometimes this hits a separate
class of codegen bugs.
These options match configuration set by the build script at [1]. The
logic for LLVM support is duplicated as-is from the same script. There
are a few possible updates that will come as a follow up.
The config introduced here is not planned to ever become stable, it is
only intended to replace the build scripts for `std` tests and
`compiler-builtins` that don't have any way to configure based on the
codegen backend.
MCP: https://github.com/rust-lang/compiler-team/issues/866
Closes: https://github.com/rust-lang/compiler-team/issues/866
[1]: 555e1d0386/library/std/build.rs (L84-L186)
Rollup of 8 pull requests
Successful merges:
- #137683 (Add a tidy check for GCC submodule version)
- #138968 (Update the index of Result to make the summary more comprehensive)
- #139572 (docs(std): mention const blocks in const keyword doc page)
- #140152 (Unify the format of rustc cli flags)
- #140193 (fix ICE in `#[naked]` attribute validation)
- #140205 (Tidying up UI tests [2/N])
- #140284 (remove expect() in `unnecessary_transmutes`)
- #140290 (rustdoc: fix typo change from equivelent to equivalent)
r? `@ghost`
`@rustbot` modify labels: rollup
Unify the format of rustc cli flags
As mentioned in #140102, I unified the format of rustc CLI flags.
I use the following rules:
1. `<param>`: Indicates a required parameter
2. `[param]`: Indicates an optional parameter
3. `|`: Indicates a mutually exclusive option
4. `*`: a list element with description
Current output:
```bash
Usage: rustc [OPTIONS] INPUT
Options:
-h, --help Display this message
--cfg <SPEC> Configure the compilation environment.
SPEC supports the syntax `<NAME>[="<VALUE>"]`.
--check-cfg <SPEC>
Provide list of expected cfgs for checking
-L [<KIND>=]<PATH> Add a directory to the library search path. The
optional KIND can be one of
<dependency|crate|native|framework|all> (default:
all).
-l [<KIND>[:<MODIFIERS>]=]<NAME>[:<RENAME>]
Link the generated crate(s) to the specified native
library NAME. The optional KIND can be one of
<static|framework|dylib> (default: dylib).
Optional comma separated MODIFIERS
<bundle|verbatim|whole-archive|as-needed>
may be specified each with a prefix of either '+' to
enable or '-' to disable.
--crate-type <bin|lib|rlib|dylib|cdylib|staticlib|proc-macro>
Comma separated list of types of crates
for the compiler to emit
--crate-name <NAME>
Specify the name of the crate being built
--edition <2015|2018|2021|2024|future>
Specify which edition of the compiler to use when
compiling code. The default is 2015 and the latest
stable edition is 2024.
--emit <TYPE>[=<FILE>]
Comma separated list of types of output for the
compiler to emit.
Each TYPE has the default FILE name:
* asm - CRATE_NAME.s
* llvm-bc - CRATE_NAME.bc
* dep-info - CRATE_NAME.d
* link - (platform and crate-type dependent)
* llvm-ir - CRATE_NAME.ll
* metadata - libCRATE_NAME.rmeta
* mir - CRATE_NAME.mir
* obj - CRATE_NAME.o
* thin-link-bitcode - CRATE_NAME.indexing.o
--print <INFO>[=<FILE>]
Compiler information to print on stdout (or to a file)
INFO may be one of
<all-target-specs-json|calling-conventions|cfg|check-cfg|code-models|crate-name|crate-root-lint-levels|deployment-target|file-names|host-tuple|link-args|native-static-libs|relocation-models|split-debuginfo|stack-protector-strategies|supported-crate-types|sysroot|target-cpus|target-features|target-libdir|target-list|target-spec-json|tls-models>.
-g Equivalent to -C debuginfo=2
-O Equivalent to -C opt-level=3
-o <FILENAME> Write output to FILENAME
--out-dir <DIR> Write output to compiler-chosen filename in DIR
--explain <OPT> Provide a detailed explanation of an error message
--test Build a test harness
--target <TARGET>
Target triple for which the code is compiled
-A, --allow <LINT> Set lint allowed
-W, --warn <LINT> Set lint warnings
--force-warn <LINT>
Set lint force-warn
-D, --deny <LINT> Set lint denied
-F, --forbid <LINT> Set lint forbidden
--cap-lints <LEVEL>
Set the most restrictive lint level. More restrictive
lints are capped at this level
-C, --codegen <OPT>[=<VALUE>]
Set a codegen option
-V, --version Print version info and exit
-v, --verbose Use verbose output
Additional help:
-C help Print codegen options
-W help Print 'lint' options and default settings
-Z help Print unstable compiler options
--help -v Print the full set of options rustc accepts
```
Make #![feature(let_chains)] bootstrap conditional in compiler/
Let chains have been stabilized recently in #132833, so we can remove the gating from our uses in the compiler (as the compiler uses edition 2024).
Autodiff flags
Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff.
At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes.
This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild.
It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is *not* the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038)
Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem.
Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations.
r? ``@oli-obk``
closes#139471
Tracking:
- https://github.com/rust-lang/rust/issues/124509
Construct OutputType using macro and print [=FILENAME] help info
Closes#139805
Use define_output_types to define variants of OutputType, as well as refactor all of its methods for clarity. This way no variant is missed when pattern matching or output help messages.
On top of that, I optimized for `emit` help messages.
r? ```@jieyouxu```
Do not remove trivial `SwitchInt` in analysis MIR
This PR ensures that we don't prematurely remove trivial `SwitchInt` terminators which affects both the borrow-checking and runtime semantics (i.e. UB) of the code. Previously the `SimplifyCfg` optimization was removing `SwitchInt` terminators when they was "trivial", i.e. when all arms branched to the same basic block, even if that `SwitchInt` terminator had the side-effect of reading an operand which (for example) may not be initialized or may point to an invalid place in memory.
This behavior is unlike all other optimizations, which are only applied after "analysis" (i.e. borrow-checking) is finished, and which Miri disables to make sure the compiler doesn't silently remove UB.
Fixing this code "breaks" (i.e. unmasks) code that used to borrow-check but no longer does, like:
```rust
fn foo() {
let x;
let (0 | _) = x;
}
```
This match expression should perform a read because `_` does not shadow the `0` literal pattern, and the compiler should have to read the match scrutinee to compare it to 0. I've checked that this behavior does not actually manifest in practice via a crater run which came back clean: https://github.com/rust-lang/rust/pull/139042#issuecomment-2767436367
As a side-note, it may be tempting to suggest that this is actually a good thing or that we should preserve this behavior. If we wanted to make this work (i.e. trivially optimize out reads from matches that are redundant like `0 | _`), then we should be enabling this behavior *after* fixing this. However, I think it's kinda unprincipled, and for example other variations of the code don't even work today, e.g.:
```rust
fn foo() {
let x;
let (0.. | _) = x;
}
```
Hide unstable print kinds within emit_unknown_print_request_help in stable channel
Fixes#138698
We need to get the channel from `matches`. However, since `matches`(Line 1169) is constructed after `rustc_optgroups` (Line1165, where `RustcOptGroup::value_hint` is generated, i.e. what `rustc --print print` prints), I've left it unchanged here for now.
2da29dbe8f/compiler/rustc_driver_impl/src/lib.rs (L1161-L1169)
There is actually a way to manually parse the `--crate-name` parameter, but I'm afraid that's an unorthodox practice. So I conservatively just modified `emit_unknown_print_request_help` to print different parameters depending on whether they are nightly or not when passing the error parameter.
r? ```@jieyouxu```
Make CodeStats' type_sizes public
Add another way to get type sizes in CodeStats. I find it weird that the only way to get this information in block for all types is via printing directly to stdout. So this PR adds that flexibility.
Add unstable parsing of `--extern foo::bar=libbar.rlib` command line options
This is a tiny step towards implementing the rustc side of support for implementing packages as optional namespaces (#122349). We add support for parsing command line options like `--extern foo::bar=libbar.rlib` when the `-Z namespaced-crates` option is present.
We don't do anything further with them. The next step is to plumb this down to the name resolver.
This PR also generally refactors the extern argument parsing code and adds some unit tests to make it clear what forms should be accepted with and without the flag.
cc ```@epage``` ```@ehuss```
Stabilize `-Zdwarf-version` as `-Cdwarf-version`
I propose stabilizing `-Zdwarf-version` as `-Cdwarf-version`. This PR adds a new `-Cdwarf-version` flag, leaving the unstable `-Z` flag as is to ease the transition period. The `-Z` flag will be removed in the future.
# `-Zdwarf-version` stabilization report
## What is the RFC for this feature and what changes have occurred to the user-facing design since the RFC was finalized?
No RFC/MCP, this flag was added in https://github.com/rust-lang/rust/pull/98350 and was not deemed large enough to require additional process.
The tracking issue for this feature is #103057.
## What behavior are we committing to that has been controversial? Summarize the major arguments pro/con.
None that has been extensively debated but there are a few questions that could have been chosen differently:
1. What should the flag name be?
The current flag name is very specific to DWARF. Other debuginfo formats exist (msvc's CodeView format or https://en.wikipedia.org/wiki/Stabs) so we could have chosen to generalize the flag name (`-{C,Z} debuginfo-version=dwarf-5` for example). While this would extend cleanly to support formats other than DWARF, there are some downsides to this design. Neither CodeView nor Stabs have specification or format versions so it's not clear what values would be supported beyond `dwarf-{2,3,4,5}` or `codeview`. We would also need to take care to ensure the name does not lead users to think they can pick a format other than one supported by the target. For instance, what would `--target x86_64-pc-windows-msvc -Cdebuginfo-version=dwarf-5` do?
2. What is the behavior when flag is used on targets that do not support DWARF?
Currently, passing `-{C,Z} dwarf-version` on targets like `*-windows-msvc` does not do anything. It may be preferable to emit a warning alerting the user that the flag has no effect on the target platform. Alternatively, we could emit an error but this could be annoying since it would require the use of target specific RUSTFLAGS to use the flag correctly (and there isn't a way to target "any platform that uses DWARF" using cfgs).
3. Does the precompiled standard library potentially using a different version of DWARF a problem?
I don't believe this is an issue as debuggers (and other such tools) already must deal with the possibility that an application uses different DWARF versions across its statically or dynamically linked libraries.
## Are there extensions to this feature that remain unstable? How do we know that we are not accidentally committing to those.
No extensions per se, although future DWARF versions could be considered as such. At present, we validate the requested DWARF version is between 2 and 5 (inclusive) so new DWARF versions will not automatically be supported until the validation logic is adjusted.
## Summarize the major parts of the implementation and provide links into the code (or to PRs)
- Targets define their preferred or default DWARF version: 34a5ea911c/compiler/rustc_target/src/spec/mod.rs (L2369)
- We use the target default but this can be overriden by `-{C,Z} dwarf-version` 34a5ea911c/compiler/rustc_session/src/session.rs (L738)
- The flag is validated 34a5ea911c/compiler/rustc_session/src/session.rs (L1253-L1258)
- When debuginfo is generated, we tell LLVM to use the requested value or the target default 34a5ea911c/compiler/rustc_codegen_llvm/src/debuginfo/mod.rs (L106)
## Summarize existing test coverage of this feature
- Test that we actually generate the appropriate DWARF version
- https://github.com/rust-lang/rust/blob/master/tests/assembly/dwarf5.rs
- https://github.com/rust-lang/rust/blob/master/tests/assembly/dwarf4.rs
- Test that LTO with different DWARF versions picks the highest version
- https://github.com/rust-lang/rust/blob/master/tests/assembly/dwarf-mixed-versions-lto.rs
- Test DWARF versions 2-5 are valid while 0, 1 and 6 report an error
- https://github.com/rust-lang/rust/blob/master/tests/ui/debuginfo/dwarf-versions.rs
- Ensure LLVM does not report a warning when LTO'ing different DWARF versions together
- https://github.com/rust-lang/rust/blob/master/tests/ui/lto/dwarf-mixed-versions-lto.rs
## Has a call-for-testing period been conducted? If so, what feedback was received?
No call-for-testing has been conducted but Rust for Linux has been using this flag without issue.
## What outstanding bugs in the issue tracker involve this feature? Are they stabilization-blocking?
All reported bugs have been resolved.
## Summarize contributors to the feature by name for recognition and assuredness that people involved in the feature agree with stabilization
- Initial implementation in https://github.com/rust-lang/rust/pull/98350 by `@pcwalton`
- Stop emitting `.debug_pubnames` and `.debug_pubtypes` when using DWARF 5 in https://github.com/rust-lang/rust/pull/117962 by `@weihanglo.`
- Refactoring & cleanups (#135739), fix LLVM warning on LTO with different DWARF versions (#136659) and argument validation (#136746) by `@wesleywiser`
## What FIXMEs are still in the code for that feature and why is it ok to leave them there?
No FIXMEs related to this feature.
## What static checks are done that are needed to prevent undefined behavior?
This feature cannot cause undefined behavior.
We ensure the DWARF version is one of the supported values [here](34a5ea911c/compiler/rustc_session/src/session.rs (L1255-L1257)).
## In what way does this feature interact with the reference/specification, and are those edits prepared?
No changes to reference/spec, unstable rustc docs are moved to the stable book as part of the stabilization PR.
## Does this feature introduce new expressions and can they produce temporaries? What are the lifetimes of those temporaries?
No.
## What other unstable features may be exposed by this feature?
`-Zembed-source` requires use of DWARF 5 extensions but has its own feature gate.
## What is tooling support like for this feature, w.r.t rustdoc, clippy, rust-analzyer, rustfmt, etc.?
No support needed for rustdoc, clippy, rust-analyzer, rustfmt or rustup.
Cargo could expose this as an option in build profiles but I would expect the decision as to what version should be used would be made for the entire crate graph at build time rather than by individual package authors.
cc-rs has support for detecting the presence of `-{C,Z} dwarf-version` in `RUSTFLAGS` and providing the corresponding flag to Clang/gcc (https://github.com/rust-lang/cc-rs/pull/1395).
---
Closes#103057
Reject test executables when not supported by target
Currently, compiling tests for SOLID produces an ICE, because SOLID does not support executables.
See https://github.com/rust-lang/rust/issues/138047
Refactor Apple version handling in the compiler
Move various Apple version handling code in the compiler out `rustc_codegen_ssa` and into a place where it can be accessed by `rustc_attr_parsing`, which I found to be necessary when doing https://github.com/rust-lang/rust/pull/136867. Thought I'd split it out to make it easier to land, and to make further changes like https://github.com/rust-lang/rust/pull/131477 have fewer conflicts / PR dependencies.
There should be no functional changes in this PR.
`@rustbot` label O-apple
r? rust-lang/compiler
Autodiff batching
Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.
Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of
```rs
for i in 0..100 {
df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.
There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.
I will also add more tests for both modes.
For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.
r? ghost
Tracking:
- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283