Use conditional synchronization for Lock
This changes `Lock` to use synchronization only if `mode::is_dyn_thread_safe` could be true. This reduces overhead for the parallel compiler running with 1 thread.
The emitters are changed to use `DynSend` instead of `Send` so they can still use `Lock`.
A Rayon thread pool is not used with 1 thread anymore, as session globals contains `Lock`s which are no longer `Sync`.
Performance improvement with 1 thread and `cfg(parallel_compiler)`:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.7665s</td><td align="right">1.7336s</td><td align="right">💚 -1.86%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2780s</td><td align="right">0.2736s</td><td align="right">💚 -1.61%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9994s</td><td align="right">0.9824s</td><td align="right">💚 -1.70%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.5875s</td><td align="right">1.5656s</td><td align="right">💚 -1.38%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">6.0682s</td><td align="right">5.9532s</td><td align="right">💚 -1.90%</td></tr><tr><td>Total</td><td align="right">10.6997s</td><td align="right">10.5083s</td><td align="right">💚 -1.79%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9831s</td><td align="right">💚 -1.69%</td></tr></table>
cc `@SparrowLii`
Add an (perma-)unstable option to disable vtable vptr
This flag is intended for evaluation of trait upcasting space cost for embedded use cases.
Compared to the approach in #112355, this option provides a way to evaluate end-to-end cost of trait upcasting. Rationale: https://github.com/rust-lang/rust/issues/112355#issuecomment-1658207769
## How this flag should be used (after merge)
Build your project with and without `-Zno-trait-vptr` flag. If you are using cargo, set `RUSTFLAGS="-Zno-trait-vptr"` in the environment variable. You probably also want to use `-Zbuild-std` or the binary built may be broken. Save both binaries somewhere.
### Evaluate the space cost
The option has a direct and indirect impact on vtable space usage. Directly, it gets rid of the trait vptr entry needed to store a pointer to a vtable of a supertrait. (IMO) this is a small saving usually. The larger saving usually comes with the indirect saving by eliminating the vtable of the supertrait (and its parent).
Both impacts only affects vtables (notably the number of functions monomorphized should , however where vtable reside can depend on your relocation model. If the relocation model is static, then vtable is rodata (usually stored in Flash/ROM together with text in embedded scenario). If the binary is relocatable, however, the vtable will live in `.data` (more specifically, `.data.rel.ro`), and this will need to reside in RAM (which may be a more scarce resource in some cases), together with dynamic relocation info living in readonly segment.
For evaluation, you should run `size` on both binaries, with and without the flag. `size` would output three columns, `text`, `data`, `bss` and the sum `dec` (and it's hex version). As explained above, both `text` and `data` may change. `bss` shouldn't usually change. It'll be useful to see:
* Percentage change in text + data (indicating required flash/ROM size)
* Percentage change in data + bss (indicating required RAM size)
This reverts commit 4410868798, reversing
changes made to 249595b752.
This causes linker failures with the binutils version used by
cross (#115239), as well as miscompilations when using the mold
linker.
This option tells LLVM to emit relaxable relocation types
R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX/R_386_GOT32X in applicable cases. True
matches Clang's CMake default since 2020-08 [1] and latest LLVM default[2].
This also works around a GNU ld<2.41 issue[3] when using
general-dynamic/local-dynamic TLS models in `-Z plt=no` mode with latest LLVM.
[1]: c41a18cf61
[2]: 2aedfdd9b8
[3]: https://sourceware.org/bugzilla/show_bug.cgi?id=24784
All of them are not exported from rustc_interface and used only during
global_ctxt(). Inlining them makes it easier to follow the order of
queries and slightly reduces line count.
new unstable option: -Zwrite-long-types-to-disk
This option guards the logic of writing long type names in files and instead using short forms in error messages in rustc_middle/ty/error behind a flag. The main motivation for this change is to disable this behaviour when running ui tests.
This logic can be triggered by running tests in a directory that has a long enough path, e.g. /my/very-long-path/where/rust-codebase/exists/
This means ui tests can fail depending on how long the path to their file is.
Some ui tests actually rely on this behaviour for their assertions, so for those we enable the flag manually.
This option guards the logic of writing long type names in files and
instead using short forms in error messages in rustc_middle/ty/error
behind a flag. The main motivation for this change is to disable this
behaviour when running ui tests.
This logic can be triggered by running tests in a directory that has a
long enough path, e.g. /my/very-long-path/where/rust-codebase/exists/
This means ui tests can fail depending on how long the path to their
file is.
Some ui tests actually rely on this behaviour for their assertions,
so for those we enable the flag manually.
Prototype: Add unstable `-Z reference-niches` option
MCP: rust-lang/compiler-team#641
Relevant RFC: rust-lang/rfcs#3204
This prototype adds a new `-Z reference-niches` option, controlling the range of valid bit-patterns for reference types (`&T` and `&mut T`), thereby enabling new enum niching opportunities. Like `-Z randomize-layout`, this setting is crate-local; as such, references to built-in types (primitives, tuples, ...) are not affected.
The possible settings are (here, `MAX` denotes the all-1 bit-pattern):
| `-Z reference-niches=` | Valid range |
|:---:|:---:|
| `null` (the default) | `1..=MAX` |
| `size` | `1..=(MAX- size)` |
| `align` | `align..=MAX.align_down_to(align)` |
| `size,align` | `align..=(MAX-size).align_down_to(align)` |
------
This is very WIP, and I'm not sure the approach I've taken here is the best one, but stage 1 tests pass locally; I believe this is in a good enough state to unleash this upon unsuspecting 3rd-party code, and see what breaks.
Resurrect: rustc_llvm: Add a -Z `print-codegen-stats` option to expose LLVM statistics.
This resurrects PR https://github.com/rust-lang/rust/pull/104000, which has sat idle for a while. And I want to see the effect of stack-move optimizations on LLVM (like https://reviews.llvm.org/D153453) :).
I have applied the changes requested by `@oli-obk` and `@nagisa` https://github.com/rust-lang/rust/pull/104000#discussion_r1014625377 and https://github.com/rust-lang/rust/pull/104000#discussion_r1014642482 in the latest commits.
r? `@oli-obk`
-----
LLVM has a neat [statistics](https://llvm.org/docs/ProgrammersManual.html#the-statistic-class-stats-option) feature that tracks how often optimizations kick in. It's very handy for optimization work. Since we expose the LLVM pass timings, I thought it made sense to expose the LLVM statistics too.
-----
(Edit: fix broken link
(Edit2: fix segmentation fault and use malloc
If `rustc` is built with
```toml
[llvm]
assertions = true
```
Then you can see like
```
rustc +stage1 -Z print-codegen-stats -C opt-level=3 tmp.rs
===-------------------------------------------------------------------------===
... Statistics Collected ...
===-------------------------------------------------------------------------===
3 aa - Number of MayAlias results
193 aa - Number of MustAlias results
531 aa - Number of NoAlias results
...
```
And the current default build emits only
```
$ rustc +stage1 -Z print-codegen-stats -C opt-level=3 tmp.rs
===-------------------------------------------------------------------------===
... Statistics Collected ...
===-------------------------------------------------------------------------===
$
```
This might be better to emit the message to tell assertion flag necessity, but now I can't find how to do that...
Implement rust-lang/compiler-team#578.
When an ICE is encountered on nightly releases, the new rustc panic
handler will also write the contents of the backtrace to disk. If any
`delay_span_bug`s are encountered, their backtrace is also added to the
file. The platform and rustc version will also be collected.
LLVM has a neat [statistics] feature that tracks how often optimizations kick
in. It's very handy for optimization work. Since we expose the LLVM pass
timings, I thought it made sense to expose the LLVM statistics too.
[statistics]: https://llvm.org/docs/ProgrammersManual.html#the-statistic-class-stats-option
Don't require each rustc_interface tool to opt-in to parallel_compiler
Previously, forgetting to call `interface::set_thread_safe_mode` would cause the following ICE:
```
thread 'rustc' panicked at 'uninitialized dyn_thread_safe mode!', /rustc/dfe0683138de0959b6ab6a039b54d9347f6a6355/compiler/rustc_data_structures/src/sync.rs:74:18
```
This calls `set_thread_safe_mode` in `interface::run_compiler` to avoid requiring it in the caller.
Fixes `tests/run-make-fulldeps/issue-19371` when parallel-compiler is enabled.
r? `@SparrowLii` cc https://github.com/rust-lang/rust/issues/75760
Previously, forgetting to call `interface::set_thread_safe_mode` would cause the following ICE:
```
thread 'rustc' panicked at 'uninitialized dyn_thread_safe mode!', /rustc/dfe0683138de0959b6ab6a039b54d9347f6a6355/compiler/rustc_data_structures/src/sync.rs:74:18
```
This calls `set_thread_safe_mode` in `interface::run_compiler` to avoid requiring it in the caller.
Fixes `tests/run-make-fulldeps/issue-19371` when parallel-compiler is enabled.
Because `Lrc<Box<T>>` is silly. (Clippy warns about `Rc<Box<T>>` and
`Arc<Box<T>>`, and it would warn here if (a) we used Clippy with rustc,
and (b) Clippy knew about `Lrc`.)
There's no need to store it in `Queries`. We can just use a local
variable, because it's always used shortly after it's produced.
The commit also removes the `tcx.analysis()` call in `ongoing_codegen`,
because it's easy to ensure that's done beforehand.
All this makes the dataflow within `run_compiler` easier to follow, at
the cost of making one test slightly more verbose, which I think is a
good tradeoff.
Write to stdout if `-` is given as output file
With this PR, if `-o -` or `--emit KIND=-` is provided, output will be written to stdout instead. Binary output (those of type `obj`, `llvm-bc`, `link` and `metadata`) being written this way will result in an error unless stdout is not a tty. Multiple output types going to stdout will trigger an error too, as they will all be mixded together.
This implements https://github.com/rust-lang/compiler-team/issues/431
The idea behind the changes is to introduce an `OutFileName` enum that represents the output - be it a real path or stdout - and to use this enum along the code paths that handle different output types.
If `-o -` or `--emit KIND=-` is provided, output will be written
to stdout instead. Binary output (`obj`, `llvm-bc`, `link` and
`metadata`) being written this way will result in an error unless
stdout is not a tty. Multiple output types going to stdout will
trigger an error too, as they will all be mixded together.