Remove `IdFunctor` trait.
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
r? `@BoxyUwU`
It's defined in `rustc_data_structures` but is only used in
`rustc_type_ir`. The code is shorter and easier to read if we remove
this layer of abstraction and just do the things directly where they are
needed.
Extract parallel operations in `rustc_data_structures::sync` into a new `parallel` submodule
This extracts parallel operations in `rustc_data_structures::sync` into a new `parallel` submodule. This cuts down on the size of the large `cfg_if!` in `sync` and makes it easier to compare between serial and parallel variants.
Add optimized lock methods for `Sharded` and refactor `Lock`
This adds methods to `Sharded` which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance.
Benchmark for `cfg(parallel_compiler)` and 1 thread:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6461s</td><td align="right">1.6345s</td><td align="right"> -0.70%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2414s</td><td align="right">0.2394s</td><td align="right"> -0.83%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9205s</td><td align="right">0.9143s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4981s</td><td align="right">1.4869s</td><td align="right"> -0.75%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.7629s</td><td align="right">5.7256s</td><td align="right"> -0.65%</td></tr><tr><td>Total</td><td align="right">10.0690s</td><td align="right">10.0008s</td><td align="right"> -0.68%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9928s</td><td align="right"> -0.72%</td></tr></table>
cc `@SparrowLii`
Remove conditional use of `Sharded` from query state
`Sharded` is already a zero cost abstraction, so it shouldn't affect the performance of the single thread compiler if LLVM does its job.
r? `@cjgillot`
elaborate a bit on the (lack of) safety in 'Mmap::map'
Sadly none of the callers of this function even consider it worth mentioning in their unsafe block that what they are doing is completely unsound.
Fix races conditions with `SyntaxContext` decoding
This changes `SyntaxContext` decoding to work with concurrent decoding. The `remapped_ctxts` field now only stores `SyntaxContext` which have completed decoding, while the new `decoding` and `local_in_progress` keeps track of `SyntaxContext`s which are in process of being decoding and on which threads.
This fixes 2 issues with the current implementation. It can return an `SyntaxContext` which contains dummy data if another thread starts decoding before the first one has completely finished. Multiple threads could also allocate multiple `SyntaxContext`s for the same `raw_id`.
[rustc_data_structures][base_n][perf] Remove unnecessary utf8 check.
Since all output characters taken from `BASE_64` are valid UTF8 chars there is no need to waste cycles on validation.
Even though it's obviously a perf win, I've also used a [benchmark](https://gist.github.com/ttsugriy/e1e63c07927d8f31e71695a9c617bbf3) on M1 MacBook Air with following results:
```
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old time: [14.670 µs 14.852 µs 15.074 µs]
Found 11 outliers among 100 measurements (11.00%)
4 (4.00%) high mild
7 (7.00%) high severe
push_str/new time: [12.573 µs 12.674 µs 12.801 µs]
Found 11 outliers among 100 measurements (11.00%)
7 (7.00%) high mild
4 (4.00%) high severe
```