Commit Graph

695 Commits

Author SHA1 Message Date
Tomasz Miąsko
b61a6d59e4 Remove unused dominator iterator 2023-10-10 21:39:59 +02:00
Tomasz Miąsko
0528d378b6 Optimize dominators for small path graphs
Generalizes the small dominators approach from #107449.
2023-10-05 23:45:59 +02:00
Tomasz Miąsko
ba694e301c Remove redundant Dominators::start_node field 2023-10-05 23:45:58 +02:00
Tomasz Miąsko
a8ec7ddf0e Test immediate dominators using public API 2023-10-05 23:45:58 +02:00
John Kåre Alsaker
2c507cae36 Rename cold_path to outline 2023-09-25 22:54:07 +02:00
Florian Schmiderer
3409ca65d8 Add OwnedTargetMachine to manage llvm:TargetMachine. Uses pointers
instead of &'static mut and provides safe interface to create/dispose
it.
2023-09-24 21:11:37 +02:00
bors
f73d376fb6 Auto merge of #115230 - Vtewari2311:mod-hurd-latest, r=b-naber
added support for GNU/Hurd

adding support for i686-unknown-hurd-gnu
2023-09-21 19:24:01 +00:00
Samuel Thibault
dcea7709f2 added support for GNU/Hurd 2023-09-21 17:31:25 +02:00
Ralf Jung
57444cf9f3 use pretty_print_const_value from MIR constant 'extra' printing 2023-09-19 11:06:32 +02:00
Zalathar
01b67f4b26 coverage: Simplify sorting of coverage spans extracted from MIR
Switching to `Ordering::then_with` makes control-flow less complicated, and
there is no need to use `partial_cmp` here.
2023-09-18 23:15:25 +10:00
Matthias Krüger
f3cc59b741
Rollup merge of #115548 - Zoxc:parallel-extract, r=wesleywiser
Extract parallel operations in `rustc_data_structures::sync` into a new `parallel` submodule

This extracts parallel operations in `rustc_data_structures::sync` into a new `parallel` submodule. This cuts down on the size of the large `cfg_if!` in `sync` and makes it easier to compare between serial and parallel variants.
2023-09-11 21:16:20 +02:00
bors
9b72cc9abf Auto merge of #115388 - Zoxc:sharded-lock, r=SparrowLii
Add optimized lock methods for `Sharded` and refactor `Lock`

This adds methods to `Sharded` which pick a shard and also locks it. These branch on parallelism just once instead of twice, improving performance.

Benchmark for `cfg(parallel_compiler)` and 1 thread:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.6461s</td><td align="right">1.6345s</td><td align="right"> -0.70%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2414s</td><td align="right">0.2394s</td><td align="right"> -0.83%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">0.9205s</td><td align="right">0.9143s</td><td align="right"> -0.67%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.4981s</td><td align="right">1.4869s</td><td align="right"> -0.75%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">5.7629s</td><td align="right">5.7256s</td><td align="right"> -0.65%</td></tr><tr><td>Total</td><td align="right">10.0690s</td><td align="right">10.0008s</td><td align="right"> -0.68%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9928s</td><td align="right"> -0.72%</td></tr></table>

cc `@SparrowLii`
2023-09-11 01:43:29 +00:00
John Kåre Alsaker
1d8fdc0332 Use FreezeLock for CStore 2023-09-09 16:02:11 +02:00
John Kåre Alsaker
c83eba9251 Add Freeze::clone 2023-09-08 14:14:57 +02:00
John Kåre Alsaker
9690142bef Remove the LockMode enum and dispatch 2023-09-08 10:15:12 +02:00
John Kåre Alsaker
61cc00d238 Refactor Lock implementation 2023-09-08 08:48:44 +02:00
John Kåre Alsaker
8fc160b742 Add optimized lock methods for Sharded 2023-09-08 08:48:44 +02:00
John Kåre Alsaker
f49382c050 Use Freeze for SourceFile.lines 2023-09-07 13:05:05 +02:00
John Kåre Alsaker
c5996b80be Use Freeze for SourceFile.external_src 2023-09-07 13:04:23 +02:00
John Kåre Alsaker
35e8b67fc2 Use a reference to the lock in the guards 2023-09-06 11:44:06 +02:00
John Kåre Alsaker
fe614712dd Extract parallel operations in rustc_data_structures::sync into a new parallel submodule 2023-09-06 09:47:34 +02:00
John Kåre Alsaker
a3ad045ea4 Rename Freeze to FreezeLock 2023-09-02 08:14:06 +02:00
John Kåre Alsaker
50f0d666d3 Add some comments 2023-09-02 08:13:07 +02:00
John Kåre Alsaker
25884ec9be Use RwLock for Freeze 2023-09-02 08:13:07 +02:00
John Kåre Alsaker
0c96a9260b Add Freeze type and use it to store Definitions 2023-09-02 08:13:03 +02:00
John Kåre Alsaker
90f5f94699 Use OnceLock for SingleCache 2023-09-01 03:11:51 +02:00
John Kåre Alsaker
c303c8abdd Use a parallel_guard function to handle the parallel guard 2023-08-30 18:17:38 +02:00
John Kåre Alsaker
d36393b839 Use Mutex to avoid issue with conditional locks 2023-08-30 18:17:38 +02:00
John Kåre Alsaker
b56acac41d Add ParallelGuard type to handle unwinding in parallel sections 2023-08-30 18:13:06 +02:00
John Kåre Alsaker
51eb8427bf Make parallel! an expression 2023-08-30 18:12:18 +02:00
John Kåre Alsaker
5739349e96 Use conditional synchronization for Lock 2023-08-30 06:10:02 +02:00
bors
6d32b298ed Auto merge of #114894 - Zoxc:sharded-cfg-cleanup2, r=cjgillot
Remove conditional use of `Sharded` from query state

`Sharded` is already a zero cost abstraction, so it shouldn't affect the performance of the single thread compiler if LLVM does its job.

r? `@cjgillot`
2023-08-29 12:04:37 +00:00
klensy
3b26b3d1d2 don't use SnapshotVec in Graph implementation, as it looks unused; use Vec instead 2023-08-28 18:59:55 +03:00
Weihang Lo
832fb9c072
Rollup merge of #114987 - RalfJung:unsound-mmap, r=cjgillot
elaborate a bit on the (lack of) safety in 'Mmap::map'

Sadly none of the callers of this function even consider it worth mentioning in their unsafe block that what they are doing is completely unsound.
2023-08-24 22:53:57 +01:00
John Kåre Alsaker
f458b112f8 Optimize lock_shards 2023-08-24 23:29:48 +02:00
bors
b60e31b673 Auto merge of #115082 - Zoxc:syntax-context-decode-race, r=cjgillot
Fix races conditions with `SyntaxContext` decoding

This changes `SyntaxContext` decoding to work with concurrent decoding. The `remapped_ctxts` field now only stores `SyntaxContext` which have completed decoding, while the new `decoding` and `local_in_progress` keeps track of `SyntaxContext`s which are in process of being decoding and on which threads.

This fixes 2 issues with the current implementation. It can return an `SyntaxContext` which contains dummy data if another thread starts decoding before the first one has completely finished. Multiple threads could also allocate multiple `SyntaxContext`s for the same `raw_id`.
2023-08-24 17:43:02 +00:00
bors
8a6b67f988 Auto merge of #115094 - Mark-Simulacrum:bootstrap-update, r=ozkanonur
Update bootstrap compiler to 1.73.0 beta
2023-08-24 11:10:52 +00:00
Mark Rousskov
0a916062aa Bump cfg(bootstrap) 2023-08-23 20:05:14 -04:00
John Kåre Alsaker
e41240e45b Fix races conditions with SyntaxContext decoding 2023-08-22 02:36:41 +02:00
Ralf Jung
132a2c6cf2 elaborate a bit on the (lack of) safety in 'Mmap::map' 2023-08-19 12:50:26 +02:00
John Kåre Alsaker
0823f0c32b Remove count 2023-08-16 10:44:32 +02:00
John Kåre Alsaker
81220c0ace Keep SHARDS fixed instead of a function of cfg!(parallel_compiler) 2023-08-16 10:00:25 +02:00
John Kåre Alsaker
c737c62e70 Make Sharded an enum and specialize it for the single thread case 2023-08-15 17:22:48 +02:00
bors
617821ab32 Auto merge of #114339 - ttsugriy:unsafe-utf8, r=davidtwco
[rustc_data_structures][base_n][perf] Remove unnecessary utf8 check.

Since all output characters taken from `BASE_64` are valid UTF8 chars there is no need to waste cycles on validation.

Even though it's obviously a perf win, I've also used a [benchmark](https://gist.github.com/ttsugriy/e1e63c07927d8f31e71695a9c617bbf3) on M1 MacBook Air with following results:
```
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old            time:   [14.670 µs 14.852 µs 15.074 µs]
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
push_str/new            time:   [12.573 µs 12.674 µs 12.801 µs]
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe
```
2023-08-08 10:25:37 +00:00
Matthias Krüger
4669905ee6
Rollup merge of #114418 - klensy:parking_lot, r=oli-obk
bump parking_lot to 0.12

Bumps parking_lot to 0.12, replaces few explicit uses of parking_lot with rustc_data_structures::sync ones.

<strike>cc `@oli-obk` this touches recent https://github.com/rust-lang/rust/pull/114283</strike>
cc `@SparrowLii` i've checked that this builds with parallel-compiler

measureme's bump https://github.com/rust-lang/measureme/pull/209
fcf3006e01/compiler/rustc_data_structures/src/sync.rs (L18-L34)
2023-08-04 21:31:56 +02:00
klensy
383b715163 bump parking_lot 0.11 to 0.12 2023-08-03 16:05:26 +03:00
Nilstrieb
5830ca216d Add internal_features lint
It lints against features that are inteded to be internal to the
compiler and standard library. Implements MCP #596.

We allow `internal_features` in the standard library and compiler as those
use many features and this _is_ the standard library from the "internal to the compiler and
standard library" after all.

Marking some features as internal wasn't exactly the most scientific approach, I just marked some
mostly obvious features. While there is a categorization in the macro,
it's not very well upheld (should probably be fixed in another PR).

We always pass `-Ainternal_features` in the testsuite
About 400 UI tests and several other tests use internal features.
Instead of throwing the attribute on each one, just always allow them.
There's nothing wrong with testing internal features^^
2023-08-03 14:50:50 +02:00
Taras Tsugrii
64dfd1090d [rustc_data_structures][base_n][perf] Remove unnecessary utf8 check.
Since all output characters taken from `BASE_64` are valid UTF8 chars
there is no need to waste cycles on validation.

Even though it's obviously a perf win, I've also used a [benchmark](https://gist.github.com/ttsugriy/e1e63c07927d8f31e71695a9c617bbf3)
on M1 MacBook Air with following results:
```
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old            time:   [14.670 µs 14.852 µs 15.074 µs]
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
push_str/new            time:   [12.573 µs 12.674 µs 12.801 µs]
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe
```
2023-08-01 11:10:17 -07:00
Matthias Krüger
00ad3ccae6
Rollup merge of #114306 - ttsugriy:push_str, r=wesleywiser
[rustc_data_structures][perf] Simplify base_n::push_str.

This minor change removes the need to reverse resulting digits. Since reverse is O(|digit_num|) but bounded by 128, it's unlikely to be a noticeable in practice. At the same time, this code is also a 1 line shorter, so combined with tiny perf win, why not?

I ran https://gist.github.com/ttsugriy/ed14860ef597ab315d4129d5f8adb191 on M1 macbook air and got a small improvement
```
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old            time:   [14.180 µs 14.313 µs 14.462 µs]
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe
push_str/new            time:   [13.741 µs 13.839 µs 13.973 µs]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
```
2023-08-01 17:39:12 +02:00
Matthias Krüger
b38718090e
Rollup merge of #114283 - oli-obk:parkin_lot_rwlock, r=SparrowLii
Use parking lot's rwlock even without parallel-rustc

Considering that this doesn't affect perf, I think we should use the simplest solution.
2023-08-01 17:39:10 +02:00