Change memory ordering in System wrapper example
Currently, the `SeqCst` ordering is used, which seems unnecessary:
+ Even `Relaxed` ordering guarantees that all updates are atomic and are executed in total order
+ User code only reads atomic for monitoring purposes, no "happens-before" relationships with actual allocations and deallocations are needed for this
If argumentation above is correct, I propose changing ordering to `Relaxed` to clarify that no synchronization is required here, and improve performance (if somebody copy-pastes this example into their code).
Correct `std::prelude` comment
(Read the changed file first for context.)
First, `alloc` has no prelude.
Second, the docs for `v1` don't matter since the [prelude module] already has all the doc links. The `rust_2021` module for instance also doesnt have a convenient doc page. However as I understand glob imports still cant be used because the items dont have the same stabilisation versions.
[prelude module]: https://doc.rust-lang.org/std/prelude/index.html
docs(std): clarify remove_dir_all errors
When using `remove_dir_all`, I assumed that the function was idempotent and that I could always call it to remove a directory if it existed. That's not the case and it bit me in production, so I figured I'd submit this to clarify the docs.
Fixes https://github.com/rust-lang/rust/issues/110912
Checking `flavor == RlibFlavor::Normal` was accidentally lost in
601fc8b36bhttps://github.com/rust-lang/rust/pull/105601
That caused combining +whole-archive and +bundle link modifiers on
non-rlib crates to fail with a confusing error message saying that
combination is unstable for rlibs. In particular, this caused the
build to fail when +whole-archive was used on staticlib crates, even
though +whole-archive effectively does nothing on non-bin crates because
the final linker invocation is left to an external build system.
Move the WorkerLocal type from the rustc-rayon fork into rustc_data_structures
This PR moves the definition of the `WorkerLocal` type from `rustc-rayon` into `rustc_data_structures`. This is enabled by the introduction of the `Registry` type which allows you to group up threads to be used by `WorkerLocal` which is basically just an array with an per thread index. The `Registry` type mirrors the one in Rayon and each Rayon worker thread is also registered with the new `Registry`. Safety for `WorkerLocal` is ensured by having it keep a reference to the registry and checking on each access that we're still on the group of threads associated with the registry used to construct it.
Accessing a `WorkerLocal` is micro-optimized due to it being hot since it's used for most arena allocations.
Performance is slightly improved for the parallel compiler:
<table><tr><td rowspan="2">Benchmark</td><td colspan="1"><b>Before</b></th><td colspan="2"><b>After</b></th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 <b>clap</b>:check</td><td align="right">1.9992s</td><td align="right">1.9949s</td><td align="right"> -0.21%</td></tr><tr><td>🟣 <b>hyper</b>:check</td><td align="right">0.2977s</td><td align="right">0.2970s</td><td align="right"> -0.22%</td></tr><tr><td>🟣 <b>regex</b>:check</td><td align="right">1.1335s</td><td align="right">1.1315s</td><td align="right"> -0.18%</td></tr><tr><td>🟣 <b>syn</b>:check</td><td align="right">1.8235s</td><td align="right">1.8171s</td><td align="right"> -0.35%</td></tr><tr><td>🟣 <b>syntex_syntax</b>:check</td><td align="right">6.9047s</td><td align="right">6.8930s</td><td align="right"> -0.17%</td></tr><tr><td>Total</td><td align="right">12.1586s</td><td align="right">12.1336s</td><td align="right"> -0.21%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9977s</td><td align="right"> -0.23%</td></tr></table>
cc `@SparrowLii`