Utilize PGO for rustc linux dist builds
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater implementation time commitment,
implementing for platforms outside of x86_64-unknown-linux-gnu is skipped.
In practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour with this PR, up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.
r? `@michaelwoerister` (but tell me if you don't want to / don't feel comfortable approving and we can find others)
Deprecate atomic compare_and_swap method
Finish implementing [RFC 1443](https://github.com/rust-lang/rfcs/blob/master/text/1443-extended-compare-and-swap.md) (https://github.com/rust-lang/rfcs/pull/1443).
It was decided to deprecate `compare_and_swap` [back in Rust 1.12 already](https://github.com/rust-lang/rust/issues/31767#issuecomment-215903038). I can't find any info about that decision being reverted. My understanding is just that it has been forgotten. If there has been a decision on keeping `compare_and_swap` then it's hard to find, and even if this PR does not go through it can act as a place where people can find out about the decision being reverted.
Atomic operations are hard to understand, very hard. And it does not help that there are multiple similar methods to do compare and swap with. They are so similar that for a reader it might be hard to understand the difference. This PR aims to make that simpler by finally deprecating `compare_and_swap` which is essentially just a more limited version of `compare_exchange`. The documentation is also updated (according to the RFC text) to explain the differences a bit better.
Even if we decide to not deprecate `compare_and_swap`. I still think the documentation for the atomic operations should be improved to better describe their differences and similarities. And the documentation can be written nicer than the PR currently proposes, but I wanted to start somewhere. Most of it is just copied from the RFC.
The documentation for `compare_exchange` and `compare_exchange_weak` indeed describe how they work! The problem is that they are more complex and harder to understand than `compare_and_swap`. So for someone who does not fully grasp this they might fall back to using `compare_and_swap`. Making the documentation outline the similarities and differences might build a bridge for people so they can cross over to the more powerful and sometimes more efficient operations.
The conversions I do to avoid the `std` internal deprecation errors are very straight forward `compare_and_swap -> compare_exchange` changes where the orderings are just using the mapping in the new documentation. Only in one place did I use `compare_exchange_weak`. This can probably be improved further. But the goal here was not for those operations to be perfect. Just to not get worse and to allow the deprecation to happen.
Remove `DefPath` from `Visibility` and calculate it on demand
Depends on #80090 and should not be merged before. Helps with https://github.com/rust-lang/rust/issues/79103 and https://github.com/rust-lang/rust/issues/76382.
cc https://github.com/rust-lang/rust/pull/80014#issuecomment-746810284 - `@nnethercote` I figured it out! It was simpler than I expected :)
This brings the size of `clean::Visibility` down from 40 bytes to 8.
Note that this does *not* remove `clean::Visibility`, even though it's now basically the same as `ty::Visibility`, because the `Invsible` variant means something different from `Inherited` and I thought it would be be confusing to merge the two. See the new comments on `impl Clean for ty::Visibility` for details.
Turn helper method into a closure
`replace_prefix` is currently implemented as a method but has no real relation
to the struct it is implemented on. Turn it into a closure and move it into the
only method from which it is called.
`@rustbot` modify labels +C-cleanup +T-compiler
r? `@lcnr`
Update books
## nomicon
2 commits in d8383b65f7948c2ca19191b3b4bd709b403aaf45..a5a48441d411f61556b57d762b03d6874afe575d
2020-11-22 10:24:42 -0500 to 2020-12-06 10:39:41 +0900
- Update atomics.md (rust-lang/nomicon#249)
- Rename `AllocRef` to `Allocator` and `(de)alloc` to `(de)allocate` (rust-lang/nomicon#248)
## reference
2 commits in a8afdca5d0715b2257b6f8b9a032fd4dd7dae855..b278478b766178491a8b6f67afa4bcd6b64d977a
2020-11-30 06:44:46 -0800 to 2020-12-21 18:18:03 -0800
- Update unions for safe ManuallyDrop assignment. (rust-lang/reference#912)
- Removing ambiguity in type-layout.md (rust-lang/reference#911)
## book
25 commits in a190438d77d28041f24da4f6592e287fab073a61..5bb44f8b5b0aa105c8b22602e9b18800484afa21
2020-11-16 10:44:08 -0600 to 2020-12-18 20:07:31 -0500
- Make some further edits to rust-lang/book#2447
- Merge remote-tracking branch 'origin/pr/2447'
- Remove copied and dangling link brackets
- Merge remote-tracking branch 'origin/pr/2359'
- Override toolchain to nightly for run lints action. (rust-lang/book#2528)
- Remove an uneeded 'static lifetime (rust-lang/book#1752)
- Fixesrust-lang/book#2330. Clarify why the lock is held too long
- Update paragraph about rustfmt in Chapter 1.2 (rust-lang/book#2304)
- Clarify language around further from rust-lang/book#2418
- Merge remote-tracking branch 'origin/pr/2418'
- Merge remote-tracking branch 'origin/pr/2475'
- Add some further edits to rust-lang/book#2433
- Merge remote-tracking branch 'origin/pr/2433'
- Note all the method families to handle integer overflow
- Merge remote-tracking branch 'origin/pr/2405'
- Fixrust-lang/book#1855 - incorporate new reference cycle diagram
- Make some further edits to the changes in rust-lang/book#1886
- Merge remote-tracking branch 'origin/pr/1886'
- Make some further edits to rust-lang/book#1998
- Merge remote-tracking branch 'origin/pr/1998'
- Update Rust version and output (rust-lang/book#2518)
- Fix typo, regarding privileged ports being up to 1023 instead of 1024 (rust-lang/book#2509)
- Change "appendixes" to "appendices" in intro. (rust-lang/book#2498)
- Update 16-11 to use method call expression for `clone` (rust-lang/book#2511)
- Correct chapter 20 final listing (rust-lang/book#2516)
## rust-by-example
7 commits in 236c734a2cb323541b3394f98682cb981b9ec086..1cce0737d6a7d3ceafb139b4a206861fb1dcb2ab
2020-11-30 14:05:49 -0300 to 2020-12-21 17:36:29 -0300
- Add book.description in book.toml (rust-lang/rust-by-example#1397)
- Simplify the call of filter_map (rust-lang/rust-by-example#1396)
- Update README.md (rust-lang/rust-by-example#1382)
- Add missing main function in static life time example. (rust-lang/rust-by-example#1383)
- Clarify first matching arm and all possible values (rust-lang/rust-by-example#1395)
- Clarify distinction between for iter and into_iter (rust-lang/rust-by-example#1394)
- Drop extern crate (rust-lang/rust-by-example#1393)
rustc_span: Provide a reserved identifier check for a specific edition
while keeping edition evaluation lazy because it may be expensive.
Needed for https://github.com/rust-lang/rust/pull/80226.
Remove redundant test
Remove ignored test. This test can also be found at src/test/rustdoc-ui/intra-doc/double-anchor.rs and the second version isn't ignored.
r? ``@jyn514``
Remove `I-prioritize` from Zulip topic
It doesn't add anything since every topic in
`t-compiler/wg-prioritization/alerts` is about prioritization.
And it makes it harder to see the issue title, which is what the topic
is actually about.
cc ``@rust-lang/wg-prioritization``
Add module-level docs to rustc_middle::ty
I thought it would be nice to point out `Ty` and `TyCtxt` on the module page, and link out to the [rustc-dev-guide chapter](https://rustc-dev-guide.rust-lang.org/ty.html).
Fix labels for 'Library Tracking Issue' template
Each label needs to be separated by a comma (see the ICE issue template
for an example of correct usage).
r? `````@m-ou-se`````
docs: Edit rustc_middle::ty::query::on_disk_cache
Expand abbreviations for "incremental compliation".
Also added the word "to" to the description of CacheEncoder.
Clarify constructor splitting in exhaustiveness checking
I reworked the explanation of the algorithm completely to make it properly account for the various extensions we've added. This includes constructor splitting, which was previously not clearly included in the algorithm. This makes wildcards less magical; I added some detailed examples; and this distinguishes clearly between constructors that only make sense in patterns (like ranges) and those that make sense for values (like `Some`). This reformulation had been floating around in my mind for a while, and I'm quite happy with how it turned out. Let me know how you feel about it.
I also factored out all three cases of splitting (wildcards, ranges and slices) into dedicated structs to encapsulate the complicated bits.
I measured no perf impact but I don't trust my local measurements for refactors since https://github.com/rust-lang/rust/pull/79284.
r? `@varkor`
`@rustbot` modify labels: +A-exhaustiveness-checking
rustc_query_system: explicitly register reused dep nodes
Register nodes that we've reused from the previous session explicitly
with `OnDiskCache`. Previously, we relied on this happening as a side
effect of accessing the nodes in the `PreviousDepGraph`. For the sake of
performance and avoiding unintended side effects, register explictily.
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater time commitment, this currently avoids
implementing for platforms outside of x86_64-unknown-linux-gnu, though in
practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.