rustc_query_system: reduce dependency graph memory usage
This change implements, at a high level, two space optimizations to the dependency graph.
The first optimization is sharing graph data with the previous dependency graph. Whenever we intern a node, we know whether that node is new (not in the previous graph) or not, and if not, the color of the node in the previous graph.
Red and green nodes have their `DepNode` present in the previous graph, so for that piece of node data, we can just store the index of the node in the previous graph rather than duplicate the `DepNode`. Green nodes additionally have the the same result `Fingerprint`, so we can avoid duplicating that too. Finally, we distinguish between "light" and "dark" green nodes, where the latter are nodes that were marked green because all of their dependencies were marked green. These nodes can additionally share edges with the previous graph, because we know that their set of dependencies is the same (technically, light green and red nodes can have the same dependencies too, but we don't try to figure out whether or not that's the case).
Also, some effort is made to pack data tightly, and to avoid storing `DepNode`s as map keys more than once.
The second optimization is storing edges in a more compact representation, as in the `SerializedDepGraph`, that is, in a single vector, rather than one `EdgesVec` per node. An `EdgesVec` is a `SmallVec` with an inline buffer for 8 elements. Each `EdgesVec` is, at minimum, 40 bytes, and has a per-node overhead of up to 40 bytes. In the ideal case of exactly 8 edges, then 32 bytes are used for edges, and the overhead is 8 bytes. But most of the time, the overhead is higher.
In contrast, using a single vector to store all edges, and having each node specify its start and end elements as 4 byte indices into the vector has a constant overhead of 8 bytes--the best case scenario for the per-node `EdgesVec` approach.
The downside of this approach is that `EdgesVec`s built up during query execution have to be copied into the vector, whereas before, we could just take ownership over them. However, we mostly make up for this because the single vector representation enables a more efficient implementation of `DepGraph::serialize`.
BTreeMap: relax the explicit borrow rule to make code shorter and safer
Expressions like `.reborrow_mut().into_len_mut()` are annoyingly long, and kind of dangerous for the reason `reborrow_mut()` is unsafe. By relaxing the single rule, we no longer have to make an exception for functions with a `borrow` name and functions like `as_leaf_mut`. This is largely restoring the declaration style of the btree::node API about a year ago, but with more explanation and consistency.
r? `@Mark-Simulacrum`
Refactor dist tarballs generation
Before this PR, each tarball we ship as part of a release was generated by manually creating the directory structure and invoking `rust-installer generate`. This means each tarball was slightly different, adding new ones meant copy-pasting the code generating another tarball and removing the useless parts, and more importantly refactoring how tarballs are generated is extremely time-consuming.
This PR introduces a new abstraction in rustbuild, `Tarball`. The `Tarball` struct provides a trivial API to generate simple tarballs, and can get out of the way when more complex tarballs have to generate. For example, the whole code to generate the `build-manifest` tarball is now the following:
```rust
let tarball = Tarball::new(builder, "build-manifest", &self.target.triple);
tarball.add_file(&build_manifest, "bin", 0o755);
tarball.generate()
```
One notable change between the old tarballs and the new ones is that the "overlay" (README.md, COPYRIGHT, LICENSE-APACHE and LICENSE-MIT) is now available in every produced tarball, while before each tarball inconsistently had or didn't have those files. Tarballs that need a different overlay have a way to change which files to include (with the `set_overlay` method):
```rust
let mut tarball = Tarball::new(builder, "rustfmt", &target.triple);
tarball.set_overlay(OverlayKind::Rustfmt);
tarball.add_file(rustfmt, "bin", 0o755);
tarball.add_file(cargofmt, "bin", 0o755);
tarball.add_legal_and_readme_to("share/doc/rustfmt");
Some(tarball.generate())
```
The PR should be reviewed commit-by-commit, as each commit migrated a separate tarball to use `Tarball`. During development i made sure every tarball can still be generated, and for the most compex tarballs I manually ensured the list of files between the old and new tarballs did not have unexpected changes.
r? `@Mark-Simulacrum`
Clippy: Revert change from last sync
r? `@Manishearth`
cc `@ebroto`
Note that the commit e898015 doesn't exist like this in the Clippy repo. I didn't want to do a full sync, because this would've included at least one new lint, which I wanted to avoid a week before beta is branched. This just reverts one commit from the last sync.
Utilize PGO for rustc linux dist builds
This implements support for applying PGO to the rustc compilation step (not
standard library or any tooling, including rustdoc). Expanding PGO to more tools
is not terribly difficult but will involve more work and greater CI time
commitment.
For the same reason of avoiding greater implementation time commitment,
implementing for platforms outside of x86_64-unknown-linux-gnu is skipped.
In practice it should be quite simple to extend over time to more platforms. The
initial implementation is intentionally minimal here to avoid too much work
investment before we start seeing wins for a subset of Rust users.
The choice of workloads to profile here is somewhat arbitrary, but the general
rationale was to aim for a small set that largely avoided time regressions on
perf.rust-lang.org's full suite of crates. The set chosen is libcore, cargo (and
its dependencies), and a few ad-hoc stress tests from perf.rlo. The stress tests
are arguably the most controversial, but they benefit those cases (avoiding
regressions) and do not really remove wins from other benchmarks.
The primary next step after this PR lands is to implement support for PGO in
LLVM. It is unclear whether we can afford a full LLVM rebuild in CI, though, so
the approach taken there may need to be more staggered. rustc-only PGO seems
well affordable on linux at least, giving us up to 20% wall time wins on some
crates for 15 minutes of extra CI time (1 hour with this PR, up from 45 minutes).
The PGO data is uploaded to allow others to reuse it if attempting to reproduce
the CI build or potentially, in the future, on other platforms where an
off-by-one strategy is used for dist builds at minimal performance cost.
r? `@michaelwoerister` (but tell me if you don't want to / don't feel comfortable approving and we can find others)
Deprecate atomic compare_and_swap method
Finish implementing [RFC 1443](https://github.com/rust-lang/rfcs/blob/master/text/1443-extended-compare-and-swap.md) (https://github.com/rust-lang/rfcs/pull/1443).
It was decided to deprecate `compare_and_swap` [back in Rust 1.12 already](https://github.com/rust-lang/rust/issues/31767#issuecomment-215903038). I can't find any info about that decision being reverted. My understanding is just that it has been forgotten. If there has been a decision on keeping `compare_and_swap` then it's hard to find, and even if this PR does not go through it can act as a place where people can find out about the decision being reverted.
Atomic operations are hard to understand, very hard. And it does not help that there are multiple similar methods to do compare and swap with. They are so similar that for a reader it might be hard to understand the difference. This PR aims to make that simpler by finally deprecating `compare_and_swap` which is essentially just a more limited version of `compare_exchange`. The documentation is also updated (according to the RFC text) to explain the differences a bit better.
Even if we decide to not deprecate `compare_and_swap`. I still think the documentation for the atomic operations should be improved to better describe their differences and similarities. And the documentation can be written nicer than the PR currently proposes, but I wanted to start somewhere. Most of it is just copied from the RFC.
The documentation for `compare_exchange` and `compare_exchange_weak` indeed describe how they work! The problem is that they are more complex and harder to understand than `compare_and_swap`. So for someone who does not fully grasp this they might fall back to using `compare_and_swap`. Making the documentation outline the similarities and differences might build a bridge for people so they can cross over to the more powerful and sometimes more efficient operations.
The conversions I do to avoid the `std` internal deprecation errors are very straight forward `compare_and_swap -> compare_exchange` changes where the orderings are just using the mapping in the new documentation. Only in one place did I use `compare_exchange_weak`. This can probably be improved further. But the goal here was not for those operations to be perfect. Just to not get worse and to allow the deprecation to happen.
Remove `DefPath` from `Visibility` and calculate it on demand
Depends on #80090 and should not be merged before. Helps with https://github.com/rust-lang/rust/issues/79103 and https://github.com/rust-lang/rust/issues/76382.
cc https://github.com/rust-lang/rust/pull/80014#issuecomment-746810284 - `@nnethercote` I figured it out! It was simpler than I expected :)
This brings the size of `clean::Visibility` down from 40 bytes to 8.
Note that this does *not* remove `clean::Visibility`, even though it's now basically the same as `ty::Visibility`, because the `Invsible` variant means something different from `Inherited` and I thought it would be be confusing to merge the two. See the new comments on `impl Clean for ty::Visibility` for details.
Turn helper method into a closure
`replace_prefix` is currently implemented as a method but has no real relation
to the struct it is implemented on. Turn it into a closure and move it into the
only method from which it is called.
`@rustbot` modify labels +C-cleanup +T-compiler
r? `@lcnr`
Update books
## nomicon
2 commits in d8383b65f7948c2ca19191b3b4bd709b403aaf45..a5a48441d411f61556b57d762b03d6874afe575d
2020-11-22 10:24:42 -0500 to 2020-12-06 10:39:41 +0900
- Update atomics.md (rust-lang/nomicon#249)
- Rename `AllocRef` to `Allocator` and `(de)alloc` to `(de)allocate` (rust-lang/nomicon#248)
## reference
2 commits in a8afdca5d0715b2257b6f8b9a032fd4dd7dae855..b278478b766178491a8b6f67afa4bcd6b64d977a
2020-11-30 06:44:46 -0800 to 2020-12-21 18:18:03 -0800
- Update unions for safe ManuallyDrop assignment. (rust-lang/reference#912)
- Removing ambiguity in type-layout.md (rust-lang/reference#911)
## book
25 commits in a190438d77d28041f24da4f6592e287fab073a61..5bb44f8b5b0aa105c8b22602e9b18800484afa21
2020-11-16 10:44:08 -0600 to 2020-12-18 20:07:31 -0500
- Make some further edits to rust-lang/book#2447
- Merge remote-tracking branch 'origin/pr/2447'
- Remove copied and dangling link brackets
- Merge remote-tracking branch 'origin/pr/2359'
- Override toolchain to nightly for run lints action. (rust-lang/book#2528)
- Remove an uneeded 'static lifetime (rust-lang/book#1752)
- Fixesrust-lang/book#2330. Clarify why the lock is held too long
- Update paragraph about rustfmt in Chapter 1.2 (rust-lang/book#2304)
- Clarify language around further from rust-lang/book#2418
- Merge remote-tracking branch 'origin/pr/2418'
- Merge remote-tracking branch 'origin/pr/2475'
- Add some further edits to rust-lang/book#2433
- Merge remote-tracking branch 'origin/pr/2433'
- Note all the method families to handle integer overflow
- Merge remote-tracking branch 'origin/pr/2405'
- Fixrust-lang/book#1855 - incorporate new reference cycle diagram
- Make some further edits to the changes in rust-lang/book#1886
- Merge remote-tracking branch 'origin/pr/1886'
- Make some further edits to rust-lang/book#1998
- Merge remote-tracking branch 'origin/pr/1998'
- Update Rust version and output (rust-lang/book#2518)
- Fix typo, regarding privileged ports being up to 1023 instead of 1024 (rust-lang/book#2509)
- Change "appendixes" to "appendices" in intro. (rust-lang/book#2498)
- Update 16-11 to use method call expression for `clone` (rust-lang/book#2511)
- Correct chapter 20 final listing (rust-lang/book#2516)
## rust-by-example
7 commits in 236c734a2cb323541b3394f98682cb981b9ec086..1cce0737d6a7d3ceafb139b4a206861fb1dcb2ab
2020-11-30 14:05:49 -0300 to 2020-12-21 17:36:29 -0300
- Add book.description in book.toml (rust-lang/rust-by-example#1397)
- Simplify the call of filter_map (rust-lang/rust-by-example#1396)
- Update README.md (rust-lang/rust-by-example#1382)
- Add missing main function in static life time example. (rust-lang/rust-by-example#1383)
- Clarify first matching arm and all possible values (rust-lang/rust-by-example#1395)
- Clarify distinction between for iter and into_iter (rust-lang/rust-by-example#1394)
- Drop extern crate (rust-lang/rust-by-example#1393)
rustc_span: Provide a reserved identifier check for a specific edition
while keeping edition evaluation lazy because it may be expensive.
Needed for https://github.com/rust-lang/rust/pull/80226.
Remove redundant test
Remove ignored test. This test can also be found at src/test/rustdoc-ui/intra-doc/double-anchor.rs and the second version isn't ignored.
r? ``@jyn514``
Remove `I-prioritize` from Zulip topic
It doesn't add anything since every topic in
`t-compiler/wg-prioritization/alerts` is about prioritization.
And it makes it harder to see the issue title, which is what the topic
is actually about.
cc ``@rust-lang/wg-prioritization``
Add module-level docs to rustc_middle::ty
I thought it would be nice to point out `Ty` and `TyCtxt` on the module page, and link out to the [rustc-dev-guide chapter](https://rustc-dev-guide.rust-lang.org/ty.html).