Use llvm::computeLTOCacheKey to determine post-ThinLTO CGU reuse
During incremental ThinLTO compilation, we attempt to re-use the
optimized (post-ThinLTO) bitcode file for a module if it is 'safe' to do
so.
Up until now, 'safe' has meant that the set of modules that our current
modules imports from/exports to is unchanged from the previous
compilation session. See PR #67020 and PR #71131 for more details.
However, this turns out be insufficient to guarantee that it's safe
to reuse the post-LTO module (i.e. that optimizing the pre-LTO module
would produce the same result). When LLVM optimizes a module during
ThinLTO, it may look at other information from the 'module index', such
as whether a (non-imported!) global variable is used. If this
information changes between compilation runs, we may end up re-using an
optimized module that (for example) had dead-code elimination run on a
function that is now used by another module.
Fortunately, LLVM implements its own ThinLTO module cache, which is used
when ThinLTO is performed by a linker plugin (e.g. when clang is used to
compile a C proect). Using this cache directly would require extensive
refactoring of our code - but fortunately for us, LLVM provides a
function that does exactly what we need.
The function `llvm::computeLTOCacheKey` is used to compute a SHA-1 hash
from all data that might influence the result of ThinLTO on a module.
In addition to the module imports/exports that we manually track, it
also hashes information about global variables (e.g. their liveness)
which might be used during optimization. By using this function, we
shouldn't have to worry about new LLVM passes breaking our module re-use
behavior.
In LLVM, the output of this function forms part of the filename used to
store the post-ThinLTO module. To keep our current filename structure
intact, this PR just writes out the mapping 'CGU name -> Hash' to a
file. To determine if a post-LTO module should be reused, we compare
hashes from the previous session.
This should unblock PR #75199 - by sheer chance, it seems to have hit
this issue due to the particular CGU partitioning and optimization
decisions that end up getting made.
Recognize discriminant reads as no-ops in RemoveNoopLandingPads
The cleanup blocks often contain read of discriminants. Teach
RemoveNoopLandingPads to recognize them as no-ops to remove
additional no-op landing pads.
Avoid SeqCst or static mut in mach_timebase_info and QueryPerformanceFrequency caches
This patch went through a couple iterations but the end result is replacing a pattern where an `AtomicUsize` (updated with many SeqCst ops) guards a `static mut` with a single `AtomicU64` that is known to use 0 as a value indicating that it is not initialized.
The code in both places exists to cache values used in the conversion of Instants to Durations on macOS, iOS, and Windows.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), but it's much simpler, safer, and in practice we'd expect it to be faster everywhere where Relaxed operations on AtomicU64 are cheaper than SeqCst operations on AtomicUsize, which is a lot of places.
Anyway, it also removes a bunch of unsafe code and greatly simplifies the logic, so IMO that alone would be worth it unless it was a regression.
If you want to take a look at the assembly output though, see https://godbolt.org/z/rbr6vn for x86_64, https://godbolt.org/z/cqcbqv for aarch64 (Note that this just the output of the mac side, but i'd expect the windows part to be the same and don't feel like doing another godbolt for it). There are several versions of this function in the godbolt:
- `info_new`: version in the current patch
- `info_less_new`: version in initial PR
- `info_original`: version currently in the tree
- `info_orig_but_better_orderings`: a version that just tries to change the original code's orderings from SeqCst to the (probably) minimal orderings required for soundness/correctness.
The biggest concern I have here is if we can use AtomicU64, or if there are targets that dont have it that this code supports. AFAICT: no. (If that changes in the future, it's easy enough to do something different for them)
r? `@Amanieu` because he caught a couple issues last time I tried to do a patch reducing orderings 😅
---
<details>
<summary>I rewrote this whole message so the original is inside here</summary>
I happened to notice the code we use for caching the result of mach_timebase_info uses SeqCst exclusively.
However, thinking a little more, it's actually pretty easy to avoid the static mut by packing the timebase info into an AtomicU64.
This entirely avoids needing to do the compare_exchange. The AtomicU64 can be read/written using Relaxed ops, which on current macos/ios platforms (x86_64/aarch64) have no overhead compared to direct loads/stores. This simplifies the code and makes it a lot safer too.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), although it should do that on both targets it applies to.
That said, it also removes a bunch of unsafe code and simplifies the logic (arguably at least — there are only two states now, initialized or not), so I think it's a net win even without concrete numbers.
If you want to take a look at the assembly output though, see below. It has the new version, the original, and a version of the original with lower Orderings (which is still worse than the version in this PR)
- godbolt.org/z/obfqf9 x86_64-apple-darwin
- godbolt.org/z/Wz5cWc aarch64-unknown-linux-gnu (godbolt can't do aarch64-apple-ios but that doesn't matter here)
A different (and more efficient) option than this would be to just use the AtomicU64 and use the knowledge that after initialization the denominator should be nonzero... That felt like it's relying on too many things I'm not confident in, so I didn't want to do that.
</details>
Clean up in intra-doc link collector
This PR makes the following changes in intra-doc links:
- clean up hard to follow closure-based logic in `check_full_res`
- fix a FIXME comment by figuring out that `true` and `false` need to be resolved separately
- refactor path resolution by extracting common code to a helper method and trying to reduce the number of unnecessary early returns
- primitive types are now defined by their symbols, not their name strings
- re-enables a commented-out test case
Closes#77267 (cc `@Stupremee)`
r? `@jyn514`
Add -Z codegen-backend dylib to deps
When the codegen-backend dylib changes, the program should be rebuilt.
---
Unfortunately I was unable to test this works locally due to running into a TLS issue when running the custom backend, `thread 'rustc' panicked at 'no ImplicitCtxt stored in tls', compiler/rustc_middle/src/ty/context.rs:1750:54`, which seems similar to https://github.com/rust-lang/rust/issues/62717 but has a completely different cause and backtrace.
`@eddyb` said to ping `@Mark-Simulacrum` about what they think about this, so, ping!
Provide structured suggestions when finding structs when expecting a trait
When finding an ADT in a trait object definition provide some solutions. Fix#45817.
Given `<Param as Trait>::Assoc: Ty` suggest `Param: Trait<Assoc = Ty>`. Fix#75829.
Allow generic parameters in intra-doc links
Fixes#62834.
---
The contents of the generics will be mostly ignored (except for warning
if fully-qualified syntax is used, which is currently unsupported in
intra-doc links - see issue #74563).
* Allow links like `Vec<T>`, `Result<T, E>`, and `Option<Box<T>>`
* Allow links like `Vec::<T>::new()`
* Warn on
* Unbalanced angle brackets (e.g. `Vec<T` or `Vec<T>>`)
* Missing type to apply generics to (`<T>` or `<Box<T>>`)
* Use of fully-qualified syntax (`<Vec as IntoIterator>::into_iter`)
* Invalid path separator (`Vec:<T>:new`)
* Too many angle brackets (`Vec<<T>>`)
* Empty angle brackets (`Vec<>`)
Note that this implementation *does* allow some constructs that aren't
valid in the actual Rust syntax, for example `Box::<T>new()`. That may
not be supported in rustdoc in the future; it is an implementation
detail.
They were not formatted correctly, so rustdoc was interpreting some
parts as code. Also cleaned up some other query docs that weren't
causing issues, but were formatted incorrectly.
Update `changelog-seen` in config.toml.example
This got out of sync when the version was bumped last time in #77133
Long-term we may want to find an easier way to maintain this that
doesn't require bumping the version in three different places. Off the
top of my head I can't think of anything, though. It _is_ documented in src/bootstrap/README.md, although I don't know how many people read that.
r? @Mark-Simulacrum
cc @spastorino
doc: disambiguate stat in MetadataExt::as_raw_stat
A few architectures in `os::linux::raw` import `libc::stat`, rather than
defining that type directly. However, that also imports the _function_
called `stat`, which makes this doc link ambiguous:
error: `crate::os::linux::raw::stat` is both a struct and a function
--> library/std/src/os/linux/fs.rs:21:19
|
21 | /// [`stat`]: crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ambiguous link
|
= note: `-D broken-intra-doc-links` implied by `-D warnings`
help: to link to the struct, prefix with the item type
|
21 | /// [`stat`]: struct@crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to link to the function, add parentheses
|
21 | /// [`stat`]: crate::os::linux::raw::stat()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We want the `struct`, so it's now prefixed accordingly.
Clarify the debug-related values should take boolean
#76588 tweaked their placeholders but these values should take boolean and the current placeholders are confusing, at least for me.
Add TraitDef::find_map_relevant_impl
This PR adds a method to `TraitDef`. While `for_each_relevant_impl` covers the general use case, sometimes it's not necessary to scan through all the relevant implementations, so this PR introduces a new method, `find_map_relevant_impl`. I've also replaced the `for_each_relevant_impl` calls where possible.
I'm hoping for a tiny bit of efficiency gain here and there.
fix __rust_alloc_error_handler comment
`__rust_alloc_error_handler` was added in the same `extern` block as the allocator functions, but the comment there was not actually correct for `__rust_alloc_error_handler`. So move it down to the rest of the default allocator handling with a fixed comment. At least the comment reflects my understanding of what happens, please check carefully. :)
r? @Amanieu Cc @haraldh
Cleanup of `eat_while()` in lexer
The size of a lexer Token was inflated by the largest `TokenKind` variants `LiteralKind::RawStr` and `RawByteStr`, because
* it used `usize` although `u32` is sufficient in rustc, since crates must be smaller than 4GB,
* and it stored the 20 bytes big `RawStrError` enum for error reporting.
If a raw string is invalid, it now needs to be reparsed to get the `RawStrError` data, but that is a very cold code path.
Technically this breaks other tools that depend on rustc_lexer because they are now also restricted to a max file size of 4GB. But this shouldn't matter in practice, and rustc_lexer isn't stable anyway.
Can I also get a perf run?
Edit: This makes no difference in performance. The PR now only contains a small cleanup.
Link to documentation-specific guidelines.
Changed contribution information URL because it's not obvious how to get from the current URL to the documentation-specific content.
The current URL points to this "Getting Started" page, which contains nothing specific about documentation[*] and instead launches into how to *build* `rustc` which is not a strict prerequisite for contributing documentation fixes:
* https://rustc-dev-guide.rust-lang.org/getting-started.html
[*] The most specific content is a "Writing documentation" bullet point which is not itself a link to anything (I guess a patch for that might be helpful too).
### Why?
Making this change will make it easier for people who wish to make small "drive by" documentation fixes (and read contribution guidelines ;) ) which I find are often how I start contributing to a project. (Exhibit A: https://github.com/rust-lang/rust/pull/77050 :) )
### Background
My impression is the change of content linked is an unintentional change due to a couple of other changes:
* Originally, the link pointed to `contributing.md` which started with a "table of contents" linking to each section. But the content in `contributing.md` was removed and replaced with a link to the "Getting Started" section here:
* 3f6928f1f6 (diff-6a3371457528722a734f3c51d9238c13L1)
But the changed link doesn't actually point to the equivalent content, which is now located here:
* https://rustc-dev-guide.rust-lang.org/contributing.html
(If the "Guide to Rustc Development" is now considered the canonical location of "How to Contribute" content it might be a good idea to merge some of the "Contributing" Introduction section into the "Getting Started" section.)
* This was then compounded by changing the link from `contributing.md` to `contributing.html` here:
* https://github.com/rust-lang/rust/pull/74037/files#diff-242481015141f373dcb178e93cffa850L88
In order to even find the new location of the previous `contributing.md` content I ended up needing to do a GitHub search of the `rust-lang` org for the phrase "Documentation improvements are very welcome". :D
Add asm! support for mips64
- [x] Updated `src/doc/unstable-book/src/library-features/asm.md`.
- [ ] No vector type support. I don't know much about those types.
cc #76839
This got out of sync when the version was bumped last time.
Long-term we may want to find an easier way to maintain this that
doesn't require bumping the version in three different places. Off the
top of my head I can't think of anything, though.
Fix error checking in posix_spawn implementation of Command
* Check for errors returned from posix_spawn*_init functions
* Check for non-zero return value from posix_spawn functions
rustc_target: Refactor away `TargetResult`
Follow-up to https://github.com/rust-lang/rust/pull/77202.
Construction of a built-in target is always infallible now, so `TargetResult` is no longer necessary.
The second commit contains some further cleanup based on built-in target construction being infallible.
Always use the Rust version in package names
The format of the tarballs produced by CI is roughly the following:
{component}-{release}-{target}.{ext}
While on the beta and nightly channels `{release}` is just the channel name, on the stable channel is either the Rust version or the version of the component we're shipping:
cargo-0.47.0-x86_64-unknown-linux-gnu.tar.xz
clippy-0.0.212-x86_64-unknown-linux-gnu.tar.xz
llvm-tools-1.46.0-x86_64-unknown-linux-gnu.tar.xz
miri-0.1.0-x86_64-unknown-linux-gnu.tar.xz
rls-1.41.0-x86_64-unknown-linux-gnu.tar.xz
rust-1.46.0-x86_64-unknown-linux-gnu.tar.xz
...
This makes it really hard to get the package URL without having access to the manifest (and there is no manifest on ci-artifacts.rlo), as there is no consistent version number to use.
This PR addresses the problem by always using the Rust version number as `{release}` for the stable channel, regardless of the version number of the component we're shipping. I chose that instead of "stable" to avoid breaking the URL scheme *that* much.
Rustup should not be affected by this change, as it fetches the URLs from the manifest. Unfortunately we don't have a way to test other clients before making a stable release, as this change only affects the stable channel.
r? `@Mark-Simulacrum`
A few architectures in `os::linux::raw` import `libc::stat`, rather than
defining that type directly. However, that also imports the _function_
called `stat`, which makes this doc link ambiguous:
error: `crate::os::linux::raw::stat` is both a struct and a function
--> library/std/src/os/linux/fs.rs:21:19
|
21 | /// [`stat`]: crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ambiguous link
|
= note: `-D broken-intra-doc-links` implied by `-D warnings`
help: to link to the struct, prefix with the item type
|
21 | /// [`stat`]: struct@crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to link to the function, add parentheses
|
21 | /// [`stat`]: crate::os::linux::raw::stat()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We want the `struct`, so it's now prefixed accordingly.
The cleanup blocks often contain read of discriminants. Teach
RemoveNoopLandingPads to recognize them as no-ops to remove
additional no-op landing pads.