Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
Optimize SipHash by reordering compress instructions
This PR optimizes hashing by changing the order of instructions in the sip.rs `compress` macro so the CPU can parallelize it better. The new order is taken directly from Fig 2.1 in [the SipHash paper](https://eprint.iacr.org/2012/351.pdf) (but with the xors moved which makes it a little faster). I attempted to optimize it some more after this, but I think this might be the optimal instruction order. Note that this shouldn't change the behavior of hashing at all, only statements that don't depend on each other were reordered.
It appears like the current order hasn't changed since its [original implementation from 2012](fada46c421 (diff-b751133c229259d7099bbbc7835324e5504b91ab1aded9464f0c48cd22e5e420R35)) which doesn't look like it was written with data dependencies in mind.
Running `./x bench library/core --stage 0 --test-args hash` before and after this change shows the following results:
Before:
```
benchmarks:
hash::sip::bench_bytes_4 7.20/iter +/- 0.70
hash::sip::bench_bytes_7 9.01/iter +/- 0.35
hash::sip::bench_bytes_8 8.12/iter +/- 0.10
hash::sip::bench_bytes_a_16 10.07/iter +/- 0.44
hash::sip::bench_bytes_b_32 13.46/iter +/- 0.71
hash::sip::bench_bytes_c_128 37.75/iter +/- 0.48
hash::sip::bench_long_str 121.18/iter +/- 3.01
hash::sip::bench_str_of_8_bytes 11.20/iter +/- 0.25
hash::sip::bench_str_over_8_bytes 11.20/iter +/- 0.26
hash::sip::bench_str_under_8_bytes 9.89/iter +/- 0.59
hash::sip::bench_u32 9.57/iter +/- 0.44
hash::sip::bench_u32_keyed 6.97/iter +/- 0.10
hash::sip::bench_u64 8.63/iter +/- 0.07
```
After:
```
benchmarks:
hash::sip::bench_bytes_4 6.64/iter +/- 0.14
hash::sip::bench_bytes_7 8.19/iter +/- 0.07
hash::sip::bench_bytes_8 8.59/iter +/- 0.68
hash::sip::bench_bytes_a_16 9.73/iter +/- 0.49
hash::sip::bench_bytes_b_32 12.70/iter +/- 0.06
hash::sip::bench_bytes_c_128 32.38/iter +/- 0.20
hash::sip::bench_long_str 102.99/iter +/- 0.82
hash::sip::bench_str_of_8_bytes 10.71/iter +/- 0.21
hash::sip::bench_str_over_8_bytes 11.73/iter +/- 0.17
hash::sip::bench_str_under_8_bytes 10.33/iter +/- 0.41
hash::sip::bench_u32 10.41/iter +/- 0.29
hash::sip::bench_u32_keyed 9.50/iter +/- 0.30
hash::sip::bench_u64 8.44/iter +/- 1.09
```
I ran this on my computer so there's some noise, but you can tell at least `bench_long_str` is significantly faster (~18%).
Also, I noticed the same compress function from the library is used in the compiler as well, so I took the liberty of copy-pasting this change to there as well.
Thanks `@semisol` for porting SipHash for another project which led me to notice this issue in Rust, and for helping investigate. <3
Instead of keeping a list of architectures which have native support
for 64-bit atomics, just use #[cfg(target_has_atomic = "64")] and its
inverted counterpart to determine whether we need to use portable
AtomicU64 on the target architecture.
Fixes for 32-bit SPARC on Linux
This PR fixes a number of issues which previously prevented `rustc` from being built
successfully for 32-bit SPARC using the `sparc-unknown-linux-gnu` triplet.
In particular, it adds linking against `libatomic` where necessary, uses portable `AtomicU64`
for `rustc_data_structures` and rewrites the spec for `sparc_unknown_linux_gnu` to use
`TargetOptions` and replaces the previously used `-mv8plus` with the more portable
`-mcpu=v9 -m32`.
To make `rustc` build successfully, support for 32-bit SPARC needs to be added to the `object`
crate as well as the `nix` crate which I will be sending out later as well.
r? nagisa
This is possible now that inline const blocks are stable; the idea was
even mentioned as an alternative when `uninit_array()` was added:
<https://github.com/rust-lang/rust/pull/65580#issuecomment-544200681>
> if it’s stabilized soon enough maybe it’s not worth having a
> standard library method that will be replaceable with
> `let buffer = [MaybeUninit::<T>::uninit(); $N];`
Const array repetition and inline const blocks are now stable (in the
next release), so that circumstance has come to pass, and we no longer
have reason to want `uninit_array()` other than convenience. Therefore,
let’s evaluate the inconvenience by not using `uninit_array()` in
the standard library, before potentially deleting it entirely.
Added an associated `const THIS_IMPLEMENTATION_HAS_BEEN_TRIPLE_CHECKED`
to the `StableOrd` trait to ensure that implementors carefully consider
whether the trait's contract is upheld, as incorrect implementations can
cause miscompilations.
This makes their intent and expected location clearer. We see some
examples where these comments were not clearly separate from `use`
declarations, which made it hard to understand what the comment is
describing.
This version shaves off ca 2% of the cycles in my experiments
and makes the control flow easier to follow for me and hopefully
others, including the compiler.
Someone gave me a working profiler and by God I'm using it.
This patch has been extracted from #123720. It specifically enhances
`Sccs` to allow tracking arbitrary commutative properties of SCCs, including
- reachable values (max/min)
- SCC-internal values (max/min)
This helps with among other things universe computation: we can now identify
SCC universes as a straightforward "find max/min" operation during SCC construction.
It's also more or less zero-cost; don't use the new features, don't pay for them.
This commit also vastly extends the documentation of the SCCs module, which I had a very hard time following.
Update ena to 0.14.3
Includes https://github.com/rust-lang/ena/pull/53, which removes a trivial `Self: Sized` bound that prevents `ena` from building on the new solver.
It's a macro that just creates an enum with a `from_u32` method. It has
two arms. One is unused and the other has a single use.
This commit inlines that single use and removes the whole macro. This
increases readability because we don't have two different macros
interacting (`enum_from_u32` and `language_item_table`).
It provides a way to effectively embed a linked list within an
`IndexVec` and also iterate over that list. It's written in a very
generic way, involving two traits `Links` and `LinkElem`. But the
`Links` trait is only impl'd for `IndexVec` and `&IndexVec`, and the
whole thing is only used in one module within `rustc_borrowck`. So I
think it's over-engineered and hard to read. Plus it has no comments.
This commit removes it, and adds a (non-generic) local iterator for the
use within `rustc_borrowck`. Much simpler.
It is optimized for lists with a single element, avoiding the need for
an allocation in that case. But `SmallVec<[T; 1]>` also avoids the
allocation, and is better in general: more standard, log2 number of
allocations if the list exceeds one item, and a much more capable API.
This commit removes `TinyList` and converts the two uses to
`SmallVec<[T; 1]>`. It also reorders the `use` items in the relevant
file so they are in just two sections (`pub` and non-`pub`), ordered
alphabetically, instead of many sections. (This is a relevant part of
the change because I had to decide where to add a `use` item for
`SmallVec`.)
And move the `repr` line after the `derive` line, where it's harder to
overlook. (I overlooked it initially, and didn't understand how this
type worked.)
Stabilize the size of incr comp object file names
The current implementation does not produce stable-length paths, and we create the paths in a way that makes our allocation behavior is nondeterministic. I think `@eddyb` fixed a number of other cases like this in the past, and this PR fixes another one. Whether that actually matters I have no idea, but we still have bimodal behavior in rustc-perf and the non-uniformity in `find` and `ls` was bothering me.
I've also removed the truncation of the mangled CGU names. Before this PR incr comp paths look like this:
```
target/debug/incremental/scratch-38izrrq90cex7/s-gux6gz0ow8-1ph76gg-ewe1xj434l26w9up5bedsojpd/261xgo1oqnd90ry5.o
```
And after, they look like this:
```
target/debug/incremental/scratch-035omutqbfkbw/s-gux6borni0-16r3v1j-6n64tmwqzchtgqzwwim5amuga/55v2re42sztc8je9bva6g8ft3.o
```
On the one hand, I'm sure this will break some people's builds because they're on Windows and only a few bytes from the path length limit. But if we're that seriously worried about the length of our file names, I have some other ideas on how to make them smaller. And last time I deleted some hash truncations from the compiler, there was a huge drop in the number if incremental compilation ICEs that were reported: https://github.com/rust-lang/rust/pull/110367https://github.com/rust-lang/rust/pull/110367
---
Upon further reading, this PR actually fixes a bug. This comment says the CGU names are supposed to be a fixed-length hash, and before this PR they aren't: ca7d34efa9/compiler/rustc_monomorphize/src/partitioning.rs (L445-L448)