Optimize large array creation in const-eval
This changes repeated memcpy's to a memset for the case that we're propagating a single byte into a region of memory. It also optimizes the element-by-element copies to have a tighter loop; I'm pretty sure the old code was actually doing a multiply within each loop iteration.
For an 8GB array (`static SLICE: [u8; SIZE] = [0u8; 1 << 33];`) this takes us from ~23 seconds to ~6 seconds locally, which is spent roughly 50/50 in (a) memset to zero and (b) memcpy of the original place into a new place, when popping stack frame. The latter seems hard to avoid but is a big memcpy (since we're copying the type rather than initializing a region, so it's pretty fast), and the first is as good as it's going to get without special casing constant-valued arrays.
Closes https://github.com/rust-lang/rust/issues/55795. (That issue's references to lint checking don't appear true anymore, but I think this closes that case as something that is slow due to *time* pretty fully. An 8GB array taking only 6 seconds feels reasonable enough to not merit further tracking).
Get rid of the hir_owner query.
This query was meant as a firewall between `hir_owner_nodes` which is supposed to change often, and the queries that only depend on the item signature. That firewall was inefficient, leaking the contents of the HIR body through `HirId`s.
`hir_owner` incurs a significant cost, as we need to hash HIR twice in multiple modes. This PR proposes to remove it, and simplify the hashing scheme.
For the future, `def_kind`, `def_span`... are much more efficient for incremental decoupling, and should be preferred.
change `.unwrap()` to `?` on write where `fmt::Result` is returned
Fixes#120090 which points out that some of the `.unwrap()`s in `rustc_middle/src/mir/pretty.rs` are likely meant to be `?`s
Remove `next_root_ty_var`
Uhh we seem to not have any test that relies on this anymore. Maybe due to the way we changed alias-relate or whatever.
Removing this hack helper fn because #119106 is the general solution.
r? lcnr
Improved collapse_debuginfo attribute, added command-line flag
Improved attribute collapse_debuginfo with variants: `#[collapse_debuginfo=(no|external|yes)]`.
Added command-line flag for default behaviour.
Work-in-progress: will add more tests.
cc https://github.com/rust-lang/rust/issues/100758
bootstrap: handle vendored sources when remapping crate paths
#115872 introduced a feature to add path remapping for crate dependencies, but only when they came from Cargo's registry cache, not a vendor directory.
This caused builds that used remapped debuginfo and vendor directories to fail with:
```
std::fs::read_dir(registry_src) failed with No such file or directory (os error 2)
```
or (if the `registry/src` directory exists but is empty)
```
error: --remap-path-prefix must contain '=' between FROM and TO
```
Fixes#117885 by explicitly supporting the `vendor` directory and adding it to `RUSTC_CARGO_REGISTRY_SRC_TO_REMAP`.
Note that `bootstrap.py` already assumes that `./vendor` within the rust repo is the only supported vendoring location.
r? `@pietroalbini`
[rustc_data_structures] Use partition_point to find slice range end.
This PR uses approach introduced in https://github.com/rust-lang/rust/pull/114152 to find
the end of the range. It's much easier to understand and reason about invariants of such
implementation.
Technically it's possible to make it even shorter by returning `&[start..end]` unconditionally
because even if searched item is not present in the slice, `start` and `end` would point at
the same index, so the range would be empty. The reason I decided not to use this shorter
implementation is because it would involve more comparisons in case there are no elements
in the slice with key equal to `key`.
Also, not that it matters much, but this implementation also improves perf according to the
benchmark below:
https://gist.github.com/ttsugriy/63c0ed39ae132b131931fa1f8a3dea55
The results on my M1 macbook air are:
```
Running benches/bin_search_slice_benchmark.rs (target/release/deps/bin_search_slice_benchmark-90fa6d68c3bd1298)
Benchmarking multiply add/binary_search_slice: Collecting 100 samples in estimated 5.0002 s (1
multiply add/binary_search_slice
time: [44.719 ns 44.918 ns 45.158 ns]
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe
Benchmarking multiply add/binary_search_slice_new: Collecting 100 samples in estimated 5.0001
multiply add/binary_search_slice_new
time: [36.955 ns 37.060 ns 37.221 ns]
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
```
Rollup of 8 pull requests
Successful merges:
- #119172 (Detect `NulInCStr` error earlier.)
- #119833 (Make tcx optional from StableMIR run macro and extend it to accept closures)
- #119967 (Add `PatKind::Err` to AST/HIR)
- #119978 (Move async closure parameters into the resultant closure's future eagerly)
- #120021 (don't store const var origins for known vars)
- #120038 (Don't create a separate "basename" when naming and opening a MIR dump file)
- #120057 (Don't ICE when deducing future output if other errors already occurred)
- #120073 (Remove spastorino from users_on_vacation)
r? `@ghost`
`@rustbot` modify labels: rollup
error on incorrect implied bounds in wfcheck except for Bevy dependents
Rebase of #109763
Additionally, special cases Bevy `ParamSet` types to not trigger the lint. This is tracked in #119956.
Fixes#109628
Don't ICE when deducing future output if other errors already occurred
The situation can't really happen outside of erroneous code. What was interesting is that it ICEd before emitting any other diagnostics. This was because the other errors were silenced due to cycle_delay_bug cycle errors.
r? ```@compiler-errors```
fixes#119890
Don't create a separate "basename" when naming and opening a MIR dump file
These functions were split up by #77080, in order to support passing the dump file's “basename” (filename without extension) to the implementation of `-Zdump-mir-spanview`, so that it could be used as a page title.
That flag has since been removed (#119566), so now there's no particular reason for this code to handle the basename separately from the filename or full path.
This PR therefore restores things to (roughly) how they were before #77080.
Move async closure parameters into the resultant closure's future eagerly
Move async closure parameters into the closure's resultant future eagerly.
Before, we used to desugar `async |p1, p2, ..| { body }` as `|p1, p2, ..| { || async { body } }`. Now, we desugar the above like `|p1, p2, ..| { async move { let p1 = p1; let p2 = p2; ... body } }`. This mirrors the same desugaring that `async fn` does with its parameter types, and the compiler literally uses the same code via a shared helper function.
This removes the necessity for E0708, since now expressions like `async |x: i32| { x }` will not give you confusing borrow errors.
This does *not* fix the case where async closures have self-borrows. This will come with a general implementation of async closures, which is still in the works.
r? oli-obk
Make tcx optional from StableMIR run macro and extend it to accept closures
Change `run` macro to avoid sometimes unnecessary dependency on `TyCtxt`, and introduce `run_with_tcx` to capture use cases where `tcx` is required. Additionally, extend both macros to accept closures that may capture variables.
I've also modified the `internal()` method to make it safer, by accepting the type context to force the `'tcx` lifetime to match the context lifetime.
These are non-backward compatible changes, but they only affect internal APIs which are provided today as helper functions until we have a stable API to start the compiler.
Detect `NulInCStr` error earlier.
By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars.
NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together.
One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error.
r? ```@fee1-dead```
Update cargo
2 commits in 1cff2ee6b92e0ad3f87c44b70b28f788b2528b3c..1ae631085f01c1a72d05df1ec81f3759a8360042
2024-01-16 16:56:57 +0000 to 2024-01-17 17:26:41 +0000
- fix(json-msg): use pkgid spec in in JSON messages (rust-lang/cargo#13311)
- doc(features): Highlight the non-blocking feature gating technique (rust-lang/cargo#13307)
r? oli-obk
Could you check if this fixes miri build?
pat_analysis: Don't rely on contiguous `VariantId`s outside of rustc
Today's pattern_analysis uses `BitSet` and `IndexVec` on the provided enum variant ids, which only makes sense if these ids count the variants from 0. In rust-analyzer, the variant ids are global interning ids, which would make `BitSet` and `IndexVec` ridiculously wasteful. In this PR I add some shims to use `FxHashSet`/`FxHashMap` instead outside of rustc.
r? ```@compiler-errors```
Fix `rustc_abi` build on stable
https://github.com/rust-lang/rust/pull/119446 broke the ability of `rustc_abi` to build on stable, which is required by rust-analyzer. This fixes it.
Construct closure type eagerly
Construct the returned closure type *before* checking the body, in the same match as we were previously deducing the coroutine types based off of the closure kind.
This simplifies some changes I'm doing in the async closure PR, and imo just seems easier to read (since we only need one match on closure kind, instead of two). There's no reason I can tell that we needed to create the closure type *after* the body was checked.
~~This also has the side-effect of making it so that the universe of the closure synthetic infer vars are lower than any infer vars that come from checking the body. We can also get rid of `next_root_ty_var` hack from closure checking (though in general we still need this, #119106). cc ```@lcnr``` since you may care about this hack 😆~~
r? ```@oli-obk```
Gracefully handle missing typeck information if typeck errored
fixes#116893
I created some logs and the typeck of `fn main` is exactly the same, no matter whether the constant's body is what it is, or if it is replaced with `panic!()`. The latter will cause the ICE not to be emitted though. The reason for that is that we abort compilation if *errors* were emitted, but not if *lint errors* were emitted. This took me way too long to debug, and is another reason why I would have liked https://github.com/rust-lang/compiler-team/issues/633
Consistently unset RUSTC_BOOTSTRAP when compiling bootstrap
Since https://github.com/rust-lang/rust/pull/113906, all x.py invocations performed by rust-analyzer have RUSTC_BOOTSTRAP=1 set. This was to fix https://github.com/rust-lang/rust/issues/112391#issuecomment-1597224941 — rust-analyzer uses some default cargo from the system when fetching workspace layout, and the standard library uses some unstable cargo feature, so x.py would previously fail if the system toolchain wasn't nightly.
82e1608dfa/library/std/Cargo.toml (L1)
This PR changes x.py to cancel out this RUSTC_BOOTSTRAP=1 when compiling bootstrap. It will only remain set when running the compiled bootstrap executable. This fixes spurious bootstrap rebuilds in the event that a rust-analyzer x.py invocation is alternated with someone running x.py themself on the command line, if any dependency of bootstrap looks at `option_env!("RUSTC_BOOTSTRAP")`, which is the case since https://github.com/rust-lang/rust/pull/119654.
**Before:**
```console
$ RUSTC_BOOTSTRAP=1 ./x.py check library/core
Building bootstrap
Compiling proc-macro2 v1.0.76
Compiling quote v1.0.35
Compiling syn v2.0.48
Compiling clap_derive v4.4.7
Compiling serde_derive v1.0.195
Compiling clap v4.4.13
Compiling clap_complete v4.4.6
Compiling build_helper v0.1.0
Compiling bootstrap v0.0.0
Finished dev [unoptimized] target(s) in 6.31s
Checking stage0 library artifacts {core} (x86_64-unknown-linux-gnu)
Finished release [optimized] target(s) in 0.23s
Build completed successfully in 0:00:07
$ ./x.py check library/core
Building bootstrap
Compiling proc-macro2 v1.0.76
Compiling quote v1.0.35
Compiling syn v2.0.48
Compiling clap_derive v4.4.7
Compiling serde_derive v1.0.195
Compiling clap v4.4.13
Compiling clap_complete v4.4.6
Compiling build_helper v0.1.0
Compiling bootstrap v0.0.0
Finished dev [unoptimized] target(s) in 5.30s
Checking stage0 library artifacts {core} (x86_64-unknown-linux-gnu)
Finished release [optimized] target(s) in 0.25s
Build completed successfully in 0:00:06
```
**After:**
```console
$ RUSTC_BOOTSTRAP=1 ./x.py check library/core
Building bootstrap
Finished dev [unoptimized] target(s) in 0.06s
Checking stage0 library artifacts {core} (x86_64-unknown-linux-gnu)
Finished release [optimized] target(s) in 0.14s
Build completed successfully in 0:00:01
$ ./x.py check library/core
Building bootstrap
Finished dev [unoptimized] target(s) in 0.04s
Checking stage0 library artifacts {core} (x86_64-unknown-linux-gnu)
Finished release [optimized] target(s) in 0.13s
Build completed successfully in 0:00:01
```
Don't ICE if TAIT-defining fn contains a closure with `_` in return type
The `delay_span_bug` got added in 0e82aaeb67 to reduce the amount of errors emitted for functions that have `_` in their return type, because inference doesn't apply to function items. But this logic shouldn't apply to closures, because their return types *can* be inferred.
Fixes https://github.com/rust-lang/rust/issues/119916.
Save liveness results for DestinationPropagation
`DestinationPropagation` needs to verify that merge candidates do not conflict with each other. This is done by verifying that a local is not live when its counterpart is written to.
To get the liveness information, the pass runs `MaybeLiveLocals` dataflow analysis repeatedly, once for each propagation round. This is quite costly, and the main driver for the perf impact on `ucd` and `diesel`. (See https://github.com/rust-lang/rust/pull/115105#issuecomment-1689205908)
In order to mitigate this cost, this PR proposes to save the result of the analysis into a `SparseIntervalMatrix`, and mirror merges of locals into that matrix: `liveness(destination) := liveness(destination) union liveness(source)`.
<details>
<summary>Proof</summary>
We denote by `'` all the quantities of the transformed program. Let $\varphi$ be a mapping of locals, which maps `source` to `destination`, and is identity otherwise. The exact liveness set after a statement is $out'(statement)$, and the proposed liveness set is $\varphi(out(statement))$.
Consider a statement. Suppose that the output state verifies $out' \subset phi(out)$. We want to prove that $in' \subset \varphi(in)$ where $in = (out - kill) \cup gen$, and conclude by induction.
We have 2 cases: either that statement is kept with locals renumbered by $\varphi$, or it is a tautological assignment and it removed.
1. If the statement is kept: the gen-set and the kill-set of $statement' = \varphi(statement)$ are $gen' = \varphi(gen)$ and $kill' = \varphi(kill)$ exactly.
From soundness requirement 3, $\varphi(in)$ is disjoint from $\varphi(kill)$.
This implies that $\varphi(out - kill)$ is disjoint from $\varphi(kill)$, and so $\varphi(out - kill) = \varphi(out) - \varphi(kill)$. Then $\varphi(in) = (\varphi(out) - \varphi(kill)) \cup \varphi(gen) = (\varphi(out) - kill') \cup gen'$.
We can conclude that $out' \subset \varphi(out) \implies in' \subset \varphi(in)$.
2. If the statement is removed. As $\varphi(statement)$ is a tautological assignment, we know that $\varphi(gen) = \varphi(kill) = \\{ destination \\}$, while $gen' = kill' = \emptyset$. So $\varphi(in) = \varphi(out) \cup \\{ destination \\}$. Then $in' = out' \subset out \subset \varphi(in)$.
By recursion, we can conclude by that $in' \subset \varphi(in)$ everywhere.
</details>
This approximate liveness results is only suboptimal if there are locals that fully disappear from the CFG due to an assignment cycle. These cases are quite unlikely, so we do not bother with them.
This change allows to reduce the perf impact of DestinationPropagation by half on diesel and ucd (https://github.com/rust-lang/rust/pull/115105#issuecomment-1694701904).
cc ````@JakobDegen````