Track local frames incrementally during execution
https://github.com/rust-lang/miri/pull/2646 currently introduces a performance regression. This change removes that regression, and provides a minor perf improvement.
The existing lazy strategy for tracking the span we want to display is as efficient as it is only because we often create a `CurrentSpan` then never call `.get()`. Most of the calls to the `before_memory_read` and `before_memory_write` hooks do not create any event that we store in `AllocHistory`. But data races are totally different, any memory read or write may race, so every call to those hooks needs to access to the current local span.
So this changes to a strategy where we update some state in a `Thread` and `FrameExtra` incrementally, upon entering and existing each function call.
Before:
```
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml
Time (mean ± σ): 5.532 s ± 0.022 s [User: 5.444 s, System: 0.073 s]
Range (min … max): 5.516 s … 5.569 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml
Time (mean ± σ): 831.4 ms ± 3.0 ms [User: 783.8 ms, System: 46.7 ms]
Range (min … max): 828.7 ms … 836.1 ms 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 1.975 s ± 0.021 s [User: 1.914 s, System: 0.059 s]
Range (min … max): 1.939 s … 1.990 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml
Time (mean ± σ): 4.060 s ± 0.051 s [User: 3.983 s, System: 0.071 s]
Range (min … max): 3.972 s … 4.100 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml
Time (mean ± σ): 784.9 ms ± 8.2 ms [User: 746.5 ms, System: 37.7 ms]
Range (min … max): 772.9 ms … 793.3 ms 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 1.679 s ± 0.006 s [User: 1.623 s, System: 0.055 s]
Range (min … max): 1.673 s … 1.687 s 5 runs
```
After:
```
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/backtraces/Cargo.toml
Time (mean ± σ): 5.330 s ± 0.037 s [User: 5.232 s, System: 0.084 s]
Range (min … max): 5.280 s … 5.383 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/mse/Cargo.toml
Time (mean ± σ): 818.9 ms ± 3.7 ms [User: 776.8 ms, System: 41.3 ms]
Range (min … max): 813.5 ms … 822.5 ms 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 1.927 s ± 0.011 s [User: 1.864 s, System: 0.061 s]
Range (min … max): 1.917 s … 1.945 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/serde2/Cargo.toml
Time (mean ± σ): 3.974 s ± 0.020 s [User: 3.893 s, System: 0.076 s]
Range (min … max): 3.956 s … 4.004 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/slice-get-unchecked/Cargo.toml
Time (mean ± σ): 780.0 ms ± 5.3 ms [User: 740.3 ms, System: 39.0 ms]
Range (min … max): 771.2 ms … 784.5 ms 5 runs
Benchmark 1: cargo +miri miri run --manifest-path /home/ben/miri/bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 1.643 s ± 0.007 s [User: 1.584 s, System: 0.058 s]
Range (min … max): 1.635 s … 1.654 s 5 runs
```
(This change is marginal, but the point is that it avoids a much more significant regression)
Remove AscribeUserTypeCx
r? ``@compiler-errors``
This basically inlines `AscribeUserTypeCx::relate_mir_and_user_ty` into `type_op_ascribe_user_type_with_span` which is the only place where it's used and makes direct use of `ObligationCtxt` API.
Unsupported query error now specifies if its unsupported for local or external crate
Fixes#101666.
I had to move `keys.rs` from `rustc_query_impl` to `rustc_middle`. I don't know if that is problematic. I couldn't think of any other way to get the needed information inside `rustc_middle`.
r? ```@jyn514```
Refine `instruction_set` MIR inline rules
Previously an exact match of the `instruction_set` attribute was required for an MIR inline to be considered. This change checks for an exact match *only* if the callee sets an `instruction_set` in the first place. When the callee does not declare an instruction set then it is considered to be platform agnostic code and it's allowed to be inline'd into the caller.
cc ``@oli-obk``
[Edit] Zulip Context: https://rust-lang.zulipchat.com/#narrow/stream/189540-t-compiler.2Fwg-mir-opt/topic/What.20exactly.20does.20the.20MIR.20optimizer.20do.3F
Manually implement PartialEq for Option<T> and specialize non-nullable types
This PR manually implements `PartialEq` and `StructuralPartialEq` for `Option`, which seems to produce slightly better codegen than the automatically derived implementation.
It also allows specializing on the `core::num::NonZero*` and `core::ptr::NonNull` types, taking advantage of the niche optimization by transmuting the `Option<T>` to `T` to be compared directly, which can be done in just two instructions.
A comparison of the original, new and specialized code generation is available [here](https://godbolt.org/z/dE4jxdYsa).
Enable profiler in dist-riscv64-linux
Build the profiler runtime to allow using -C profile-generate and -C instrument-coverage on riscv64-linux.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Add `ConstKind::Expr`
Starting to implement `ty::ConstKind::Abstract`, most of the match cases are stubbed out, some I was unsure what to add, others I didn't want to add until a more complete implementation was ready.
r? `@lcnr`
Previously an exact match of the `instruction_set` attribute was required for an MIR inline to be considered. This change checks for an exact match *only* if the callee sets an `instruction_set` in the first place. When the callee does not declare an instruction set then it is considered to be platform agnostic code and it's allowed to be inline'd into the caller.
rustdoc: fix broken tooltip CSS
text `#ffffff` on background `#fdffd3` fails the [WCAG color contrast checker], and seems like a mistake in 16b55903ee.
Making the cursor a pointer is misleading, since clicking it doesn't do anything.
[WCAG color contrast checker]: https://accessibleweb.com/color-contrast-checker/
rustbuild: Don't build doc::SharedAssets when building JSON docs.
Previously, running `./x doc library/core/ --json` on a plain build would panic bootstrap.
```
$ ./x doc library/core/ --json
Building rustbuild
Blocking waiting for file lock on package cache
Compiling bootstrap v0.0.0 (/home/nixon/dev/rust/rust/src/bootstrap)
Finished dev [unoptimized] target(s) in 4.47s
thread 'main' panicked at 'fs::write(&version_info, &info) failed with No such file or directory (os error 2) ("/home/nixon/dev/rust/rust/build/x86_64-unknown-linux-gnu/doc/version_info.html")', doc.rs:410:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Build completed unsuccessfully in 0:00:04
```
Becuase the `SharedAssets` step assumes that the HTML out dir has been created. This isn't true for JSON.
The fix is to not build shared assets when doing a JSON doc build, as it doesn't need them.
r? ``@jyn514``
``@rustbot`` modify labels: +A-rustdoc-json
rustc_codegen_ssa: write `.dwp` in a streaming fashion
When writing a `.dwp` file, rustc writes to a Vec first then to a BufWriter-wrapped file. It seems very likely that we can write in a streaming fashion to avoid double buffering in an intermediate Vec.
On my Linux machine, `.dwp` from the latest rust-lang/cargo is 113MiB. It may worth a stream writer, though I didn't do any benchmark 🙇🏾♂️.