Save/restore more items in cache with incremental compilation
Right now they don't play very well together, consider a simple example:
```
$ export RUSTFLAGS="--emit asm"
$ cargo new --lib foo
Created library `foo` package
$ cargo build -q
$ touch src/lib.rs
$ cargo build
error: could not copy
"/path/to/foo/target/debug/deps/foo-e307cc7fa7b6d64f.4qbzn9k8mosu50a5.rcgu.s"
to "/path/to/foo/target/debug/deps/foo-e307cc7fa7b6d64f.s":
No such file or directory (os error 2)
```
Touch triggers the rebuild, incremental compilation detects no changes (yay) and everything explodes while trying to copy files were they should go.
This pull request fixes it by copying and restoring more files in the incremental compilation cache
Fixes https://github.com/rust-lang/rust/issues/89149
Fixes https://github.com/rust-lang/rust/issues/88829
Related: https://internals.rust-lang.org/t/interaction-between-incremental-compilation-and-emit/20551
rename ptr::from_exposed_addr -> ptr::with_exposed_provenance
As discussed on [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/To.20expose.20or.20not.20to.20expose/near/427757066).
The old name, `from_exposed_addr`, makes little sense as it's not the address that is exposed, it's the provenance. (`ptr.expose_addr()` stays unchanged as we haven't found a better option yet. The intended interpretation is "expose the provenance and return the address".)
The new name nicely matches `ptr::without_provenance`.
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Simplify trim-paths feature by merging all debuginfo options together
This PR simplifies the trim-paths feature by merging all debuginfo options together, as described in https://github.com/rust-lang/rust/issues/111540#issuecomment-1994010274.
And also do some correctness fixes found during the review.
cc `@weihanglo`
r? `@michaelwoerister`
Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).
A sample improvement from nightly:
```
leaq str.0(%rip), %rdi
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
movl $25, %esi
callq *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```
to this PR:
```
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
callq *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```
[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
Remove `TypeAndMut` from `ty::RawPtr` variant, make it take `Ty` and `Mutability`
Pretty much mechanically converting `ty::RawPtr(ty::TypeAndMut { ty, mutbl })` to `ty::RawPtr(ty, mutbl)` and its fallout.
r? lcnr
cc rust-lang/types-team#124
"Handle" calls to upstream monomorphizations in compiler_builtins
This is pretty cooked, but I think it works.
compiler-builtins has a long-standing problem that at link time, its rlib cannot contain any calls to `core`. And yet, in codegen we _love_ inserting calls to symbols in `core`, generally from various panic entrypoints.
I intend this PR to attack that problem as completely as possible. When we generate a function call, we now check if we are generating a function call from `compiler_builtins` and whether the callee is a function which was not lowered in the current crate, meaning we will have to link to it.
If those conditions are met, actually generating the call is asking for a linker error. So we don't. If the callee diverges, we lower to an abort with the same behavior as `core::intrinsics::abort`. If the callee does not diverge, we produce an error. This means that compiler-builtins can contain panics, but they'll SIGILL instead of panicking. I made non-diverging calls a compile error because I'm guessing that they'd mostly get into compiler-builtins by someone making a mistake while working on the crate, and compile errors are better than linker errors. We could turn such calls into aborts as well if that's preferred.
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
add test ensuring simd codegen checks don't run when a static assertion failed
stdarch relies on this to ensure that SIMD indices are in bounds.
I would love to know why this works, but I can't figure out where codegen decides to not codegen a function if a required-const does not evaluate. `@oli-obk` `@bjorn3` do you have any idea?
Expose the Freeze trait again (unstably) and forbid implementing it manually
non-emoji version of https://github.com/rust-lang/rust/pull/121501
cc #60715
This trait is useful for generic constants (associated consts of generic traits). See the test (`tests/ui/associated-consts/freeze.rs`) added in this PR for a usage example. The builtin `Freeze` trait is the only way to do it, users cannot work around this issue.
It's also a useful trait for building some very specific abstrations, as shown by the usage by the `zerocopy` crate: https://github.com/google/zerocopy/issues/941
cc ```@RalfJung```
T-lang signed off on reexposing this unstably: https://github.com/rust-lang/rust/pull/121501#issuecomment-1969827742
Distinguish between library and lang UB in assert_unsafe_precondition
As described in https://github.com/rust-lang/rust/pull/121583#issuecomment-1963168186, `assert_unsafe_precondition` now explicitly distinguishes between language UB (conditions we explicitly optimize on) and library UB (things we document you shouldn't do, and maybe some library internals assume you don't do).
`debug_assert_nounwind` was originally added to avoid the "only at runtime" aspect of `assert_unsafe_precondition`. Since then the difference between the macros has gotten muddied. This totally revamps the situation.
Now _all_ preconditions shall be checked with `assert_unsafe_precondition`. If you have a precondition that's only checkable at runtime, do a `const_eval_select` hack, as done in this PR.
r? RalfJung
Add asm goto support to `asm!`
Tracking issue: #119364
This PR implements asm-goto support, using the syntax described in "future possibilities" section of [RFC2873](https://rust-lang.github.io/rfcs/2873-inline-asm.html#asm-goto).
Currently I have only implemented the `label` part, not the `fallthrough` part (i.e. fallthrough is implicit). This doesn't reduce the expressive though, since you can use label-break to get arbitrary control flow or simply set a value and rely on jump threading optimisation to get the desired control flow. I can add that later if deemed necessary.
r? ``@Amanieu``
cc ``@ojeda``
Remove useless lifetime of ArchiveBuilder
`trait ArchiveBuilder<'a>` has a seemingly useless lifetime a, so I remove it. If this is intentional, please reject this PR.
```rust
pub trait ArchiveBuilder<'a> {
fn add_file(&mut self, path: &Path);
fn add_archive(
&mut self,
archive: &Path,
skip: Box<dyn FnMut(&str) -> bool + 'static>,
) -> io::Result<()>;
fn build(self: Box<Self>, output: &Path) -> bool;
}
```
rename 'try' intrinsic to 'catch_unwind'
The intrinsic has nothing to do with `try` blocks, and corresponds to the stable `catch_unwind` function, so this makes a lot more sense IMO.
Also rename Miri's special function while we are at it, to reflect the level of abstraction it works on: it's an unwinding mechanism, on which Rust implements panics.
Add "algebraic" fast-math intrinsics, based on fast-math ops that cannot return poison
Setting all of LLVM's fast-math flags makes our fast-math intrinsics very dangerous, because some inputs are UB. This set of flags permits common algebraic transformations, but according to the [LangRef](https://llvm.org/docs/LangRef.html#fastmath), only the flags `nnan` (no nans) and `ninf` (no infs) can produce poison.
And this uses the algebraic float ops to fix https://github.com/rust-lang/rust/issues/120720
cc `@orlp`
Invert diagnostic lints.
That is, change `diagnostic_outside_of_impl` and `untranslatable_diagnostic` from `allow` to `deny`, because more than half of the compiler has been converted to use translated diagnostics.
This commit removes more `deny` attributes than it adds `allow` attributes, which proves that this change is warranted.
r? ````@davidtwco````
That is, change `diagnostic_outside_of_impl` and
`untranslatable_diagnostic` from `allow` to `deny`, because more than
half of the compiler has be converted to use translated diagnostics.
This commit removes more `deny` attributes than it adds `allow`
attributes, which proves that this change is warranted.
remove StructuralEq trait
The documentation given for the trait is outdated: *all* function pointers implement `PartialEq` and `Eq` these days. So the `StructuralEq` trait doesn't really seem to have any reason to exist any more.
One side-effect of this PR is that we allow matching on some consts that do not implement `Eq`. However, we already allowed matching on floats and consts containing floats, so this is not new, it is just allowed in more cases now. IMO it makes no sense at all to allow float matching but also sometimes require an `Eq` instance. If we want to require `Eq` we should adjust https://github.com/rust-lang/rust/pull/115893 to check for `Eq`, and rule out float matching for good.
Fixes https://github.com/rust-lang/rust/issues/115881
Replacement of #114390: Add new intrinsic `is_var_statically_known` and optimize pow for powers of two
This adds a new intrinsic `is_val_statically_known` that lowers to [``@llvm.is.constant.*`](https://llvm.org/docs/LangRef.html#llvm-is-constant-intrinsic).` It also applies the intrinsic in the int_pow methods to recognize and optimize the idiom `2isize.pow(x)`. See #114390 for more discussion.
While I have extended the scope of the power of two optimization from #114390, I haven't added any new uses for the intrinsic. That can be done in later pull requests.
Note: When testing or using the library, be sure to use `--stage 1` or higher. Otherwise, the intrinsic will be a noop and the doctests will be skipped. If you are trying out edits, you may be interested in [`--keep-stage 0`](https://rustc-dev-guide.rust-lang.org/building/suggested.html#faster-builds-with---keep-stage).
Fixes#47234Resolves#114390
`@Centri3`
Improved support of collapse_debuginfo attribute for macros.
Added walk_chain_collapsed function to consider collapse_debuginfo attribute in parent macros in call chain.
Fixed collapse_debuginfo attribute processing for cranelift (there was if/else branches error swap).
cc https://github.com/rust-lang/rust/issues/100758
Remove `DiagCtxt` API duplication
`DiagCtxt` defines the internal API for creating and emitting diagnostics: methods like `struct_err`, `struct_span_warn`, `note`, `create_fatal`, `emit_bug`. There are over 50 methods.
Some of these methods are then duplicated across several other types: `Session`, `ParseSess`, `Parser`, `ExtCtxt`, and `MirBorrowckCtxt`. `Session` duplicates the most, though half the ones it does are unused. Each duplicated method just calls forward to the corresponding method in `DiagCtxt`. So this duplication exists to (in the best case) shorten chains like `ecx.tcx.sess.parse_sess.dcx.emit_err()` to `ecx.emit_err()`.
This API duplication is ugly and has been bugging me for a while. And it's inconsistent: there's no real logic about which methods are duplicated, and the use of `#[rustc_lint_diagnostic]` and `#[track_caller]` attributes vary across the duplicates.
This PR removes the duplicated API methods and makes all diagnostic creation and emission go through `DiagCtxt`. It also adds `dcx` getter methods to several types to shorten chains. This approach scales *much* better than API duplication; indeed, the PR adds `dcx()` to numerous types that didn't have API duplication: `TyCtxt`, `LoweringCtxt`, `ConstCx`, `FnCtxt`, `TypeErrCtxt`, `InferCtxt`, `CrateLoader`, `CheckAttrVisitor`, and `Resolver`. These result in a lot of changes from `foo.tcx.sess.emit_err()` to `foo.dcx().emit_err()`. (You could do this with more types, but it gets into diminishing returns territory for types that don't emit many diagnostics.)
After all these changes, some call sites are more verbose, some are less verbose, and many are the same. The total number of lines is reduced, mostly because of the removed API duplication. And consistency is increased, because calls to `emit_err` and friends are always preceded with `.dcx()` or `.dcx`.
r? `@compiler-errors`