Remove nondeterminism in multiple-definitions test
Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function. Restore the multiple-definitions test.
Resolves#87084.
Make expansions stable for incr. comp.
This PR aims to make expansions stable for incr. comp. by using the same architecture as definitions:
- the interned identifier `ExpnId` contains a `CrateNum` and a crate-local id;
- bidirectional maps `ExpnHash <-> ExpnId` are setup;
- incr. comp. on-disk cache saves and reconstructs expansions using their `ExpnHash`.
I tried to use as many `LocalExpnId` as I could in the resolver code, but I may have missed a few opportunities.
All this will allow to use an `ExpnId` as a query key, and to force this query without recomputing caller queries. For instance, this will be used to implement #85999.
r? `@petrochenkov`
CTFE/Miri engine Pointer type overhaul
This fixes the long-standing problem that we are using `Scalar` as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a `Scalar` and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like `force_mplace_ptr` and `force_op_ptr`.
Another problem this solves is that in Miri, it would make a lot more sense to have the `Pointer::offset` field represent the full absolute address (instead of being relative to the `AllocId`). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.
To solve this, the `Pointer` type is made entirely parametric over the provenance, so that we can use `Pointer<AllocId>` inside `Scalar` but use `Pointer<Option<AllocId>>` when accessing memory (where `None` represents the case that we could not figure out an `AllocId`; in that case the `offset` is an absolute address). Moreover, the `Provenance` trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.
I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes https://github.com/rust-lang/miri/issues/841
Miri PR: https://github.com/rust-lang/miri/pull/1851
r? `@oli-obk`
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.
# Summary
Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).
Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.
In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.
Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.
```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>(); // Err(ParseFloatError { kind: Invalid })
```
# Solution
This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.
**Documentation**
Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:
```rust
// Round-to-even only happens for negative values of q
// when q ≥ −4 in the 64-bit case and when q ≥ −17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
// so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
// (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
// or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```
This ensures maintainability of the code base.
**Improvements for Disguised Fast-Path Cases**
The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.
However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.
**Digit Parsing Improvements**
Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.
**Unsafe Changes**
Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.
```rust
impl AsciiStr {
#[inline]
pub fn step_by(&mut self, n: usize) -> &mut Self {
unsafe { self.ptr = self.ptr.add(n) };
self
}
}
...
#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
// the first character is 'e'/'E' and scientific mode is enabled
let start = *s;
s.step();
...
}
```
The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.
```rust
impl AsciiStr {
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: same as step_by, safe as long n is less than the buffer length
self.ptr = unsafe { self.ptr.add(n) };
self
}
}
...
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
let start = *s;
// SAFETY: the first character is 'e'/'E' and scientific mode is enabled
unsafe {
s.step();
}
...
}
```
This allows us to trivially demonstrate the new implementation of dec2flt is safe.
**Inline Annotations Have Been Removed**
In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).
**Fixed Correctness Tests**
Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.
**Undefined Behavior**
An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:
```rust
#[inline]
pub fn check_len(&self, n: usize) -> bool {
unsafe { self.ptr.add(n) <= self.end }
}
```
And the new implementation is as follows:
```rust
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
```
Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).
**Inferring Binary Exponents**
Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).
# Code Size
The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.
**new**
Using rustc version 1.55.0-dev.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K
**old**
Using rustc version 1.53.0-nightly.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K
# Correctness
The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.
# Issues Addressed
This will fix and close the following issues:
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Implementation is based off fast-float-rust, with a few notable changes.
- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Add initial implementation of HIR-based WF checking for diagnostics
During well-formed checking, we walk through all types 'nested' in
generic arguments. For example, WF-checking `Option<MyStruct<u8>>`
will cause us to check `MyStruct<u8>` and `u8`. However, this is done
on a `rustc_middle::ty::Ty`, which has no span information. As a result,
any errors that occur will have a very general span (e.g. the
definintion of an associated item).
This becomes a problem when macros are involved. In general, an
associated type like `type MyType = Option<MyStruct<u8>>;` may
have completely different spans for each nested type in the HIR. Using
the span of the entire associated item might end up pointing to a macro
invocation, even though a user-provided span is available in one of the
nested types.
This PR adds a framework for HIR-based well formed checking. This check
is only run during error reporting, and is used to obtain a more precise
span for an existing error. This is accomplished by individually
checking each 'nested' type in the HIR for the type, allowing us to
find the most-specific type (and span) that produces a given error.
The majority of the changes are to the error-reporting code. However,
some of the general trait code is modified to pass through more
information.
Since this has no soundness implications, I've implemented a minimal
version to begin with, which can be extended over time. In particular,
this only works for HIR items with a corresponding `DefId` (e.g. it will
not work for WF-checking performed within function bodies).
During well-formed checking, we walk through all types 'nested' in
generic arguments. For example, WF-checking `Option<MyStruct<u8>>`
will cause us to check `MyStruct<u8>` and `u8`. However, this is done
on a `rustc_middle::ty::Ty`, which has no span information. As a result,
any errors that occur will have a very general span (e.g. the
definintion of an associated item).
This becomes a problem when macros are involved. In general, an
associated type like `type MyType = Option<MyStruct<u8>>;` may
have completely different spans for each nested type in the HIR. Using
the span of the entire associated item might end up pointing to a macro
invocation, even though a user-provided span is available in one of the
nested types.
This PR adds a framework for HIR-based well formed checking. This check
is only run during error reporting, and is used to obtain a more precise
span for an existing error. This is accomplished by individually
checking each 'nested' type in the HIR for the type, allowing us to
find the most-specific type (and span) that produces a given error.
The majority of the changes are to the error-reporting code. However,
some of the general trait code is modified to pass through more
information.
Since this has no soundness implications, I've implemented a minimal
version to begin with, which can be extended over time. In particular,
this only works for HIR items with a corresponding `DefId` (e.g. it will
not work for WF-checking performed within function bodies).
TAIT: Infer all inference variables in opaque type substitutions via InferCx
The previous algorithm was correct for the example given in its
documentation, but when the TAIT was declared as a free item
instead of an associated item, the generic parameters were the
wrong ones.
cc `@spastorino`
r? `@nikomatsakis`
The previous algorithm was correct for the example given in its
documentation, but when the TAIT was declared as a free item
instead of an associated item, the generic parameters were the
wrong ones.
Replace associated item bound vars with placeholders when projecting
Fixes#76407Fixes#76826
Similar, but more limited, to #85499. This allows us to handle things like `for<'a> <T as Trait>::Assoc<'a>` but not `for<'a> <T as Trait<'a>>::Assoc`, unblocking GATs.
r? `@nikomatsakis`
Add -Zfuture-incompat-test to assist with testing future-incompat reports.
This adds a `-Zfuture-incompat-test` cli flag to assist with testing future-incompatible reports. This flag causes all lints to be treated as a future-incompatible lint, and will emit a report for them. This is being added so that Cargo's testsuite can reliably test the reporting infrastructure. Right now, Cargo relies on using array_into_iter as a test subject. Since the breaking "future incompatible" lints are never intended to last forever, this means Cargo's testsuite would always need to keep changing to choose different lints (for example, #86330 proposed dropping that moniker for array_into_iter). With this flag, Cargo's tests can trigger any lint and check for the report.
This resolves all the problems we had around "normalizing" the representation of a Scalar in case it carries a Pointer value: we can just use Pointer if we want to have a value taht we are sure is already normalized.
CTFE engine: small cleanups
I noticed these while preparing a large PR, and figured I'd better send them ahead to not muddy the diff unnecessarily.
- remove remaining use of Pointer in Allocation API (I missed those in https://github.com/rust-lang/rust/pull/85472)
- remove unnecessary deallocate_local hack (this logic does not seem necessary any more)
r? `@oli-obk`
Simplify future incompatible reporting.
This simplifies the implementation of the future incompatible reporting system. Instead of having a separate field in the future_incompatible definition, this reuses the `FutureIncompatibilityReason` enum. It also drops the "date" field. Cargo does not use the date field, and there isn't much of a need for this to be structured, and I am skeptical that the date can be predicted reliably. The date or release version can be listed in the lint text if desired.
Revert the revert of renaming traits::VTable to ImplSource
As #72114 and #73055 were merged so closely together I think this
accidentally happened while rebasing
Support forwarding caller location through trait object method call
Since PR #69251, the `#[track_caller]` attribute has been supported on
traits. However, it only has an effect on direct (monomorphized) method
calls. Calling a `#[track_caller]` method on a trait object will *not*
propagate caller location information - instead, `Location::caller()` will
return the location of the method definition.
This PR forwards caller location information when `#[track_caller]` is
present on the method definition in the trait. This is possible because
`#[track_caller]` in this position is 'inherited' by any impls of that
trait, so all implementations will have the same ABI.
This PR does *not* change the behavior in the case where
`#[track_caller]` is present only on the impl of a trait.
While all implementations of the method might have an explicit
`#[track_caller]`, we cannot know this at codegen time, since other
crates may have impls of the trait. Therefore, we keep the current
behavior of not forwarding the caller location, ensuring that all
implementations of the trait will have the correct ABI.
See the modified test for examples of how this works
Add support for raw-dylib with stdcall, fastcall functions
Next stage of work for #58713: allow `extern "stdcall"` and `extern "fastcall"` with `#[link(kind = "raw-dylib")]`.
I've deliberately omitted support for vectorcall, as that doesn't currently work, and I wanted to get this out for review. (I haven't really investigated the vectorcall failure much yet, but at first (very cursory) glance it appears that the problem is elsewhere.)
- Closures in external crates may get compiled in because of
monomorphization. We should store names of captured variables
in `optimized_mir`, so that they are written into the metadata
file and we can use them to generate debuginfo.
- If there are breakpoints inside closures, the names of captured
variables stored in `optimized_mir` can be used to print them.
Now the name is more precise when disjoint fields are captured.
Previously, debuggers print closures as something like
```
y::main::closure-0 (0x7fffffffdd34)
```
The pointer actually references to an upvar. It is not
very obvious, especially for beginners.
It's because upvars don't have names before, as they
are packed into a tuple. This commit names the upvars,
so we can expect to see something like
```
y::main::closure-0 {_captured_ref__b: 0x[...]}
```
The `large_assignments` lints detects moves over specified limit. The
limit is configured through `move_size_limit = "N"` attribute placed at
the root of a crate. When attribute is absent, the lint is disabled.
Make it possible to enable the lint without making any changes to the
source code, through a new flag `-Zmove-size-limit=N`. For example, to
detect moves exceeding 1023 bytes in a cargo crate, including all
dependencies one could use:
```
$ env RUSTFLAGS=-Zmove-size-limit=1024 cargo build -vv
```
Query-ify global limit attribute handling
Currently, we read various 'global limits' from inner attributes the crate root (`recursion_limit`, `move_size_limit`, `type_length_limit`, `const_eval_limit`). These limits are then stored in `Sessions`, allowing them to be access from a `TyCtxt` without registering a dependency on the crate root attributes.
This PR moves the calculation of these global limits behind queries, so that we properly track dependencies on crate root attributes. During the setup of macro expansion (before we've created a `TyCtxt`), we need to access the recursion limit, which is now done by directly calling into the code shared by the normal query implementations.
Hack: Ignore inference variables in certain queries
Fixes#84841Fixes#86753
Some queries are not built to accept types with inference variables, which can lead to ICEs. These queries probably ought to be converted to canonical form, but as a quick workaround, we can return conservative results in the case that inference variables are found.
We should file a follow-up issue (and update the FIXMEs...) to do the proper refactoring.
cc `@arora-aman`
r? `@oli-obk`
Support allocation failures when interpreting MIR
This closes#79601 by handling the case where memory allocation fails during MIR interpretation, and translates that failure into an `InterpError`. The error message is "tried to allocate more memory than available to compiler" to make it clear that the memory shortage is happening at compile-time by the compiler itself, and that it is not a runtime issue.
Now that memory allocation can fail, it would be neat if Miri could simulate low-memory devices to make it easy to see how much memory a Rust program needs.
Note that this breaks Miri because it assumes that allocation can never fail.
Fix ICE when `main` is declared in an `extern` block
Changes in #84401 to implement `imported_main` changed how the crate entry point is found, and a declared `main` in an `extern` block was detected erroneously. This was causing the ICE described in #86110.
This PR adds a check for this case and emits an error instead. Previously a `main` declaration in an `extern` block was not detected as an entry point at all, so emitting an error shouldn't break anything that worked previously. In 1.52.1 stable this is demonstrated, with a `` `main` function not found`` error.
Fixes#86110
Remove unused dependencies from compiler crates
Various compiler crates have dependencies that they don't appear to use. I used some scripting to detect such dependencies, filtered them based on some manual review, and removed those that do indeed appear to be entirely unused.
Introduce -Zprofile-closures to evaluate the impact of 2229
This creates a CSV with name "closure_profile_XXXXX.csv", where the
variable part is the process id of the compiler.
To profile a cargo project you can run one of the following depending on
if you're compiling a library or a binary:
```
cargo +nightly rustc --lib -- -Zprofile-closures
cargo +nightly rustc --bin {binary_name} -- -Zprofile-closures
```
r? `@nikomatsakis`
Change vtable memory representation to use tcx allocated allocations.
This fixes https://github.com/rust-lang/rust/issues/86324. However i suspect there's more to change before it can land.
r? `@bjorn3`
cc `@rust-lang/miri`
Turn non_fmt_panic into a future_incompatible edition lint.
This turns the `non_fmt_panic` lint into a future_incompatible edition lint, so it becomes part of the `rust_2021_compatibility` group. See https://github.com/rust-lang/rust/issues/85894.
This lint produces both warnings about semantical changes (e.g. `panic!("{{")`) and things that will become hard errors (e.g. `panic!("{")`). So I added a `explain_reason: false` that supresses the default "this will become a hard error" or "the semantics will change" message, and instead added a note depending on the situation. (cc `@rylev)`
r? `@nikomatsakis`
This creates a CSV with name "closure_profile_XXXXX.csv", where the
variable part is the process id of the compiler.
To profile a cargo project you can run one of the following depending on
if you're compiling a library or a binary:
```
cargo +stage1 rustc --lib -- -Zprofile-closures
cargo +stage1 rustc --bin -- -Zprofile-closures
```
Use HTTPS links where possible
While looking at #86583, I wondered how many other (insecure) HTTP links were in `rustc`. This changes most other `http` links to `https`. While most of the links are in comments or documentation, there are a few other HTTP links that are used by CI that are changed to HTTPS.
Notes:
- I didn't change any to or in licences
- Some links don't support HTTPS :(
- Some `http` links were dead, in those cases I upgraded them to their new places (all of which used HTTPS)
Disambiguate between SourceFiles from different crates even if they have the same path
This PR fixes an ICE that can occur when the compiler encounters a source file that is part of both the local crate and an upstream crate:
1. While importing source files from an upstream crate the compiler creates a `SourceFile` entry for `foo.rs` in the `SourceMap`. Since this is an imported source file its `src` field is `None`.
2. At a later point the parser encounters `foo.rs` again. It tells the `SourceMap` to load the file but because we already have an entry for `foo.rs` the `SourceMap` will return the existing version with `src == None`.
3. The parser proceeds under the assumption that `src.is_some()` and panics when actually trying to use the file's contents.
This PR fixes the issue by adding the source file's associated `CrateNum` to the `SourceMap`'s interning key. As a consequence the two instances of the file will each have a separate entry in the `SourceMap`. They just happen to share the same file path. This approach seemed less problematic to me than trying to mutate the `SourceFile` after it had already been created.
Another, more involved, approach might be to merge the `src` and the `external_src` field.
Fixes#85955
Fix `unused_unsafe` around `await`
Enables `unused_unsafe` lint for `unsafe { future.await }`.
The existing test for this is `unsafe { println!() }`, so I assume that `println!` used to contain compiler-generated unsafe but this is no longer true, and so the existing test is broken. I replaced the test with `unsafe { ...await }`. I believe `await` is currently the only instance of compiler-generated unsafe.
Reverts some parts of #85421, but the issue predates that PR.