StructField -> FieldDef ("field definition")
Field -> ExprField ("expression field", not "field expression")
FieldPat -> PatField ("pattern field", not "field pattern")
Also rename visiting and other methods working on them.
Avoid sorting predicates by `DefId`
Fixes issue #82920
Even if an item does not change between compilation sessions, it may end
up with a different `DefId`, since inserting/deleting an item affects
the `DefId`s of all subsequent items. Therefore, we use a `DefPathHash`
in the incremental compilation system, which is stable in the face of
changes to unrelated items.
In particular, the query system will consider the inputs to a query to
be unchanged if any `DefId`s in the inputs have their `DefPathHash`es
unchanged. Queries are pure functions, so the query result should be
unchanged if the query inputs are unchanged.
Unfortunately, it's possible to inadvertantly make a query result
incorrectly change across compilations, by relying on the specific value
of a `DefId`. Specifically, if the query result is a slice that gets
sorted by `DefId`, the precise order will depend on how the `DefId`s got
assigned in a particular compilation session. If some definitions end up
with different `DefId`s (but the same `DefPathHash`es) in a subsequent
compilation session, we will end up re-computing a *different* value for
the query, even though the query system expects the result to unchanged
due to the unchanged inputs.
It turns out that we have been sorting the predicates computed during
`astconv` by their `DefId`. These predicates make their way into the
`super_predicates_that_define_assoc_type`, which ends up getting used to
compute the vtables of trait objects. This, re-ordering these predicates
between compilation sessions can lead to undefined behavior at runtime -
the query system will re-use code built with a *differently ordered*
vtable, resulting in the wrong method being invoked at runtime.
This PR avoids sorting by `DefId` in `astconv`, fixing the
miscompilation. However, it's possible that other instances of this
issue exist - they could also be easily introduced in the future.
To fully fix this issue, we should
1. Turn on `-Z incremental-verify-ich` by default. This will cause the
compiler to ICE whenver an 'unchanged' query result changes between
compilation sessions, instead of causing a miscompilation.
2. Remove the `Ord` impls for `CrateNum` and `DefId`. This will make it
difficult to introduce ICEs in the first place.
or-patterns: disallow in `let` bindings
~~Blocked on https://github.com/rust-lang/rust/pull/81869~~
Disallows top-level or-patterns before type ascription. We want to reserve this syntactic space for possible future generalized type ascription.
r? ``@petrochenkov``
Suggest character encoding is incorrect when encountering random null bytes
This adds a note whenever null bytes are seen at the start of a token unexpectedly, since those tend to come from UTF-16 encoded files without a [BOM](https://en.wikipedia.org/wiki/Byte_order_mark) (if a UTF-16 BOM appears it won't be valid UTF-8, but if there is no BOM it be both valid UTF-16 and valid but garbled UTF-8). This approach was suggested in https://github.com/rust-lang/rust/issues/73979#issuecomment-653976451.
Closes#73979.
This adds recovery when in array type syntax user writes
[X; Y<Z, ...>]
instead of
[X; Y::<Z, ...>]
Fixes#82566
Note that whenever we parse an expression and know that the next token
cannot be `,`, we should be calling
check_mistyped_turbofish_with_multiple_type_params for this recovery.
Previously we only did this for statement parsing (e.g. `let x = f<a,
b>;`). We now also do it when parsing the length field in array type
syntax.
check_mistyped_turbofish_with_multiple_type_params was previously
expecting type arguments between angle brackets, which is not right, as
we can also see const expressions. We now use generic argument parser
instead of type parser.
Test with one, two, and three generic arguments added to check
consistentcy between
1. check_no_chained_comparison: Called after parsing a nested binop
application like `x < A > ...` where angle brackets are interpreted as
binary operators and `A` is an expression.
2. check_mistyped_turbofish_with_multiple_type_params: called by
`parse_full_stmt` when we expect to see a semicolon after parsing an
expression but don't see it.
(In `T2<1, 2>::C;`, the expression is `T2 < 1`)
When token-based attribute handling is implemeneted in #80689,
we will need to access tokens from `HasAttrs` (to perform
cfg-stripping), and we will to access attributes from `HasTokens` (to
construct a `PreexpTokenStream`).
This PR merges the `HasAttrs` and `HasTokens` traits into a new
`AstLike` trait. The previous `HasAttrs` impls from `Vec<Attribute>` and `AttrVec`
are removed - they aren't attribute targets, so the impls never really
made sense.
Improve suggestion for tuple struct pattern matching errors.
Closes#80174
This change allows numbers to be parsed as field names when pattern matching on structs, which allows us to provide better error messages when tuple structs are matched using a struct pattern.
r? ``@estebank``
ast: Keep expansion status for out-of-line module items
I.e. whether a module `mod foo;` is already loaded from a file or not.
This is a pre-requisite to correctly treating inner attributes on such modules (https://github.com/rust-lang/rust/issues/81661).
With this change AST structures for `mod` items diverge even more for AST structure for the crate root, which previously used `ast::Mod`.
Therefore this PR removes `ast::Mod` from `ast::Crate` in the first commit, these two things are sufficiently different from each other, at least at syntactic level.
Customization points for visiting a "`mod` item or crate root" were also removed from AST visitors (`fn visit_mod`).
`ast::Mod` itself was refactored away in the second commit in favor of `ItemKind::Mod(Unsafe, ModKind)`.
Crate root is sufficiently different from `mod` items, at least at syntactic level.
Also remove customization point for "`mod` item or crate root" from AST visitors.
Along the way, we also implement a handful of diagnostics improvements
and fixes, particularly with respect to the special handling of `||` in
place of `|` and when there are leading verts in function params, which
don't allow top-level or-patterns anyway.
Rollup of 11 pull requests
Successful merges:
- #80523 (#[doc(inline)] sym_generated)
- #80920 (Visit more targets when validating attributes)
- #81720 (Updated smallvec version due to RUSTSEC-2021-0003)
- #81891 ([rustdoc-json] Make `header` a vec of modifiers, and FunctionPointer consistent)
- #81912 (Implement the precise analysis pass for lint `disjoint_capture_drop_reorder`)
- #81914 (Fixing bad suggestion for `_` in `const` type when a function #81885)
- #81919 (BTreeMap: fix internal comments)
- #81927 (Add a regression test for #32498)
- #81965 (Fix MIR pretty printer for non-local DefIds)
- #82029 (Use debug log level for developer oriented logs)
- #82056 (fix ice (#82032))
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
This is a pure refactoring split out from #80689.
It represents the most invasive part of that PR, requiring changes in
every caller of `parse_outer_attributes`
In order to eagerly expand `#[cfg]` attributes while preserving the
original `TokenStream`, we need to know the range of tokens that
corresponds to every attribute target. This is accomplished by making
`parse_outer_attributes` return an opaque `AttrWrapper` struct. An
`AttrWrapper` must be converted to a plain `AttrVec` by passing it to
`collect_tokens_trailing_token`. This makes it difficult to accidentally
construct an AST node with attributes without calling `collect_tokens_trailing_token`,
since AST nodes store an `AttrVec`, not an `AttrWrapper`.
As a result, we now call `collect_tokens_trailing_token` for attribute
targets which only support inert attributes, such as generic arguments
and struct fields. Currently, the constructed `LazyTokenStream` is
simply discarded. Future PRs will record the token range corresponding
to the attribute target, allowing those tokens to be removed from an
enclosing `collect_tokens_trailing_token` call if necessary.
The panic happens when in recovery parsing a full `impl`
(`parse_item_impl`) fails and we drop the `DiagnosticBuilder` for the
recovery suggestion and return the `parse_item_impl` error.
We now raise the original error "expected identifier found `impl`" when
parsing the `impl` fails.
Note that the regression test is slightly simplified version of the
original repro in #81806, to make the error output smaller and more
resilient to unrelated changes in parser error messages.
Fixes#81806
Box the biggest ast::ItemKind variants
This PR is a different approach on https://github.com/rust-lang/rust/pull/81400, aiming to save memory in humongous ASTs.
The three affected item kind enums are:
- `ast::ItemKind` (208 -> 112 bytes)
- `ast::AssocItemKind` (176 -> 72 bytes)
- `ast::ForeignItemKind` (176 -> 72 bytes)
Improve handling of spans around macro result parse errors
Fixes#81543
After we expand a macro, we try to parse the resulting tokens as a AST
node. This commit makes several improvements to how we handle spans when
an error occurs:
* Only ovewrite the original `Span` if it's a dummy span. This preserves
a more-specific span if one is available.
* Use `self.prev_token` instead of `self.token` when emitting an error
message after encountering EOF, since an EOF token always has a dummy
span
* Make `SourceMap::next_point` leave dummy spans unused. A dummy span
does not have a logical 'next point', since it's a zero-length span.
Re-using the span span preserves its 'dummy-ness' for other checks
Fixes#81543
After we expand a macro, we try to parse the resulting tokens as a AST
node. This commit makes several improvements to how we handle spans when
an error occurs:
* Only ovewrite the original `Span` if it's a dummy span. This preserves
a more-specific span if one is available.
* Use `self.prev_token` instead of `self.token` when emitting an error
message after encountering EOF, since an EOF token always has a dummy
span
* Make `SourceMap::next_point` leave dummy spans unused. A dummy span
does not have a logical 'next point', since it's a zero-length span.
Re-using the span span preserves its 'dummy-ness' for other checks
Clone entire `TokenCursor` when collecting tokens
Reverts PR #80830Fixestaiki-e/pin-project#312
We can have an arbitrary number of `None`-delimited group frames pushed
on the stack due to proc-macro invocations, which can legally be exited.
Attempting to account for this would add a lot of complexity for a tiny
performance gain, so let's just use the original strategy.
Reverts PR #80830Fixestaiki-e/pin-project#312
We can have an arbitrary number of `None`-delimited group frames pushed
on the stack due to proc-macro invocations, which can legally be exited.
Attempting to account for this would add a lot of complexity for a tiny
performance gain, so let's just use the original strategy.
Improve diagnostics when parsing angle args
https://github.com/rust-lang/rust/pull/79266 introduced parsing of generic arguments in associated type constraints, this however resulted in possibly very confusing error messages in cases in which closing angle brackets were missing such as in `Vec<(u32, _, _) = vec![]`, which outputs an incorrectly parsed equality constraint error, as noted by `@cynecx.`
This PR tries to provide better error messages in such cases.
r? `@petrochenkov`
Refactor token collection to capture trailing token immediately
Split out from https://github.com/rust-lang/rust/pull/80689 - when we start capturing more information about attribute targets, we'll need to know in advance if we're capturing a trailing token or not.
r? `@ghost`
Currently, when a user uses a struct pattern to pattern match on
a tuple struct, the errors we emit generally suggest adding fields
using their field names, which are numbers. However, numbers are
not valid identifiers, so the suggestions, which use the shorthand
notation, are not valid syntax. This commit changes those errors
to suggest using the actual tuple struct pattern syntax instead,
which is a more actionable suggestion.
Fixes#81007
Previously, we would fail to collect tokens in the proper place when
only builtin attributes were present. As a result, we would end up with
attribute tokens in the collected `TokenStream`, leading to duplication
when we attempted to prepend the attributes from the AST node.
We now explicitly track when token collection must be performed due to
nomterminal parsing.
Set tokens on AST node in `collect_tokens`
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
Rework diagnostics for wrong number of generic args (fixes#66228 and #71924)
This PR reworks the `wrong number of {} arguments` message, so that it provides more details and contextual hints.
Suggest async {} for async || {}
Fixes#76011
This adds support for adding help diagnostics to the feature gating checks and
then uses it for the async_closure gate to add the extra bit of help
information as described in the issue.
We will never need to pop past our starting frame during token
capturing. Using an empty stack allows us to avoid pointless heap
allocations/deallocations.
- Adds optional default values to const generic parameters in the AST
and HIR
- Parses these optional default values
- Adds a `const_generics_defaults` feature gate
Implemented a compiler diagnostic for move async mistake
Fixes#79694
First time contributing, so I hope I'm doing everything right.
(If not, please correct me!)
This code performs a check when a move capture clause is parsed. The check is to detect if the user has reversed the async move keywords and to provide a diagnostic with a suggestion to fix it.
Checked code:
```rust
fn main() {
move async { };
}
```
Previous output:
```txt
PS C:\Repos\move_async_test> cargo build
Compiling move_async_test v0.1.0 (C:\Repos\move_async_test)
error: expected one of `|` or `||`, found keyword `async`
--> src\main.rs:2:10
|
2 | move async { };
| ^^^^^ expected one of `|` or `||`
error: aborting due to previous error
error: could not compile `move_async_test`
```
New output:
```txt
PS C:\Repos\move_async_test> cargo +dev build
Compiling move_async_test v0.1.0 (C:\Repos\move_async_test)
error: the order of `move` and `async` is incorrect
--> src\main.rs:2:13
|
2 | let _ = move async { };
| ^^^^^^^^^^
|
help: try switching the order
|
2 | let _ = async move { };
| ^^^^^^^^^^
error: aborting due to previous error
error: could not compile `move_async_test`
```
Is there a file/module where these kind of things are tested?
Would love some feedback 😄
Ran the tidy check
Following the diagnostic guide better
Diagnostic generation is now relegated to its own function in the diagnostics module.
Added tests
Fixed the ui test
Properly capture trailing 'unglued' token
If we try to capture the `Vec<u8>` in `Option<Vec<u8>>`, we'll
need to capture a `>` token which was 'unglued' from a `>>` token.
The processing of unglueing a token for parsing purposes bypasses the
usual capturing infrastructure, so we currently lose the trailing `>`.
As a result, we fall back to the reparsed `TokenStream`, causing us to
lose spans.
This commit makes token capturing keep track of a trailing 'unglued'
token. Note that we don't need to care about unglueing except at the end
of the captured tokens - if we capture both the first and second unglued
tokens, then we'll end up capturing the full 'glued' token, which
already works correctly.
Recover on `const impl<> X for Y`
`@leonardo-m` mentioned that `const impl Foo for Bar` could be recovered from in #79287.
I'm not sure about the error strings as they are, I think it should probably be something like the error that `expected_one_of_not_found` makes + the suggestion to flip the keywords, but I'm not sure how exactly to do that. Also, I decided not to try to handle `const unsafe impl` or `unsafe const impl` cause I figured that `unsafe impl const` would be pretty rare anyway (if it's even valid?), and it wouldn't be worth making the code more messy.
If we try to capture the `Vec<u8>` in `Option<Vec<u8>>`, we'll
need to capture a `>` token which was 'unglued' from a `>>` token.
The processing of unglueing a token for parsing purposes bypasses the
usual capturing infrastructure, so we currently lose the trailing `>`.
As a result, we fall back to the reparsed `TokenStream`, causing us to
lose spans.
This commit makes token capturing keep track of a trailing 'unglued'
token. Note that we don't need to care about unglueing except at the end
of the captured tokens - if we capture both the first and second unglued
tokens, then we'll end up capturing the full 'glued' token, which
already works correctly.
Update error to reflect that integer literals can have float suffixes
For example, `1` is parsed as an integer literal, but it can be turned
into a float with the suffix `f32`. Now the error calls them "numeric
literals" and notes that you can add a float suffix since they can be
either integers or floats.
rustc_parse: fix ConstBlock expr span
The span for a ConstBlock expression should presumably run through the end of the block it contains and not stop at the keyword, just like is done with similar block-containing expression kinds, such as a TryBlock
Properly handle attributes on statements
We now collect tokens for the underlying node wrapped by `StmtKind`
nstead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
For example, `1` is parsed as an integer literal, but it can be turned
into a float with the suffix `f32`. Now the error calls them "numeric
literals" and notes that you can add a float suffix since they can be
either integers or floats.
When parsing a statement (e.g. inside a function body),
we now consider `struct Foo {};` and `$stmt;` to each consist
of two statements: `struct Foo {}` and `;`, and `$stmt` and `;`.
As a result, an attribute macro invoke as
`fn foo() { #[attr] struct Bar{}; }` will see `struct Bar{}` as its
input. Additionally, the 'unused semicolon' lint now fires in more
places.
We now collect tokens for the underlying node wrapped by `StmtKind`
instead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
Cache pretty-print/retokenize result to avoid compile time blowup
Fixes#79242
If a `macro_rules!` recursively builds up a nested nonterminal
(passing it to a proc-macro at each step), we will end up repeatedly
pretty-printing/retokenizing the same nonterminals. Unfortunately, the
'probable equality' check we do has a non-trivial cost, which leads to a
blowup in compilation time.
As a workaround, we cache the result of the 'probable equality' check,
which eliminates the compilation time blowup for the linked issue. This
commit only touches a single file (other than adding tests), so it
should be easy to backport.
The proper solution is to remove the pretty-print/retokenize hack
entirely. However, this will almost certainly break a large number of
crates that were relying on hygiene bugs created by using the reparsed
`TokenStream`. As a result, we will definitely not want to backport
such a change.
Fixes#79242
If a `macro_rules!` recursively builds up a nested nonterminal
(passing it to a proc-macro at each step), we will end up repeatedly
pretty-printing/retokenizing the same nonterminals. Unfortunately, the
'probable equality' check we do has a non-trivial cost, which leads to a
blowup in compilation time.
As a workaround, we cache the result of the 'probable equality' check,
which eliminates the compilation time blowup for the linked issue. This
commit only touches a single file (other than adding tests), so it
should be easy to backport.
The proper solution is to remove the pretty-print/retokenize hack
entirely. However, this will almost certainly break a large number of
crates that were relying on hygiene bugs created by using the reparsed
`TokenStream`. As a result, we will definitely not want to backport
such a change.
Make `_` an expression, to discard values in destructuring assignments
This is the third and final step towards implementing destructuring assignment (RFC: rust-lang/rfcs#2909, tracking issue: #71126). This PR is the third and final part of #71156, which was split up to allow for easier review.
With this PR, an underscore `_` is parsed as an expression but is allowed *only* on the left-hand side of a destructuring assignment. There it simply discards a value, similarly to the wildcard `_` in patterns. For instance,
```rust
(a, _) = (1, 2)
```
will simply assign 1 to `a` and discard the 2. Note that for consistency,
```
_ = foo
```
is also allowed and equivalent to just `foo`.
Thanks to ````@varkor```` who helped with the implementation, particularly around pre-expansion gating.
r? ````@petrochenkov````
rustc_parse: Remove optimization for 0-length streams in `collect_tokens`
The optimization conflates empty token streams with unknown token stream, which is at least suspicious, and doesn't affect performance because 0-length token streams are very rare.
r? `@Aaron1011`
The optimization conflates empty token streams with unknown token stream, which is at least suspicious, and doesn't affect performance because 0-length token streams are very rare.
Implement destructuring assignment for structs and slices
This is the second step towards implementing destructuring assignment (RFC: rust-lang/rfcs#2909, tracking issue: #71126). This PR is the second part of #71156, which was split up to allow for easier review.
Note that the first PR (#78748) is not merged yet, so it is included as the first commit in this one. I thought this would allow the review to start earlier because I have some time this weekend to respond to reviews. If ``@petrochenkov`` prefers to wait until the first PR is merged, I totally understand, of course.
This PR implements destructuring assignment for (tuple) structs and slices. In order to do this, the following *parser change* was necessary: struct expressions are not required to have a base expression, i.e. `Struct { a: 1, .. }` becomes legal (in order to act like a struct pattern).
Unfortunately, this PR slightly regresses the diagnostics implemented in #77283. However, it is only a missing help message in `src/test/ui/issues/issue-77218.rs`. Other instances of this diagnostic are not affected. Since I don't exactly understand how this help message works and how to fix it yet, I was hoping it's OK to regress this temporarily and fix it in a follow-up PR.
Thanks to ``@varkor`` who helped with the implementation, particularly around the struct rest changes.
r? ``@petrochenkov``
Do not collect tokens for doc comments
Doc comment is a single token and AST has all the information to re-create it precisely.
Doc comments are also responsible for majority of calls to `collect_tokens` (with `num_calls == 1` and `num_calls == 0`, cc https://github.com/rust-lang/rust/pull/78736).
(I also moved token collection into `fn parse_attribute` to deduplicate code a bit.)
r? `@Aaron1011`
rustc_ast: Do not panic by default when visiting macro calls
Panicking by default made sense when we didn't have HIR or MIR and everything worked on AST, but now all AST visitors run early and majority of them have to deal with macro calls, often by ignoring them.
The second commit renames `visit_mac` to `visit_mac_call`, the corresponding structures were renamed earlier in https://github.com/rust-lang/rust/pull/69589.
Fixes#78675
We now bail out of `prepend_attrs` if we ended up capturing any inner
attributes (which can happen in several places, due to token capturing
for `macro_rules!` arguments.
Adjust turbofish help message for const generics
Types are no longer special. (This message arguably only makes sense with `min_const_generics` or more, but we'll be there soon.)
r? @lcnr
Tweak invalid `fn` header and body parsing
* Rely on regular "expected"/"found" parser error for `fn`, fix#77115
* Recover empty `fn` bodies when encountering `}`
* Recover trailing `>` in return types
* Recover from non-type in array type `[<BAD TOKEN>; LEN]`
Suggest that expressions that look like const generic arguments should be enclosed in brackets
I pulled out the changes for const expressions from https://github.com/rust-lang/rust/pull/71592 (without the trait object diagnostic changes) and made some small changes; the implementation is `@estebank's.`
We're also going to want to make some changes separately to account for trait objects (they result in poor diagnostics, as is evident from one of the test cases here), such as an adaption of https://github.com/rust-lang/rust/pull/72273.
Fixes https://github.com/rust-lang/rust/issues/70753.
r? `@petrochenkov`