Set tokens on AST node in `collect_tokens`
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
rustc_parse: Better spans for synthesized token streams
I think using the nonterminal span for synthesizing its tokens is a better approximation than using `DUMMY_SP` or the attribute span like #79472 did in `expand.rs`.
r? `@Aaron1011`
- Adds optional default values to const generic parameters in the AST
and HIR
- Parses these optional default values
- Adds a `const_generics_defaults` feature gate
Properly handle attributes on statements
We now collect tokens for the underlying node wrapped by `StmtKind`
nstead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
We now collect tokens for the underlying node wrapped by `StmtKind`
instead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
Move lev_distance to rustc_ast, make non-generic
rustc_ast currently has a few dependencies on rustc_lexer. Ideally, an AST
would not have any dependency its lexer, for minimizing
design-time dependencies. Breaking this dependency would also have practical
benefits, since modifying rustc_lexer would not trigger a rebuild of rustc_ast.
This commit does not remove the rustc_ast --> rustc_lexer dependency,
but it does remove one of the sources of this dependency, which is the
code that handles fuzzy matching between symbol names for making suggestions
in diagnostics. Since that code depends only on Symbol, it is easy to move
it to rustc_span. It might even be best to move it to a separate crate,
since other tools such as Cargo use the same algorithm, and have simply
contain a duplicate of the code.
This changes the signature of find_best_match_for_name so that it is no
longer generic over its input. I checked the optimized binaries, and this
function was duplicated for nearly every call site, because most call sites
used short-lived iterator chains, generic over Map and such. But there's
no good reason for a function like this to be generic, since all it does
is immediately convert the generic input (the Iterator impl) to a concrete
Vec<Symbol>. This has all of the costs of generics (duplicated method bodies)
with no benefit.
Changing find_best_match_for_name to be non-generic removed about 10KB of
code from the optimized binary. I know it's a drop in the bucket, but we have
to start reducing binary size, and beginning to tame over-use of generics
is part of that.
rustc_ast currently has a few dependencies on rustc_lexer. Ideally, an AST
would not have any dependency its lexer, for minimizing unnecessarily
design-time dependencies. Breaking this dependency would also have practical
benefits, since modifying rustc_lexer would not trigger a rebuild of rustc_ast.
This commit does not remove the rustc_ast --> rustc_lexer dependency,
but it does remove one of the sources of this dependency, which is the
code that handles fuzzy matching between symbol names for making suggestions
in diagnostics. Since that code depends only on Symbol, it is easy to move
it to rustc_span. It might even be best to move it to a separate crate,
since other tools such as Cargo use the same algorithm, and have simply
contain a duplicate of the code.
This changes the signature of find_best_match_for_name so that it is no
longer generic over its input. I checked the optimized binaries, and this
function was duplicated at nearly every call site, because most call sites
used short-lived iterator chains, generic over Map and such. But there's
no good reason for a function like this to be generic, since all it does
is immediately convert the generic input (the Iterator impl) to a concrete
Vec<Symbol>. This has all of the costs of generics (duplicated method bodies)
with no benefit.
Changing find_best_match_for_name to be non-generic removed about 10KB of
code from the optimized binary. I know it's a drop in the bucket, but we have
to start reducing binary size, and beginning to tame over-use of generics
is part of that.
Make `_` an expression, to discard values in destructuring assignments
This is the third and final step towards implementing destructuring assignment (RFC: rust-lang/rfcs#2909, tracking issue: #71126). This PR is the third and final part of #71156, which was split up to allow for easier review.
With this PR, an underscore `_` is parsed as an expression but is allowed *only* on the left-hand side of a destructuring assignment. There it simply discards a value, similarly to the wildcard `_` in patterns. For instance,
```rust
(a, _) = (1, 2)
```
will simply assign 1 to `a` and discard the 2. Note that for consistency,
```
_ = foo
```
is also allowed and equivalent to just `foo`.
Thanks to ````@varkor```` who helped with the implementation, particularly around pre-expansion gating.
r? ````@petrochenkov````
Implement destructuring assignment for structs and slices
This is the second step towards implementing destructuring assignment (RFC: rust-lang/rfcs#2909, tracking issue: #71126). This PR is the second part of #71156, which was split up to allow for easier review.
Note that the first PR (#78748) is not merged yet, so it is included as the first commit in this one. I thought this would allow the review to start earlier because I have some time this weekend to respond to reviews. If ``@petrochenkov`` prefers to wait until the first PR is merged, I totally understand, of course.
This PR implements destructuring assignment for (tuple) structs and slices. In order to do this, the following *parser change* was necessary: struct expressions are not required to have a base expression, i.e. `Struct { a: 1, .. }` becomes legal (in order to act like a struct pattern).
Unfortunately, this PR slightly regresses the diagnostics implemented in #77283. However, it is only a missing help message in `src/test/ui/issues/issue-77218.rs`. Other instances of this diagnostic are not affected. Since I don't exactly understand how this help message works and how to fix it yet, I was hoping it's OK to regress this temporarily and fix it in a follow-up PR.
Thanks to ``@varkor`` who helped with the implementation, particularly around the struct rest changes.
r? ``@petrochenkov``
Do not collect tokens for doc comments
Doc comment is a single token and AST has all the information to re-create it precisely.
Doc comments are also responsible for majority of calls to `collect_tokens` (with `num_calls == 1` and `num_calls == 0`, cc https://github.com/rust-lang/rust/pull/78736).
(I also moved token collection into `fn parse_attribute` to deduplicate code a bit.)
r? `@Aaron1011`
The discussion seems to have resolved that this lint is a bit "noisy" in
that applying it in all places would result in a reduction in
readability.
A few of the trivial functions (like `Path::new`) are fine to leave
outside of closures.
The general rule seems to be that anything that is obviously an
allocation (`Box`, `Vec`, `vec![]`) should be in a closure, even if it
is a 0-sized allocation.
rustc_ast: Do not panic by default when visiting macro calls
Panicking by default made sense when we didn't have HIR or MIR and everything worked on AST, but now all AST visitors run early and majority of them have to deal with macro calls, often by ignoring them.
The second commit renames `visit_mac` to `visit_mac_call`, the corresponding structures were renamed earlier in https://github.com/rust-lang/rust/pull/69589.
rustc_ast: Visit tokens stored in AST nodes in mutable visitor
After #77271 token visiting is enabled only for one visitor in `rustc_expand\src\mbe\transcribe.rs` which applies hygiene marks to tokens produced by declarative macros (`macro_rules` or `macro`), so this change doesn't affect anything else.
When a macro has some interpolated token from an outer macro in its output
```rust
macro inner() {
$interpolated
}
```
we can use the usual interpretation of interpolated tokens in token-based model - a None-delimited group - to write this macro in an equivalent form
```rust
macro inner() {
⟪ a b c d ⟫
}
```
When we are expanding the macro `inner` we need to apply hygiene marks to all tokens produced by it, including the tokens inside the group.
Before this PR we did this by visiting the AST piece inside the interpolated token and applying marks to all spans in it.
I'm not sure this is 100% correct (ideally we should apply the marks to tokens and then re-parse the AST from tokens), but it's a very good approximation at least.
We didn't however apply the marks to actual tokens stored in the nonterminal, so if we used the nonterminal as a token rather than as an AST piece (e.g. passed it to a proc macro), then we got hygiene bugs.
This PR applies the marks to tokens in addition to the AST pieces thus fixing the issue.
r? `@Aaron1011`
Originally, there has been a dedicated pass for renumbering
AST NodeIds to have actual values. This pass had been added by
commit a5ad4c3794.
Then, later, this step was moved to where it resides now,
macro expansion. See commit c86c8d41a2
or PR #36438.
The comment snippet, added by the original commit, has
survived the times without any change, becoming outdated
at removal of the dedicated pass.
Nowadays, grepping for the next_node_id function will show up
multiple places in the compiler that call it, but the main
rewriting that the comment talks about is still done in the
expansion step, inside an innocious looking visit_id function
that's called during macro invocation collection.