Commit Graph

351 Commits

Author SHA1 Message Date
Nicholas Nethercote
d59b17c5cd Remove Token::uninterpolated_span.
In favour of the similar method on `Parser`, which works on things
other than identifiers and lifetimes.
2025-04-02 06:21:16 +11:00
Nicholas Nethercote
49ed25b5d2 Remove NtExpr and NtLiteral.
Notes about tests:
- tests/ui/rfcs/rfc-2294-if-let-guard/feature-gate.rs: some messages are
  now duplicated due to repeated parsing.

- tests/ui/rfcs/rfc-2497-if-let-chains/disallowed-positions.rs: ditto.

- `tests/ui/proc-macro/macro-rules-derive-cfg.rs`: the diff looks large
  but the only difference is the insertion of a single
  invisible-delimited group around a metavar.

- `tests/ui/attributes/nonterminal-expansion.rs`: a slight span
  degradation, somehow related to the recent massive attr parsing
  rewrite (#135726). I couldn't work out exactly what is going wrong,
  but I don't think it's worth holding things up for a single slightly
  suboptimal error message.
2025-04-02 06:20:35 +11:00
lcnr
d4b8fa9e4c remove feature(inline_const_pat) 2025-03-21 09:35:31 +01:00
Zachary S
f478853f42 If a label is placed on the block of a loop instead of the header, suggest moving it to the header. 2025-03-17 01:59:37 -05:00
bors
aaa2d47dae Auto merge of #138083 - nnethercote:rm-NtItem-NtStmt, r=petrochenkov
Remove `NtItem` and `NtStmt`

Another piece of #124141.

r? `@petrochenkov`
2025-03-12 14:18:36 +00:00
Nicholas Nethercote
141719f68a Remove NtItem and NtStmt.
This involves replacing `nt_pretty_printing_compatibility_hack` with
`stream_pretty_printing_compatibility_hack`.

The handling of statements in `transcribe` is slightly different to
other nonterminal kinds, due to the lack of `from_ast` implementation
for empty statements.

Notable test changes:
- `tests/ui/proc-macro/expand-to-derive.rs`: the diff looks large but
  the only difference is the insertion of a single invisible-delimited
  group around a metavar.
2025-03-07 14:51:07 +11:00
Santiago Pastorino
65d65e5e7e
Parse and allow const use closures 2025-03-06 17:58:34 -03:00
Nicholas Nethercote
2a1e2e9632 Replace ast::TokenKind::BinOp{,Eq} and remove BinOpToken.
`BinOpToken` is badly named, because it only covers the assignable
binary ops and excludes comparisons and `&&`/`||`. Its use in
`ast::TokenKind` does allow a small amount of code sharing, but it's a
clumsy factoring.

This commit removes `ast::TokenKind::BinOp{,Eq}`, replacing each one
with 10 individual variants. This makes `ast::TokenKind` more similar to
`rustc_lexer::TokenKind`, which has individual variants for all
operators.

Although the number of lines of code increases, the number of chars
decreases due to the frequent use of shorter names like `token::Plus`
instead of `token::BinOp(BinOpToken::Plus)`.
2025-03-03 09:26:11 +11:00
Nicholas Nethercote
50076cdeb9 Remove NtPath. 2025-02-28 08:42:14 +11:00
Nicholas Nethercote
7ea59e053b Remove NtMeta.
Note: there was an existing code path involving `Interpolated` in
`MetaItem::from_tokens` that was dead. This commit transfers that to the
new form, but puts an `unreachable!` call inside it.
2025-02-28 08:42:06 +11:00
Nicholas Nethercote
ef1114a964 Remove NtPat.
The one notable test change is `tests/ui/macros/trace_faulty_macros.rs`.
This commit removes the complicated `Interpolated` handling in
`expected_expression_found` that results in a longer error message. But
I think the new, shorter message is actually an improvement.

The original complaint was in #71039, when the error message started
with "error: expected expression, found `1 + 1`". That was confusing
because `1 + 1` is an expression. Other than that, the reporter said
"the whole error message is not too bad if you ignore the first line".

Subsequently, extra complexity and wording was added to the error
message. But I don't think the extra wording actually helps all that
much. In particular, it still says of the `1+1` that "this is expected
to be expression". This repeats the problem from the original complaint!

This commit removes the extra complexity, reverting to a simpler error
message. This is primarily because the traversal is a pain without
`Interpolated` tokens. Nonetheless, I think the error message is
*improved*. It now starts with "expected expression, found `pat`
metavariable", which is much clearer and the real problem. It also
doesn't say anything specific about `1+1`, which is good, because the
`1+1` isn't really relevant to the error -- it's the `$e:pat` that's
important.
2025-02-28 08:36:12 +11:00
Michael Goulet
12e3911d81 Greatly simplify lifetime captures in edition 2024 2025-02-22 22:24:52 +00:00
Nicholas Nethercote
0f490b040a Avoid snapshotting the parser in parse_path_inner. 2025-02-21 16:48:01 +11:00
Nicholas Nethercote
76b04437be Remove NtTy.
Notes about tests:

- tests/ui/parser/macro/trait-object-macro-matcher.rs: the syntax error
  is duplicated, because it occurs now when parsing the decl macro
  input, and also when parsing the expanded decl macro. But this won't
  show up for normal users due to error de-duplication.

- tests/ui/associated-consts/issue-93835.rs: similar, plus there are
  some additional errors about this very broken code.

- The changes to metavariable descriptions in #132629 are now visible in
  error message for several tests.
2025-02-21 15:49:46 +11:00
Nicholas Nethercote
c7981d6411 Remove NtVis.
We now use invisible delimiters for expanded `vis` fragments, instead of
`Token::Interpolated`.
2025-02-21 15:49:44 +11:00
Noratrieb
8a02724b9d Fix const items not being allowed to be called r#move or r#static
Because of an ambiguity with const closures, the parser needs to ensure
that for a const item, the `const` keyword isn't followed by a `move` or
`static` keyword, as that would indicate a const closure:

```rust
fn main() {
  const move // ...
}
```

This check did not take raw identifiers into account, therefore being
unable to distinguish between `const move` and `const r#move`. The
latter is obviously not a const closure, so it should be allowed as a
const item.

This fixes the check in the parser to only treat `const ...` as a const
closure if it's followed by the *proper keyword*, and not a raw
identifier.

Additionally, this adds a large test that tests for all raw identifiers in
all kinds of positions, including `const`, to prevent issues like this
one from occurring again.
2025-02-16 18:21:40 +01:00
Askar Safin
0a21f1d0a2 tree-wide: parallel: Fully removed all Lrc, replaced with Arc 2025-02-03 13:25:57 +03:00
Matthias Krüger
78ded09912
Rollup merge of #135882 - hkBst:master, r=estebank
simplify `similar_tokens` from `Option<Vec<_>>` to `&[_]`

All uses immediately invoke contains, so maybe a further simplification is possible.
2025-01-30 12:45:27 +01:00
Yotam Ofek
8b57fd9e43 use fmt::from_fn in more places, instead of using structs that impl formatting traits 2025-01-24 14:45:56 +00:00
Marijn Schouten
ccb967438d simplify similar_tokens from Option<Vec<_>> to Vec<_> 2025-01-23 11:45:42 +01:00
Josh Stone
aef640a613 Only assert the Parser size on specific arches
The size of this struct depends on the alignment of `u128`, for example
powerpc64le and s390x have align-8 and end up with only 280 bytes. Our
64-bit tier-1 arches are the same though, so let's just assert on those.
2025-01-21 17:24:29 -08:00
Mark Rousskov
6f72f13436 Remove allocations from case-insensitive comparison to keywords 2025-01-11 12:39:44 -05:00
Nicholas Nethercote
0f7dccf784 Fix Parser size assertion on s390x.
For some reason the memory layout is different on s390x.
2024-12-19 20:06:44 +11:00
Nicholas Nethercote
b9bf0b4b10 Speed up Parser::expected_token_types.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.

This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)

The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.

Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.

The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.

The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
d5370d981f Remove bra/ket naming.
This is a naming convention used in a handful of spots in the parser for
delimiters. It confused me when I first saw it a long time ago, and I've
never liked it. A web search says "Bra-ket notation" exists in linear
algebra but the terminology has zero prior use in a programming context,
as far as I can tell.

This commit changes it to `open`/`close`, which is consistent with the
rest of the compiler.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
fb5ba8a6d4 Tweak some parser check/eat methods.
The most significant is `check_keyword`: it now only pushes to
`expected_token_types` if the keyword check fails, which matches how all
the other `check` methods work.

The remainder are just tweaks to make these methods more consistent with
each other.
2024-12-19 16:05:41 +11:00
Nicholas Nethercote
48f7714819 Rename Parser::expected_tokens as Parser::expected_token_types.
Because the `Token` type is similar to but different to the `TokenType`
type, and the difference is important, so we want to avoid confusion.
2024-12-19 16:05:41 +11:00
许杰友 Jieyou Xu (Joe)
477f222b02
Rollup merge of #134161 - nnethercote:overhaul-token-cursors, r=spastorino
Overhaul token cursors

Some nice cleanups here.

r? `````@davidtwco`````
2024-12-18 22:56:53 +08:00
Nicholas Nethercote
2620eb42d7 Re-export more rustc_span::symbol things from rustc_span.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.

This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
2024-12-18 13:38:53 +11:00
Nicholas Nethercote
2903356b2e Overhaul TokenTreeCursor.
- Move it to `rustc_parse`, which is the only crate that uses it. This
  lets us remove all the `pub` markers from it.

- Change `next_ref` and `look_ahead` to `get` and `bump`, which work
  better for the `rustc_parse` uses.

- This requires adding a `TokenStream::get` method, which is simple.

- In `TokenCursor`, we currently duplicate the
  `DelimSpan`/`DelimSpacing`/`Delimiter` from the surrounding
  `TokenTree::Delimited` in the stack. This isn't necessary so long as
  we don't prematurely move past the `Delimited`, and is a small perf
  win on a very hot code path.

- In `parse_token_tree`, we clone the relevant `TokenTree::Delimited`
  instead of constructing an identical one from pieces.
2024-12-18 12:50:22 +11:00
Jonathan Dönszelmann
d50c0a5480
Add hir::Attribute 2024-12-15 19:18:46 +01:00
Oli Scherer
53b2c7cc95 Rename value field to expr to simplify later commits' diffs 2024-12-15 18:47:45 +01:00
Oli Scherer
778321d155 Change AttrArgs::Eq into a struct variant 2024-12-02 10:28:58 +00:00
Nicholas Nethercote
cfafa9380b Add metavariables to TokenDescription.
Pasted metavariables are wrapped in invisible delimiters, which
pretty-print as empty strings, and changing that can break some proc
macros. But error messages saying "expected identifer, found ``" are
bad. So this commit adds support for metavariables in `TokenDescription`
so they print as "metavariable" in error messages, instead of "``".

It's not used meaningfully yet, but will be needed to get rid of
interpolated tokens.
2024-11-21 08:16:55 +11:00
Nicholas Nethercote
afe238f66f Introduce InvisibleOrigin on invisible delimiters.
It's not used meaningfully yet, but will be needed to get rid of
interpolated tokens.
2024-11-21 08:16:54 +11:00
Nicholas Nethercote
99d02fb40f Optimize check_keyword_case.
`to_lowercase` allocates, but `eq_ignore_ascii_case` doesn't. This path
is hot enough that this makes a small but noticeable difference in
benchmarking.
2024-11-13 08:43:47 +11:00
Nicholas Nethercote
a201fab208 Tweak expand_incomplete_parse warning.
By using `token_descr`, as is done for many other errors, we can get
slightly better descriptions in error messages, e.g.
"macro expansion ignores token `let` and any following" becomes
"macro expansion ignores keyword `let` and any tokens following".

This will be more important once invisible delimiters start being
mentioned in error messages -- without this commit, that leads to error
messages such as "error at ``" because invisible delimiters are
pretty printed as an empty string.
2024-10-28 14:12:45 +11:00
Jubilee
515bdcda01
Rollup merge of #130551 - nnethercote:fix-break-last-token, r=petrochenkov
Fix `break_last_token`.

It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams.

r? `@petrochenkov`
2024-09-23 07:54:44 -07:00
Nicholas Nethercote
73cc575177 Fix break_last_token.
It currently doesn't handle the three-char tokens `>>=` and `<<=`
correctly. These can be broken twice, resulting in three individual
tokens. This is a latent bug that currently doesn't cause any problems,
but does cause problems for #124141, because that PR increases the usage
of lazy token streams.
2024-09-23 09:14:30 +10:00
Michael Goulet
c682aa162b Reformat using the new identifier sorting from rustfmt 2024-09-22 19:11:29 -04:00
Pavel Grigorenko
e90e2593ea Parser: recover from ::: to :: 2024-09-21 20:07:52 +03:00
Michael Goulet
af8d911d63 Also fix if in else 2024-09-11 17:24:01 -04:00
bors
6d05f12170 Auto merge of #129346 - nnethercote:fix-double-handling-in-collect_tokens, r=petrochenkov
Fix double handling in `collect_tokens`

Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.

This PR fixes some double handling cases that cause problems, including #129166.

r? `@petrochenkov`
2024-09-08 05:35:23 +00:00
Michael Goulet
97910580aa Add initial support for raw lifetimes 2024-09-06 10:32:48 -04:00
Nicholas Nethercote
1fdabfbebb Avoid double-handling of attributes in collect_tokens.
By keeping track of attributes that have been previously processed.

This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.

Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
2024-08-24 06:57:47 +10:00
Nicholas Nethercote
39b38a94e3 Split the assertion in NodeRange::new. 2024-08-23 14:40:08 +10:00
Nicholas Nethercote
9d31f86f0d Overhaul token collection.
This commit does the following.

- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
  (a) it's annoying long, and (b) the `_trailing_token` bit is less
  accurate now that its types have changed.

- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
  `UsePreAttrPos` in the return type of `f`. These are used in
  `parse_expr_force_collect` (for vanilla expressions) and in
  `parse_stmt_without_recovery` (for two different cases of expression
  statements). Together these ensure are enough to fix all the problems
  with token collection and assoc expressions. The changes to the
  `stringify.rs` test demonstrate some of these.

- Adds a new test. The code in this test was causing an assertion
  failure prior to this commit, due to an invalid `NodeRange`.

The extra complexity is annoying, but necessary to fix the existing
problems.
2024-08-16 09:07:55 +10:00
Nicholas Nethercote
5aaa2f92ee Add an assertion to NodeRange::new. 2024-08-16 09:07:31 +10:00
Nicholas Nethercote
c8098be41f Convert a bool to Trailing.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
2024-08-16 09:07:29 +10:00
Nicholas Nethercote
7923b20dd9 Use impl PartialEq<TokenKind> for Token more.
This lets us compare a `Token` with a `TokenKind`. It's used a lot, but
can be used even more, avoiding the need for some `.kind` uses.
2024-08-14 16:37:09 +10:00