Inline ExprPrecedence::order into Expr::precedence
The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from #119105 and #119427).
Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:
1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`
As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.
The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:
```rust
match self.kind {
ExprKind::Closure(..) => ExprPrecedence::Closure,
...
}
```
although there were exceptions of both many-to-one, and one-to-many:
```rust
ExprKind::Underscore => ExprPrecedence::Path,
ExprKind::Path(..) => ExprPrecedence::Path,
...
ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```
Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.
You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:
- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.
- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.
- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.
This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
There is a not-very-useful layering in the lexer, where
`TokenTreesReader` contains a `StringReader`. This commit combines them
and names the result `Lexer`, which is a more obvious name for it.
The methods of `Lexer` are now split across `mod.rs` and `tokentrees.rs`
which isn't ideal, but it doesn't seem worth moving a bunch of code to
avoid it.
With the insertion of a new chapter 17 on async and await to _The Rust
Programming Language_, references in compiler output to later chapters
need to be updated to avoid confusing users. Redirects exist so that
users who click old links will end up in the right place anyway, but
this way users will be directed to the right URL in the first place.
Current places where `Interpolated` is used are going to change to
instead use invisible delimiters. This prepares for that.
- It adds invisible delimiter cases to the `can_begin_*`/`may_be_*`
methods and the `failed_to_match_macro` that are equivalent to the
existing `Interpolated` cases.
- It adds panics/asserts in some places where invisible delimiters
should never occur.
- In `Parser::parse_struct_fields` it excludes an ident + invisible
delimiter from special consideration in an error message, because
that's quite different to an ident + paren/brace/bracket.
Pasted metavariables are wrapped in invisible delimiters, which
pretty-print as empty strings, and changing that can break some proc
macros. But error messages saying "expected identifer, found ``" are
bad. So this commit adds support for metavariables in `TokenDescription`
so they print as "metavariable" in error messages, instead of "``".
It's not used meaningfully yet, but will be needed to get rid of
interpolated tokens.
It was added in #123752 to handle some cases involving emoji, but it
isn't necessary because it's always treated the same as
`TokenKind::InvalidIdent`. This commit removes it, which makes things a
little simpler.
Trim whitespace in RemoveLet primary span
Separate `RemoveLet` span into primary span for `let` and removal suggestion span for `let `, so that primary span does not include whitespace.
Fixes: #133031
If a macro statement has been parsed after `else`, suggest a missing `if`:
```
error: expected `{`, found `falsy`
--> $DIR/else-no-if.rs:47:12
|
LL | } else falsy! {} {
| ---- ^^^^^
| |
| expected an `if` or a block after this `else`
|
help: add an `if` if this is the condition of a chained `else if` statement
|
LL | } else if falsy! {} {
| ++
```
Look at the expression that was parsed when trying to recover from a bad `if` condition to determine what was likely intended by the user beyond "maybe this was meant to be an `else` body".
```
error: expected `{`, found `map`
--> $DIR/missing-dot-on-if-condition-expression-fixable.rs:4:30
|
LL | for _ in [1, 2, 3].iter()map(|x| x) {}
| ^^^ expected `{`
|
help: you might have meant to write a method call
|
LL | for _ in [1, 2, 3].iter().map(|x| x) {}
| +
```
Separate `RemoveLet` span into primary span for `let` and removal
suggestion span for `let `, so that primary span does not include
whitespace.
Fixes: #133031
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
`to_lowercase` allocates, but `eq_ignore_ascii_case` doesn't. This path
is hot enough that this makes a small but noticeable difference in
benchmarking.
Enforce that raw lifetimes must be valid raw identifiers
Make sure that the identifier part of a raw lifetime is a valid raw identifier. This precludes `'r#_` and all module segment paths for now.
I don't believe this is compelling to support. This was raised by `@ehuss` in https://github.com/rust-lang/reference/pull/1603#discussion_r1822726753 (well, specifically the `'r#_` case), but I don't see why we shouldn't just make it consistent with raw identifiers.
Use `token_descr` more in error messages
This is the first two commits from #124141, put into their own PR to get things rolling. Commit messages have the details.
r? ``@estebank``
cc ``@petrochenkov``
By using `token_descr`, as is done for many other errors, we can get
slightly better descriptions in error messages, e.g.
"macro expansion ignores token `let` and any following" becomes
"macro expansion ignores keyword `let` and any tokens following".
This will be more important once invisible delimiters start being
mentioned in error messages -- without this commit, that leads to error
messages such as "error at ``" because invisible delimiters are
pretty printed as an empty string.
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `&pin (mut|const) T` type position sugar
This adds parser support for `&pin mut T` and `&pin const T` references. These are desugared to `Pin<&mut T>` and `Pin<&T>` in the AST lowering phases.
This PR currently includes #130526 since that one is in the commit queue. Only the most recent commits (bd450027eb4a94b814a7dd9c0fa29102e6361149 and following) are new.
Tracking:
- #130494
r? `@compiler-errors`
Add suggestion for removing invalid path sep `::` in fn def
Add suggestion for removing invalid path separator `::` in function definition.
for example: `fn invalid_path_separator::<T>() {}`
fixes#130791
Retire the `unnamed_fields` feature for now
`#![feature(unnamed_fields)]` was implemented in part in #115131 and #115367, however work on that feature has (afaict) stalled and in the mean time there have been some concerns raised (e.g.[^1][^2]) about whether `unnamed_fields` is worthwhile to have in the language, especially in its current desugaring. Because it represents a compiler implementation burden including a new kind of anonymous ADT and additional complication to field selection, and is quite prone to bugs today, I'm choosing to remove the feature.
However, since I'm not one to really write a bunch of words, I'm specifically *not* going to de-RFC this feature. This PR essentially *rolls back* the state of this feature to "RFC accepted but not yet implemented"; however if anyone wants to formally unapprove the RFC from the t-lang side, then please be my guest. I'm just not totally willing to summarize the various language-facing reasons for why this feature is or is not worthwhile, since I'm coming from the compiler side mostly.
Fixes#117942Fixes#121161Fixes#121263Fixes#121299Fixes#121722Fixes#121799Fixes#126969Fixes#131041
Tracking:
* https://github.com/rust-lang/rust/issues/49804
[^1]: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Unnamed.20struct.2Funion.20fields
[^2]: https://github.com/rust-lang/rust/issues/49804#issuecomment-1972619108
```
error: expected a pattern, found an expression
--> f889.rs:3:13
|
3 | let (x, y.drop()) = (1, 2); //~ ERROR
| ^^^^^^^^ not a pattern
|
= note: arbitrary expressions are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
error[E0532]: expected a pattern, found a function call
--> f889.rs:2:13
|
2 | let (x, drop(y)) = (1, 2); //~ ERROR
| ^^^^ not a tuple struct or tuple variant
|
= note: function calls are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
```
Fix#97200.
Implement RFC3695 Allow boolean literals as cfg predicates
This PR implements https://github.com/rust-lang/rfcs/pull/3695: allow boolean literals as cfg predicates, i.e. `cfg(true)` and `cfg(false)`.
r? `@nnethercote` *(or anyone with parser knowledge)*
cc `@clubby789`