Point at invalid utf-8 span on user's source code
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: byte `193` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character.
Fix#76869.
Properly record metavar spans for other expansions other than TT
This properly records metavar spans for nonterminals other than tokentree. This means that we operations like `span.to(other_span)` work correctly for macros. As you can see, other diagnostics involving metavars have improved as a result.
Fixes#132908
Alternative to #133270
cc `@ehuss`
cc `@petrochenkov`
The size of this struct depends on the alignment of `u128`, for example
powerpc64le and s390x have align-8 and end up with only 280 bytes. Our
64-bit tier-1 arches are the same though, so let's just assert on those.
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: `[193]` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character.
Fix#76869.
Fix parenthesization of chained comparisons by pretty-printer
Example:
```rust
macro_rules! repro {
() => {
1 < 2
};
}
fn main() {
let _ = repro!() == false;
}
```
Previously `-Zunpretty=expanded` would pretty-print this syntactically invalid output: `fn main() { let _ = 1 < 2 == false; }`
```console
error: comparison operators cannot be chained
--> <anon>:8:23
|
8 | fn main() { let _ = 1 < 2 == false; }
| ^ ^^
|
help: parenthesize the comparison
|
8 | fn main() { let _ = (1 < 2) == false; }
| + +
```
With the fix, it will print `fn main() { let _ = (1 < 2) == false; }`.
Making `-Zunpretty=expanded` consistently produce syntactically valid Rust output is important because that is what makes it possible for `cargo expand` to format and perform filtering on the expanded code.
## Review notes
According to `rg '\.fixity\(\)' compiler/` the `fixity` function is called only 3 places:
- 13170cd787/compiler/rustc_ast_pretty/src/pprust/state/expr.rs (L283-L287)
- 13170cd787/compiler/rustc_hir_pretty/src/lib.rs (L1295-L1299)
- 13170cd787/compiler/rustc_parse/src/parser/expr.rs (L282-L289)
The 2 pretty printers definitely want to treat comparisons using `Fixity::None`. That's the whole bug being fixed. Meanwhile, the parser's `Fixity::None` codepath is previously unreachable as indicated by the comment, so as long as `Fixity::None` here behaves exactly the way that `Fixity::Left` used to behave, you can tell that this PR definitely does not constitute any behavior change for the parser.
My guess for why comparison operators were set to `Fixity::Left` instead of `Fixity::None` is that it's a very old workaround for giving a good chained comparisons diagnostic (like what I pasted above). Nowadays that is handled by a different dedicated codepath.
Detect missing `.` in method chain in `let` bindings and statements
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
On parse errors where an ident is found where one wasn't expected, see if the next elements might have been meant as method call or field access.
```
error: expected one of `.`, `;`, `?`, `else`, or an operator, found `map`
--> $DIR/missing-dot-on-statement-expression.rs:7:29
|
LL | let _ = [1, 2, 3].iter()map(|x| x);
| ^^^ expected one of `.`, `;`, `?`, `else`, or an operator
|
help: you might have meant to write a method call
|
LL | let _ = [1, 2, 3].iter().map(|x| x);
| +
```
Instead use dcx.abort_if_error() or guar.raise_fatal() instead. These
guarantee that an error actually happened previously and thus we don't
silently abort.
Currently it relies on having the right integer for every variant, and
if you add a variant you need to adjust the integers for all subsequent
variants, which is a pain.
This commit introduces a match guard formulation that takes advantage of
the enum-to-integer conversion to avoid specifying the integer for each
variant. And it does this via a macro to avoid lots of boilerplate.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.
This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)
The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.
Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.
The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.
The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
This is a naming convention used in a handful of spots in the parser for
delimiters. It confused me when I first saw it a long time ago, and I've
never liked it. A web search says "Bra-ket notation" exists in linear
algebra but the terminology has zero prior use in a programming context,
as far as I can tell.
This commit changes it to `open`/`close`, which is consistent with the
rest of the compiler.
The most significant is `check_keyword`: it now only pushes to
`expected_token_types` if the keyword check fails, which matches how all
the other `check` methods work.
The remainder are just tweaks to make these methods more consistent with
each other.
Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
EDIT: this PR also updates `rustfmt.toml` to set
`use_field_init_shorthand = true`.
Overhaul keyword handling
The compiler's list of keywords has some problems.
- It contains several items that aren't keywords.
- The order isn't quite right in a couple of places.
- Some of the names of predicates relating to keywords are confusing.
- rustdoc and rustfmt have their own (incorrect) versions of the keyword list.
- `AllKeywords` is unnecessarily complex.
r? ```@jieyouxu```
`rustc_symbol` is the source of truth for keywords.
rustdoc has its own implicit definition of keywords, via the
`is_doc_keyword`. It (presumably) intends to include all keywords, but
it omits `yeet`.
rustfmt has its own explicit list of Rust keywords. It also (presumably)
intends to include all keywords, but it omits `await`, `builtin`, `gen`,
`macro_rules`, `raw`, `reuse`, `safe`, and `yeet`. Also, it does linear
searches through this list, which is inefficient.
This commit fixes all of the above problems by introducing a new
predicate `is_any_keyword` in rustc and using it in rustdoc and rustfmt.
It documents that it's not the right predicate in most cases.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
- Move it to `rustc_parse`, which is the only crate that uses it. This
lets us remove all the `pub` markers from it.
- Change `next_ref` and `look_ahead` to `get` and `bump`, which work
better for the `rustc_parse` uses.
- This requires adding a `TokenStream::get` method, which is simple.
- In `TokenCursor`, we currently duplicate the
`DelimSpan`/`DelimSpacing`/`Delimiter` from the surrounding
`TokenTree::Delimited` in the stack. This isn't necessary so long as
we don't prematurely move past the `Delimited`, and is a small perf
win on a very hot code path.
- In `parse_token_tree`, we clone the relevant `TokenTree::Delimited`
instead of constructing an identical one from pieces.
Because `TokenStreamIter` is a much better name for a `TokenStream`
iterator. Also rename the `TokenStream::trees` method as
`TokenStream::iter`, and some local variables.
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
Keep track of patterns that could have introduced a binding, but didn't
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:18:30
|
LL | if let Website { url, .. } = website {
| ------------------- this pattern doesn't include `title`, which is available in `Website`
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
```
Fix#74863.
Remove `Lexer`'s dependency on `Parser`.
Lexing precedes parsing, as you'd expect: `Lexer` creates a `TokenStream` and `Parser` then parses that `TokenStream`.
But, in a horrendous violation of layering abstractions and common sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err` method does some error recovery that relies on creating a `Parser` to do some post-processing of the `TokenStream` that the `Lexer` just created.
This commit just removes `unclosed_delim_err`. This change removes `Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s return value can have a more typical form.
The cost is slightly worse error messages in two obscure cases, as shown in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff marker detection is no longer supported (because that detection is implemented in the parser).
In my opinion this cost is outweighed by the magnitude of the code cleanup.
r? ```````@chenyukang```````
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:19:30
|
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
|
note: `Website` has a field `title` which could have been included in this pattern, but it wasn't
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:17:12
|
LL | / struct Website {
LL | | url: String,
LL | | title: Option<String> ,
| | ----- defined here
LL | | }
| |_-
...
LL | if let Website { url, .. } = website {
| ^^^^^^^^^^^^^^^^^^^ this pattern doesn't include `title`, which is available in `Website`
```
Fix#74863.
Add AST support for unsafe binders
I'm splitting up #130514 into pieces. It's impossible for me to keep up with a huge PR like that. I'll land type system support for this next, probably w/o MIR lowering, which will come later.
r? `@oli-obk`
cc `@BoxyUwU` and `@lcnr` who also may want to look at this, though this PR doesn't do too much yet
Keep track of parse errors in `mod`s and don't emit resolve errors for paths involving them
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around in the HIR and mark its `DefId` in the `Resolver`. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by expansion of `mod`s with parse errors.
Fix https://github.com/rust-lang/rust/issues/97734.
Lexing precedes parsing, as you'd expect: `Lexer` creates a
`TokenStream` and `Parser` then parses that `TokenStream`.
But, in a horrendous violation of layering abstractions and common
sense, `Lexer` depends on `Parser`! The `Lexer::unclosed_delim_err`
method does some error recovery that relies on creating a `Parser` to do
some post-processing of the `TokenStream` that the `Lexer` just created.
This commit just removes `unclosed_delim_err`. This change removes
`Lexer`'s dependency on `Parser`, and also means that `lex_token_tree`'s
return value can have a more typical form.
The cost is slightly worse error messages in two obscure cases, as shown
in these tests:
- tests/ui/parser/brace-in-let-chain.rs: there is slightly less
explanation in this case involving an extra `{`.
- tests/ui/parser/diff-markers/unclosed-delims{,-in-macro}.rs: the diff
marker detection is no longer supported (because that detection is
implemented in the parser).
In my opinion this cost is outweighed by the magnitude of the code
cleanup.
allow `symbol_intern_string_literal` lint in test modules
Since #133545, `x check compiler --stage 1` no longer works because compiler test modules trigger `symbol_intern_string_literal` lint errors. Bootstrap shouldn't control when to ignore or enable this lint in the compiler tree (using `Kind != Test` was ineffective for obvious reasons).
Also, conditionally adding this rustflag invalidates the build cache between `x test` and other commands.
This PR removes the `Kind` check from bootstrap and handles it directly in the compiler tree in a more natural way.
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by `mod` expansion.
Fix#97734.
Add test to check unicode identifier version
This adds a test to verify which version of Unicode is used for identifiers. This is part of the language, documented at https://doc.rust-lang.org/nightly/reference/identifiers.html#r-ident.unicode. The version here often changes implicitly due to dependency updates pulling in new versions, and thus we often don't notice it has changed leaving the documentation out of date. The intent here is to have a canary to give us a notification when it changes so that we can update the documentation.
Initial implementation of `#[feature(default_field_values]`, proposed in https://github.com/rust-lang/rfcs/pull/3681.
Support default fields in enum struct variant
Allow default values in an enum struct variant definition:
```rust
pub enum Bar {
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Allow using `..` without a base on an enum struct variant
```rust
Bar::Foo { .. }
```
`#[derive(Default)]` doesn't account for these as it is still gating `#[default]` only being allowed on unit variants.
Support `#[derive(Default)]` on enum struct variants with all defaulted fields
```rust
pub enum Bar {
#[default]
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Check for missing fields in typeck instead of mir_build.
Expand test with `const` param case (needs `generic_const_exprs` enabled).
Properly instantiate MIR const
The following works:
```rust
struct S<A> {
a: Vec<A> = Vec::new(),
}
S::<i32> { .. }
```
Add lint for default fields that will always fail const-eval
We *allow* this to happen for API writers that might want to rely on users'
getting a compile error when using the default field, different to the error
that they would get when the field isn't default. We could change this to
*always* error instead of being a lint, if we wanted.
This will *not* catch errors for partially evaluated consts, like when the
expression relies on a const parameter.
Suggestions when encountering `Foo { .. }` without `#[feature(default_field_values)]`:
- Suggest adding a base expression if there are missing fields.
- Suggest enabling the feature if all the missing fields have optional values.
- Suggest removing `..` if there are no missing fields.
Gate async fn trait bound modifier on `async_trait_bounds`
This PR moves `async Fn()` trait bounds into a new feature gate: `feature(async_trait_bounds)`. The general vibe is that we will most likely stabilize the `feature(async_closure)` *without* the `async Fn()` trait bound modifier, so we need to gate that separately.
We're trying to work on the general vision of `async` trait bound modifier general in: https://github.com/rust-lang/rfcs/pull/3710, however that RFC still needs more time for consensus to converge, and we've decided that the value that users get from calling the bound `async Fn()` is *not really* worth blocking landing async closures in general.
Change `AttrArgs::Eq` to a struct variant
Cleanups for simplifying https://github.com/rust-lang/rust/pull/131808
Basically changes `AttrArgs::Eq` to a struct variant and then avoids several matches on `AttrArgsEq` in favor of methods on it. This will make future refactorings simpler, as they can either keep methods or switch to field accesses without having to restructure code
Eliminate magic numbers from expression precedence
Context: see https://github.com/rust-lang/rust/pull/133140.
This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds.
This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate.
- Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not.
- Squashing the `Closure` and `Jump` cases into a single precedence level.
- Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`.
FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.
Improve span handling in `parse_expr_bottom`.
`parse_expr_bottom` stores `this.token.span` in `lo`, but then fails to use it in many places where it could. This commit fixes that, and likewise (to a smaller extent) in `parse_ty_common`.
r? ``@spastorino``
`parse_expr_bottom` stores `this.token.span` in `lo`, but then fails to
use it in many places where it could. This commit fixes that, and
likewise (to a smaller extent) in `parse_ty_common`.
Inline ExprPrecedence::order into Expr::precedence
The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from #119105 and #119427).
Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:
1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`
As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.
The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:
```rust
match self.kind {
ExprKind::Closure(..) => ExprPrecedence::Closure,
...
}
```
although there were exceptions of both many-to-one, and one-to-many:
```rust
ExprKind::Underscore => ExprPrecedence::Path,
ExprKind::Path(..) => ExprPrecedence::Path,
...
ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```
Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.
You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:
- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.
- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.
- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.
This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
There is a not-very-useful layering in the lexer, where
`TokenTreesReader` contains a `StringReader`. This commit combines them
and names the result `Lexer`, which is a more obvious name for it.
The methods of `Lexer` are now split across `mod.rs` and `tokentrees.rs`
which isn't ideal, but it doesn't seem worth moving a bunch of code to
avoid it.
With the insertion of a new chapter 17 on async and await to _The Rust
Programming Language_, references in compiler output to later chapters
need to be updated to avoid confusing users. Redirects exist so that
users who click old links will end up in the right place anyway, but
this way users will be directed to the right URL in the first place.
Current places where `Interpolated` is used are going to change to
instead use invisible delimiters. This prepares for that.
- It adds invisible delimiter cases to the `can_begin_*`/`may_be_*`
methods and the `failed_to_match_macro` that are equivalent to the
existing `Interpolated` cases.
- It adds panics/asserts in some places where invisible delimiters
should never occur.
- In `Parser::parse_struct_fields` it excludes an ident + invisible
delimiter from special consideration in an error message, because
that's quite different to an ident + paren/brace/bracket.
Pasted metavariables are wrapped in invisible delimiters, which
pretty-print as empty strings, and changing that can break some proc
macros. But error messages saying "expected identifer, found ``" are
bad. So this commit adds support for metavariables in `TokenDescription`
so they print as "metavariable" in error messages, instead of "``".
It's not used meaningfully yet, but will be needed to get rid of
interpolated tokens.
It was added in #123752 to handle some cases involving emoji, but it
isn't necessary because it's always treated the same as
`TokenKind::InvalidIdent`. This commit removes it, which makes things a
little simpler.
Trim whitespace in RemoveLet primary span
Separate `RemoveLet` span into primary span for `let` and removal suggestion span for `let `, so that primary span does not include whitespace.
Fixes: #133031
If a macro statement has been parsed after `else`, suggest a missing `if`:
```
error: expected `{`, found `falsy`
--> $DIR/else-no-if.rs:47:12
|
LL | } else falsy! {} {
| ---- ^^^^^
| |
| expected an `if` or a block after this `else`
|
help: add an `if` if this is the condition of a chained `else if` statement
|
LL | } else if falsy! {} {
| ++
```
Look at the expression that was parsed when trying to recover from a bad `if` condition to determine what was likely intended by the user beyond "maybe this was meant to be an `else` body".
```
error: expected `{`, found `map`
--> $DIR/missing-dot-on-if-condition-expression-fixable.rs:4:30
|
LL | for _ in [1, 2, 3].iter()map(|x| x) {}
| ^^^ expected `{`
|
help: you might have meant to write a method call
|
LL | for _ in [1, 2, 3].iter().map(|x| x) {}
| +
```
Separate `RemoveLet` span into primary span for `let` and removal
suggestion span for `let `, so that primary span does not include
whitespace.
Fixes: #133031
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
`to_lowercase` allocates, but `eq_ignore_ascii_case` doesn't. This path
is hot enough that this makes a small but noticeable difference in
benchmarking.
Enforce that raw lifetimes must be valid raw identifiers
Make sure that the identifier part of a raw lifetime is a valid raw identifier. This precludes `'r#_` and all module segment paths for now.
I don't believe this is compelling to support. This was raised by `@ehuss` in https://github.com/rust-lang/reference/pull/1603#discussion_r1822726753 (well, specifically the `'r#_` case), but I don't see why we shouldn't just make it consistent with raw identifiers.
Use `token_descr` more in error messages
This is the first two commits from #124141, put into their own PR to get things rolling. Commit messages have the details.
r? ``@estebank``
cc ``@petrochenkov``
By using `token_descr`, as is done for many other errors, we can get
slightly better descriptions in error messages, e.g.
"macro expansion ignores token `let` and any following" becomes
"macro expansion ignores keyword `let` and any tokens following".
This will be more important once invisible delimiters start being
mentioned in error messages -- without this commit, that leads to error
messages such as "error at ``" because invisible delimiters are
pretty printed as an empty string.
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `&pin (mut|const) T` type position sugar
This adds parser support for `&pin mut T` and `&pin const T` references. These are desugared to `Pin<&mut T>` and `Pin<&T>` in the AST lowering phases.
This PR currently includes #130526 since that one is in the commit queue. Only the most recent commits (bd450027eb4a94b814a7dd9c0fa29102e6361149 and following) are new.
Tracking:
- #130494
r? `@compiler-errors`
Add suggestion for removing invalid path sep `::` in fn def
Add suggestion for removing invalid path separator `::` in function definition.
for example: `fn invalid_path_separator::<T>() {}`
fixes#130791
Retire the `unnamed_fields` feature for now
`#![feature(unnamed_fields)]` was implemented in part in #115131 and #115367, however work on that feature has (afaict) stalled and in the mean time there have been some concerns raised (e.g.[^1][^2]) about whether `unnamed_fields` is worthwhile to have in the language, especially in its current desugaring. Because it represents a compiler implementation burden including a new kind of anonymous ADT and additional complication to field selection, and is quite prone to bugs today, I'm choosing to remove the feature.
However, since I'm not one to really write a bunch of words, I'm specifically *not* going to de-RFC this feature. This PR essentially *rolls back* the state of this feature to "RFC accepted but not yet implemented"; however if anyone wants to formally unapprove the RFC from the t-lang side, then please be my guest. I'm just not totally willing to summarize the various language-facing reasons for why this feature is or is not worthwhile, since I'm coming from the compiler side mostly.
Fixes#117942Fixes#121161Fixes#121263Fixes#121299Fixes#121722Fixes#121799Fixes#126969Fixes#131041
Tracking:
* https://github.com/rust-lang/rust/issues/49804
[^1]: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Unnamed.20struct.2Funion.20fields
[^2]: https://github.com/rust-lang/rust/issues/49804#issuecomment-1972619108
```
error: expected a pattern, found an expression
--> f889.rs:3:13
|
3 | let (x, y.drop()) = (1, 2); //~ ERROR
| ^^^^^^^^ not a pattern
|
= note: arbitrary expressions are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
error[E0532]: expected a pattern, found a function call
--> f889.rs:2:13
|
2 | let (x, drop(y)) = (1, 2); //~ ERROR
| ^^^^ not a tuple struct or tuple variant
|
= note: function calls are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
```
Fix#97200.
Implement RFC3695 Allow boolean literals as cfg predicates
This PR implements https://github.com/rust-lang/rfcs/pull/3695: allow boolean literals as cfg predicates, i.e. `cfg(true)` and `cfg(false)`.
r? `@nnethercote` *(or anyone with parser knowledge)*
cc `@clubby789`
Fix `break_last_token`.
It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams.
r? `@petrochenkov`
It currently doesn't handle the three-char tokens `>>=` and `<<=`
correctly. These can be broken twice, resulting in three individual
tokens. This is a latent bug that currently doesn't cause any problems,
but does cause problems for #124141, because that PR increases the usage
of lazy token streams.
Implement a Method to Seal `DiagInner`'s Suggestions
This PR adds a method on `DiagInner` called `.seal_suggestions()` to prevent new suggestions from being added while preserving existing suggestions.
This is useful because currently there is no way to prevent new suggestions from being added to a diagnostic. `.disable_suggestions()` is the closest but it gets rid of all suggestions before and after the call.
Therefore, `.seal_suggestions()` can be used when, for example, misspelled keyword is detected and reported. In such cases, we may want to prevent other suggestions from being added to the diagnostic, as they would likely be meaningless once the misspelled keyword is identified. For context: https://github.com/rust-lang/rust/pull/129899#discussion_r1741307132
To store an additional state, the type of the `suggestions` field in `DiagInner` was changed into a three variant enum. While this change affects files across different crates, care was taken to preserve the existing code's semantics. This is validated by the fact that all UI tests pass without any modifications.
r? chenyukang
stabilize `const_extern_fn`
closes https://github.com/rust-lang/rust/issues/64926
tracking issue: https://github.com/rust-lang/rust/issues/64926
reference PR: https://github.com/rust-lang/reference/pull/1596
## Stabilizaton Report
### Summary
Using `const extern "Rust"` and `const extern "C"` was already stabilized (since version 1.62.0, see https://github.com/rust-lang/rust/pull/95346). This PR stabilizes the other calling conventions: it is now possible to write `const unsafe extern "calling-convention" fn` and `const extern "calling-convention" fn` for any supported calling convention:
```rust
const extern "C-unwind" fn foo1(val: u8) -> u8 { val + 1}
const extern "stdcall" fn foo2(val: u8) -> u8 { val + 1}
const unsafe extern "C-unwind" fn bar1(val: bool) -> bool { !val }
const unsafe extern "stdcall" fn bar2(val: bool) -> bool { !val }
```
This can be used to const-ify an `extern fn`, or conversely, to make a `const fn` callable from external code.
r? T-lang
cc `@RalfJung`