allow `symbol_intern_string_literal` lint in test modules
Since #133545, `x check compiler --stage 1` no longer works because compiler test modules trigger `symbol_intern_string_literal` lint errors. Bootstrap shouldn't control when to ignore or enable this lint in the compiler tree (using `Kind != Test` was ineffective for obvious reasons).
Also, conditionally adding this rustflag invalidates the build cache between `x test` and other commands.
This PR removes the `Kind` check from bootstrap and handles it directly in the compiler tree in a more natural way.
When we expand a `mod foo;` and parse `foo.rs`, we now track whether that file had an unrecovered parse error that reached the end of the file. If so, we keep that information around. When resolving a path like `foo::bar`, we do not emit any errors for "`bar` not found in `foo`", as we know that the parse error might have caused `bar` to not be parsed and accounted for.
When this happens in an existing project, every path referencing `foo` would be an irrelevant compile error. Instead, we now skip emitting anything until `foo.rs` is fixed. Tellingly enough, we didn't have any test for errors caused by `mod` expansion.
Fix#97734.
Add test to check unicode identifier version
This adds a test to verify which version of Unicode is used for identifiers. This is part of the language, documented at https://doc.rust-lang.org/nightly/reference/identifiers.html#r-ident.unicode. The version here often changes implicitly due to dependency updates pulling in new versions, and thus we often don't notice it has changed leaving the documentation out of date. The intent here is to have a canary to give us a notification when it changes so that we can update the documentation.
Initial implementation of `#[feature(default_field_values]`, proposed in https://github.com/rust-lang/rfcs/pull/3681.
Support default fields in enum struct variant
Allow default values in an enum struct variant definition:
```rust
pub enum Bar {
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Allow using `..` without a base on an enum struct variant
```rust
Bar::Foo { .. }
```
`#[derive(Default)]` doesn't account for these as it is still gating `#[default]` only being allowed on unit variants.
Support `#[derive(Default)]` on enum struct variants with all defaulted fields
```rust
pub enum Bar {
#[default]
Foo {
bar: S = S,
baz: i32 = 42 + 3,
}
}
```
Check for missing fields in typeck instead of mir_build.
Expand test with `const` param case (needs `generic_const_exprs` enabled).
Properly instantiate MIR const
The following works:
```rust
struct S<A> {
a: Vec<A> = Vec::new(),
}
S::<i32> { .. }
```
Add lint for default fields that will always fail const-eval
We *allow* this to happen for API writers that might want to rely on users'
getting a compile error when using the default field, different to the error
that they would get when the field isn't default. We could change this to
*always* error instead of being a lint, if we wanted.
This will *not* catch errors for partially evaluated consts, like when the
expression relies on a const parameter.
Suggestions when encountering `Foo { .. }` without `#[feature(default_field_values)]`:
- Suggest adding a base expression if there are missing fields.
- Suggest enabling the feature if all the missing fields have optional values.
- Suggest removing `..` if there are no missing fields.
Gate async fn trait bound modifier on `async_trait_bounds`
This PR moves `async Fn()` trait bounds into a new feature gate: `feature(async_trait_bounds)`. The general vibe is that we will most likely stabilize the `feature(async_closure)` *without* the `async Fn()` trait bound modifier, so we need to gate that separately.
We're trying to work on the general vision of `async` trait bound modifier general in: https://github.com/rust-lang/rfcs/pull/3710, however that RFC still needs more time for consensus to converge, and we've decided that the value that users get from calling the bound `async Fn()` is *not really* worth blocking landing async closures in general.
Change `AttrArgs::Eq` to a struct variant
Cleanups for simplifying https://github.com/rust-lang/rust/pull/131808
Basically changes `AttrArgs::Eq` to a struct variant and then avoids several matches on `AttrArgsEq` in favor of methods on it. This will make future refactorings simpler, as they can either keep methods or switch to field accesses without having to restructure code
Eliminate magic numbers from expression precedence
Context: see https://github.com/rust-lang/rust/pull/133140.
This PR continues on backporting Syn's expression precedence design into rustc. Rustc's design used mysterious integer quantities represented variously as `i8` or `usize` (e.g. `PREC_CLOSURE = -40i8`), a special significance around `0` that is never named, and an extra `PREC_FORCE_PAREN` precedence level that does not correspond to any expression. Syn's design uses a C-like enum with variants that clearly correspond to specific sets of expression kinds.
This PR is a refactoring that has no intended behavior change on its own, but it unblocks other precedence work that rustc's precedence design was poorly suited to accommodate.
- Asymmetrical precedence, so that a pretty-printer can tell `(return 1) + 1` needs parens but `1 + return 1` does not.
- Squashing the `Closure` and `Jump` cases into a single precedence level.
- Numerous remaining false positives and false negatives in rustc pretty-printer's parenthesization of macro metavariables, for example in `$e < rhs` where $e is `lhs as Thing<T>`.
FYI `@fmease` — you don't need to review if rustbot picks someone else, but you mentioned being interested in the followup PRs.
Improve span handling in `parse_expr_bottom`.
`parse_expr_bottom` stores `this.token.span` in `lo`, but then fails to use it in many places where it could. This commit fixes that, and likewise (to a smaller extent) in `parse_ty_common`.
r? ``@spastorino``
`parse_expr_bottom` stores `this.token.span` in `lo`, but then fails to
use it in many places where it could. This commit fixes that, and
likewise (to a smaller extent) in `parse_ty_common`.
Inline ExprPrecedence::order into Expr::precedence
The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from #119105 and #119427).
Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:
1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`
As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.
The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:
```rust
match self.kind {
ExprKind::Closure(..) => ExprPrecedence::Closure,
...
}
```
although there were exceptions of both many-to-one, and one-to-many:
```rust
ExprKind::Underscore => ExprPrecedence::Path,
ExprKind::Path(..) => ExprPrecedence::Path,
...
ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```
Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.
You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:
- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.
- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.
- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.
This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
There is a not-very-useful layering in the lexer, where
`TokenTreesReader` contains a `StringReader`. This commit combines them
and names the result `Lexer`, which is a more obvious name for it.
The methods of `Lexer` are now split across `mod.rs` and `tokentrees.rs`
which isn't ideal, but it doesn't seem worth moving a bunch of code to
avoid it.
With the insertion of a new chapter 17 on async and await to _The Rust
Programming Language_, references in compiler output to later chapters
need to be updated to avoid confusing users. Redirects exist so that
users who click old links will end up in the right place anyway, but
this way users will be directed to the right URL in the first place.
Current places where `Interpolated` is used are going to change to
instead use invisible delimiters. This prepares for that.
- It adds invisible delimiter cases to the `can_begin_*`/`may_be_*`
methods and the `failed_to_match_macro` that are equivalent to the
existing `Interpolated` cases.
- It adds panics/asserts in some places where invisible delimiters
should never occur.
- In `Parser::parse_struct_fields` it excludes an ident + invisible
delimiter from special consideration in an error message, because
that's quite different to an ident + paren/brace/bracket.
Pasted metavariables are wrapped in invisible delimiters, which
pretty-print as empty strings, and changing that can break some proc
macros. But error messages saying "expected identifer, found ``" are
bad. So this commit adds support for metavariables in `TokenDescription`
so they print as "metavariable" in error messages, instead of "``".
It's not used meaningfully yet, but will be needed to get rid of
interpolated tokens.
It was added in #123752 to handle some cases involving emoji, but it
isn't necessary because it's always treated the same as
`TokenKind::InvalidIdent`. This commit removes it, which makes things a
little simpler.
Trim whitespace in RemoveLet primary span
Separate `RemoveLet` span into primary span for `let` and removal suggestion span for `let `, so that primary span does not include whitespace.
Fixes: #133031
If a macro statement has been parsed after `else`, suggest a missing `if`:
```
error: expected `{`, found `falsy`
--> $DIR/else-no-if.rs:47:12
|
LL | } else falsy! {} {
| ---- ^^^^^
| |
| expected an `if` or a block after this `else`
|
help: add an `if` if this is the condition of a chained `else if` statement
|
LL | } else if falsy! {} {
| ++
```
Look at the expression that was parsed when trying to recover from a bad `if` condition to determine what was likely intended by the user beyond "maybe this was meant to be an `else` body".
```
error: expected `{`, found `map`
--> $DIR/missing-dot-on-if-condition-expression-fixable.rs:4:30
|
LL | for _ in [1, 2, 3].iter()map(|x| x) {}
| ^^^ expected `{`
|
help: you might have meant to write a method call
|
LL | for _ in [1, 2, 3].iter().map(|x| x) {}
| +
```
Separate `RemoveLet` span into primary span for `let` and removal
suggestion span for `let `, so that primary span does not include
whitespace.
Fixes: #133031
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
`to_lowercase` allocates, but `eq_ignore_ascii_case` doesn't. This path
is hot enough that this makes a small but noticeable difference in
benchmarking.
Enforce that raw lifetimes must be valid raw identifiers
Make sure that the identifier part of a raw lifetime is a valid raw identifier. This precludes `'r#_` and all module segment paths for now.
I don't believe this is compelling to support. This was raised by `@ehuss` in https://github.com/rust-lang/reference/pull/1603#discussion_r1822726753 (well, specifically the `'r#_` case), but I don't see why we shouldn't just make it consistent with raw identifiers.
Use `token_descr` more in error messages
This is the first two commits from #124141, put into their own PR to get things rolling. Commit messages have the details.
r? ``@estebank``
cc ``@petrochenkov``
By using `token_descr`, as is done for many other errors, we can get
slightly better descriptions in error messages, e.g.
"macro expansion ignores token `let` and any following" becomes
"macro expansion ignores keyword `let` and any tokens following".
This will be more important once invisible delimiters start being
mentioned in error messages -- without this commit, that leads to error
messages such as "error at ``" because invisible delimiters are
pretty printed as an empty string.
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `&pin (mut|const) T` type position sugar
This adds parser support for `&pin mut T` and `&pin const T` references. These are desugared to `Pin<&mut T>` and `Pin<&T>` in the AST lowering phases.
This PR currently includes #130526 since that one is in the commit queue. Only the most recent commits (bd450027eb4a94b814a7dd9c0fa29102e6361149 and following) are new.
Tracking:
- #130494
r? `@compiler-errors`
Add suggestion for removing invalid path sep `::` in fn def
Add suggestion for removing invalid path separator `::` in function definition.
for example: `fn invalid_path_separator::<T>() {}`
fixes#130791
Retire the `unnamed_fields` feature for now
`#![feature(unnamed_fields)]` was implemented in part in #115131 and #115367, however work on that feature has (afaict) stalled and in the mean time there have been some concerns raised (e.g.[^1][^2]) about whether `unnamed_fields` is worthwhile to have in the language, especially in its current desugaring. Because it represents a compiler implementation burden including a new kind of anonymous ADT and additional complication to field selection, and is quite prone to bugs today, I'm choosing to remove the feature.
However, since I'm not one to really write a bunch of words, I'm specifically *not* going to de-RFC this feature. This PR essentially *rolls back* the state of this feature to "RFC accepted but not yet implemented"; however if anyone wants to formally unapprove the RFC from the t-lang side, then please be my guest. I'm just not totally willing to summarize the various language-facing reasons for why this feature is or is not worthwhile, since I'm coming from the compiler side mostly.
Fixes#117942Fixes#121161Fixes#121263Fixes#121299Fixes#121722Fixes#121799Fixes#126969Fixes#131041
Tracking:
* https://github.com/rust-lang/rust/issues/49804
[^1]: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Unnamed.20struct.2Funion.20fields
[^2]: https://github.com/rust-lang/rust/issues/49804#issuecomment-1972619108
```
error: expected a pattern, found an expression
--> f889.rs:3:13
|
3 | let (x, y.drop()) = (1, 2); //~ ERROR
| ^^^^^^^^ not a pattern
|
= note: arbitrary expressions are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
error[E0532]: expected a pattern, found a function call
--> f889.rs:2:13
|
2 | let (x, drop(y)) = (1, 2); //~ ERROR
| ^^^^ not a tuple struct or tuple variant
|
= note: function calls are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
```
Fix#97200.
Implement RFC3695 Allow boolean literals as cfg predicates
This PR implements https://github.com/rust-lang/rfcs/pull/3695: allow boolean literals as cfg predicates, i.e. `cfg(true)` and `cfg(false)`.
r? `@nnethercote` *(or anyone with parser knowledge)*
cc `@clubby789`
Fix `break_last_token`.
It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams.
r? `@petrochenkov`
It currently doesn't handle the three-char tokens `>>=` and `<<=`
correctly. These can be broken twice, resulting in three individual
tokens. This is a latent bug that currently doesn't cause any problems,
but does cause problems for #124141, because that PR increases the usage
of lazy token streams.
Implement a Method to Seal `DiagInner`'s Suggestions
This PR adds a method on `DiagInner` called `.seal_suggestions()` to prevent new suggestions from being added while preserving existing suggestions.
This is useful because currently there is no way to prevent new suggestions from being added to a diagnostic. `.disable_suggestions()` is the closest but it gets rid of all suggestions before and after the call.
Therefore, `.seal_suggestions()` can be used when, for example, misspelled keyword is detected and reported. In such cases, we may want to prevent other suggestions from being added to the diagnostic, as they would likely be meaningless once the misspelled keyword is identified. For context: https://github.com/rust-lang/rust/pull/129899#discussion_r1741307132
To store an additional state, the type of the `suggestions` field in `DiagInner` was changed into a three variant enum. While this change affects files across different crates, care was taken to preserve the existing code's semantics. This is validated by the fact that all UI tests pass without any modifications.
r? chenyukang
stabilize `const_extern_fn`
closes https://github.com/rust-lang/rust/issues/64926
tracking issue: https://github.com/rust-lang/rust/issues/64926
reference PR: https://github.com/rust-lang/reference/pull/1596
## Stabilizaton Report
### Summary
Using `const extern "Rust"` and `const extern "C"` was already stabilized (since version 1.62.0, see https://github.com/rust-lang/rust/pull/95346). This PR stabilizes the other calling conventions: it is now possible to write `const unsafe extern "calling-convention" fn` and `const extern "calling-convention" fn` for any supported calling convention:
```rust
const extern "C-unwind" fn foo1(val: u8) -> u8 { val + 1}
const extern "stdcall" fn foo2(val: u8) -> u8 { val + 1}
const unsafe extern "C-unwind" fn bar1(val: bool) -> bool { !val }
const unsafe extern "stdcall" fn bar2(val: bool) -> bool { !val }
```
This can be used to const-ify an `extern fn`, or conversely, to make a `const fn` callable from external code.
r? T-lang
cc `@RalfJung`
Fix `clippy::useless_conversion`
Self-explanatory. Probably the last clippy change I'll actually put up since this is the only other one I've actually seen in the wild.
Simplify some nested `if` statements
Applies some but not all instances of `clippy::collapsible_if`. Some ended up looking worse afterwards, though, so I left those out. Also applies instances of `clippy::collapsible_else_if`
Review with whitespace disabled please.
Fix double handling in `collect_tokens`
Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.
This PR fixes some double handling cases that cause problems, including #129166.
r? `@petrochenkov`
Implement raw lifetimes and labels (`'r#ident`)
This PR does two things:
1. Reserve lifetime prefixes, e.g. `'prefix#lt` in edition 2021.
2. Implements raw lifetimes, e.g. `'r#async` in edition 2021.
This PR additionally extends the `keyword_idents_2024` lint to also check lifetimes.
cc `@traviscross`
r? parser
Add Suggestions for Misspelled Keywords
Fixes#97793
This PR detects misspelled keywords using two heuristics:
1. Lowercasing the unexpected identifier.
2. Using edit distance to find a keyword similar to the unexpected identifier.
However, it does not detect each and every misspelled keyword to
minimize false positives and ambiguities. More details about the
implementation can be found in the comments.
This PR detects misspelled keywords using two heuristics:
1. Lowercasing the unexpected identifier.
2. Using edit distance to find a keyword similar to the unexpected identifier.
However, it does not detect each and every misspelled keyword to
minimize false positives and ambiguities. More details about the
implementation can be found in the comments.
Don't make statement nonterminals match pattern nonterminals
Right now, the heuristic we use to check if a token may begin a pattern nonterminal falls back to `may_be_ident`:
ef71f1047e/compiler/rustc_parse/src/parser/nonterminal.rs (L21-L37)
This has the unfortunate side effect that a `stmt` nonterminal eagerly matches against a `pat` nonterminal, leading to a parse error:
```rust
macro_rules! m {
($pat:pat) => {};
($stmt:stmt) => {};
}
macro_rules! m2 {
($stmt:stmt) => {
m! { $stmt }
};
}
m2! { let x = 1 }
```
This PR fixes it by more accurately reflecting the set of nonterminals that may begin a pattern nonterminal.
As a side-effect, I modified `Token::can_begin_pattern` to work correctly and used that in `Parser::nonterminal_may_begin_with`.
By keeping track of attributes that have been previously processed.
This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.
Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
In a case like this:
```
mod a {
mod b {
#[cfg_attr(unix, inline)]
fn f() {
#[cfg_attr(linux, inline)]
fn g1() {}
#[cfg_attr(linux, inline)]
fn g2() {}
}
}
}
```
We currently end up with the following replacement ranges.
- The lazy tokens for `f` has replacement ranges for `g1` and `g2`.
- The lazy tokens for `a` has replacement ranges for `f`, `g1`, and
`g2`.
I.e. the replacement ranges for `g1` and `g2` are duplicated. In
general, replacement ranges for inner AST nodes are duplicated up the
chain for each nested `collect_tokens` call. And the code that processes
the replacements is careful about the ordering in which the replacements
are applied, to ensure that inner replacements are applied before outer
replacements.
But all of this is unnecessary. If you apply an inner replacement and
then an outer replacement, the outer replacement completely overwrites
the inner replacement.
This commit avoids the duplication by removing replacements from
`self.capture_state.parser_replacements` when they are used. (The effect
on the example above is that the lazy tokesn for `a` no longer include
replacement ranges for `g1` and `g2`.) This eliminates the possibility
of nested replacements on individual AST nodes, which avoids the need
for careful ordering of replacements.
This example triggers an assertion failure:
```
fn f() -> u32 {
#[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
`collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
`collect_tokens`.
- The return value of the `f` call is the expression `0`.
- This inner call collects tokens for `0` (parser range 10..11) and
creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
0..11).
- We return to the outer `collect_tokens` call. The return value of the
`f` call is *again* the expression `0`, again with the range 10..11,
but the replacement from earlier covers the range 0..11. The code
mistakenly assumes that any attributes from an inner `collect_tokens`
call fit entirely within the body of the result of an outer
`collect_tokens` call. So it adjusts the replacement parser range
0..11 to a node range by subtracting 10, resulting in -10..1. This is
an invalid range and triggers an assertion failure.
It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.
This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.
Fixes#129166. Note: these invalid ranges have been happening for a long
time. #128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
Stabilize opaque type precise capturing (RFC 3617)
This PR partially stabilizes opaque type *precise capturing*, which was specified in [RFC 3617](https://github.com/rust-lang/rfcs/pull/3617), and whose syntax was amended by FCP in [#125836](https://github.com/rust-lang/rust/issues/125836).
This feature, as stabilized here, gives us a way to explicitly specify the generic lifetime parameters that an RPIT-like opaque type captures. This solves the problem of overcapturing, for lifetime parameters in these opaque types, and will allow the Lifetime Capture Rules 2024 ([RFC 3498](https://github.com/rust-lang/rfcs/pull/3498)) to be fully stabilized for RPIT in Rust 2024.
### What are we stabilizing?
This PR stabilizes the use of a `use<'a, T>` bound in return-position impl Trait opaque types. Such a bound fully specifies the set of generic parameters captured by the RPIT opaque type, entirely overriding the implicit default behavior. E.g.:
```rust
fn does_not_capture<'a, 'b>() -> impl Sized + use<'a> {}
// ~~~~~~~~~~~~~~~~~~~~
// This RPIT opaque type does not capture `'b`.
```
The way we would suggest thinking of `impl Trait` types *without* an explicit `use<..>` bound is that the `use<..>` bound has been *elided*, and that the bound is filled in automatically by the compiler according to the edition-specific capture rules.
All non-`'static` lifetime parameters, named (i.e. non-APIT) type parameters, and const parameters in scope are valid to name, including an elided lifetime if such a lifetime would also be valid in an outlives bound, e.g.:
```rust
fn elided(x: &u8) -> impl Sized + use<'_> { x }
```
Lifetimes must be listed before type and const parameters, but otherwise the ordering is not relevant to the `use<..>` bound. Captured parameters may not be duplicated. For now, only one `use<..>` bound may appear in a bounds list. It may appear anywhere within the bounds list.
### How does this differ from the RFC?
This stabilization differs from the RFC in one respect: the RFC originally specified `use<'a, T>` as syntactically part of the RPIT type itself, e.g.:
```rust
fn capture<'a>() -> impl use<'a> Sized {}
```
However, settling on the final syntax was left as an open question. T-lang later decided via FCP in [#125836](https://github.com/rust-lang/rust/issues/125836) to treat `use<..>` as a syntactic bound instead, e.g.:
```rust
fn capture<'a>() -> impl Sized + use<'a> {}
```
### What aren't we stabilizing?
The key goal of this PR is to stabilize the parts of *precise capturing* that are needed to enable the migration to Rust 2024.
There are some capabilities of *precise capturing* that the RFC specifies but that we're not stabilizing here, as these require further work on the type system. We hope to lift these limitations later.
The limitations that are part of this PR were specified in the [RFC's stabilization strategy](https://rust-lang.github.io/rfcs/3617-precise-capturing.html#stabilization-strategy).
#### Not capturing type or const parameters
The RFC addresses the overcapturing of type and const parameters; that is, it allows for them to not be captured in opaque types. We're not stabilizing that in this PR. Since all in scope generic type and const parameters are implicitly captured in all editions, this is not needed for the migration to Rust 2024.
For now, when using `use<..>`, all in scope type and const parameters must be nameable (i.e., APIT cannot be used) and included as arguments. For example, this is an error because `T` is in scope and not included as an argument:
```rust
fn test<T>() -> impl Sized + use<> {}
//~^ ERROR `impl Trait` must mention all type parameters in scope in `use<...>`
```
This is due to certain current limitations in the type system related to how generic parameters are represented as captured (i.e. bivariance) and how inference operates.
We hope to relax this in the future, and this stabilization is forward compatible with doing so.
#### Precise capturing for return-position impl Trait **in trait** (RPITIT)
The RFC specifies precise capturing for RPITIT. We're not stabilizing that in this PR. Since RPITIT already adheres to the Lifetime Capture Rules 2024, this isn't needed for the migration to Rust 2024.
The effect of this is that the anonymous associated types created by RPITITs must continue to capture all of the lifetime parameters in scope, e.g.:
```rust
trait Foo<'a> {
fn test() -> impl Sized + use<Self>;
//~^ ERROR `use<...>` precise capturing syntax is currently not allowed in return-position `impl Trait` in traits
}
```
To allow this involves a meaningful amount of type system work related to adding variance to GATs or reworking how generics are represented in RPITITs. We plan to do this work separately from the stabilization. See:
- https://github.com/rust-lang/rust/pull/124029
Supporting precise capturing for RPITIT will also require us to implement a new algorithm for detecting refining capture behavior. This may involve looking through type parameters to detect cases where the impl Trait type in an implementation captures fewer lifetimes than the corresponding RPITIT in the trait definition, e.g.:
```rust
trait Foo {
fn rpit() -> impl Sized + use<Self>;
}
impl<'a> Foo for &'a () {
// This is "refining" due to not capturing `'a` which
// is implied by the trait's `use<Self>`.
fn rpit() -> impl Sized + use<>;
// This is not "refining".
fn rpit() -> impl Sized + use<'a>;
}
```
This stabilization is forward compatible with adding support for this later.
### The technical details
This bound is purely syntactical and does not lower to a [`Clause`](https://doc.rust-lang.org/1.79.0/nightly-rustc/rustc_middle/ty/type.ClauseKind.html) in the type system. For the purposes of the type system (and for the types team's curiosity regarding this stabilization), we have no current need to represent this as a `ClauseKind`.
Since opaques already capture a variable set of lifetimes depending on edition and their syntactical position (e.g. RPIT vs RPITIT), a `use<..>` bound is just a way to explicitly rather than implicitly specify that set of lifetimes, and this only affects opaque type lowering from AST to HIR.
### FCP plan
While there's much discussion of the type system here, the feature in this PR is implemented internally as a transformation that happens before lowering to the type system layer. We already support impl Trait types partially capturing the in scope lifetimes; we just currently only expose that implicitly.
So, in my (errs's) view as a types team member, there's nothing for types to weigh in on here with respect to the implementation being stabilized, and I'd suggest a lang-only proposed FCP (though we'll of course CC the team below).
### Authorship and acknowledgments
This stabilization report was coauthored by compiler-errors and TC.
TC would like to acknowledge the outstanding and speedy work that compiler-errors has done to make this feature happen.
compiler-errors thanks TC for authoring the RFC, for all of his involvement in this feature's development, and pushing the Rust 2024 edition forward.
### Open items
We're doing some things in parallel here. In signaling the intention to stabilize, we want to uncover any latent issues so we can be sure they get addressed. We want to give the maximum time for discussion here to happen by starting it while other remaining miscellaneous work proceeds. That work includes:
- [x] Look into `syn` support.
- https://github.com/dtolnay/syn/issues/1677
- https://github.com/dtolnay/syn/pull/1707
- [x] Look into `rustfmt` support.
- https://github.com/rust-lang/rust/pull/126754
- [x] Look into `rust-analyzer` support.
- https://github.com/rust-lang/rust-analyzer/issues/17598
- https://github.com/rust-lang/rust-analyzer/pull/17676
- [x] Look into `rustdoc` support.
- https://github.com/rust-lang/rust/issues/127228
- https://github.com/rust-lang/rust/pull/127632
- https://github.com/rust-lang/rust/pull/127658
- [x] Suggest this feature to RfL (a known nightly user).
- [x] Add a chapter to the edition guide.
- https://github.com/rust-lang/edition-guide/pull/316
- [x] Update the Reference.
- https://github.com/rust-lang/reference/pull/1577
### (Selected) implementation history
* https://github.com/rust-lang/rfcs/pull/3498
* https://github.com/rust-lang/rfcs/pull/3617
* https://github.com/rust-lang/rust/pull/123468
* https://github.com/rust-lang/rust/issues/125836
* https://github.com/rust-lang/rust/pull/126049
* https://github.com/rust-lang/rust/pull/126753Closes#123432.
cc `@rust-lang/lang` `@rust-lang/types`
`@rustbot` labels +T-lang +I-lang-nominated +A-impl-trait +F-precise_capturing
Tracking:
- https://github.com/rust-lang/rust/issues/123432
----
For the compiler reviewer, I'll leave some inline comments about diagnostics fallout :^)
r? compiler
Stabilize `unsafe_attributes`
# Stabilization report
## Summary
This is a tracking issue for the RFC 3325: unsafe attributes
We are stabilizing `#![feature(unsafe_attributes)]`, which makes certain attributes considered 'unsafe', meaning that they must be surrounded by an `unsafe(...)`, as in `#[unsafe(no_mangle)]`.
RFC: rust-lang/rfcs#3325
Tracking issue: #123757
## What is stabilized
### Summary of stabilization
Certain attributes will now be designated as unsafe attributes, namely, `no_mangle`, `export_name`, and `link_section` (stable only), and these attributes will need to be called by surrounding them in `unsafe(...)` syntax. On editions prior to 2024, this is simply an edition lint, but it will become a hard error in 2024. This also works in `cfg_attr`, but `unsafe` is not allowed for any other attributes, including proc-macros ones.
```rust
#[unsafe(no_mangle)]
fn a() {}
#[cfg_attr(any(), unsafe(export_name = "c"))]
fn b() {}
```
For a table showing the attributes that were considered to be included in the list to require unsafe, and subsequent reasoning about why each such attribute was or was not included, see [this comment here](https://github.com/rust-lang/rust/pull/124214#issuecomment-2124753464)
## Tests
The relevant tests are in `tests/ui/rust-2024/unsafe-attributes` and `tests/ui/attributes/unsafe`.
This commit does the following.
- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
(a) it's annoying long, and (b) the `_trailing_token` bit is less
accurate now that its types have changed.
- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
`UsePreAttrPos` in the return type of `f`. These are used in
`parse_expr_force_collect` (for vanilla expressions) and in
`parse_stmt_without_recovery` (for two different cases of expression
statements). Together these ensure are enough to fix all the problems
with token collection and assoc expressions. The changes to the
`stringify.rs` test demonstrate some of these.
- Adds a new test. The code in this test was causing an assertion
failure prior to this commit, due to an invalid `NodeRange`.
The extra complexity is annoying, but necessary to fix the existing
problems.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
Fix bug in `Parser::look_ahead`.
The special case was failing to handle invisible delimiters on one path.
Fixes (but doesn't close until beta backported) #128895.
r? `@davidtwco`
Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
Previously we would try to issue a suggestion for `let x <op>= 1`, i.e.
a compound assignment within a `let` binding, to remove the `<op>`. The
suggestion code unfortunately incorrectly assumed that the `<op>` is an
exactly-1-byte ASCII character, but this assumption is incorrect because
we also recover Unicode-confusables like `➖=` as `-=`. In this example,
the suggestion code used a `+ BytePos(1)` to calculate the span of the
`<op>` codepoint that looks like `-` but the mult-byte Unicode
look-alike would cause the suggested removal span to be inside a
multi-byte codepoint boundary, triggering a codepoint boundary
assertion.
Issue: <https://github.com/rust-lang/rust/issues/128845>
More unsafe attr verification
This code denies unsafe on attributes such as `#[test]` and `#[ignore]`, while also changing the `MetaItem` parsing so `unsafe` in args like `#[allow(unsafe(dead_code))]` is not accidentally allowed.
Tracking:
- https://github.com/rust-lang/rust/issues/123757
When collecting tokens there are two kinds of range:
- a range relative to the parser's full token stream (which we get when
we are parsing);
- a range relative to a single AST node's token stream (which we use
within `LazyAttrTokenStreamImpl` when replacing tokens).
These are currently both represented with `Range<u32>` and it's easy to
mix them up -- until now I hadn't properly understood the difference.
This commit introduces `ParserRange` and `NodeRange` to distinguish
them. This also requires splitting `ReplaceRange` in two, giving the new
types `ParserReplacement` and `NodeReplacement`. (These latter two names
reduce the overloading of the word "range".)
The commit also rewrites some comments to be clearer.
The end result is a little more verbose, but much clearer.
`parse_expr_assoc_with` has an awkward structure -- sometimes the lhs is
already parsed. This commit splits the post-lhs part into a new method
`parse_expr_assoc_rest_with`, which makes everything shorter and
simpler.
It has a single use. This makes the `let` handling case in
`parse_stmt_without_recovery` more similar to the statement path and
statement expression cases.
This makes it possible for the `unsafe(...)` syntax to only be
valid at the top level, and the `NestedMetaItem`s will automatically
reject `unsafe(...)`.
Mark `Parser::eat`/`check` methods as `#[must_use]`
These methods return a `bool`, but we probably should either use these values or explicitly throw them away (e.g. when we just want to unconditionally eat a token if it exists).
I changed a few places from `eat` to `expect`, but otherwise I tried to leave a comment explaining why the `eat` was okay.
This also adds a test for the `pattern_type!` macro, which used to silently accept a missing `is` token.
Add limit for unclosed delimiters in lexer diagnostic
Fixes#127868
The first commit shows the original diagnostic, and the second commit shows the changes.
improve error message when `global_asm!` uses `asm!` options
specifically, what was
error: expected one of `)`, `att_syntax`, or `raw`, found `preserves_flags`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ expected one of `)`, `att_syntax`, or `raw`
is now
error: the `preserves_flags` option cannot be used with `global_asm!`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ the `preserves_flags` option is not meaningful for global-scoped inline assembly
mirroring the phrasing of the [reference](https://doc.rust-lang.org/reference/inline-assembly.html#options).
This is also a bit of a refactor for a future `naked_asm!` macro (for use in `#[naked]` functions). Currently this sort of error can come up when switching from inline to global asm, or when a user just isn't that experienced with assembly. With `naked_asm!` added to the mix hitting this error is more likely.
Improve `extern "<abi>" unsafe fn()` error message
These errors were already reported in #87217, and fixed by #87235 but missed the case of an explicit ABI.
This PR does not cover multiple keywords like `extern "C" pub const unsafe fn()`, but I don't know what a good way to cover this would be. It also seems rarer than `extern "C" unsafe` which I saw happen a few times in workshops.
Remove unnecessary range replacements
This PR removes an unnecessary range replacement in `collect_tokens_trailing_token`, and does a couple of other small cleanups.
r? ````@petrochenkov````
A fully imperative style is easier to read than a half-iterator,
half-imperative style. Also, rename `inner_attr` as `attr` because it
might be an outer attribute.
Imagine you have replace ranges (2..20,X) and (5..15,Y), and these tokens:
```
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x
```
If we replace (5..15,Y) first, then (2..20,X) we get this sequence
```
a,b,c,d,e,Y,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is what we want.
If we do it in the other order, we get this:
```
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,Y,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is wrong. So it's true that we need the `.rev()` but the comment
is wrong about why.
The current code is this:
```
self.capture_state.replace_ranges.push((start_pos..end_pos, Some(target)));
self.capture_state.replace_ranges.extend(inner_attr_replace_ranges);
```
What's not obvious is that every range in `inner_attr_replace_ranges`
must be a strict sub-range of `start_pos..end_pos`. Which means, in
`LazyAttrTokenStreamImpl::to_attr_token_stream`, they will be done
first, and then the `start_pos..end_pos` replacement will just overwrite
them. So they aren't needed.
This has been bugging me for a while. I find complex "if any of these
are true" conditions easier to think about than complex "if all of these
are true" conditions, because you can stop as soon as one is true.
Reorder trait bound modifiers *after* `for<...>` binder in trait bounds
This PR suggests changing the grammar of trait bounds from:
```
[CONSTNESS] [ASYNCNESS] [?] [BINDER] [TRAIT_PATH]
const async ? for<'a> Sized
```
to
```
([BINDER] [CONSTNESS] [ASYNCNESS] | [?]) [TRAIT_PATH]
```
i.e., either
```
? Sized
```
or
```
for<'a> const async Sized
```
(but not both)
### Why?
I think it's strange that the binder applies "more tightly" than the `?` trait polarity. This becomes even weirder when considering that we (or at least, I) want to have `async` trait bounds expressed like:
```
where T: for<'a> async Fn(&'a ()) -> i32,
```
and not:
```
where T: async for<'a> Fn(&'a ()) -> i32,
```
### Fallout
No crates on crater use this syntax, presumably because it's literally useless. This will require modifying the reference grammar, though.
### Alternatives
If this is not desirable, then we can alternatively keep parsing `for<'a>` after the `?` but deprecate it with either an FCW (or an immediate hard error), and begin parsing `for<'a>` *before* the `?`.
Adding details, clarifying lots of little things, etc. In particular,
the commit adds details of an example. I find this very helpful, because
it's taken me a long time to understand how this code works.
Currently in `collect_tokens_trailing_token`, `start_pos` and `end_pos`
are 1-indexed by `replace_ranges` is 0-indexed, which is really
confusing. Making them both 0-indexed makes debugging much easier.