Trim whitespace in RemoveLet primary span
Separate `RemoveLet` span into primary span for `let` and removal suggestion span for `let `, so that primary span does not include whitespace.
Fixes: #133031
If a macro statement has been parsed after `else`, suggest a missing `if`:
```
error: expected `{`, found `falsy`
--> $DIR/else-no-if.rs:47:12
|
LL | } else falsy! {} {
| ---- ^^^^^
| |
| expected an `if` or a block after this `else`
|
help: add an `if` if this is the condition of a chained `else if` statement
|
LL | } else if falsy! {} {
| ++
```
Look at the expression that was parsed when trying to recover from a bad `if` condition to determine what was likely intended by the user beyond "maybe this was meant to be an `else` body".
```
error: expected `{`, found `map`
--> $DIR/missing-dot-on-if-condition-expression-fixable.rs:4:30
|
LL | for _ in [1, 2, 3].iter()map(|x| x) {}
| ^^^ expected `{`
|
help: you might have meant to write a method call
|
LL | for _ in [1, 2, 3].iter().map(|x| x) {}
| +
```
Separate `RemoveLet` span into primary span for `let` and removal
suggestion span for `let `, so that primary span does not include
whitespace.
Fixes: #133031
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
`to_lowercase` allocates, but `eq_ignore_ascii_case` doesn't. This path
is hot enough that this makes a small but noticeable difference in
benchmarking.
Enforce that raw lifetimes must be valid raw identifiers
Make sure that the identifier part of a raw lifetime is a valid raw identifier. This precludes `'r#_` and all module segment paths for now.
I don't believe this is compelling to support. This was raised by `@ehuss` in https://github.com/rust-lang/reference/pull/1603#discussion_r1822726753 (well, specifically the `'r#_` case), but I don't see why we shouldn't just make it consistent with raw identifiers.
Use `token_descr` more in error messages
This is the first two commits from #124141, put into their own PR to get things rolling. Commit messages have the details.
r? ``@estebank``
cc ``@petrochenkov``
By using `token_descr`, as is done for many other errors, we can get
slightly better descriptions in error messages, e.g.
"macro expansion ignores token `let` and any following" becomes
"macro expansion ignores keyword `let` and any tokens following".
This will be more important once invisible delimiters start being
mentioned in error messages -- without this commit, that leads to error
messages such as "error at ``" because invisible delimiters are
pretty printed as an empty string.
Rollup of 9 pull requests
Successful merges:
- #122670 (Fix bug where `option_env!` would return `None` when env var is present but not valid Unicode)
- #131095 (Use environment variables instead of command line arguments for merged doctests)
- #131339 (Expand set_ptr_value / with_metadata_of docs)
- #131652 (Move polarity into `PolyTraitRef` rather than storing it on the side)
- #131675 (Update lint message for ABI not supported)
- #131681 (Fix up-to-date checking for run-make tests)
- #131702 (Suppress import errors for traits that couldve applied for method lookup error)
- #131703 (Resolved python deprecation warning in publish_toolstate.py)
- #131710 (Remove `'apostrophes'` from `rustc_parse_format`)
r? `@ghost`
`@rustbot` modify labels: rollup
Add `&pin (mut|const) T` type position sugar
This adds parser support for `&pin mut T` and `&pin const T` references. These are desugared to `Pin<&mut T>` and `Pin<&T>` in the AST lowering phases.
This PR currently includes #130526 since that one is in the commit queue. Only the most recent commits (bd450027eb4a94b814a7dd9c0fa29102e6361149 and following) are new.
Tracking:
- #130494
r? `@compiler-errors`
Add suggestion for removing invalid path sep `::` in fn def
Add suggestion for removing invalid path separator `::` in function definition.
for example: `fn invalid_path_separator::<T>() {}`
fixes#130791
Retire the `unnamed_fields` feature for now
`#![feature(unnamed_fields)]` was implemented in part in #115131 and #115367, however work on that feature has (afaict) stalled and in the mean time there have been some concerns raised (e.g.[^1][^2]) about whether `unnamed_fields` is worthwhile to have in the language, especially in its current desugaring. Because it represents a compiler implementation burden including a new kind of anonymous ADT and additional complication to field selection, and is quite prone to bugs today, I'm choosing to remove the feature.
However, since I'm not one to really write a bunch of words, I'm specifically *not* going to de-RFC this feature. This PR essentially *rolls back* the state of this feature to "RFC accepted but not yet implemented"; however if anyone wants to formally unapprove the RFC from the t-lang side, then please be my guest. I'm just not totally willing to summarize the various language-facing reasons for why this feature is or is not worthwhile, since I'm coming from the compiler side mostly.
Fixes#117942Fixes#121161Fixes#121263Fixes#121299Fixes#121722Fixes#121799Fixes#126969Fixes#131041
Tracking:
* https://github.com/rust-lang/rust/issues/49804
[^1]: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Unnamed.20struct.2Funion.20fields
[^2]: https://github.com/rust-lang/rust/issues/49804#issuecomment-1972619108
```
error: expected a pattern, found an expression
--> f889.rs:3:13
|
3 | let (x, y.drop()) = (1, 2); //~ ERROR
| ^^^^^^^^ not a pattern
|
= note: arbitrary expressions are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
error[E0532]: expected a pattern, found a function call
--> f889.rs:2:13
|
2 | let (x, drop(y)) = (1, 2); //~ ERROR
| ^^^^ not a tuple struct or tuple variant
|
= note: function calls are not allowed in patterns: <https://doc.rust-lang.org/book/ch18-00-patterns.html>
```
Fix#97200.
Implement RFC3695 Allow boolean literals as cfg predicates
This PR implements https://github.com/rust-lang/rfcs/pull/3695: allow boolean literals as cfg predicates, i.e. `cfg(true)` and `cfg(false)`.
r? `@nnethercote` *(or anyone with parser knowledge)*
cc `@clubby789`
Fix `break_last_token`.
It currently doesn't handle the three-char tokens `>>=` and `<<=` correctly. These can be broken twice, resulting in three individual tokens. This is a latent bug that currently doesn't cause any problems, but does cause problems for #124141, because that PR increases the usage of lazy token streams.
r? `@petrochenkov`
It currently doesn't handle the three-char tokens `>>=` and `<<=`
correctly. These can be broken twice, resulting in three individual
tokens. This is a latent bug that currently doesn't cause any problems,
but does cause problems for #124141, because that PR increases the usage
of lazy token streams.
Implement a Method to Seal `DiagInner`'s Suggestions
This PR adds a method on `DiagInner` called `.seal_suggestions()` to prevent new suggestions from being added while preserving existing suggestions.
This is useful because currently there is no way to prevent new suggestions from being added to a diagnostic. `.disable_suggestions()` is the closest but it gets rid of all suggestions before and after the call.
Therefore, `.seal_suggestions()` can be used when, for example, misspelled keyword is detected and reported. In such cases, we may want to prevent other suggestions from being added to the diagnostic, as they would likely be meaningless once the misspelled keyword is identified. For context: https://github.com/rust-lang/rust/pull/129899#discussion_r1741307132
To store an additional state, the type of the `suggestions` field in `DiagInner` was changed into a three variant enum. While this change affects files across different crates, care was taken to preserve the existing code's semantics. This is validated by the fact that all UI tests pass without any modifications.
r? chenyukang
stabilize `const_extern_fn`
closes https://github.com/rust-lang/rust/issues/64926
tracking issue: https://github.com/rust-lang/rust/issues/64926
reference PR: https://github.com/rust-lang/reference/pull/1596
## Stabilizaton Report
### Summary
Using `const extern "Rust"` and `const extern "C"` was already stabilized (since version 1.62.0, see https://github.com/rust-lang/rust/pull/95346). This PR stabilizes the other calling conventions: it is now possible to write `const unsafe extern "calling-convention" fn` and `const extern "calling-convention" fn` for any supported calling convention:
```rust
const extern "C-unwind" fn foo1(val: u8) -> u8 { val + 1}
const extern "stdcall" fn foo2(val: u8) -> u8 { val + 1}
const unsafe extern "C-unwind" fn bar1(val: bool) -> bool { !val }
const unsafe extern "stdcall" fn bar2(val: bool) -> bool { !val }
```
This can be used to const-ify an `extern fn`, or conversely, to make a `const fn` callable from external code.
r? T-lang
cc `@RalfJung`
Fix `clippy::useless_conversion`
Self-explanatory. Probably the last clippy change I'll actually put up since this is the only other one I've actually seen in the wild.
Simplify some nested `if` statements
Applies some but not all instances of `clippy::collapsible_if`. Some ended up looking worse afterwards, though, so I left those out. Also applies instances of `clippy::collapsible_else_if`
Review with whitespace disabled please.
Fix double handling in `collect_tokens`
Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.
This PR fixes some double handling cases that cause problems, including #129166.
r? `@petrochenkov`
Implement raw lifetimes and labels (`'r#ident`)
This PR does two things:
1. Reserve lifetime prefixes, e.g. `'prefix#lt` in edition 2021.
2. Implements raw lifetimes, e.g. `'r#async` in edition 2021.
This PR additionally extends the `keyword_idents_2024` lint to also check lifetimes.
cc `@traviscross`
r? parser
Add Suggestions for Misspelled Keywords
Fixes#97793
This PR detects misspelled keywords using two heuristics:
1. Lowercasing the unexpected identifier.
2. Using edit distance to find a keyword similar to the unexpected identifier.
However, it does not detect each and every misspelled keyword to
minimize false positives and ambiguities. More details about the
implementation can be found in the comments.
This PR detects misspelled keywords using two heuristics:
1. Lowercasing the unexpected identifier.
2. Using edit distance to find a keyword similar to the unexpected identifier.
However, it does not detect each and every misspelled keyword to
minimize false positives and ambiguities. More details about the
implementation can be found in the comments.
Don't make statement nonterminals match pattern nonterminals
Right now, the heuristic we use to check if a token may begin a pattern nonterminal falls back to `may_be_ident`:
ef71f1047e/compiler/rustc_parse/src/parser/nonterminal.rs (L21-L37)
This has the unfortunate side effect that a `stmt` nonterminal eagerly matches against a `pat` nonterminal, leading to a parse error:
```rust
macro_rules! m {
($pat:pat) => {};
($stmt:stmt) => {};
}
macro_rules! m2 {
($stmt:stmt) => {
m! { $stmt }
};
}
m2! { let x = 1 }
```
This PR fixes it by more accurately reflecting the set of nonterminals that may begin a pattern nonterminal.
As a side-effect, I modified `Token::can_begin_pattern` to work correctly and used that in `Parser::nonterminal_may_begin_with`.
By keeping track of attributes that have been previously processed.
This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.
Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
In a case like this:
```
mod a {
mod b {
#[cfg_attr(unix, inline)]
fn f() {
#[cfg_attr(linux, inline)]
fn g1() {}
#[cfg_attr(linux, inline)]
fn g2() {}
}
}
}
```
We currently end up with the following replacement ranges.
- The lazy tokens for `f` has replacement ranges for `g1` and `g2`.
- The lazy tokens for `a` has replacement ranges for `f`, `g1`, and
`g2`.
I.e. the replacement ranges for `g1` and `g2` are duplicated. In
general, replacement ranges for inner AST nodes are duplicated up the
chain for each nested `collect_tokens` call. And the code that processes
the replacements is careful about the ordering in which the replacements
are applied, to ensure that inner replacements are applied before outer
replacements.
But all of this is unnecessary. If you apply an inner replacement and
then an outer replacement, the outer replacement completely overwrites
the inner replacement.
This commit avoids the duplication by removing replacements from
`self.capture_state.parser_replacements` when they are used. (The effect
on the example above is that the lazy tokesn for `a` no longer include
replacement ranges for `g1` and `g2`.) This eliminates the possibility
of nested replacements on individual AST nodes, which avoids the need
for careful ordering of replacements.
This example triggers an assertion failure:
```
fn f() -> u32 {
#[cfg_eval] #[cfg(not(FALSE))] 0
}
```
The sequence of events:
- `configure_annotatable` calls `parse_expr_force_collect`, which calls
`collect_tokens`.
- Within that, we end up in `parse_expr_dot_or_call`, which again calls
`collect_tokens`.
- The return value of the `f` call is the expression `0`.
- This inner call collects tokens for `0` (parser range 10..11) and
creates a replacement covering `#[cfg(not(FALSE))] 0` (parser range
0..11).
- We return to the outer `collect_tokens` call. The return value of the
`f` call is *again* the expression `0`, again with the range 10..11,
but the replacement from earlier covers the range 0..11. The code
mistakenly assumes that any attributes from an inner `collect_tokens`
call fit entirely within the body of the result of an outer
`collect_tokens` call. So it adjusts the replacement parser range
0..11 to a node range by subtracting 10, resulting in -10..1. This is
an invalid range and triggers an assertion failure.
It's tricky to follow, but basically things get complicated when an AST
node is returned from an inner `collect_tokens` call and then returned
again from an outer `collect_token` node without being wrapped in any
kind of additional layer.
This commit changes `collect_tokens` to return early in some extra cases,
avoiding the construction of lazy tokens. In the example above, the
outer `collect_tokens` returns earlier because the `0` token already has
tokens and `self.capture_state.capturing` is `Capturing::No`. This early
return avoids the creation of the invalid range and the assertion
failure.
Fixes#129166. Note: these invalid ranges have been happening for a long
time. #128725 looks like it's at fault only because it introduced the
assertion that catches the invalid ranges.
Stabilize opaque type precise capturing (RFC 3617)
This PR partially stabilizes opaque type *precise capturing*, which was specified in [RFC 3617](https://github.com/rust-lang/rfcs/pull/3617), and whose syntax was amended by FCP in [#125836](https://github.com/rust-lang/rust/issues/125836).
This feature, as stabilized here, gives us a way to explicitly specify the generic lifetime parameters that an RPIT-like opaque type captures. This solves the problem of overcapturing, for lifetime parameters in these opaque types, and will allow the Lifetime Capture Rules 2024 ([RFC 3498](https://github.com/rust-lang/rfcs/pull/3498)) to be fully stabilized for RPIT in Rust 2024.
### What are we stabilizing?
This PR stabilizes the use of a `use<'a, T>` bound in return-position impl Trait opaque types. Such a bound fully specifies the set of generic parameters captured by the RPIT opaque type, entirely overriding the implicit default behavior. E.g.:
```rust
fn does_not_capture<'a, 'b>() -> impl Sized + use<'a> {}
// ~~~~~~~~~~~~~~~~~~~~
// This RPIT opaque type does not capture `'b`.
```
The way we would suggest thinking of `impl Trait` types *without* an explicit `use<..>` bound is that the `use<..>` bound has been *elided*, and that the bound is filled in automatically by the compiler according to the edition-specific capture rules.
All non-`'static` lifetime parameters, named (i.e. non-APIT) type parameters, and const parameters in scope are valid to name, including an elided lifetime if such a lifetime would also be valid in an outlives bound, e.g.:
```rust
fn elided(x: &u8) -> impl Sized + use<'_> { x }
```
Lifetimes must be listed before type and const parameters, but otherwise the ordering is not relevant to the `use<..>` bound. Captured parameters may not be duplicated. For now, only one `use<..>` bound may appear in a bounds list. It may appear anywhere within the bounds list.
### How does this differ from the RFC?
This stabilization differs from the RFC in one respect: the RFC originally specified `use<'a, T>` as syntactically part of the RPIT type itself, e.g.:
```rust
fn capture<'a>() -> impl use<'a> Sized {}
```
However, settling on the final syntax was left as an open question. T-lang later decided via FCP in [#125836](https://github.com/rust-lang/rust/issues/125836) to treat `use<..>` as a syntactic bound instead, e.g.:
```rust
fn capture<'a>() -> impl Sized + use<'a> {}
```
### What aren't we stabilizing?
The key goal of this PR is to stabilize the parts of *precise capturing* that are needed to enable the migration to Rust 2024.
There are some capabilities of *precise capturing* that the RFC specifies but that we're not stabilizing here, as these require further work on the type system. We hope to lift these limitations later.
The limitations that are part of this PR were specified in the [RFC's stabilization strategy](https://rust-lang.github.io/rfcs/3617-precise-capturing.html#stabilization-strategy).
#### Not capturing type or const parameters
The RFC addresses the overcapturing of type and const parameters; that is, it allows for them to not be captured in opaque types. We're not stabilizing that in this PR. Since all in scope generic type and const parameters are implicitly captured in all editions, this is not needed for the migration to Rust 2024.
For now, when using `use<..>`, all in scope type and const parameters must be nameable (i.e., APIT cannot be used) and included as arguments. For example, this is an error because `T` is in scope and not included as an argument:
```rust
fn test<T>() -> impl Sized + use<> {}
//~^ ERROR `impl Trait` must mention all type parameters in scope in `use<...>`
```
This is due to certain current limitations in the type system related to how generic parameters are represented as captured (i.e. bivariance) and how inference operates.
We hope to relax this in the future, and this stabilization is forward compatible with doing so.
#### Precise capturing for return-position impl Trait **in trait** (RPITIT)
The RFC specifies precise capturing for RPITIT. We're not stabilizing that in this PR. Since RPITIT already adheres to the Lifetime Capture Rules 2024, this isn't needed for the migration to Rust 2024.
The effect of this is that the anonymous associated types created by RPITITs must continue to capture all of the lifetime parameters in scope, e.g.:
```rust
trait Foo<'a> {
fn test() -> impl Sized + use<Self>;
//~^ ERROR `use<...>` precise capturing syntax is currently not allowed in return-position `impl Trait` in traits
}
```
To allow this involves a meaningful amount of type system work related to adding variance to GATs or reworking how generics are represented in RPITITs. We plan to do this work separately from the stabilization. See:
- https://github.com/rust-lang/rust/pull/124029
Supporting precise capturing for RPITIT will also require us to implement a new algorithm for detecting refining capture behavior. This may involve looking through type parameters to detect cases where the impl Trait type in an implementation captures fewer lifetimes than the corresponding RPITIT in the trait definition, e.g.:
```rust
trait Foo {
fn rpit() -> impl Sized + use<Self>;
}
impl<'a> Foo for &'a () {
// This is "refining" due to not capturing `'a` which
// is implied by the trait's `use<Self>`.
fn rpit() -> impl Sized + use<>;
// This is not "refining".
fn rpit() -> impl Sized + use<'a>;
}
```
This stabilization is forward compatible with adding support for this later.
### The technical details
This bound is purely syntactical and does not lower to a [`Clause`](https://doc.rust-lang.org/1.79.0/nightly-rustc/rustc_middle/ty/type.ClauseKind.html) in the type system. For the purposes of the type system (and for the types team's curiosity regarding this stabilization), we have no current need to represent this as a `ClauseKind`.
Since opaques already capture a variable set of lifetimes depending on edition and their syntactical position (e.g. RPIT vs RPITIT), a `use<..>` bound is just a way to explicitly rather than implicitly specify that set of lifetimes, and this only affects opaque type lowering from AST to HIR.
### FCP plan
While there's much discussion of the type system here, the feature in this PR is implemented internally as a transformation that happens before lowering to the type system layer. We already support impl Trait types partially capturing the in scope lifetimes; we just currently only expose that implicitly.
So, in my (errs's) view as a types team member, there's nothing for types to weigh in on here with respect to the implementation being stabilized, and I'd suggest a lang-only proposed FCP (though we'll of course CC the team below).
### Authorship and acknowledgments
This stabilization report was coauthored by compiler-errors and TC.
TC would like to acknowledge the outstanding and speedy work that compiler-errors has done to make this feature happen.
compiler-errors thanks TC for authoring the RFC, for all of his involvement in this feature's development, and pushing the Rust 2024 edition forward.
### Open items
We're doing some things in parallel here. In signaling the intention to stabilize, we want to uncover any latent issues so we can be sure they get addressed. We want to give the maximum time for discussion here to happen by starting it while other remaining miscellaneous work proceeds. That work includes:
- [x] Look into `syn` support.
- https://github.com/dtolnay/syn/issues/1677
- https://github.com/dtolnay/syn/pull/1707
- [x] Look into `rustfmt` support.
- https://github.com/rust-lang/rust/pull/126754
- [x] Look into `rust-analyzer` support.
- https://github.com/rust-lang/rust-analyzer/issues/17598
- https://github.com/rust-lang/rust-analyzer/pull/17676
- [x] Look into `rustdoc` support.
- https://github.com/rust-lang/rust/issues/127228
- https://github.com/rust-lang/rust/pull/127632
- https://github.com/rust-lang/rust/pull/127658
- [x] Suggest this feature to RfL (a known nightly user).
- [x] Add a chapter to the edition guide.
- https://github.com/rust-lang/edition-guide/pull/316
- [x] Update the Reference.
- https://github.com/rust-lang/reference/pull/1577
### (Selected) implementation history
* https://github.com/rust-lang/rfcs/pull/3498
* https://github.com/rust-lang/rfcs/pull/3617
* https://github.com/rust-lang/rust/pull/123468
* https://github.com/rust-lang/rust/issues/125836
* https://github.com/rust-lang/rust/pull/126049
* https://github.com/rust-lang/rust/pull/126753Closes#123432.
cc `@rust-lang/lang` `@rust-lang/types`
`@rustbot` labels +T-lang +I-lang-nominated +A-impl-trait +F-precise_capturing
Tracking:
- https://github.com/rust-lang/rust/issues/123432
----
For the compiler reviewer, I'll leave some inline comments about diagnostics fallout :^)
r? compiler
Stabilize `unsafe_attributes`
# Stabilization report
## Summary
This is a tracking issue for the RFC 3325: unsafe attributes
We are stabilizing `#![feature(unsafe_attributes)]`, which makes certain attributes considered 'unsafe', meaning that they must be surrounded by an `unsafe(...)`, as in `#[unsafe(no_mangle)]`.
RFC: rust-lang/rfcs#3325
Tracking issue: #123757
## What is stabilized
### Summary of stabilization
Certain attributes will now be designated as unsafe attributes, namely, `no_mangle`, `export_name`, and `link_section` (stable only), and these attributes will need to be called by surrounding them in `unsafe(...)` syntax. On editions prior to 2024, this is simply an edition lint, but it will become a hard error in 2024. This also works in `cfg_attr`, but `unsafe` is not allowed for any other attributes, including proc-macros ones.
```rust
#[unsafe(no_mangle)]
fn a() {}
#[cfg_attr(any(), unsafe(export_name = "c"))]
fn b() {}
```
For a table showing the attributes that were considered to be included in the list to require unsafe, and subsequent reasoning about why each such attribute was or was not included, see [this comment here](https://github.com/rust-lang/rust/pull/124214#issuecomment-2124753464)
## Tests
The relevant tests are in `tests/ui/rust-2024/unsafe-attributes` and `tests/ui/attributes/unsafe`.
This commit does the following.
- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
(a) it's annoying long, and (b) the `_trailing_token` bit is less
accurate now that its types have changed.
- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
`UsePreAttrPos` in the return type of `f`. These are used in
`parse_expr_force_collect` (for vanilla expressions) and in
`parse_stmt_without_recovery` (for two different cases of expression
statements). Together these ensure are enough to fix all the problems
with token collection and assoc expressions. The changes to the
`stringify.rs` test demonstrate some of these.
- Adds a new test. The code in this test was causing an assertion
failure prior to this commit, due to an invalid `NodeRange`.
The extra complexity is annoying, but necessary to fix the existing
problems.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
Fix bug in `Parser::look_ahead`.
The special case was failing to handle invisible delimiters on one path.
Fixes (but doesn't close until beta backported) #128895.
r? `@davidtwco`
Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
Previously we would try to issue a suggestion for `let x <op>= 1`, i.e.
a compound assignment within a `let` binding, to remove the `<op>`. The
suggestion code unfortunately incorrectly assumed that the `<op>` is an
exactly-1-byte ASCII character, but this assumption is incorrect because
we also recover Unicode-confusables like `➖=` as `-=`. In this example,
the suggestion code used a `+ BytePos(1)` to calculate the span of the
`<op>` codepoint that looks like `-` but the mult-byte Unicode
look-alike would cause the suggested removal span to be inside a
multi-byte codepoint boundary, triggering a codepoint boundary
assertion.
Issue: <https://github.com/rust-lang/rust/issues/128845>
More unsafe attr verification
This code denies unsafe on attributes such as `#[test]` and `#[ignore]`, while also changing the `MetaItem` parsing so `unsafe` in args like `#[allow(unsafe(dead_code))]` is not accidentally allowed.
Tracking:
- https://github.com/rust-lang/rust/issues/123757
When collecting tokens there are two kinds of range:
- a range relative to the parser's full token stream (which we get when
we are parsing);
- a range relative to a single AST node's token stream (which we use
within `LazyAttrTokenStreamImpl` when replacing tokens).
These are currently both represented with `Range<u32>` and it's easy to
mix them up -- until now I hadn't properly understood the difference.
This commit introduces `ParserRange` and `NodeRange` to distinguish
them. This also requires splitting `ReplaceRange` in two, giving the new
types `ParserReplacement` and `NodeReplacement`. (These latter two names
reduce the overloading of the word "range".)
The commit also rewrites some comments to be clearer.
The end result is a little more verbose, but much clearer.
`parse_expr_assoc_with` has an awkward structure -- sometimes the lhs is
already parsed. This commit splits the post-lhs part into a new method
`parse_expr_assoc_rest_with`, which makes everything shorter and
simpler.
It has a single use. This makes the `let` handling case in
`parse_stmt_without_recovery` more similar to the statement path and
statement expression cases.
This makes it possible for the `unsafe(...)` syntax to only be
valid at the top level, and the `NestedMetaItem`s will automatically
reject `unsafe(...)`.
Mark `Parser::eat`/`check` methods as `#[must_use]`
These methods return a `bool`, but we probably should either use these values or explicitly throw them away (e.g. when we just want to unconditionally eat a token if it exists).
I changed a few places from `eat` to `expect`, but otherwise I tried to leave a comment explaining why the `eat` was okay.
This also adds a test for the `pattern_type!` macro, which used to silently accept a missing `is` token.
Add limit for unclosed delimiters in lexer diagnostic
Fixes#127868
The first commit shows the original diagnostic, and the second commit shows the changes.
improve error message when `global_asm!` uses `asm!` options
specifically, what was
error: expected one of `)`, `att_syntax`, or `raw`, found `preserves_flags`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ expected one of `)`, `att_syntax`, or `raw`
is now
error: the `preserves_flags` option cannot be used with `global_asm!`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ the `preserves_flags` option is not meaningful for global-scoped inline assembly
mirroring the phrasing of the [reference](https://doc.rust-lang.org/reference/inline-assembly.html#options).
This is also a bit of a refactor for a future `naked_asm!` macro (for use in `#[naked]` functions). Currently this sort of error can come up when switching from inline to global asm, or when a user just isn't that experienced with assembly. With `naked_asm!` added to the mix hitting this error is more likely.
Improve `extern "<abi>" unsafe fn()` error message
These errors were already reported in #87217, and fixed by #87235 but missed the case of an explicit ABI.
This PR does not cover multiple keywords like `extern "C" pub const unsafe fn()`, but I don't know what a good way to cover this would be. It also seems rarer than `extern "C" unsafe` which I saw happen a few times in workshops.
Remove unnecessary range replacements
This PR removes an unnecessary range replacement in `collect_tokens_trailing_token`, and does a couple of other small cleanups.
r? ````@petrochenkov````
A fully imperative style is easier to read than a half-iterator,
half-imperative style. Also, rename `inner_attr` as `attr` because it
might be an outer attribute.
Imagine you have replace ranges (2..20,X) and (5..15,Y), and these tokens:
```
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x
```
If we replace (5..15,Y) first, then (2..20,X) we get this sequence
```
a,b,c,d,e,Y,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is what we want.
If we do it in the other order, we get this:
```
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,Y,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is wrong. So it's true that we need the `.rev()` but the comment
is wrong about why.
The current code is this:
```
self.capture_state.replace_ranges.push((start_pos..end_pos, Some(target)));
self.capture_state.replace_ranges.extend(inner_attr_replace_ranges);
```
What's not obvious is that every range in `inner_attr_replace_ranges`
must be a strict sub-range of `start_pos..end_pos`. Which means, in
`LazyAttrTokenStreamImpl::to_attr_token_stream`, they will be done
first, and then the `start_pos..end_pos` replacement will just overwrite
them. So they aren't needed.
This has been bugging me for a while. I find complex "if any of these
are true" conditions easier to think about than complex "if all of these
are true" conditions, because you can stop as soon as one is true.
Reorder trait bound modifiers *after* `for<...>` binder in trait bounds
This PR suggests changing the grammar of trait bounds from:
```
[CONSTNESS] [ASYNCNESS] [?] [BINDER] [TRAIT_PATH]
const async ? for<'a> Sized
```
to
```
([BINDER] [CONSTNESS] [ASYNCNESS] | [?]) [TRAIT_PATH]
```
i.e., either
```
? Sized
```
or
```
for<'a> const async Sized
```
(but not both)
### Why?
I think it's strange that the binder applies "more tightly" than the `?` trait polarity. This becomes even weirder when considering that we (or at least, I) want to have `async` trait bounds expressed like:
```
where T: for<'a> async Fn(&'a ()) -> i32,
```
and not:
```
where T: async for<'a> Fn(&'a ()) -> i32,
```
### Fallout
No crates on crater use this syntax, presumably because it's literally useless. This will require modifying the reference grammar, though.
### Alternatives
If this is not desirable, then we can alternatively keep parsing `for<'a>` after the `?` but deprecate it with either an FCW (or an immediate hard error), and begin parsing `for<'a>` *before* the `?`.
Adding details, clarifying lots of little things, etc. In particular,
the commit adds details of an example. I find this very helpful, because
it's taken me a long time to understand how this code works.
Currently in `collect_tokens_trailing_token`, `start_pos` and `end_pos`
are 1-indexed by `replace_ranges` is 0-indexed, which is really
confusing. Making them both 0-indexed makes debugging much easier.
The `Option`s within the `ReplaceRange`s within the hashmap are always
`None`. This PR omits them and inserts them when they are extracted from
the hashmap.
There are three places where we currently check `force_collect` and call
`collect_tokens_no_attrs` for `ForceCollect::Yes` and a vanilla parsing
function for `ForceCollect::No`.
But we can instead just pass in `force_collect` and let
`collect_tokens_trailing_token` do the appropriate thing.
Fix ICE in suggestion caused by `⩵` being recovered as `==`
The second suggestion shown here would previously incorrectly assume that the span corresponding to `⩵` was 2 bytes wide composed by 2 1 byte wide chars, so a span pointing at `==` could point only at one of the `=` to remove it. Instead, we now replace the whole thing (as we should have the whole time):
```
error: unknown start of token: \u{2a75}
--> $DIR/unicode-double-equals-recovery.rs:1:16
|
LL | const A: usize ⩵ 2;
| ^
|
help: Unicode character '⩵' (Two Consecutive Equals Signs) looks like '==' (Double Equals Sign), but it is not
|
LL | const A: usize == 2;
| ~~
error: unexpected `==`
--> $DIR/unicode-double-equals-recovery.rs:1:16
|
LL | const A: usize ⩵ 2;
| ^
|
help: try using `=` instead
|
LL | const A: usize = 2;
| ~
```
Fix#127823.
The second suggestion shown here would previously incorrectly assume that the span corresponding to `⩵` was 2 bytes wide composed by 2 1 byte wide chars, so a span pointing at `==` could point only at one of the `=` to remove it. Instead, we now replace the whole thing (as we should have the whole time):
```
error: unknown start of token: \u{2a75}
--> $DIR/unicode-double-equals-recovery.rs:1:16
|
LL | const A: usize ⩵ 2;
| ^
|
help: Unicode character '⩵' (Two Consecutive Equals Signs) looks like '==' (Double Equals Sign), but it is not
|
LL | const A: usize == 2;
| ~~
error: unexpected `==`
--> $DIR/unicode-double-equals-recovery.rs:1:16
|
LL | const A: usize ⩵ 2;
| ^
|
help: try using `=` instead
|
LL | const A: usize = 2;
| ~
```
Remove `TrailingToken`.
It's used in `Parser::collect_tokens_trailing_token` to decide whether to capture a trailing token. But the callers actually know whether to capture a trailing token, so it's simpler for them to just pass in a bool.
Also, the `TrailingToken::Gt` case was weird, because it didn't result in a trailing token being captured. It could have been subsumed by the `TrailingToken::MaybeComma` case, and it effectively is in the new code.
r? `@petrochenkov`
It's used in `Parser::collect_tokens_trailing_token` to decide whether
to capture a trailing token. But the callers actually know whether to
capture a trailing token, so it's simpler for them to just pass in a
bool.
Also, the `TrailingToken::Gt` case was weird, because it didn't result
in a trailing token being captured. It could have been subsumed by the
`TrailingToken::MaybeComma` case, and it effectively is in the new code.
More accurate span for anonymous argument suggestion
Use smaller span for suggesting adding `_:` ahead of a type:
```
error: expected one of `(`, `...`, `..=`, `..`, `::`, `:`, `{`, or `|`, found `)`
--> $DIR/anon-params-denied-2018.rs:12:47
|
LL | fn foo_with_qualified_path(<Bar as T>::Baz);
| ^ expected one of 8 possible tokens
|
= note: anonymous parameters are removed in the 2018 edition (see RFC 1685)
help: explicitly ignore the parameter name
|
LL | fn foo_with_qualified_path(_: <Bar as T>::Baz);
| ++
```
Some parser improvements
I was looking closely at attribute handling in the parser while debugging some issues relating to #124141, and found a few small improvements.
``@spastorino``
Use smaller span for suggesting adding `_:` ahead of a type:
```
error: expected one of `(`, `...`, `..=`, `..`, `::`, `:`, `{`, or `|`, found `)`
--> $DIR/anon-params-denied-2018.rs:12:47
|
LL | fn foo_with_qualified_path(<Bar as T>::Baz);
| ^ expected one of 8 possible tokens
|
= note: anonymous parameters are removed in the 2018 edition (see RFC 1685)
help: explicitly ignore the parameter name
|
LL | fn foo_with_qualified_path(_: <Bar as T>::Baz);
| ++
```
It only has two call sites, and it extremely similar to
`Parser::parse_expr_dot_or_call_with`, in both name and behaviour. The
only difference is the latter has an `attrs` argument and an
`ensure_sufficient_stack` call. We can pass in an empty `attrs` as
necessary, as is already done at some `parse_expr_dot_or_call_with` call
sites.
Make parse error suggestions verbose and fix spans
Go over all structured parser suggestions and make them verbose style.
When suggesting to add or remove delimiters, turn them into multiple suggestion parts.
Fix `DebugParser`.
I tried using this and it didn't work at all. `prev_token` is never eof, so the accumulator is always false, which means the `then_some` always returns `None`, which means `scan` always returns `None`, and `tokens` always ends up an empty vec. I'm not sure how this code was supposed to work.
(An aside: I find `Iterator::scan` to be a pretty wretched function, that produces code which is very hard to understand. Probably why this is just one of two uses of it in the entire compiler.)
This commit changes it to a simpler imperative style that produces a valid `tokens` vec.
r? `@workingjubilee`
Clear `inner_attr_ranges` regularly.
There's a comment saying we don't do it for performance reasons, but it doesn't actually affect performance.
The commit also tweaks the control flow, to make clearer that two code paths are mutually exclusive.
r? ````@petrochenkov````
It currently doesn't work at all. This commit changes it to a simpler
imperative style that produces a valid `tokens` vec.
(An aside: I find `Iterator::scan` to be a pretty wretched function,
that produces code which is very hard to understand. Probably why this
is just one of two uses of it in the entire compiler.)
That method is currently badly broken, and the test output reflects
this. The obtained tokens list is always empty, except in the case where
we go two `bump`s past the final token, whereupon it will produce as
many `Eof` tokens as asked for.
Fix `Parser::look_ahead`
`Parser::look_ahead` has a slow but simple general case, and a fast special case that is hit most of the time. But the special case is buggy and behaves differently to the general case. There are also no unit tests. This PR fixes all of this, resulting in a `Parser::look_ahead` that is equally fast, slightly simpler, more correct, and better tested.
r? `@davidtwco`
This new special case is simpler than the old special case because it
only is used when `dist == 1`. But that's still enough to cover ~98% of
cases. This results in equivalent performance to the old special case,
and identical behaviour as the general case.
The general case at the bottom of `look_ahead` is slow, because it
clones the token cursor. Above it there is a special case for
performance that is hit most of the time and avoids the cloning.
Unfortunately, its behaviour differs from the general case in two ways.
- When within a pair of delimiters, if you look any distance past the
closing delimiter you get the closing delimiter instead of what comes
after the closing delimiter.
- It uses `tree_cursor.look_ahead(dist - 1)` which totally confuses
tokens with token trees. This means that only the first token in a
token tree will be seen. E.g. in a sequence like `{ a }` the `a` and
`}` will be skipped over. Bad!
It's likely that these differences weren't noticed before now because
the use of `look_ahead` in the parser is limited to small distances and
relatively few contexts.
Removing the special case causes slowdowns up of to 2% on a range of
benchmarks. The next commit will add a new, correct special case to
regain that lost performance.
Go over all structured parser suggestions and make them verbose style.
When suggesting to add or remove delimiters, turn them into multiple suggestion parts.
The new condition is equivalent in practice, but it's much more obvious
that it would result in an empty range, because the condition lines up
with the contents of the iterator.
There's a comment saying we don't do it for performance reasons, but it
doesn't actually affect performance.
The commit also tweaks the control flow, to make clearer that two code
paths are mutually exclusive.
Currently the second element is a `Vec<(FlatToken, Spacing)>`. But the
vector always has zero or one elements, and the `FlatToken` is always
`FlatToken::AttrTarget` (which contains an `AttributesData`), and the
spacing is always `Alone`. So we can simplify it to
`Option<AttributesData>`.
An assertion in `to_attr_token_stream` can can also be removed, because
`new_tokens.len()` was always 0 or 1, which means than `range.len()`
is always greater than or equal to it, because `range.is_empty()` is
always false (as per the earlier assertion).
And update the comment. Clearly the return type of this function was
changed at some point in the past, but its name and comment weren't
updated to match.
The number of source code bytes can't exceed a `u32`'s range, so a token
position also can't. This reduces the size of `Parser` and
`LazyAttrTokenStreamImpl` by eight bytes each.
Move binder and polarity parsing into `parse_generic_ty_bound`
Let's pull out the parts of #127054 which just:
1. Make the parsing code less confusing
2. Fix `?use<>` (to correctly be denied)
3. Improve `T: for<'a> 'a` diagnostics
This should have no user-facing effects on stable parsing.
r? fmease
It currently goes one token too far.
Example: line 259 of `tests/ui/abi/compatibility.rs`:
```
test_abi_compatible!(fn_fn, fn(), fn(i32) -> i32);
```
This commit changes the span for the second element from `fn(),` to
`fn()`, i.e. removes the extraneous comma.
coverage: Overhaul validation of the `#[coverage(..)]` attribute
This PR makes sweeping changes to how the (currently-unstable) coverage attribute is validated:
- Multiple coverage attributes on the same item/expression are now treated as an error.
- The attribute must always be `#[coverage(off)]` or `#[coverage(on)]`, and the error messages for this are more consistent.
- A trailing comma is still allowed after off/on, since that's part of the normal attribute syntax.
- Some places that silently ignored a coverage attribute now produce an error instead.
- These cases were all clearly bugs.
- Some places that ignored a coverage attribute (with a warning) now produce an error instead.
- These were originally added as lints, but I don't think it makes much sense to knowingly allow new attributes to be used in meaningless places.
- Some of these errors might soon disappear, if it's easy to extend recursive coverage attributes to things like modules and impl blocks.
---
One of the goals of this PR is to lay a more solid foundation for making the coverage attribute recursive, so that it applies to all nested functions/closures instead of just the one it is directly attached to.
Fixes#126658.
This PR incorporates #126659, which adds more tests for validation of the coverage attribute.
`@rustbot` label +A-code-coverage
Special case when a code line only has multiline span starts
Minimize multline span overlap when there are multiple of them starting on the same line:
```
3 | X0 Y0 Z0
| _____^ - -
| | _______| |
| || _________|
4 | ||| X1 Y1 Z1
5 | ||| X2 Y2 Z2
| |||____^__-__- `Z` label
| ||_____|__|
| |______| `Y` is a good letter too
| `X` is a good letter
```
Add hard error and migration lint for unsafe attrs
More implementation work for https://github.com/rust-lang/rust/issues/123757
This adds the migration lint for unsafe attributes, as well as making it a hard error in Rust 2024.
Merge `PatParam`/`PatWithOr`, and `Expr`/`Expr2021`, for a few reasons.
- It's conceptually nice, because the two pattern kinds and the two
expression kinds are very similar.
- With expressions in particular, there are several places where both
expression kinds get the same treatment.
- It removes one unreachable match arm.
- Most importantly, for #124141 I will need to introduce a new type
`MetaVarKind` that is very similar to `NonterminalKind`, but records a
couple of extra fields for expression metavars. It's nicer to have a
single `MetaVarKind::Expr` expression variant to hold those extra
fields instead of duplicating them across two variants
`MetaVarKind::{Expr,Expr2021}`. And then it makes sense for patterns
to be treated the same way, and for `NonterminalKind` to also be
treated the same way.
I also clarified the comments, because I have long found them a little
hard to understand.
`StaticForeignItem` and `StaticItem` are the same
The struct `StaticItem` and `StaticForeignItem` are the same, so remove `StaticForeignItem`. Having them be separate is unique to `static` items -- unlike `ForeignItemKind::{Fn,TyAlias}`, which use the normal AST item.
r? ``@spastorino`` or ``@oli-obk``
Make edition dependent `:expr` macro fragment act like the edition-dependent `:pat` fragment does
Parse the `:expr` fragment as `:expr_2021` in editions <=2021, and as `:expr` in edition 2024. This is similar to how we parse `:pat` as `:pat_param` in edition <=2018 and `:pat_with_or` in >=2021, and means we can get rid of a span dependency from `nonterminal_may_begin_with`.
Specifically, this fixes a theoretical regression since the `expr_2021` macro fragment previously would allow `const {}` if the *caller* is edition 2024. This is inconsistent with the way that the `pat` macro fragment was upgraded, and also leads to surprising behavior when a macro *caller* crate upgrades to edtion 2024, since they may have parsing changes that they never asked for (with no way of opting out of it).
This PR also allows using `expr_2021` in all editions. Why was this was disallowed in the first place? It's purely additive, and also it's still feature gated?
r? ```@fmease``` ```@eholk``` cc ```@vincenzopalazzo```
cc #123865
Tracking:
- https://github.com/rust-lang/rust/issues/123742
Improve conflict marker recovery
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.
This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using
r? <reviewer name>
-->
closes#113826
r? ```@estebank``` since you reviewed #115413
cc: ```@rben01``` since you opened up the issue in the first place
Properly gate `safe` keyword in pre-expansion
This PR gates `safe` keyword in pre-expansion contexts. Should mitigate the fallout of https://github.com/rust-lang/rust/issues/126755, which is that `safe` is now usable on beta lol.
r? `@spastorino` or `@oli-obk`
cc #124482 tracking #123743
Clean up some comments near `use` declarations
#125443 will reformat all `use` declarations in the repository. There are a few edge cases involving comments on `use` declarations that require care. This PR cleans up some clumsy comment cases, taking us a step closer to #125443 being able to merge.
r? ``@lqd``
We currently use `can_begin_literal_maybe_minus` in a couple of places
where only string literals are allowed. This commit introduces a
more specific function, which makes things clearer. It doesn't change
behaviour because the two functions affected (`is_unsafe_foreign_mod`
and `check_keyword_case`) are always followed by a call to `parse_abi`,
which checks again for a string literal.
It's clearer this way, because the `Interpolated` cases in
`can_begin_const_arg` and `is_pat_range_end_start` are more permissive
than the `Interpolated` cases in `can_begin_literal_maybe_minus`.
Fix duplicated attributes on nonterminal expressions
This PR fixes a long-standing bug (#86055) whereby expression attributes can be duplicated when expanded through declarative macros.
First, consider how items are parsed in declarative macros:
```
Items:
- parse_nonterminal
- parse_item(ForceCollect::Yes)
- parse_item_
- attrs = parse_outer_attributes
- parse_item_common(attrs)
- maybe_whole!
- collect_tokens_trailing_token
```
The important thing is that the parsing of outer attributes is outside token collection, so the item's tokens don't include the attributes. This is how it's supposed to be.
Now consider how expression are parsed in declarative macros:
```
Exprs:
- parse_nonterminal
- parse_expr_force_collect
- collect_tokens_no_attrs
- collect_tokens_trailing_token
- parse_expr
- parse_expr_res(None)
- parse_expr_assoc_with
- parse_expr_prefix
- parse_or_use_outer_attributes
- parse_expr_dot_or_call
```
The important thing is that the parsing of outer attributes is inside token collection, so the the expr's tokens do include the attributes, i.e. in `AttributesData::tokens`.
This PR fixes the bug by rearranging expression parsing to that outer attribute parsing happens outside of token collection. This requires a number of small refactorings because expression parsing is somewhat complicated. While doing so the PR makes the code a bit cleaner and simpler, by eliminating `parse_or_use_outer_attributes` and `Option<AttrWrapper>` arguments (in favour of the simpler `parse_outer_attributes` and `AttrWrapper` arguments), and simplifying `LhsExpr`.
r? `@petrochenkov`
It now parses outer attributes before collecting tokens. This avoids the
problem where the outer attribute tokens were being stored twice -- for
the attribute tokesn, and also for the expression tokens.
Fixes#86055.