Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
Previously we would try to issue a suggestion for `let x <op>= 1`, i.e.
a compound assignment within a `let` binding, to remove the `<op>`. The
suggestion code unfortunately incorrectly assumed that the `<op>` is an
exactly-1-byte ASCII character, but this assumption is incorrect because
we also recover Unicode-confusables like `➖=` as `-=`. In this example,
the suggestion code used a `+ BytePos(1)` to calculate the span of the
`<op>` codepoint that looks like `-` but the mult-byte Unicode
look-alike would cause the suggested removal span to be inside a
multi-byte codepoint boundary, triggering a codepoint boundary
assertion.
Issue: <https://github.com/rust-lang/rust/issues/128845>
More unsafe attr verification
This code denies unsafe on attributes such as `#[test]` and `#[ignore]`, while also changing the `MetaItem` parsing so `unsafe` in args like `#[allow(unsafe(dead_code))]` is not accidentally allowed.
Tracking:
- https://github.com/rust-lang/rust/issues/123757
When collecting tokens there are two kinds of range:
- a range relative to the parser's full token stream (which we get when
we are parsing);
- a range relative to a single AST node's token stream (which we use
within `LazyAttrTokenStreamImpl` when replacing tokens).
These are currently both represented with `Range<u32>` and it's easy to
mix them up -- until now I hadn't properly understood the difference.
This commit introduces `ParserRange` and `NodeRange` to distinguish
them. This also requires splitting `ReplaceRange` in two, giving the new
types `ParserReplacement` and `NodeReplacement`. (These latter two names
reduce the overloading of the word "range".)
The commit also rewrites some comments to be clearer.
The end result is a little more verbose, but much clearer.
`parse_expr_assoc_with` has an awkward structure -- sometimes the lhs is
already parsed. This commit splits the post-lhs part into a new method
`parse_expr_assoc_rest_with`, which makes everything shorter and
simpler.
It has a single use. This makes the `let` handling case in
`parse_stmt_without_recovery` more similar to the statement path and
statement expression cases.
This makes it possible for the `unsafe(...)` syntax to only be
valid at the top level, and the `NestedMetaItem`s will automatically
reject `unsafe(...)`.
Mark `Parser::eat`/`check` methods as `#[must_use]`
These methods return a `bool`, but we probably should either use these values or explicitly throw them away (e.g. when we just want to unconditionally eat a token if it exists).
I changed a few places from `eat` to `expect`, but otherwise I tried to leave a comment explaining why the `eat` was okay.
This also adds a test for the `pattern_type!` macro, which used to silently accept a missing `is` token.
Add limit for unclosed delimiters in lexer diagnostic
Fixes#127868
The first commit shows the original diagnostic, and the second commit shows the changes.
improve error message when `global_asm!` uses `asm!` options
specifically, what was
error: expected one of `)`, `att_syntax`, or `raw`, found `preserves_flags`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ expected one of `)`, `att_syntax`, or `raw`
is now
error: the `preserves_flags` option cannot be used with `global_asm!`
--> $DIR/bad-options.rs:45:25
|
LL | global_asm!("", options(preserves_flags));
| ^^^^^^^^^^^^^^^ the `preserves_flags` option is not meaningful for global-scoped inline assembly
mirroring the phrasing of the [reference](https://doc.rust-lang.org/reference/inline-assembly.html#options).
This is also a bit of a refactor for a future `naked_asm!` macro (for use in `#[naked]` functions). Currently this sort of error can come up when switching from inline to global asm, or when a user just isn't that experienced with assembly. With `naked_asm!` added to the mix hitting this error is more likely.
Improve `extern "<abi>" unsafe fn()` error message
These errors were already reported in #87217, and fixed by #87235 but missed the case of an explicit ABI.
This PR does not cover multiple keywords like `extern "C" pub const unsafe fn()`, but I don't know what a good way to cover this would be. It also seems rarer than `extern "C" unsafe` which I saw happen a few times in workshops.
Remove unnecessary range replacements
This PR removes an unnecessary range replacement in `collect_tokens_trailing_token`, and does a couple of other small cleanups.
r? ````@petrochenkov````
A fully imperative style is easier to read than a half-iterator,
half-imperative style. Also, rename `inner_attr` as `attr` because it
might be an outer attribute.
Imagine you have replace ranges (2..20,X) and (5..15,Y), and these tokens:
```
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x
```
If we replace (5..15,Y) first, then (2..20,X) we get this sequence
```
a,b,c,d,e,Y,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is what we want.
If we do it in the other order, we get this:
```
a,b,X,_,_,_,_,_,_,_,_,_,_,_,_,p,q,r,s,t,u,v,w,x
a,b,X,_,_,Y,_,_,_,_,_,_,_,_,_,_,_,_,_,_,u,v,w,x
```
which is wrong. So it's true that we need the `.rev()` but the comment
is wrong about why.
The current code is this:
```
self.capture_state.replace_ranges.push((start_pos..end_pos, Some(target)));
self.capture_state.replace_ranges.extend(inner_attr_replace_ranges);
```
What's not obvious is that every range in `inner_attr_replace_ranges`
must be a strict sub-range of `start_pos..end_pos`. Which means, in
`LazyAttrTokenStreamImpl::to_attr_token_stream`, they will be done
first, and then the `start_pos..end_pos` replacement will just overwrite
them. So they aren't needed.
This has been bugging me for a while. I find complex "if any of these
are true" conditions easier to think about than complex "if all of these
are true" conditions, because you can stop as soon as one is true.
Reorder trait bound modifiers *after* `for<...>` binder in trait bounds
This PR suggests changing the grammar of trait bounds from:
```
[CONSTNESS] [ASYNCNESS] [?] [BINDER] [TRAIT_PATH]
const async ? for<'a> Sized
```
to
```
([BINDER] [CONSTNESS] [ASYNCNESS] | [?]) [TRAIT_PATH]
```
i.e., either
```
? Sized
```
or
```
for<'a> const async Sized
```
(but not both)
### Why?
I think it's strange that the binder applies "more tightly" than the `?` trait polarity. This becomes even weirder when considering that we (or at least, I) want to have `async` trait bounds expressed like:
```
where T: for<'a> async Fn(&'a ()) -> i32,
```
and not:
```
where T: async for<'a> Fn(&'a ()) -> i32,
```
### Fallout
No crates on crater use this syntax, presumably because it's literally useless. This will require modifying the reference grammar, though.
### Alternatives
If this is not desirable, then we can alternatively keep parsing `for<'a>` after the `?` but deprecate it with either an FCW (or an immediate hard error), and begin parsing `for<'a>` *before* the `?`.
Adding details, clarifying lots of little things, etc. In particular,
the commit adds details of an example. I find this very helpful, because
it's taken me a long time to understand how this code works.
Currently in `collect_tokens_trailing_token`, `start_pos` and `end_pos`
are 1-indexed by `replace_ranges` is 0-indexed, which is really
confusing. Making them both 0-indexed makes debugging much easier.