Subdiagnostics don't need to be lazily translated, they can always be
eagerly translated. Eager translation is slightly more complex as we need
to have a `DiagCtxt` available to perform the translation, which involves
slightly more threading of that context.
This slight increase in complexity should enable later simplifications -
like passing `DiagCtxt` into `AddToDiagnostic` and moving Fluent messages
into the diagnostic structs rather than having them in separate files
(working on that was what led to this change).
Signed-off-by: David Wood <david@davidtw.co>
Be more careful about interpreting a label/lifetime as a mistyped char literal.
Currently the parser interprets any label/lifetime in certain positions as a mistyped char literal, on the assumption that the trailing single quote was accidentally omitted. In such cases it gives an error with a suggestion to add the trailing single quote, and then puts the appropriate char literal into the AST. This behaviour was introduced in #101293.
This is reasonable for a case like this:
```
let c = 'a;
```
because `'a'` is a valid char literal. It's less reasonable for a case like this:
```
let c = 'abc;
```
because `'abc'` is not a valid char literal.
Prior to #120329 this could result in some sub-optimal suggestions in error messages, but nothing else. But #120329 changed `LitKind::from_token_lit` to assume that the char/byte/string literals it receives are valid, and to assert if not. This is reasonable because the lexer does not produce invalid char/byte/string literals in general. But in this "interpret label/lifetime as unclosed char literal" case the parser can produce an invalid char literal with contents such as `abc`, which triggers an assertion failure.
This PR changes the parser so it's more cautious about interpreting labels/lifetimes as unclosed char literals.
Fixes#120397.
r? `@compiler-errors`
Currently the parser will interpret any label/lifetime in certain
positions as a mistyped char literal, on the assumption that the
trailing single quote was accidentally omitted. This is reasonable for a
something like 'a (because 'a' would be valid) but not reasonable for a
something like 'abc (because 'abc' is not valid).
This commit restricts this behaviour only to labels/lifetimes that would
be valid char literals, via the new `could_be_unclosed_char_literal`
function. The commit also augments the `label-is-actually-char.rs` test
in a couple of ways:
- Adds testing of labels/lifetimes with identifiers longer than one
char, e.g. 'abc.
- Adds a new match with simpler patterns, because the
`recover_unclosed_char` call in `parse_pat_with_range_pat` was not
being exercised (in this test or any other ui tests).
Fixes#120397, an assertion failure, which was caused by this behaviour
in the parser interacting with some new stricter char literal checking
added in #120329.
In #119606 I added them and used a `_mv` suffix, but that wasn't great.
A `with_` prefix has three different existing uses.
- Constructors, e.g. `Vec::with_capacity`.
- Wrappers that provide an environment to execute some code, e.g.
`with_session_globals`.
- Consuming chaining methods, e.g. `Span::with_{lo,hi,ctxt}`.
The third case is exactly what we want, so this commit changes
`DiagnosticBuilder::foo_mv` to `DiagnosticBuilder::with_foo`.
Thanks to @compiler-errors for the suggestion.
The existing uses are replaced in one of three ways.
- In a function that also has calls to `emit`, just rearrange the code
so that exactly one of `delay_as_bug` or `emit` is called on every
path.
- In a function returning a `DiagnosticBuilder`, use
`downgrade_to_delayed_bug`. That's good enough because it will get
emitted later anyway.
- In `unclosed_delim_err`, one set of errors is being replaced with
another set, so just cancel the original errors.
This works for most of its call sites. This is nice, because `emit` very
much makes sense as a consuming operation -- indeed,
`DiagnosticBuilderState` exists to ensure no diagnostic is emitted
twice, but it uses runtime checks.
For the small number of call sites where a consuming emit doesn't work,
the commit adds `DiagnosticBuilder::emit_without_consuming`. (This will
be removed in subsequent commits.)
Likewise, `emit_unless` becomes consuming. And `delay_as_bug` becomes
consuming, while `delay_as_bug_without_consuming` is added (which will
also be removed in subsequent commits.)
All this requires significant changes to `DiagnosticBuilder`'s chaining
methods. Currently `DiagnosticBuilder` method chaining uses a
non-consuming `&mut self -> &mut Self` style, which allows chaining to
be used when the chain ends in `emit()`, like so:
```
struct_err(msg).span(span).emit();
```
But it doesn't work when producing a `DiagnosticBuilder` value,
requiring this:
```
let mut err = self.struct_err(msg);
err.span(span);
err
```
This style of chaining won't work with consuming `emit` though. For
that, we need to use to a `self -> Self` style. That also would allow
`DiagnosticBuilder` production to be chained, e.g.:
```
self.struct_err(msg).span(span)
```
However, removing the `&mut self -> &mut Self` style would require that
individual modifications of a `DiagnosticBuilder` go from this:
```
err.span(span);
```
to this:
```
err = err.span(span);
```
There are *many* such places. I have a high tolerance for tedious
refactorings, but even I gave up after a long time trying to convert
them all.
Instead, this commit has it both ways: the existing `&mut self -> Self`
chaining methods are kept, and new `self -> Self` chaining methods are
added, all of which have a `_mv` suffix (short for "move"). Changes to
the existing `forward!` macro lets this happen with very little
additional boilerplate code. I chose to add the suffix to the new
chaining methods rather than the existing ones, because the number of
changes required is much smaller that way.
This doubled chainging is a bit clumsy, but I think it is worthwhile
because it allows a *lot* of good things to subsequently happen. In this
commit, there are many `mut` qualifiers removed in places where
diagnostics are emitted without being modified. In subsequent commits:
- chaining can be used more, making the code more concise;
- more use of chaining also permits the removal of redundant diagnostic
APIs like `struct_err_with_code`, which can be replaced easily with
`struct_err` + `code_mv`;
- `emit_without_diagnostic` can be removed, which simplifies a lot of
machinery, removing the need for `DiagnosticBuilderState`.
Clairify `ast::PatKind::Struct` presese of `..` by using an enum instead of a bool
The bool is mainly used for when a `..` is present, but it is also set on recovery to avoid errors. The doc comment not describes both of these cases.
See cee794ee98/compiler/rustc_parse/src/parser/pat.rs (L890-L897) for the only place this is constructed.
r? ``@compiler-errors``
`IntoDiagnostic` defaults to `ErrorGuaranteed`, because errors are the
most common diagnostic level. It makes sense to do likewise for the
closely-related (and much more widely used) `DiagnosticBuilder` type,
letting us write `DiagnosticBuilder<'a, ErrorGuaranteed>` as just
`DiagnosticBuilder<'a>`. This cuts over 200 lines of code due to many
multi-line things becoming single line things.
This commit replaces this pattern:
```
err.into_diagnostic(dcx)
```
with this pattern:
```
dcx.create_err(err)
```
in a lot of places.
It's a little shorter, makes the error level explicit, avoids some
`IntoDiagnostic` imports, and is a necessary prerequisite for the next
commit which will add a `level` arg to `into_diagnostic`.
This requires adding `track_caller` on `create_err` to avoid mucking up
the output of `tests/ui/track-diagnostics/track4.rs`. It probably should
have been there already.
never_patterns: Parse match arms with no body
Never patterns are meant to signal unreachable cases, and thus don't take bodies:
```rust
let ptr: *const Option<!> = ...;
match *ptr {
None => { foo(); }
Some(!),
}
```
This PR makes rustc accept the above, and enforces that an arm has a body xor is a never pattern. This affects parsing of match arms even with the feature off, so this is delicate. (Plus this is my first non-trivial change to the parser).
~~The last commit is optional; it introduces a bit of churn to allow the new suggestions to be machine-applicable. There may be a better solution? I'm not sure.~~ EDIT: I removed that commit
r? `@compiler-errors`
Because a macro invocation can expand to a never pattern, we can't rule
out a `arm!(),` arm at parse time. Instead we detect that case at
expansion time, if the macro tries to output a pattern followed by `=>`.
Add `never_patterns` feature gate
This PR adds the feature gate and most basic parsing for the experimental `never_patterns` feature. See the tracking issue (https://github.com/rust-lang/rust/issues/118155) for details on the experiment.
`@scottmcm` has agreed to be my lang-team liaison for this experiment.
More detail when expecting expression but encountering bad macro argument
On nested macro invocations where the same macro fragment changes fragment type from one to the next, point at the chain of invocations and at the macro fragment definition place, explaining that the change has occurred.
Fix#71039.
```
error: expected expression, found pattern `1 + 1`
--> $DIR/trace_faulty_macros.rs:49:37
|
LL | (let $p:pat = $e:expr) => {test!(($p,$e))};
| ------- -- this is interpreted as expression, but it is expected to be pattern
| |
| this macro fragment matcher is expression
...
LL | (($p:pat, $e:pat)) => {let $p = $e;};
| ------ ^^ expected expression
| |
| this macro fragment matcher is pattern
...
LL | test!(let x = 1+1);
| ------------------
| | |
| | this is expected to be expression
| in this macro invocation
|
= note: when forwarding a matched fragment to another macro-by-example, matchers in the second macro will see an opaque AST of the fragment type, not the underlying tokens
= note: this error originates in the macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)
```
Most notably, this commit changes the `pub use crate::*;` in that file
to `use crate::*;`. This requires a lot of `use` items in other crates
to be adjusted, because everything defined within `rustc_span::*` was
also available via `rustc_span::source_map::*`, which is bizarre.
The commit also removes `SourceMap::span_to_relative_line_string`, which
is unused.
Currently a `{D,Subd}iagnosticMessage` can be created from any type that
impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static,
str>`, which are reasonable. It also includes `&String`, which is pretty
weird, and results in many places making unnecessary allocations for
patterns like this:
```
self.fatal(&format!(...))
```
This creates a string with `format!`, takes a reference, passes the
reference to `fatal`, which does an `into()`, which clones the
reference, doing a second allocation. Two allocations for a single
string, bleh.
This commit changes the `From` impls so that you can only create a
`{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static,
str>`. This requires changing all the places that currently create one
from a `&String`. Most of these are of the `&format!(...)` form
described above; each one removes an unnecessary static `&`, plus an
allocation when executed. There are also a few places where the existing
use of `&String` was more reasonable; these now just use `clone()` at
the call site.
As well as making the code nicer and more efficient, this is a step
towards possibly using `Cow<'static, str>` in
`{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing
the `From<&'a str>` impls to `From<&'static str>`, which is doable, but
I'm not yet sure if it's worthwhile.
My type ascription
Oh rip it out
Ah
If you think we live too much then
You can sacrifice diagnostics
Don't mix your garbage
Into my syntax
So many weird hacks keep diagnostics alive
Yet I don't even step outside
So many bad diagnostics keep tyasc alive
Yet tyasc doesn't even bother to survive!
Instead of loading the Fluent resources for every crate in
`rustc_error_messages`, each crate generates typed identifiers for its
own diagnostics and creates a static which are pulled together in the
`rustc_driver` crate and provided to the diagnostic emitter.
Signed-off-by: David Wood <david.wood@huawei.com>
Improve diagnostic for missing space in range pattern
Improves the diagnostic in #107425 by turning it into a note explaining the parsing issue.
r? `@compiler-errors`
Revert "review comment: Remove AST AnonTy"
This reverts commit 020cca8d36.
Revert "Ensure macros are not affected"
This reverts commit 12d18e4031.
Revert "Emit fewer errors on patterns with possible type ascription"
This reverts commit c847a01a3b.
Revert "Teach parser to understand fake anonymous enum syntax"
This reverts commit 2d82420665.
These two methods both produce a `MetaItemLit`, and then some of the
call sites convert the `MetaItemLit` to a `token::Lit` with
`as_token_lit`.
This commit parameterises these two methods with a `mk_lit_char`
closure, which can be used to produce either `MetaItemLit` or
`token::Lit` directly as necessary.
`token::Lit` contains a `kind` field that indicates what kind of literal
it is. `ast::MetaItemLit` currently wraps a `token::Lit` but also has
its own `kind` field. This means that `ast::MetaItemLit` encodes the
literal kind in two different ways.
This commit changes `ast::MetaItemLit` so it no longer wraps
`token::Lit`. It now contains the `symbol` and `suffix` fields from
`token::Lit`, but not the `kind` field, eliminating the redundancy.
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's
used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all
three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`),
where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them
having partial overlap. I find this makes the code hard to read. It also leads
to various unreachable code paths, and allows invalid values (such as
accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is
now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro
case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and
disallows the invalid values.
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.
This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
Suggest removing unnecessary prefix let in patterns
Helps with #101291, though I think `@estebank` probably wants this:
> Finally, I think it'd be nice if we could detect that we don't know for sure and "just" swallow the rest of the expression (find the next ; accounting for nested braces) or the end of the item (easier).
... to be implemented before we close that issue out completely.
In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.
`Parser::parse_bottom_expr` currently constructs an empty `attrs` and
then passes it to a large number of other functions. This makes the code
harder to read than it should be, because it's not clear that many
`attrs` arguments are always empty.
This commit removes `attrs` and the passing, simplifying a lot of
functions. The commit also renames `Parser::mk_expr` (which takes an
`attrs` argument) as `mk_expr_with_attrs`, and introduces a new
`mk_expr` which creates an expression with no attributes, which is the
more common case.
Change `span_suggestion` (and variants) to take `impl ToString` rather
than `String` for the suggested code, as this simplifies the
requirements on the diagnostic derive.
Signed-off-by: David Wood <david.wood@huawei.com>
* Recover from invalid `'label: ` before block.
* Make suggestion to enclose statements in a block multipart.
* Point at `match`, `while`, `loop` and `unsafe` keywords when failing
to parse their expression.
* Do not suggest `{ ; }`.
* Do not suggest `|` when very unlikely to be what was wanted (in `let`
statements).
Add diagnostics for mistyped inclusive range
Inclusive ranges are correctly typed as `..=`. However, it's quite easy to think of it as being like `==`, and type `..==` instead. This PR adds helpful diagnostics for this case.
Resolves#86395 (there are some other cases there, but I think those should probably have separate issues).
r? `@estebank`