Migrate rustc_ast_passes diagnostics to `SessionDiagnostic` and translatable messages (first part)
Doing a full migration of the `rustc_ast_passes` crate.
Making a draft here since there's not yet a tracking issue for the migrations going on.
`@rustbot` label +A-translation
In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.
Don't derive `PartialEq::ne`.
Currently we skip deriving `PartialEq::ne` for C-like (fieldless) enums
and empty structs, thus reyling on the default `ne`. This behaviour is
unnecessarily conservative, because the `PartialEq` docs say this:
> Implementations must ensure that eq and ne are consistent with each other:
>
> `a != b` if and only if `!(a == b)` (ensured by the default
> implementation).
This means that the default implementation (`!(a == b)`) is always good
enough. So this commit changes things such that `ne` is never derived.
The motivation for this change is that not deriving `ne` reduces compile
times and binary sizes.
Observable behaviour may change if a user has defined a type `A` with an
inconsistent `PartialEq` and then defines a type `B` that contains an
`A` and also derives `PartialEq`. Such code is already buggy and
preserving bug-for-bug compatibility isn't necessary.
Two side-effects of the change:
- There is only one error message produced for types where `PartialEq`
cannot be derived, instead of two.
- For coverage reports, some warnings about generated `ne` methods not
being executed have disappeared.
Both side-effects seem fine, and possibly preferable.
- Rename `ast::Lit::token` as `ast::Lit::token_lit`, because its type is
`token::Lit`, which is not a token. (This has been confusing me for a
long time.)
reasonable because we have an `ast::token::Lit` inside an `ast::Lit`.
- Rename `LitKind::{from,to}_lit_token` as
`LitKind::{from,to}_token_lit`, to match the above change and
`token::Lit`.
Simplify format_args builtin macro implementation.
Instead of a FxHashMap<Symbol, (usize, Span)> for the named arguments, this now includes the name and span in the elements of the Vec<FormatArg> directly. The FxHashMap still exists to look up the index, but no longer contains the span. Looking up the name or span of an argument is now trivial and does not need the map anymore.
Instead of a FxHashMap<Symbol, (usize, Span)> for the named arguments,
this now includes the name and span in the elements of the
Vec<FormatArg> directly. The FxHashMap still exists to look up the
index, but no longer contains the span. Looking up the name or span of
an argument is now trivial and does not need the map anymore.
Improve position named arguments lint underline and formatting names
For named arguments used as implicit position arguments, underline both
the opening curly brace and either:
* if there is formatting, the next character (which will either be the
closing curl brace or the `:` denoting the start of formatting args)
* if there is no formatting, the entire arg span (important if there is
whitespace like `{ }`)
This should make it more obvious where the named argument should be.
Additionally, in the lint message, emit the formatting argument names
without a dollar sign to avoid potentially confusion.
Fixes#99907
Properly reject the `may_unwind` option in `global_asm!`
This was accidentally accepted even though it had no effect in
`global_asm!`. The option only makes sense for `asm!` which runs within
a function.
For named arguments used as implicit position arguments, underline both
the opening curly brace and either:
* if there is formatting, the next character (which will either be the
closing curl brace or the `:` denoting the start of formatting args)
* if there is no formatting, the entire arg span (important if there is
whitespace like `{ }`)
This should make it more obvious where the named argument should be.
Additionally, in the lint message, emit the formatting argument names
without a dollar sign to avoid potentially confusion.
Fixes#99907
Currently we skip deriving `PartialEq::ne` for C-like (fieldless) enums
and empty structs, thus reyling on the default `ne`. This behaviour is
unnecessarily conservative, because the `PartialEq` docs say this:
> Implementations must ensure that eq and ne are consistent with each other:
>
> `a != b` if and only if `!(a == b)` (ensured by the default
> implementation).
This means that the default implementation (`!(a == b)`) is always good
enough. So this commit changes things such that `ne` is never derived.
The motivation for this change is that not deriving `ne` reduces compile
times and binary sizes.
Observable behaviour may change if a user has defined a type `A` with an
inconsistent `PartialEq` and then defines a type `B` that contains an
`A` and also derives `PartialEq`. Such code is already buggy and
preserving bug-for-bug compatibility isn't necessary.
Two side-effects of the change:
- There is only one error message produced for types where `PartialEq`
cannot be derived, instead of two.
- For coverage reports, some warnings about generated `ne` methods not
being executed have disappeared.
Both side-effects seem fine, and possibly preferable.
Remove `TreeAndSpacing`.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
r? `@petrochenkov`
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
Address issue #99265 by checking each positionally used argument
to see if the argument is named and adding a lint to use the name
instead. This way, when named arguments are used positionally in a
different order than their argument order, the suggested lint is
correct.
For example:
```
println!("{b} {}", a=1, b=2);
```
This will now generate the suggestion:
```
println!("{b} {a}", a=1, b=2);
```
Additionally, this check now also correctly replaces or inserts
only where the positional argument is (or would be if implicit).
Also, width and precision are replaced with their argument names
when they exists.
Since the issues were so closely related, this fix for issue #99265
also fixes issue #99266.
Fixes#99265Fixes#99266
Diagnostic width span is not added when '0$' is used as width in format strings
When the following code is run rustc does not add diagnostic spans for the width argument. Such spans are necessary for a clippy lint that I am currently writing.
```rust
println!("Hello {1:0$}!", 5, "x");
// ^^
// Should have a span here
```
Currently `#![forbid(unused_qualifications)]` is incompatible with all
derive's because we add `#[allow(unused_qualifications)]` in all
generated impl's.
Final derive output improvements
With all these changes, the derive output in `deriving-all-codegen.stdout` is pretty close to optimal, i.e. very similar to what you'd write by hand.
r? `@ghost`
Emit warning when named arguments are used positionally in format
Addresses Issue 98466 by emitting an error if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466
Implement `for<>` lifetime binder for closures
This PR implements RFC 3216 ([TI](https://github.com/rust-lang/rust/issues/97362)) and allows code like the following:
```rust
let _f = for<'a, 'b> |a: &'a A, b: &'b B| -> &'b C { b.c(a) };
// ^^^^^^^^^^^--- new!
```
cc ``@Aaron1011`` ``@cjgillot``
Addresses Issue 98466 by emitting a warning if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466
Currently, for the enums and comparison traits we always check the tag
for equality before doing anything else. This is a bit clumsy. This
commit changes things so that the tags are handled very much like a
zeroth field in the enum.
For `eq`/ne` this makes the code slightly cleaner.
For `partial_cmp` and `cmp` it's a more notable change: in the case
where the tags aren't equal, instead of having a tag equality check
followed by a tag comparison, it just does a single tag comparison.
The commit also improves how `Hash` works for enums: instead of having
duplicated code to hash the tag for every arm within the match, we do
it just once before the match.
All this required replacing the `EnumNonMatchingCollapsed` value with a
new `EnumTag` value.
For fieldless enums the new code is particularly improved. All the code
now produced is close to optimal, being very similar to what you'd write
by hand.
Use `tag` in names of things referring to tags, instead of the
mysterious `vi`.
Also change `arg_N` in output to `argN`, which has the same length as
`self` and so results in nicer vertical alignments.
By producing `&T` expressions for fields instead of `T`. This matches
what the existing comments (e.g. on `FieldInfo`) claim is happening, and
it's also what most of the trait-specific code needs.
The exception is `PartialEq`, which needs `T` expressions for lots of
special case error messaging to work. So we now convert the `&T` back to
a `T` for `PartialEq`.
E.g. improving code like this:
```
match &*self {
&Enum1::Single { x: ref __self_0 } => {
::core:#️⃣:Hash::hash(&*__self_0, state)
}
}
```
to this:
```
match self {
Enum1::Single { x: __self_0 } => {
::core:#️⃣:Hash::hash(&*__self_0, state)
}
}
```
by removing the `&*`, the `&`, and the `ref`.
I suspect the current generated code predates deref-coercion.
The commit also gets rid of `use_temporaries`, instead passing around
`always_copy`, which makes things a little clearer. And it fixes up some
comments.
`cs_fold` has four distinct cases, covered by three different function
arguments:
- first field
- combine current field with previous results
- no fields
- non-matching enum variants
This commit clarifies things by replacing the three function arguments
with one that takes a new `CsFold` type with four slightly different)
cases
- single field
- combine result for current field with results for previous fields
- no fields
- non-matching enum variants
This makes the code shorter and clearer.
When deriving functions for zero-variant enums, we just generated a
function body that calls `std::instrincs::unreachable`. There is a large
comment with some not-very-useful historical discussion about
alternatives, including some discussion of feature-gating zero-variant
enums, which is clearly irrelevant today.
This commit cuts the comment down greatly.
The deriving code has some complex parts involving iterations over
selflike args and also fields within structs and enum variants.
The return types for a few functions demonstrate this:
- `TraitDef::create_{struct_pattern,enum_variant_pattern}` returns a
`(P<ast::Pat>, Vec<(Span, Option<Ident>, P<Expr>)>)`
- `TraitDef::create_struct_field_accesses` returns a `Vec<(Span,
Option<Ident>, P<Expr>)>`.
This results in per-field data stored within per-selflike-arg data, with
lots of repetition within the per-field data elements. This then has to
be "transposed" in two places (`expand_struct_method_body` and
`expand_enum_method_body`) into per-self-like-arg data stored within
per-field data. It's all quite clumsy and confusing.
This commit rearranges things greatly. Data is obtained in the needed
form up-front, avoiding the need for transposition. Also, various
functions are split, removed, and added, to make things clearer and
avoid tuple return values.
The diff is hard to read, which reflects the messiness of the original
code -- there wasn't an easy way to break these changes into small
pieces. (Sorry!) It's a net reduction of 35 lines and a readability
improvement. The generated code is unchanged.
The deriving code has inconsistent terminology to describe args.
In some places it distinguishes between:
- the `&self` arg (if present), versus
- all other args.
In other places it distinguishes between:
- the `&self` arg (if present) and any other arguments with the same
type (in practice there is at most one, e.g. in `PartialEq::eq`),
versus
- all other args.
The terms "self_args" and "nonself_args" are sometimes used for the
former distinction, and sometimes for the latter. "args" is also
sometimes used for "all other args".
This commit makes the code consistently uses "self_args"/"nonself_args"
for the former and "selflike_args"/"nonselflike_args" for the latter.
This change makes the code easier to read.
The commit also adds a panic on an impossible path (the `Self_` case) in
`extract_arg_details`.
We currently do a match on the comparison of every field in a struct or
enum variant. But the last field has a degenerate match like this:
```
match ::core::cmp::Ord::cmp(&self.y, &other.y) {
::core::cmp::Ordering::Equal =>
::core::cmp::Ordering::Equal,
cmp => cmp,
},
```
This commit changes it to this:
```
::core::cmp::Ord::cmp(&self.y, &other.y),
```
This is fairly straightforward thanks to the existing `cs_fold1`
function.
The commit also removes the `cs_fold` function which is no longer used.
(Note: there is some repetition now in `cs_cmp` and `cs_partial_cmp`. I
will remove that in a follow-up PR.)
It's common to see repeated assertions like this in derived `clone` and
`eq` methods:
```
let _: ::core::clone::AssertParamIsClone<u32>;
let _: ::core::clone::AssertParamIsClone<u32>;
```
This commit avoids them.
All derive ops currently use match-destructuring to access fields. This
is reasonable for enums, but sub-optimal for structs. E.g.:
```
fn eq(&self, other: &Point) -> bool {
match *other {
Self { x: ref __self_1_0, y: ref __self_1_1 } =>
match *self {
Self { x: ref __self_0_0, y: ref __self_0_1 } =>
(*__self_0_0) == (*__self_1_0) &&
(*__self_0_1) == (*__self_1_1),
},
}
}
```
This commit changes derive ops on structs to use field access instead, e.g.:
```
fn eq(&self, other: &Point) -> bool {
self.x == other.x && self.y == other.y
}
```
This is faster to compile, results in smaller binaries, and is simpler to
generate. Unfortunately, we have to keep the old pattern generating code around
for `repr(packed)` structs because something like `&self.x` (which doesn't show
up in `PartialEq` ops, but does show up in `Debug` and `Hash` ops) isn't
allowed. But this commit at least changes those cases to use let-destructuring
instead of match-destructuring, e.g.:
```
fn hash<__H: ::core:#️⃣:Hasher>(&self, state: &mut __H) -> () {
{
let Self(ref __self_0_0) = *self;
{ ::core:#️⃣:Hash::hash(&(*__self_0_0), state) }
}
}
```
There are some unnecessary blocks remaining in the generated code, but I
will fix them in a follow-up PR.
The existing derive code allows for various possibilities that aren't
needed in practice, which complicates the code. There are only a few
auto-derived traits and new ones are unlikely, so this commit simplifies
things.
- `PtrTy` has been eliminated. The `Raw` variant was never used, and the
lifetime for the `Borrowed` variant was always `None`. That left just
the mutability field, which has been inlined as necessary.
- `MethodDef::explicit_self` was a confusing `Option<Option<PtrTy>>`.
Indicating either `&self` or nothing. It's now a `bool`.
- `borrowed_self` is renamed as `self_ref`.
- `Ty::Ptr` is renamed to `Ty::Ref`.
The `&[ast::Variant]` field isn't used.
The `Vec<Ident>` field is only used for its length, but that's always
the same as the length of the `&[Ident]` and so isn't necessary.
[RFC 2011] Optimize non-consuming operators
Tracking issue: https://github.com/rust-lang/rust/issues/44838
Fifth step of https://github.com/rust-lang/rust/pull/96496
The most non-invasive approach that will probably have very little to no performance impact.
## Current behaviour
Captures are handled "on-the-fly", i.e., they are performed in the same place expressions are located.
```rust
// `let a = 1; let b = 2; assert!(a > 1 && b < 100);`
if !(
{ ***try capture `a` and then return `a`*** } > 1 && { ***try capture `b` and then return `b`*** } < 100
) {
panic!( ... );
}
```
As such, some overhead is likely to occur (Specially with very large chains of conditions).
## New behaviour for non-consuming operators
When an operator is known to not take `self`, then it is possible to capture variables **AFTER** the condition.
```rust
// `let a = 1; let b = 2; assert!(a > 1 && b < 100);`
if !( a > 1 && b < 100 ) {
{ ***try capture `a`*** }
{ ***try capture `b`*** }
panic!( ... );
}
```
So the possible impact on the runtime execution time will be diminished.
r? ````@oli-obk````
Currently the generated code for methods like `eq`, `ne`, and `partial_cmp`
includes stuff like this:
```
let __self_vi = ::core::intrinsics::discriminant_value(&*self);
let __arg_1_vi = ::core::intrinsics::discriminant_value(&*other);
if true && __self_vi == __arg_1_vi {
...
}
```
This commit removes the unnecessary `true &&`, and makes the generating
code a little easier to read in the process. It also fixes some errors
in comments.