This migrates everything but the `mbe` and `proc_macro` modules. It also
contains a few cleanups and drive-by/accidental diagnostic improvements
which can be seen in the diff for the UI tests.
Shrink `rustc_parse_format::Piece`
This makes both variants closer together in size (previously they were different by 208 bytes -- 16 vs 224). This may make things worse, but it's worth a try.
r? `@nnethercote`
This makes both variants closer together in size (previously they were
different by 208 bytes -- 16 vs 224). This may make things worse, but
it's worth a try.
Add `type_ascribe!` macro as placeholder syntax for type ascription
This makes it still possible to test the internal semantics of type ascription even once the `:`-syntax is removed from the parser. The macro now gets used in a bunch of UI tests that test the semantics and not syntax of type ascription.
I might have forgotten a few tests but this should hopefully be most of them. The remaining ones will certainly be found once type ascription is removed from the parser altogether.
Part of #101728
rustc_ast_lowering: Stop lowering imports into multiple items
Lower them into a single item with multiple resolutions instead.
This also allows to remove additional `NodId`s and `DefId`s related to those additional items.
`token::Lit` contains a `kind` field that indicates what kind of literal
it is. `ast::MetaItemLit` currently wraps a `token::Lit` but also has
its own `kind` field. This means that `ast::MetaItemLit` encodes the
literal kind in two different ways.
This commit changes `ast::MetaItemLit` so it no longer wraps
`token::Lit`. It now contains the `symbol` and `suffix` fields from
`token::Lit`, but not the `kind` field, eliminating the redundancy.
This is required to distinguish between cooked and raw byte string
literals in an `ast::LitKind`, without referring to an adjacent
`token::Lit`. It's a prerequisite for the next commit.
Lower them into a single item with multiple resolutions instead.
This also allows to remove additional `NodId`s and `DefId`s related to those additional items.
There is code for converting `Attribute` (syntactic) to `MetaItem`
(semantic). There is also code for the reverse direction. The reverse
direction isn't really necessary; it's currently only used when
generating attributes, e.g. in `derive` code.
This commit adds some new functions for creating `Attributes`s directly,
without involving `MetaItem`s: `mk_attr_word`, `mk_attr_name_value_str`,
`mk_attr_nested_word`, and
`ExtCtxt::attr_{word,name_value_str,nested_word}`.
These new methods replace the old functions for creating `Attribute`s:
`mk_attr_inner`, `mk_attr_outer`, and `ExtCtxt::attribute`. Those
functions took `MetaItem`s as input, and relied on many other functions
that created `MetaItems`, which are also removed: `mk_name_value_item`,
`mk_list_item`, `mk_word_item`, `mk_nested_word_item`,
`{MetaItem,MetaItemKind,NestedMetaItem}::token_trees`,
`MetaItemKind::attr_args`, `MetaItemLit::{from_lit_kind,to_token}`,
`ExtCtxt::meta_word`.
Overall this cuts more than 100 lines of code and makes thing simpler.
In `Expander::expand` the code currently uses `mk_attr_outer` to convert
a `MetaItem` to an `Attribute`, and then follows that with
`meta_item_list` which converts back. This commit avoids the unnecessary
conversions.
There was one wrinkle: the existing conversions caused the bogus `<>` on
`Default<>` to be removed. With the conversion gone, we get a second
error message about the `<>`. This is a rare case, so I think it
probably doesn't matter much.
`check_builtin_attribute` calls `parse_meta` to convert an `Attribute`
to a `MetaItem`, which it then checks. However, many callers of
`check_builtin_attribute` start with a `MetaItem`, and then convert it
to an `Attribute` by calling `cx.attribute(meta_item)`. This `MetaItem`
to `Attribute` to `MetaItem` conversion is silly.
This commit adds a new function `check_builtin_meta_item`, which can be
called instead from these call sites. `check_builtin_attribute` also now
calls it. The commit also renames `check_meta` as `check_attr` to better
match its arguments.
Use `as_deref` in compiler (but only where it makes sense)
This simplifies some code :3
(there are some changes that are not exacly `as_deref`, but more like "clever `Option`/`Result` method use")
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's
used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all
three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`),
where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them
having partial overlap. I find this makes the code hard to read. It also leads
to various unreachable code paths, and allows invalid values (such as
accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is
now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro
case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and
disallows the invalid values.
The current approach to field accesses in derived code:
- Normal case: `&self.0`
- In a packed struct that derives `Copy`: `&{self.0}`
- In a packed struct that doesn't derive `Copy`: `let Self(ref x) = *self`
The `let` pattern used in the third case is equivalent to the simpler
field access in the first case. This commit changes the third case to
use a field access.
The commit also combines two boolean arguments (`is_packed` and
`always_copy`) into a single field (`copy_fields`) earlier, to save
passing both around.
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.
This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
Delay `include_bytes` to AST lowering
Hopefully addresses #65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
Add the `#[derive_const]` attribute
Closes#102371. This is a minimal patchset for the attribute to work. There are no restrictions on what traits this attribute applies to.
r? `````@oli-obk`````
`#[test]`: Point at return type if `Termination` bound is unsatisfied
Together with #103142 (already merged) this fully fixes#50291.
I don't consider my current solution of changing a few spans “here and there” very clean since the
failed obligation is a `FunctionArgumentObligation` and we point at a type instead of a function argument.
If you agree with me on this point, I can offer to keep the spans of the existing nodes and instead inject
`let _: AssertRetTyIsTermination<$ret_ty>;` (type to be defined in `libtest`) similar to `AssertParamIsEq` etc.
used by some built-in derive-macros.
I haven't tried that approach yet though and cannot promise that it would actually work out or
be “cleaner” for that matter.
````@rustbot```` label A-libtest A-diagnostics
r? ````@estebank````
The new implementation doesn't use weak lang items and instead changes
`#[alloc_error_handler]` to an attribute macro just like
`#[global_allocator]`.
The attribute will generate the `__rg_oom` function which is called by
the compiler-generated `__rust_alloc_error_handler`. If no `__rg_oom`
function is defined in any crate then the compiler shim will call
`__rdl_oom` in the alloc crate which will simply panic.
This also fixes link errors with `-C link-dead-code` with
`default_alloc_error_handler`: `__rg_oom` was previously defined in the
alloc crate and would attempt to reference the `oom` lang item, even if
it didn't exist. This worked as long as `__rg_oom` was excluded from
linking since it was not called.
This is a prerequisite for the stabilization of
`default_alloc_error_handler` (#102318).
Sort tests at compile time, not at startup
Recently, another Miri user was trying to run `cargo miri test` on the crate `iced-x86` with `--features=code_asm,mvex`. This configuration has a startup time of ~18 minutes. That's ~18 minutes before any tests even start to run. The fact that this crate has over 26,000 tests and Miri is slow makes a lot of code which is otherwise a bit sloppy but fine into a huge runtime issue.
Sorting the tests when the test harness is created instead of at startup time knocks just under 4 minutes out of those ~18 minutes. I have ways to remove most of the rest of the startup time, but this change requires coordinating changes of both the compiler and libtest, so I'm sending it separately.
(except for doctests, because there is no compile-time harness)
This fixes a typo first appearing in #94624
in which test-macro diagnostic uses "a" article twice.
Since I searched sources for " a a " sequences,
I also fixed the same issue in a few source files where I found it.
Signed-off-by: Petr Portnov <gh@progrm-jarvis.ru>
On later stages, the feature is already stable.
Result of running:
rg -l "feature.let_else" compiler/ src/librustdoc/ library/ | xargs sed -s -i "s#\\[feature.let_else#\\[cfg_attr\\(bootstrap, feature\\(let_else\\)#"
These two type names are long and have long matching prefixes. I find
them hard to read, especially in combinations like
`AttrAnnotatedTokenStream::new(vec![AttrAnnotatedTokenTree::Token(..)])`.
This commit renames them as `AttrToken{Stream,Tree}`.
Recently, another Miri user was trying to run `cargo miri test` on the
crate `iced-x86` with `--features=code_asm,mvex`. This configuration has
a startup time of ~18 minutes. That's ~18 minutes before any tests even
start to run. The fact that this crate has over 26,000 tests and Miri is
slow makes a lot of code which is otherwise a bit sloppy but fine into a
huge runtime issue.
Sorting the tests when the test harness is created instead of at startup
time knocks just under 4 minutes out of those ~18 minutes. I have ways
to remove most of the rest of the startup time, but this change requires
coordinating changes of both the compiler and libtest, so I'm sending it
separately.
Replace `rustc_data_structures::thin_vec::ThinVec` with `thin_vec::ThinVec`
`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.
This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.
The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
`ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
avoid some unnecessary allocations.
r? `@spastorino`
Revert let_chains stabilization
This is the revert against master, the beta revert was already done in #100538.
Bumps the stage0 compiler which already has it reverted.
Separate CountIsStar from CountIsParam in rustc_parse_format.
`rustc_parse_format`'s parser would result in the exact same output for `{:.*}` and `{:.0$}`, making it hard for diagnostics to handle these cases properly.
This splits those cases by adding a new `CountIsStar` enum variant.
This fixes#100995
Prerequisite for https://github.com/rust-lang/rust/pull/100996
`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.
This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.
The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
`ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
avoid some unnecessary allocations.
Fix rustc_parse_format precision & width spans
When a `precision`/`width` was `CountIsName - {:name$}` or `CountIs - {:10}` the `precision_span`/`width_span` was set to `None`
For `width` the name span in `CountIsName(_, name_span)` had its `.start` off by one
r? ``@fee1-dead`` / cc ``@PrestonFrom`` since this is similar to #99987
Migrate rustc_ast_passes diagnostics to `SessionDiagnostic` and translatable messages (first part)
Doing a full migration of the `rustc_ast_passes` crate.
Making a draft here since there's not yet a tracking issue for the migrations going on.
`@rustbot` label +A-translation
In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.
Don't derive `PartialEq::ne`.
Currently we skip deriving `PartialEq::ne` for C-like (fieldless) enums
and empty structs, thus reyling on the default `ne`. This behaviour is
unnecessarily conservative, because the `PartialEq` docs say this:
> Implementations must ensure that eq and ne are consistent with each other:
>
> `a != b` if and only if `!(a == b)` (ensured by the default
> implementation).
This means that the default implementation (`!(a == b)`) is always good
enough. So this commit changes things such that `ne` is never derived.
The motivation for this change is that not deriving `ne` reduces compile
times and binary sizes.
Observable behaviour may change if a user has defined a type `A` with an
inconsistent `PartialEq` and then defines a type `B` that contains an
`A` and also derives `PartialEq`. Such code is already buggy and
preserving bug-for-bug compatibility isn't necessary.
Two side-effects of the change:
- There is only one error message produced for types where `PartialEq`
cannot be derived, instead of two.
- For coverage reports, some warnings about generated `ne` methods not
being executed have disappeared.
Both side-effects seem fine, and possibly preferable.
- Rename `ast::Lit::token` as `ast::Lit::token_lit`, because its type is
`token::Lit`, which is not a token. (This has been confusing me for a
long time.)
reasonable because we have an `ast::token::Lit` inside an `ast::Lit`.
- Rename `LitKind::{from,to}_lit_token` as
`LitKind::{from,to}_token_lit`, to match the above change and
`token::Lit`.
Simplify format_args builtin macro implementation.
Instead of a FxHashMap<Symbol, (usize, Span)> for the named arguments, this now includes the name and span in the elements of the Vec<FormatArg> directly. The FxHashMap still exists to look up the index, but no longer contains the span. Looking up the name or span of an argument is now trivial and does not need the map anymore.
Instead of a FxHashMap<Symbol, (usize, Span)> for the named arguments,
this now includes the name and span in the elements of the
Vec<FormatArg> directly. The FxHashMap still exists to look up the
index, but no longer contains the span. Looking up the name or span of
an argument is now trivial and does not need the map anymore.
Improve position named arguments lint underline and formatting names
For named arguments used as implicit position arguments, underline both
the opening curly brace and either:
* if there is formatting, the next character (which will either be the
closing curl brace or the `:` denoting the start of formatting args)
* if there is no formatting, the entire arg span (important if there is
whitespace like `{ }`)
This should make it more obvious where the named argument should be.
Additionally, in the lint message, emit the formatting argument names
without a dollar sign to avoid potentially confusion.
Fixes#99907
Properly reject the `may_unwind` option in `global_asm!`
This was accidentally accepted even though it had no effect in
`global_asm!`. The option only makes sense for `asm!` which runs within
a function.
For named arguments used as implicit position arguments, underline both
the opening curly brace and either:
* if there is formatting, the next character (which will either be the
closing curl brace or the `:` denoting the start of formatting args)
* if there is no formatting, the entire arg span (important if there is
whitespace like `{ }`)
This should make it more obvious where the named argument should be.
Additionally, in the lint message, emit the formatting argument names
without a dollar sign to avoid potentially confusion.
Fixes#99907
Currently we skip deriving `PartialEq::ne` for C-like (fieldless) enums
and empty structs, thus reyling on the default `ne`. This behaviour is
unnecessarily conservative, because the `PartialEq` docs say this:
> Implementations must ensure that eq and ne are consistent with each other:
>
> `a != b` if and only if `!(a == b)` (ensured by the default
> implementation).
This means that the default implementation (`!(a == b)`) is always good
enough. So this commit changes things such that `ne` is never derived.
The motivation for this change is that not deriving `ne` reduces compile
times and binary sizes.
Observable behaviour may change if a user has defined a type `A` with an
inconsistent `PartialEq` and then defines a type `B` that contains an
`A` and also derives `PartialEq`. Such code is already buggy and
preserving bug-for-bug compatibility isn't necessary.
Two side-effects of the change:
- There is only one error message produced for types where `PartialEq`
cannot be derived, instead of two.
- For coverage reports, some warnings about generated `ne` methods not
being executed have disappeared.
Both side-effects seem fine, and possibly preferable.
Remove `TreeAndSpacing`.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
r? `@petrochenkov`
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
Address issue #99265 by checking each positionally used argument
to see if the argument is named and adding a lint to use the name
instead. This way, when named arguments are used positionally in a
different order than their argument order, the suggested lint is
correct.
For example:
```
println!("{b} {}", a=1, b=2);
```
This will now generate the suggestion:
```
println!("{b} {a}", a=1, b=2);
```
Additionally, this check now also correctly replaces or inserts
only where the positional argument is (or would be if implicit).
Also, width and precision are replaced with their argument names
when they exists.
Since the issues were so closely related, this fix for issue #99265
also fixes issue #99266.
Fixes#99265Fixes#99266
Diagnostic width span is not added when '0$' is used as width in format strings
When the following code is run rustc does not add diagnostic spans for the width argument. Such spans are necessary for a clippy lint that I am currently writing.
```rust
println!("Hello {1:0$}!", 5, "x");
// ^^
// Should have a span here
```
Currently `#![forbid(unused_qualifications)]` is incompatible with all
derive's because we add `#[allow(unused_qualifications)]` in all
generated impl's.
Final derive output improvements
With all these changes, the derive output in `deriving-all-codegen.stdout` is pretty close to optimal, i.e. very similar to what you'd write by hand.
r? `@ghost`
Emit warning when named arguments are used positionally in format
Addresses Issue 98466 by emitting an error if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466
Implement `for<>` lifetime binder for closures
This PR implements RFC 3216 ([TI](https://github.com/rust-lang/rust/issues/97362)) and allows code like the following:
```rust
let _f = for<'a, 'b> |a: &'a A, b: &'b B| -> &'b C { b.c(a) };
// ^^^^^^^^^^^--- new!
```
cc ``@Aaron1011`` ``@cjgillot``
Addresses Issue 98466 by emitting a warning if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466
Currently, for the enums and comparison traits we always check the tag
for equality before doing anything else. This is a bit clumsy. This
commit changes things so that the tags are handled very much like a
zeroth field in the enum.
For `eq`/ne` this makes the code slightly cleaner.
For `partial_cmp` and `cmp` it's a more notable change: in the case
where the tags aren't equal, instead of having a tag equality check
followed by a tag comparison, it just does a single tag comparison.
The commit also improves how `Hash` works for enums: instead of having
duplicated code to hash the tag for every arm within the match, we do
it just once before the match.
All this required replacing the `EnumNonMatchingCollapsed` value with a
new `EnumTag` value.
For fieldless enums the new code is particularly improved. All the code
now produced is close to optimal, being very similar to what you'd write
by hand.
Use `tag` in names of things referring to tags, instead of the
mysterious `vi`.
Also change `arg_N` in output to `argN`, which has the same length as
`self` and so results in nicer vertical alignments.
By producing `&T` expressions for fields instead of `T`. This matches
what the existing comments (e.g. on `FieldInfo`) claim is happening, and
it's also what most of the trait-specific code needs.
The exception is `PartialEq`, which needs `T` expressions for lots of
special case error messaging to work. So we now convert the `&T` back to
a `T` for `PartialEq`.
E.g. improving code like this:
```
match &*self {
&Enum1::Single { x: ref __self_0 } => {
::core:#️⃣:Hash::hash(&*__self_0, state)
}
}
```
to this:
```
match self {
Enum1::Single { x: __self_0 } => {
::core:#️⃣:Hash::hash(&*__self_0, state)
}
}
```
by removing the `&*`, the `&`, and the `ref`.
I suspect the current generated code predates deref-coercion.
The commit also gets rid of `use_temporaries`, instead passing around
`always_copy`, which makes things a little clearer. And it fixes up some
comments.
`cs_fold` has four distinct cases, covered by three different function
arguments:
- first field
- combine current field with previous results
- no fields
- non-matching enum variants
This commit clarifies things by replacing the three function arguments
with one that takes a new `CsFold` type with four slightly different)
cases
- single field
- combine result for current field with results for previous fields
- no fields
- non-matching enum variants
This makes the code shorter and clearer.
When deriving functions for zero-variant enums, we just generated a
function body that calls `std::instrincs::unreachable`. There is a large
comment with some not-very-useful historical discussion about
alternatives, including some discussion of feature-gating zero-variant
enums, which is clearly irrelevant today.
This commit cuts the comment down greatly.
The deriving code has some complex parts involving iterations over
selflike args and also fields within structs and enum variants.
The return types for a few functions demonstrate this:
- `TraitDef::create_{struct_pattern,enum_variant_pattern}` returns a
`(P<ast::Pat>, Vec<(Span, Option<Ident>, P<Expr>)>)`
- `TraitDef::create_struct_field_accesses` returns a `Vec<(Span,
Option<Ident>, P<Expr>)>`.
This results in per-field data stored within per-selflike-arg data, with
lots of repetition within the per-field data elements. This then has to
be "transposed" in two places (`expand_struct_method_body` and
`expand_enum_method_body`) into per-self-like-arg data stored within
per-field data. It's all quite clumsy and confusing.
This commit rearranges things greatly. Data is obtained in the needed
form up-front, avoiding the need for transposition. Also, various
functions are split, removed, and added, to make things clearer and
avoid tuple return values.
The diff is hard to read, which reflects the messiness of the original
code -- there wasn't an easy way to break these changes into small
pieces. (Sorry!) It's a net reduction of 35 lines and a readability
improvement. The generated code is unchanged.
The deriving code has inconsistent terminology to describe args.
In some places it distinguishes between:
- the `&self` arg (if present), versus
- all other args.
In other places it distinguishes between:
- the `&self` arg (if present) and any other arguments with the same
type (in practice there is at most one, e.g. in `PartialEq::eq`),
versus
- all other args.
The terms "self_args" and "nonself_args" are sometimes used for the
former distinction, and sometimes for the latter. "args" is also
sometimes used for "all other args".
This commit makes the code consistently uses "self_args"/"nonself_args"
for the former and "selflike_args"/"nonselflike_args" for the latter.
This change makes the code easier to read.
The commit also adds a panic on an impossible path (the `Self_` case) in
`extract_arg_details`.
We currently do a match on the comparison of every field in a struct or
enum variant. But the last field has a degenerate match like this:
```
match ::core::cmp::Ord::cmp(&self.y, &other.y) {
::core::cmp::Ordering::Equal =>
::core::cmp::Ordering::Equal,
cmp => cmp,
},
```
This commit changes it to this:
```
::core::cmp::Ord::cmp(&self.y, &other.y),
```
This is fairly straightforward thanks to the existing `cs_fold1`
function.
The commit also removes the `cs_fold` function which is no longer used.
(Note: there is some repetition now in `cs_cmp` and `cs_partial_cmp`. I
will remove that in a follow-up PR.)
It's common to see repeated assertions like this in derived `clone` and
`eq` methods:
```
let _: ::core::clone::AssertParamIsClone<u32>;
let _: ::core::clone::AssertParamIsClone<u32>;
```
This commit avoids them.
All derive ops currently use match-destructuring to access fields. This
is reasonable for enums, but sub-optimal for structs. E.g.:
```
fn eq(&self, other: &Point) -> bool {
match *other {
Self { x: ref __self_1_0, y: ref __self_1_1 } =>
match *self {
Self { x: ref __self_0_0, y: ref __self_0_1 } =>
(*__self_0_0) == (*__self_1_0) &&
(*__self_0_1) == (*__self_1_1),
},
}
}
```
This commit changes derive ops on structs to use field access instead, e.g.:
```
fn eq(&self, other: &Point) -> bool {
self.x == other.x && self.y == other.y
}
```
This is faster to compile, results in smaller binaries, and is simpler to
generate. Unfortunately, we have to keep the old pattern generating code around
for `repr(packed)` structs because something like `&self.x` (which doesn't show
up in `PartialEq` ops, but does show up in `Debug` and `Hash` ops) isn't
allowed. But this commit at least changes those cases to use let-destructuring
instead of match-destructuring, e.g.:
```
fn hash<__H: ::core:#️⃣:Hasher>(&self, state: &mut __H) -> () {
{
let Self(ref __self_0_0) = *self;
{ ::core:#️⃣:Hash::hash(&(*__self_0_0), state) }
}
}
```
There are some unnecessary blocks remaining in the generated code, but I
will fix them in a follow-up PR.
The existing derive code allows for various possibilities that aren't
needed in practice, which complicates the code. There are only a few
auto-derived traits and new ones are unlikely, so this commit simplifies
things.
- `PtrTy` has been eliminated. The `Raw` variant was never used, and the
lifetime for the `Borrowed` variant was always `None`. That left just
the mutability field, which has been inlined as necessary.
- `MethodDef::explicit_self` was a confusing `Option<Option<PtrTy>>`.
Indicating either `&self` or nothing. It's now a `bool`.
- `borrowed_self` is renamed as `self_ref`.
- `Ty::Ptr` is renamed to `Ty::Ref`.
The `&[ast::Variant]` field isn't used.
The `Vec<Ident>` field is only used for its length, but that's always
the same as the length of the `&[Ident]` and so isn't necessary.
[RFC 2011] Optimize non-consuming operators
Tracking issue: https://github.com/rust-lang/rust/issues/44838
Fifth step of https://github.com/rust-lang/rust/pull/96496
The most non-invasive approach that will probably have very little to no performance impact.
## Current behaviour
Captures are handled "on-the-fly", i.e., they are performed in the same place expressions are located.
```rust
// `let a = 1; let b = 2; assert!(a > 1 && b < 100);`
if !(
{ ***try capture `a` and then return `a`*** } > 1 && { ***try capture `b` and then return `b`*** } < 100
) {
panic!( ... );
}
```
As such, some overhead is likely to occur (Specially with very large chains of conditions).
## New behaviour for non-consuming operators
When an operator is known to not take `self`, then it is possible to capture variables **AFTER** the condition.
```rust
// `let a = 1; let b = 2; assert!(a > 1 && b < 100);`
if !( a > 1 && b < 100 ) {
{ ***try capture `a`*** }
{ ***try capture `b`*** }
panic!( ... );
}
```
So the possible impact on the runtime execution time will be diminished.
r? ````@oli-obk````
Currently the generated code for methods like `eq`, `ne`, and `partial_cmp`
includes stuff like this:
```
let __self_vi = ::core::intrinsics::discriminant_value(&*self);
let __arg_1_vi = ::core::intrinsics::discriminant_value(&*other);
if true && __self_vi == __arg_1_vi {
...
}
```
This commit removes the unnecessary `true &&`, and makes the generating
code a little easier to read in the process. It also fixes some errors
in comments.
macros: use typed identifiers in diag and subdiag derive
Using typed identifiers instead of strings with the Fluent identifiers in the diagnostic and subdiagnostic derives - this enables the diagnostic derive to benefit from the compile-time validation that comes with typed identifiers, namely that use of a non-existent Fluent identifier will not compile.
r? `````@oli-obk`````
Using typed identifiers instead of strings with the Fluent identifier
enables the diagnostic derive to benefit from the compile-time
validation that comes with typed identifiers - use of a non-existent
Fluent identifier will not compile.
Signed-off-by: David Wood <david.wood@huawei.com>
Fixup missing renames from `#[main]` to `#[rustc_main]`
In #84217 `#[main]` was removed and replaced with `#[rustc_main]`. In some places the rename was forgotten, which makes the current code confusing, because at first glance it seems that `#[main]` is still around. Perform the renames also in these places.
I noticed this (after first being confused by it) when working on #97802.
r? `@petrochenkov`
(since you reviewed the other PR)
This commit adds new methods that combine sequences of existing
formatting methods.
- `Formatter::debug_{tuple,struct}_field[12345]_finish`, equivalent to a
`Formatter::debug_{tuple,struct}` + N x `Debug{Tuple,Struct}::field` +
`Debug{Tuple,Struct}::finish` call sequence.
- `Formatter::debug_{tuple,struct}_fields_finish` is similar, but can
handle any number of fields by using arrays.
These new methods are all marked as `doc(hidden)` and unstable. They are
intended for the compiler's own use.
Special-casing up to 5 fields gives significantly better performance
results than always using arrays (as was tried in #95637).
The commit also changes the `Debug` deriving code to use these new methods. For
example, where the old `Debug` code for a struct with two fields would be like
this:
```
fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result {
match *self {
Self {
f1: ref __self_0_0,
f2: ref __self_0_1,
} => {
let debug_trait_builder = &mut ::core::fmt::Formatter::debug_struct(f, "S2");
let _ = ::core::fmt::DebugStruct::field(debug_trait_builder, "f1", &&(*__self_0_0));
let _ = ::core::fmt::DebugStruct::field(debug_trait_builder, "f2", &&(*__self_0_1));
::core::fmt::DebugStruct::finish(debug_trait_builder)
}
}
}
```
the new code is like this:
```
fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result {
match *self {
Self {
f1: ref __self_0_0,
f2: ref __self_0_1,
} => ::core::fmt::Formatter::debug_struct_field2_finish(
f,
"S2",
"f1",
&&(*__self_0_0),
"f2",
&&(*__self_0_1),
),
}
}
```
This shrinks the code produced for `Debug` instances
considerably, reducing compile times and binary sizes.
Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>
In fc357039f9 `#[main]` was removed and replaced with `#[rustc_main]`.
In some place the rename was forgotten, which makes the current code
confusing, because at first glance it seems that `#[main]` is still
around. Perform the renames also in these places.
proc_macro: don't pass a client-side function pointer through the server.
Before this PR, `proc_macro::bridge::Client<F>` contained both:
* the C ABI entry-point `run`, that the server can call to start the client
* some "payload" `f: F` passed to that entry-point
* in practice, this was always a (client-side Rust ABI) `fn` pointer to the actual function the proc macro author wrote, i.e. `#[proc_macro] fn foo(input: TokenStream) -> TokenStream`
In other words, the client was passing one of its (Rust) `fn` pointers to the server, which was passing it back to the client, for the client to call (see later below for why that was ever needed).
I was inspired by `@nnethercote's` attempt to remove the `get_handle_counters` field from `Client` (see https://github.com/rust-lang/rust/pull/97004#issuecomment-1139273301), which combined with removing the `f` ("payload") field, could theoretically allow for a `#[repr(transparent)]` `Client` that mostly just newtypes the C ABI entry-point `fn` pointer <sub>(and in the context of e.g. wasm isolation, that's *all* you want, since you can reason about it from outside the wasm VM, as just a 32-bit "function table index", that you can pass to the wasm VM to call that function)</sub>.
<hr/>
So this PR removes that "payload". But it's not a simple refactor: the reason the field existed in the first place is because monomorphizing over a function type doesn't let you call the function without having a value of that type, because function types don't implement anything like `Default`, i.e.:
```rust
extern "C" fn ffi_wrapper<A, R, F: Fn(A) -> R>(arg: A) -> R {
let f: F = ???; // no way to get a value of `F`
f(arg)
}
```
That could be solved with something like this, if it was allowed:
```rust
extern "C" fn ffi_wrapper<
A, R,
F: Fn(A) -> R,
const f: F // not allowed because the type is a generic param
>(arg: A) -> R {
f(arg)
}
```
Instead, this PR contains a workaround in `proc_macro::bridge::selfless_reify` (see its module-level comment for more details) that can provide something similar to the `ffi_wrapper` example above, but limited to `F` being `Copy` and ZST (and requiring an `F` value to prove the caller actually can create values of `F` and it's not uninhabited or some other unsound situation).
<hr/>
Hopefully this time we don't have a performance regression, and this has a chance to land.
cc `@mystor` `@bjorn3`
Begin fixing all the broken doctests in `compiler/`
Begins to fix#95994.
All of them pass now but 24 of them I've marked with `ignore HELP (<explanation>)` (asking for help) as I'm unsure how to get them to work / if we should leave them as they are.
There are also a few that I marked `ignore` that could maybe be made to work but seem less important.
Each `ignore` has a rough "reason" for ignoring after it parentheses, with
- `(pseudo-rust)` meaning "mostly rust-like but contains foreign syntax"
- `(illustrative)` a somewhat catchall for either a fragment of rust that doesn't stand on its own (like a lone type), or abbreviated rust with ellipses and undeclared types that would get too cluttered if made compile-worthy.
- `(not-rust)` stuff that isn't rust but benefits from the syntax highlighting, like MIR.
- `(internal)` uses `rustc_*` code which would be difficult to make work with the testing setup.
Those reason notes are a bit inconsistently applied and messy though. If that's important I can go through them again and try a more principled approach. When I run `rg '```ignore \(' .` on the repo, there look to be lots of different conventions other people have used for this sort of thing. I could try unifying them all if that would be helpful.
I'm not sure if there was a better existing way to do this but I wrote my own script to help me run all the doctests and wade through the output. If that would be useful to anyone else, I put it here: https://github.com/Elliot-Roberts/rust_doctest_fixing_tool
Add a new Rust attribute to support embedding debugger visualizers
Implemented [this RFC](https://github.com/rust-lang/rfcs/pull/3191) to add support for embedding debugger visualizers into a PDB.
Added a new attribute `#[debugger_visualizer]` and updated the `CrateMetadata` to store debugger visualizers for crate dependencies.
RFC: https://github.com/rust-lang/rfcs/pull/3191
Cleanup `DebuggerVisualizerFile` type and other minor cleanup of queries.
Merge the queries for debugger visualizers into a single query.
Revert move of `resolve_path` to `rustc_builtin_macros`. Update dependencies in Cargo.toml for `rustc_passes`.
Respond to PR comments. Load visualizer files into opaque bytes `Vec<u8>`. Debugger visualizers for dynamically linked crates should not be embedded in the current crate.
Update the unstable book with the new feature. Add the tracking issue for the debugger_visualizer feature.
Respond to PR comments and minor cleanups.
Change `span_suggestion` (and variants) to take `impl ToString` rather
than `String` for the suggested code, as this simplifies the
requirements on the diagnostic derive.
Signed-off-by: David Wood <david.wood@huawei.com>
Implement sym operands for global_asm!
Tracking issue: #93333
This PR is pretty much a complete rewrite of `sym` operand support for inline assembly so that the same implementation can be shared by `asm!` and `global_asm!`. The main changes are:
- At the AST level, `sym` is represented as a special `InlineAsmSym` AST node containing a path instead of an `Expr`.
- At the HIR level, `sym` is split into `SymStatic` and `SymFn` depending on whether the path resolves to a static during AST lowering (defaults to `SynFn` if `get_early_res` fails).
- `SymFn` is just an `AnonConst`. It runs through typeck and we just collect the resulting type at the end. An error is emitted if the type is not a `FnDef`.
- `SymStatic` directly holds a path and the `DefId` of the `static` that it is pointing to.
- The representation at the MIR level is mostly unchanged. There is a minor change to THIR where `SymFn` is a constant instead of an expression.
- At the codegen level we need to apply the target's symbol mangling to the result of `tcx.symbol_name()` depending on the target. This is done by calling the LLVM name mangler, which handles all of the details.
- On Mach-O, all symbols have a leading underscore.
- On x86 Windows, different mangling is used for cdecl, stdcall, fastcall and vectorcall.
- No mangling is needed on other platforms.
r? `@nagisa`
cc `@eddyb`
Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
Stabilize `derive_default_enum`
This stabilizes `#![feature(derive_default_enum)]`, as proposed in [RFC 3107](https://github.com/rust-lang/rfcs/pull/3107) and tracked in #87517. In short, it permits you to `#[derive(Default)]` on `enum`s, indicating what the default should be by placing a `#[default]` attribute on the desired variant (which must be a unit variant in the interest of forward compatibility).
```````@rustbot``````` label +S-waiting-on-review +T-lang
refactor: simplify few string related interactions
Few small optimizations:
check_doc_keyword: don't alloc string for emptiness check
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice.
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use FxHashSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice. Vec introduced in #83149, but no perf run posted on merge
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use BTreeSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
`MultiSpan` contains labels, which are more complicated with the
introduction of diagnostic translation and will use types from
`rustc_errors` - however, `rustc_errors` depends on `rustc_span` so
`rustc_span` cannot use types like `DiagnosticMessage` without
dependency cycles. Introduce a new `rustc_error_messages` crate that can
contain `DiagnosticMessage` and `MultiSpan`.
Signed-off-by: David Wood <david.wood@huawei.com>
More robust fallback for `use` suggestion
Our old way to suggest where to add `use`s would first look for pre-existing `use`s in the relevant crate/module, and if there are *no* uses, it would fallback on trying to use another item as the basis for the suggestion.
But this was fragile, as illustrated in issue #87613
This PR instead identifies span of the first token after any inner attributes, and uses *that* as the fallback for the `use` suggestion.
Fix#87613
then we just suggest the first legal position where you could inject a use.
To do this, I added `inject_use_span` field to `ModSpans`, and populate it in
parser (it is the span of the first token found after inner attributes, if any).
Then I rewrote the use-suggestion code to utilize it, and threw out some stuff
that is now unnecessary with this in place. (I think the result is easier to
understand.)
Then I added a test of issue 87613.
Improve allowness of the unexpected_cfgs lint
This pull-request improve the allowness (`#[allow(...)]`) of the `unexpected_cfgs` lint.
Before this PR only crate level `#![allow(unexpected_cfgs)]` worked, now with this PR it also work when put around `cfg!` or if it is in a upper level. Making it work ~for the attributes `cfg`, `cfg_attr`, ...~ for the same level is awkward as the current code is design to give "Some parent node that is close to this macro call" (cf. https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html) meaning that allow on the same line as an attribute won't work. I'm note even sure if this would be possible.
Found while working on https://github.com/rust-lang/rust/pull/94298.
r? ````````@petrochenkov````````
As an example:
#[test]
#[ignore = "not yet implemented"]
fn test_ignored() {
...
}
Will now render as:
running 2 tests
test tests::test_ignored ... ignored, not yet implemented
test result: ok. 1 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.00s