In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.
- Rename `ast::Lit::token` as `ast::Lit::token_lit`, because its type is
`token::Lit`, which is not a token. (This has been confusing me for a
long time.)
reasonable because we have an `ast::token::Lit` inside an `ast::Lit`.
- Rename `LitKind::{from,to}_lit_token` as
`LitKind::{from,to}_token_lit`, to match the above change and
`token::Lit`.
There is some redundancy here, but the extra assertions make it easier
to keep track of relative things, e.g. `ExprKind` is the biggest part
of `Expr`.
Stringify non-shorthand visibility correctly
This makes `stringify!(pub(in crate))` evaluate to `pub(in crate)` rather than `pub(crate)`, matching the behavior before the `crate` shorthand was removed. Further, this changes `stringify!(pub(in super))` to evaluate to `pub(in super)` rather than the current `pub(super)`. If the latter is not desired (it is _technically_ breaking), it can be undone.
Fixes#99981
`@rustbot` label +C-bug +regression-from-stable-to-beta +T-compiler
Rollup of 8 pull requests
Successful merges:
- #99340 (Fix ICE in Definitions::create_def)
- #99629 (Improve `cannot move out of` error message)
- #99864 (bootstrap: don't emit warn about duplicated deps with same/different features if some of sets actually empty)
- #99911 (Remove some uses of `guess_head_span`)
- #99976 (Make Rustdoc exit with correct error code when scraping examples from invalid files)
- #100003 (Improve size assertions.)
- #100012 (Avoid `Ty` to `String` conversions)
- #100020 (better error when python is not found in x - issue #99648)
Failed merges:
- #99994 (Replace `guess_head_span` with `opt_span`)
r? `@ghost`
`@rustbot` modify labels: rollup
- For any file with four or more size assertions, move them into a
separate module (as is already done for `hir.rs`).
- Add some more for AST nodes and THIR nodes.
- Put the `hir.rs` ones in alphabetical order.
From 72 bytes to 12 bytes (on x86-64).
There are two parts to this:
- Changing various source code offsets from 64-bit to 32-bit. This is
not a problem because the rest of rustc also uses 32-bit source code
offsets. This means `Token` is no longer `Copy` but this causes no
problems.
- Removing the `RawStrError` from `LiteralKind`. Raw string literal
invalidity is now indicated by a `None` value within
`RawStr`/`RawByteStr`, and the new `validate_raw_str` function can be
used to re-lex an invalid raw string literal to get the `RawStrError`.
There is one very small change in behaviour. Previously, if a raw string
literal matched both the `InvalidStarter` and `TooManyHashes` cases,
the latter would override the former. This has now changed, because
`raw_double_quoted_string` now uses `?` and so returns immediately upon
detecting the `InvalidStarter` case. I think this is a slight
improvement to report the earlier-detected error, and it explains the
change in the `test_too_many_hashes` test.
The commit also removes a couple of comments that refer to #77629 and
say that the size of these types don't affect performance. These
comments are wrong, though the performance effect is small.
Remove `TreeAndSpacing`.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
r? `@petrochenkov`
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
More derive output improvements
This PR includes:
- Some test improvements.
- Some cosmetic changes to derive output that make the code look more like what a human would write.
- Some more fundamental improvements to `cmp` and `partial_cmp` generation.
r? `@Mark-Simulacrum`
It's common to see repeated assertions like this in derived `clone` and
`eq` methods:
```
let _: ::core::clone::AssertParamIsClone<u32>;
let _: ::core::clone::AssertParamIsClone<u32>;
```
This commit avoids them.
Fixup missing renames from `#[main]` to `#[rustc_main]`
In #84217 `#[main]` was removed and replaced with `#[rustc_main]`. In some places the rename was forgotten, which makes the current code confusing, because at first glance it seems that `#[main]` is still around. Perform the renames also in these places.
I noticed this (after first being confused by it) when working on #97802.
r? `@petrochenkov`
(since you reviewed the other PR)
In fc357039f9 `#[main]` was removed and replaced with `#[rustc_main]`.
In some place the rename was forgotten, which makes the current code
confusing, because at first glance it seems that `#[main]` is still
around. Perform the renames also in these places.
Both functions do some modifying of streams using `make_mut`:
- `push` sometimes glues the first token of the next stream to the last
token of the first stream.
- `build` appends tokens to the first stream.
By doing all of this in the one place, things are simpler. The first
stream can be modified in both ways (if necessary) in the one place, and
any next stream with the first token removed doesn't need to be stored.
It's a weird function: it lets you modify the token stream in the middle
of iteration. There is only one call site, and it is only used for the
rare `ProceduralMasquerade` legacy case.
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
(This was previously merged as commit 5 in #94732 and then was reverted
in #97905 because of a perf regression caused by commit 4 in #94732.)
Don't suggest adding `let` in certain `if` conditions
Avoid being too eager to suggest `let` in an `if` condition with an `=`, namely when the LHS of the `=` isn't even valid as a pattern (to a first degree approximation).
This heustic I came up with kinda sucks. Let me know if it needs to be refined.
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
There are two impls of the `Encoder` trait: `opaque::Encoder` and
`opaque::FileEncoder`. The former encodes into memory and is infallible, the
latter writes to file and is fallible.
Currently, standard `Result`/`?`/`unwrap` error handling is used, but this is a
bit verbose and has non-trivial cost, which is annoying given how rare failures
are (especially in the infallible `opaque::Encoder` case).
This commit changes how `Encoder` fallibility is handled. All the `emit_*`
methods are now infallible. `opaque::Encoder` requires no great changes for
this. `opaque::FileEncoder` now implements a delayed error handling strategy.
If a failure occurs, it records this via the `res` field, and all subsequent
encoding operations are skipped if `res` indicates an error has occurred. Once
encoding is complete, the new `finish` method is called, which returns a
`Result`. In other words, there is now a single `Result`-producing method
instead of many of them.
This has very little effect on how any file errors are reported if
`opaque::FileEncoder` has any failures.
Much of this commit is boring mechanical changes, removing `Result` return
values and `?` or `unwrap` from expressions. The more interesting parts are as
follows.
- serialize.rs: The `Encoder` trait gains an `Ok` associated type. The
`into_inner` method is changed into `finish`, which returns
`Result<Vec<u8>, !>`.
- opaque.rs: The `FileEncoder` adopts the delayed error handling
strategy. Its `Ok` type is a `usize`, returning the number of bytes
written, replacing previous uses of `FileEncoder::position`.
- Various methods that take an encoder now consume it, rather than being
passed a mutable reference, e.g. `serialize_query_result_cache`.
The change was "Show invisible delimiters (within comments) when pretty
printing". It's useful to show these delimiters, but is a breaking
change for some proc macros.
Fixes#97608.
Permit `asm_const` and `asm_sym` to reference generic params
Related #96557
These constructs will be allowed:
```rust
fn foofoo<const N: usize>() {}
unsafe fn foo<const N: usize>() {
asm!("/* {0} */", const N);
asm!("/* {0} */", const N + 1);
asm!("/* {0} */", sym foofoo::<N>);
}
fn barbar<T>() {}
unsafe fn bar<T>() {
asm!("/* {0} */", const std::mem::size_of::<T>());
asm!("/* {0} */", const std::mem::size_of::<(T, T)>());
asm!("/* {0} */", sym barbar::<T>);
asm!("/* {0} */", sym barbar::<(T, T)>);
}
```
`@Amanieu,` I didn't switch inline asms to use `DefKind::InlineAsm`, as I see little value doing that; given that no type inference is needed, it will only make typecking slower and more complex but will have no real gains. I did switch them to follow the same code path as inline asm during symbol resolution, though.
The `error: unconstrained generic constant` you mentioned in #76001 is due to the fact that `to_const` will actually add a wfness obligation to the constant, which we don't need for `asm_const`, so I have that removed.
`@rustbot` label: +A-inline-assembly +F-asm
Begin fixing all the broken doctests in `compiler/`
Begins to fix#95994.
All of them pass now but 24 of them I've marked with `ignore HELP (<explanation>)` (asking for help) as I'm unsure how to get them to work / if we should leave them as they are.
There are also a few that I marked `ignore` that could maybe be made to work but seem less important.
Each `ignore` has a rough "reason" for ignoring after it parentheses, with
- `(pseudo-rust)` meaning "mostly rust-like but contains foreign syntax"
- `(illustrative)` a somewhat catchall for either a fragment of rust that doesn't stand on its own (like a lone type), or abbreviated rust with ellipses and undeclared types that would get too cluttered if made compile-worthy.
- `(not-rust)` stuff that isn't rust but benefits from the syntax highlighting, like MIR.
- `(internal)` uses `rustc_*` code which would be difficult to make work with the testing setup.
Those reason notes are a bit inconsistently applied and messy though. If that's important I can go through them again and try a more principled approach. When I run `rg '```ignore \(' .` on the repo, there look to be lots of different conventions other people have used for this sort of thing. I could try unifying them all if that would be helpful.
I'm not sure if there was a better existing way to do this but I wrote my own script to help me run all the doctests and wade through the output. If that would be useful to anyone else, I put it here: https://github.com/Elliot-Roberts/rust_doctest_fixing_tool
Overhaul `MacArgs`
Motivation:
- Clarify some code that I found hard to understand.
- Eliminate one use of three places where `TokenKind::Interpolated` values are created.
r? `@petrochenkov`
The value in `MacArgs::Eq` is currently represented as a `Token`.
Because of `TokenKind::Interpolated`, `Token` can be either a token or
an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a
literal or macro call AST fragment, and then is later lowered to a
literal token. But this is very non-obvious. `Token` is a much more
general type than what is needed.
This commit restricts things, by introducing a new type `MacArgsEqKind`
that is either an AST expression (pre-lowering) or an AST literal
(post-lowering). The downside is that the code is a bit more verbose in
a few places. The benefit is that makes it much clearer what the
possibilities are (though also shorter in some other places). Also, it
removes one use of `TokenKind::Interpolated`, taking us a step closer to
removing that variant, which will let us make `Token` impl `Copy` and
remove many "handle Interpolated" code paths in the parser.
Things to note:
- Error messages have improved. Messages like this:
```
unexpected token: `"bug" + "found"`
```
now say "unexpected expression", which makes more sense. Although
arbitrary expressions can exist within tokens thanks to
`TokenKind::Interpolated`, that's not obvious to anyone who doesn't
know compiler internals.
- In `parse_mac_args_common`, we no longer need to collect tokens for
the value expression.
The comment on this function explains that it's a specialized version of
`maybe_whole_expr`. But `maybe_whole_expr` doesn't do anything with
`NtIdent`, so `is_whole_expr` also doesn't need to.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
The two paths are equivalent -- they both end up calling `visit_expr()`.
I have kept the more restrictive path, the one that requires that
`token` be an expression nonterminal. (The next commit will simplify this
function further.)
Add `BoundKind` in `visit_param_bounds` to check questions in bounds
From the FIXME in the impl of `AstValidator`. Better bound checks by adding `BoundCtxt` type parameter to `visit_param_bound`
cc `@ecstatic-morse`
This lets us clone just the parts within a `TokenTree` that need
cloning, rather than the entire thing. This is a surprisingly large
performance win, up to 4% on `async-std-1.10.0`.
Report undeclared lifetimes during late resolution.
First step in https://github.com/rust-lang/rust/pull/91557
We reuse the rib design of the current resolution framework. Specific `LifetimeRib` and `LifetimeRibKind` types are introduced. The most important variant is `LifetimeRibKind::Generics`, which happens each time we encounter something which may introduce generic lifetime parameters. It can be an item or a `for<...>` binder. The `LifetimeBinderKind` specifies how this rib behaves with respect to in-band lifetimes.
r? `@petrochenkov`
Remove last vestiges of skippng ident span hashing
This removes a comment that no longer applies, and properly hashes
the full ident for path segments.
Implement sym operands for global_asm!
Tracking issue: #93333
This PR is pretty much a complete rewrite of `sym` operand support for inline assembly so that the same implementation can be shared by `asm!` and `global_asm!`. The main changes are:
- At the AST level, `sym` is represented as a special `InlineAsmSym` AST node containing a path instead of an `Expr`.
- At the HIR level, `sym` is split into `SymStatic` and `SymFn` depending on whether the path resolves to a static during AST lowering (defaults to `SynFn` if `get_early_res` fails).
- `SymFn` is just an `AnonConst`. It runs through typeck and we just collect the resulting type at the end. An error is emitted if the type is not a `FnDef`.
- `SymStatic` directly holds a path and the `DefId` of the `static` that it is pointing to.
- The representation at the MIR level is mostly unchanged. There is a minor change to THIR where `SymFn` is a constant instead of an expression.
- At the codegen level we need to apply the target's symbol mangling to the result of `tcx.symbol_name()` depending on the target. This is done by calling the LLVM name mangler, which handles all of the details.
- On Mach-O, all symbols have a leading underscore.
- On x86 Windows, different mangling is used for cdecl, stdcall, fastcall and vectorcall.
- No mangling is needed on other platforms.
r? `@nagisa`
cc `@eddyb`
Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
By heap allocating the argument within `NtPath`, `NtVis`, and `NtStmt`.
This slightly reduces cumulative and peak allocation amounts, most
notably on `deep-vector`.
Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
More robust fallback for `use` suggestion
Our old way to suggest where to add `use`s would first look for pre-existing `use`s in the relevant crate/module, and if there are *no* uses, it would fallback on trying to use another item as the basis for the suggestion.
But this was fragile, as illustrated in issue #87613
This PR instead identifies span of the first token after any inner attributes, and uses *that* as the fallback for the `use` suggestion.
Fix#87613
then we just suggest the first legal position where you could inject a use.
To do this, I added `inject_use_span` field to `ModSpans`, and populate it in
parser (it is the span of the first token found after inner attributes, if any).
Then I rewrote the use-suggestion code to utilize it, and threw out some stuff
that is now unnecessary with this in place. (I think the result is easier to
understand.)
Then I added a test of issue 87613.
Generator drop tracking: improve break and continue handling
This PR fixes two related issues.
One, sometimes break or continue have a block target instead of an expression target. This seems to mainly happen with try blocks. Since the drop tracking analysis only works on expressions, if we see a block target for break or continue, we substitute the last expression of the block as the target instead.
Two, break and continue were incorrectly being treated as the same, so continue would also show up as an exit from the loop or block. This patch corrects the way continue is handled by keeping a stack of loop entry points and uses those to find the target of the continue.
Fixes#93197
r? `@nikomatsakis`
This commit fixes two issues.
One, sometimes break or continue have a block target instead of an
expression target. This seems to mainly happen with try blocks. Since
the drop tracking analysis only works on expressions, if we see a block
target for break or continue, we substitute the last expression of the
block as the target instead.
Two, break and continue were incorrectly being treated as the same, so
continue would also show up as an exit from the loop or block. This
patch corrects the way continue is handled by keeping a stack of loop
entry points and uses those to find the target of the continue.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
currently panics on failure (e.g. if the input is too short, or on a
bad `Result` discriminant), and in some places it returns an error
(e.g. on a bad `Option` discriminant). The number of places where
either happens is surprisingly small, just because the binary
representation has very little redundancy and a lot of input reading
can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
`.rlink` file production, and there's a `FIXME` comment suggesting it
should change to a binary format, and (b) in a few tests in
non-fundamental ways. Indeed #85993 is open to remove it entirely.
And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.
Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
optimization for small counts that the impl for `Result<T, E>` has,
because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
`collect`, which is nice; the one for `Vec` uses unsafe code, because
that gave better perf on some benchmarks.
Fix star handling in block doc comments
Fixes#92872.
Some extra explanation about this PR and why https://github.com/rust-lang/rust/pull/92357 created this regression: when we merge doc comment kinds for example in:
```rust
/// he
/**
* hello
*/
#[doc = "boom"]
```
We don't want to remove the empty lines between them. However, to correctly compute the "horizontal trim", we still need it, so instead, I put back a part of the "vertical trim" directly in the "horizontal trim" computation so it doesn't impact the output buffer but allows us to correctly handle the stars.
r? `@camelid`
Add Attribute::meta_kind
The `AttrItem::meta` function is being called on a lot of places, however almost always the caller is only interested in the `kind` of the result `MetaItem`. Before, the `path` had to be cloned in order to get the kind, now it does not have to be.
There is a larger related "problem". In a lot of places, something wants to know contents of attributes. This is accessed through `Attribute::meta_item_list`, which calls `AttrItem::meta` (now `AttrItem::meta_kind`), among other methods. When this function is called, the meta item list has to be recreated from scratch. Everytime something asks a simple question (like is this item/list of attributes `#[doc(hidden)]`?), the tokens of the attribute(s) are cloned, parsed and the results are allocated on the heap. That seems really unnecessary. What would be the best way to cache this? Turn `meta_item_list` into a query perhaps? Related PR: https://github.com/rust-lang/rust/pull/92227
r? rust-lang/rustdoc
ast: Avoid aborts on fatal errors thrown from mutable AST visitor
Set the node to some dummy value and rethrow the error instead.
When using the old aborting `visit_clobber` in `InvocationCollector::visit_crate` the next tests abort due to fatal errors:
```
ui\modules\path-invalid-form.rs
ui\modules\path-macro.rs
ui\modules\path-no-file-name.rs
ui\parser\issues\issue-5806.rs
ui\parser\mod_file_with_path_attr.rs
```
Follow up to https://github.com/rust-lang/rust/pull/91313.
Remove `SymbolStr`
This was originally proposed in https://github.com/rust-lang/rust/pull/74554#discussion_r466203544. As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences.
Best reviewed one commit at a time.
r? `@oli-obk`
Stabilise `feature(const_generics_defaults)`
`feature(const_generics_defaults)` is complete implementation wise and has a pretty extensive test suite so I think is ready for stabilisation.
needs stabilisation report and maybe an RFC 😅
r? `@lcnr`
cc `@rust-lang/project-const-generics`
Fix bug with `#[doc]` string single-character last lines
Fixes#90618.
This is because `.iter().all(|c| c == '*')` returns `true` if there is no character checked. And in case the last line has only one character, it simply returns `true`, making the last line behind removed.
TraitKind -> Trait
TyAliasKind -> TyAlias
ImplKind -> Impl
FnKind -> Fn
All `*Kind`s in AST are supposed to be enums.
Tuple structs are converted to braced structs for the types above, and fields are reordered in syntactic order.
Also, mutable AST visitor now correctly visit spans in defaultness, unsafety, impl polarity and constness.
Optimize bidi character detection.
Should fix most of the performance regression of the bidi character detection (#90514), to be confirmed with a perf run.
rustc_ast: Turn `MutVisitor::token_visiting_enabled` into a constant
It's a visitor property rather than something that needs to be determined at runtime
rustc_span: `Ident::invalid` -> `Ident::empty`
The equivalent for `Symbol`s was renamed some time ago (`kw::Invalid` -> `kw::Empty`), and it makes sense to do the same thing for `Ident`s as well.
Revert anon union parsing
Revert PR #84571 and #85515, which implemented anonymous union parsing in a manner that broke the context-sensitivity for the `union` keyword and thus broke stable Rust code.
Fix#88583.
This reverts commit 059b68dd67.
Note that this was manually adjusted to retain some of the refactoring
introduced by commit 059b68dd67, so that it could
likewise retain the correction introduced in commit
5b4bc05fa5
Improve diagnostics for unary plus operators (#88276)
This pull request improves the diagnostics emitted on parsing a unary plus operator. See #88276.
Before:
```
error: expected expression, found `+`
--> src/main.rs:2:13
|
2 | let x = +1;
| ^ expected expression
```
After:
```
error: leading `+` is not supported
--> main.rs:2:13
|
2 | let x = +1;
| ^
| |
| unexpected `+`
| help: try removing the `+`
```
Detect bare blocks with type ascription that were meant to be a `struct` literal
Address part of #34255.
Potential improvement: silence the other knock down errors in `issue-34255-1.rs`.
- [x] Removed `?const` and change uses of `?const`
- [x] Added `~const` to the AST. It is gated behind const_trait_impl.
- [x] Validate `~const` in ast_validation.
- [ ] Add enum `BoundConstness` to the HIR. (With variants `NotConst` and
`ConstIfConst` allowing future extensions)
- [ ] Adjust trait selection and pre-existing code to use `BoundConstness`.
- [ ] Optional steps (*for this PR, obviously*)
- [ ] Fix#88155
- [ ] Do something with constness bounds in chalk
Use if-let guards in the codebase and various other pattern cleanups
Dogfooding if-let guards as experimentation for the feature.
Tracking issue #51114. Conflicts with #87937.
rfc3052 followup: Remove authors field from Cargo manifests
Since RFC 3052 soft deprecated the authors field, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information for contributors, we may as well
remove it from crates in this repo.
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
The special case breaks several useful invariants (`ExpnId`s are
globally unique, and never change). This special case
was added back in 2016 in https://github.com/rust-lang/rust/pull/34355
Stabilize "RangeFrom" patterns in 1.55
Implements a partial stabilization of #67264 and #37854.
Reference PR: https://github.com/rust-lang/reference/pull/900
# Stabilization Report
This stabilizes the `X..` pattern, shown as such, offering an exhaustive match for unsigned integers:
```rust
match x as u32 {
0 => println!("zero!"),
1.. => println!("positive number!"),
}
```
Currently if a Rust author wants to write such a match on an integer, they must use `1..={integer}::MAX` . By allowing a "RangeFrom" style pattern, this simplifies the match to not require the MAX path and thus not require specifically repeating the type inside the match, allowing for easier refactoring. This is particularly useful for instances like the above case, where different behavior on "0" vs. "1 or any positive number" is desired, and the actual MAX is unimportant.
Notably, this excepts slice patterns which include half-open ranges from stabilization, as the wisdom of those is still subject to some debate.
## Practical Applications
Instances of this specific usage have appeared in the compiler:
16143d1067/compiler/rustc_middle/src/ty/inhabitedness/mod.rs (L219)673d0db5e3/compiler/rustc_ty_utils/src/ty.rs (L524)
And I have noticed there are also a handful of "in the wild" users who have deployed it to similar effect, especially in the case of rejecting any value of a certain number or greater. It simply makes it much more ergonomic to write an irrefutable match, as done in Katholieke Universiteit Leuven's [SCALE and MAMBA project](05e5db00d5/WebAssembly/scale_std/src/fixed_point.rs (L685-L695)).
## Tests
There were already many tests in [src/test/ui/half-open-range/patterns](90a2e5e3fe/src/test/ui/half-open-range-patterns), as well as [generic pattern tests that test the `exclusive_range_pattern` feature](673d0db5e3/src/test/ui/pattern/usefulness/integer-ranges/reachability.rs), many dating back to the feature's introduction and remaining standing to this day. However, this stabilization comes with some additional tests to explore the... sometimes interesting behavior of interactions with other patterns. e.g. There is, at least, a mild diagnostic improvement in some edge cases, because before now, the pattern `0..=(5+1)` encounters the `half_open_range_patterns` feature gate and can thus emit the request to enable the feature flag, while also emitting the "inclusive range with no end" diagnostic. There is no intent to allow an `X..=` pattern that I am aware of, so removing the flag request is a strict improvement. The arrival of the `J | K` "or" pattern also enables some odd formations.
Some of the behavior tested for here is derived from experiments in this [Playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=58777b3c715c85165ac4a70d93efeefc) example, linked at https://github.com/rust-lang/rust/issues/67264#issuecomment-812770692, which may be useful to reference to observe the current behavior more closely.
In addition tests constituting an explanation of the "slicing range patterns" syntax issue are included in this PR.
## Desiderata
The exclusive range patterns and half-open range patterns are fairly strongly requested by many authors, as they make some patterns much more natural to write, but there is disagreement regarding the "closed" exclusive range pattern or the "RangeTo" pattern, especially where it creates "off by one" gaps in the presence of a "catch-all" wildcard case. Also, there are obviously no range analyses in place that will force diagnostics for e.g. highly overlapping matches. I believe these should be warned on, ideally, and I think it would be reasonable to consider such a blocker to stabilizing this feature, but there is no technical issue with the feature as-is from the purely syntactic perspective as such overlapping or missed matches can already be generated today with such a catch-all case. And part of the "point" of the feature, at least from my view, is to make it easier to omit wildcard matches: a pattern with such an "open" match produces an irrefutable match and does not need the wild card case, making it easier to benefit from exhaustiveness checking.
## History
- Implemented:
- Partially via exclusive ranges: https://github.com/rust-lang/rust/pull/35712
- Fully with half-open ranges: https://github.com/rust-lang/rust/pull/67258
- Unresolved Questions:
- The precedence concerns of https://github.com/rust-lang/rust/pull/48501 were considered as likely requiring adjustment but probably wanting a uniform consistent change across all pattern styles, given https://github.com/rust-lang/rust/issues/67264#issuecomment-720711656, but it is still unknown what changes might be desired
- How we want to handle slice patterns in ranges seems to be an open question still, as witnessed in the discussion of this PR!
I checked but I couldn't actually find an RFC for this, and given "approved provisionally by lang team without an RFC", I believe this might require an RFC before it can land? Unsure of procedure here, on account of this being stabilizing a subset of a feature of syntax.
r? `@scottmcm`
Fix ICE when `main` is declared in an `extern` block
Changes in #84401 to implement `imported_main` changed how the crate entry point is found, and a declared `main` in an `extern` block was detected erroneously. This was causing the ICE described in #86110.
This PR adds a check for this case and emits an error instead. Previously a `main` declaration in an `extern` block was not detected as an entry point at all, so emitting an error shouldn't break anything that worked previously. In 1.52.1 stable this is demonstrated, with a `` `main` function not found`` error.
Fixes#86110
parser: Ensure that all nonterminals have tokens after parsing
`parse_nonterminal` should always result in something with tokens.
This requirement wasn't satisfied in two cases:
- `stmt` nonterminal with expression statements (e.g. `0`, or `{}`, or `path + 1`) because `fn parse_stmt_without_recovery` forgot to propagate `force_collect` in some cases.
- `expr` nonterminal with expressions with built-in attributes (e.g. `#[allow(warnings)] 0`) due to an incorrect optimization in `fn parse_expr_force_collect`, it assumed that all expressions starting with `#` have their tokens collected during parsing, but that's not true if all the attributes on that expression are built-in and inert.
(Discovered when trying to implement eager `cfg` expansion for all attributes https://github.com/rust-lang/rust/pull/83824#issuecomment-817317170.)
r? `@Aaron1011`
further split up const_fn feature flag
This continues the work on splitting up `const_fn` into separate feature flags:
* `const_fn_trait_bound` for `const fn` with trait bounds
* `const_fn_unsize` for unsizing coercions in `const fn` (looks like only `dyn` unsizing is still guarded here)
I don't know if there are even any things left that `const_fn` guards... at least libcore and liballoc do not need it any more.
`@oli-obk` are you currently able to do reviews?
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:
* Derives macros no longer lose spans when their input is modified
by eager cfg-expansion. This is accomplished by performing eager
cfg-expansion on the token stream that we pass to the derive
proc-macro
* Inner attributes now preserve spans in all cases, including when we
have multiple inner attributes in a row.
This is accomplished through the following changes:
* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
These are very similar to a normal `TokenTree`, but they also track
the position of attributes and attribute targets within the stream.
They are built when we collect tokens during parsing.
An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
`AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
is created during the parsing of a nested AST node to make the 'outer'
AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
Extract attribute name once and match it against symbols that are being
validated, instead of using `Session::check_name` for each symbol
individually.
Assume that all validated attributes are used, instead of marking them
as such, since the attribute check should be exhaustive.
Use AnonConst for asm! constants
This replaces the old system which used explicit promotion. See #83169 for more background.
The syntax for `const` operands is still the same as before: `const <expr>`.
Fixes#83169
Because the implementation is heavily based on inline consts, we suffer from the same issues:
- We lose the ability to use expressions derived from generics. See the deleted tests in `src/test/ui/asm/const.rs`.
- We are hitting the same ICEs as inline consts, for example #78174. It is unlikely that we will be able to stabilize this before inline consts are stabilized.
Found with https://github.com/est31/warnalyzer.
Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
wants to use it in the future?
- Don't change rustc_serialize
I plan to scrap most of the json module in the near future (see
https://github.com/rust-lang/compiler-team/issues/418) and fixing the
tests needed more work than I expected.
TODO: check if any of the comments on the deleted code should be kept.
Allow registering tool lints with `register_tool`
Previously, there was no way to add a custom tool prefix, even if the tool
itself had registered a lint:
```rust
#![feature(register_tool)]
#![register_tool(xyz)]
#![warn(xyz::my_lint)]
```
```
$ rustc unknown-lint.rs --crate-type lib
error[E0710]: an unknown tool name found in scoped lint: `xyz::my_lint`
--> unknown-lint.rs:3:9
|
3 | #![warn(xyz::my_lint)]
| ^^^
```
This allows opting-in to lints from other tools using `register_tool`.
cc https://github.com/rust-lang/rust/issues/66079#issuecomment-788589193, ``@chorman0773``
r? ``@petrochenkov``
Extend `proc_macro_back_compat` lint to `procedural-masquerade`
We now lint on *any* use of `procedural-masquerade` crate. While this
crate still exists, its main reverse dependency (`cssparser`) no longer
depends on it. Any crates still depending off should stop doing so, as
it only exists to support very old Rust versions.
If a crate actually needs to support old versions of rustc via
`procedural-masquerade`, then they'll just need to accept the warning
until we remove it entirely (at the same time as the back-compat hack).
The latest version of `procedural-masquerade` does work with the
latest rustc, but trying to check for the version seems like more
trouble than it's worth.
While working on this, I realized that the `proc-macro-hack` check was
never actually doing anything. The corresponding enum variant in
`proc-macro-hack` is named `Value` or `Nested` - it has never been
called `Input`. Due to a strange Crater issue, the Crater run that
tested adding this did *not* end up testing it - some of the crates that
would have failed did not actually have their tests checked, making it
seem as though the `proc-macro-hack` check was working.
The Crater issue is being discussed at
https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Nearly.20identical.20Crater.20runs.20processed.20a.20crate.20differently/near/230406661
Despite the `proc-macro-hack` check not actually doing anything, we
haven't gotten any reports from users about their build being broken.
I went ahead and removed it entirely, since it's clear that no one is
being affected by the `proc-macro-hack` regression in practice.
ast/hir: Rename field-related structures
I always forget what `ast::Field` and `ast::StructField` mean despite working with AST for long time, so this PR changes the naming to less confusing and more consistent.
- `StructField` -> `FieldDef` ("field definition")
- `Field` -> `ExprField` ("expression field", not "field expression")
- `FieldPat` -> `PatField` ("pattern field", not "field pattern")
Various visiting and other methods working with the fields are renamed correspondingly too.
The second commit reduces the size of `ExprKind` by boxing fields of `ExprKind::Struct` in preparation for https://github.com/rust-lang/rust/pull/80080.
Previously, there was no way to add a custom tool prefix, even if the tool
itself had registered a lint:
```
#![feature(register_tool)]
#![register_tool(xyz)]
#![warn(xyz::my_lint)]
```
```
$ rustc unknown-lint.rs --crate-type lib
error[E0710]: an unknown tool name found in scoped lint: `xyz::my_lint`
--> unknown-lint.rs:3:9
|
3 | #![warn(xyz::my_lint)]
| ^^^
```
This allows opting-in to lints from other tools using `register_tool`.
StructField -> FieldDef ("field definition")
Field -> ExprField ("expression field", not "field expression")
FieldPat -> PatField ("pattern field", not "field pattern")
Also rename visiting and other methods working on them.
We now lint on *any* use of `procedural-masquerade` crate. While this
crate still exists, its main reverse dependency (`cssparser`) no longer
depends on it. Any crates still depending off should stop doing so, as
it only exists to support very old Rust versions.
If a crate actually needs to support old versions of rustc via
`procedural-masquerade`, then they'll just need to accept the warning
until we remove it entirely (at the same time as the back-compat hack).
The latest version of `procedural-masquerade` does not work with the
latest rustc, but trying to check for the version seems like more
trouble than it's worth.
While working on this, I realized that the `proc-macro-hack` check was
never actually doing anything. The corresponding enum variant in
`proc-macro-hack` is named `Value` or `Nested` - it has never been
called `Input`. Due to a strange Crater issue, the Crater run that
tested adding this did *not* end up testing it - some of the crates that
would have failed did not actually have their tests checked, making it
seem as though the `proc-macro-hack` check was working.
The Crater issue is being discussed at
https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Nearly.20identical.20Crater.20runs.20processed.20a.20crate.20differently/near/230406661
Despite the `proc-macro-hack` check not actually doing anything, we
haven't gotten any reports from users about their build being broken.
I went ahead and removed it entirely, since it's clear that no one is
being affected by the `proc-macro-hack` regression in practice.
Introduce `proc_macro_back_compat` lint, and emit for `time-macros-impl`
Now that future-incompat-report support has landed in nightly Cargo, we
can start to make progress towards removing the various proc-macro
back-compat hacks that have accumulated in the compiler.
This PR introduces a new lint `proc_macro_back_compat`, which results in
a future-incompat-report entry being generated. All proc-macro
back-compat warnings will be grouped under this lint. Note that this
lint will never actually become a hard error - instead, we will remove
the special cases for various macros, which will cause older versions of
those crates to emit some other error.
I've added code to fire this lint for the `time-macros-impl` case. This
is the easiest case out of all of our current back-compat hacks - the
crate was renamed to `time-macros`, so seeing a filename with
`time-macros-impl` guarantees that an older version of the parent `time`
crate is in use.
When Cargo's future-incompat-report feature gets stabilized, affected
users will start to see future-incompat warnings when they build their
crates.
Now that future-incompat-report support has landed in nightly Cargo, we
can start to make progress towards removing the various proc-macro
back-compat hacks that have accumulated in the compiler.
This PR introduces a new lint `proc_macro_back_compat`, which results in
a future-incompat-report entry being generated. All proc-macro
back-compat warnings will be grouped under this lint. Note that this
lint will never actually become a hard error - instead, we will remove
the special cases for various macros, which will cause older versions of
those crates to emit some other error.
I've added code to fire this lint for the `time-macros-impl` case. This
is the easiest case out of all of our current back-compat hacks - the
crate was renamed to `time-macros`, so seeing a filename with
`time-macros-impl` guarantees that an older version of the parent `time`
crate is in use.
When Cargo's future-incompat-report feature gets stabilized, affected
users will start to see future-incompat warnings when they build their
crates.
Change x64 size checks to not apply to x32.
Rust contains various size checks conditional on target_arch = "x86_64", but these checks were never intended to apply to x86_64-unknown-linux-gnux32. Add target_pointer_width = "64" to the conditions.
Implement built-in attribute macro `#[cfg_eval]` + some refactoring
This PR implements a built-in attribute macro `#[cfg_eval]` as it was suggested in https://github.com/rust-lang/rust/pull/79078 to avoid `#[derive()]` without arguments being abused as a way to configure input for other attributes.
The macro is used for eagerly expanding all `#[cfg]` and `#[cfg_attr]` attributes in its input ("fully configuring" the input).
The effect is identical to effect of `#[derive(Foo, Bar)]` which also fully configures its input before passing it to macros `Foo` and `Bar`, but unlike `#[derive]` `#[cfg_eval]` can be applied to any syntax nodes supporting macro attributes, not only certain items.
`cfg_eval` was the first name suggested in https://github.com/rust-lang/rust/pull/79078, but other alternatives are also possible, e.g. `cfg_expand`.
```rust
#[cfg_eval]
#[my_attr] // Receives `struct S {}` as input, the field is configured away by `#[cfg_eval]`
struct S {
#[cfg(FALSE)]
field: u8,
}
```
Tracking issue: https://github.com/rust-lang/rust/issues/82679
Let a portion of DefPathHash uniquely identify the DefPath's crate.
This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot.
It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.
The following new piece of documentation describes the most interesting aspects of the changes:
```rust
/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.
```
Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?
r? `@pnkfelix` (feel free to re-assign)
Rust contains various size checks conditional on target_arch = "x86_64",
but these checks were never intended to apply to
x86_64-unknown-linux-gnux32. Add target_pointer_width = "64" to the
conditions.
- Rename `broken_intra_doc_links` to `rustdoc::broken_intra_doc_links`
- Ensure that the old lint names still work and give deprecation errors
- Register lints even when running doctests
Otherwise, all `rustdoc::` lints would be ignored.
- Register all existing lints as removed
This unfortunately doesn't work with `register_renamed` because tool
lints have not yet been registered when rustc is running. For similar
reasons, `check_backwards_compat` doesn't work either. Call
`register_removed` directly instead.
- Fix fallout
+ Rustdoc lints for compiler/
+ Rustdoc lints for library/
Note that this does *not* suggest `rustdoc::broken_intra_doc_links` for
`rustdoc::intra_doc_link_resolution_failure`, since there was no time
when the latter was valid.
When token-based attribute handling is implemeneted in #80689,
we will need to access tokens from `HasAttrs` (to perform
cfg-stripping), and we will to access attributes from `HasTokens` (to
construct a `PreexpTokenStream`).
This PR merges the `HasAttrs` and `HasTokens` traits into a new
`AstLike` trait. The previous `HasAttrs` impls from `Vec<Attribute>` and `AttrVec`
are removed - they aren't attribute targets, so the impls never really
made sense.
Crate root is sufficiently different from `mod` items, at least at syntactic level.
Also remove customization point for "`mod` item or crate root" from AST visitors.