Remove unused dependencies from compiler crates
Various compiler crates have dependencies that they don't appear to use. I used some scripting to detect such dependencies, filtered them based on some manual review, and removed those that do indeed appear to be entirely unused.
Use HTTPS links where possible
While looking at #86583, I wondered how many other (insecure) HTTP links were in `rustc`. This changes most other `http` links to `https`. While most of the links are in comments or documentation, there are a few other HTTP links that are used by CI that are changed to HTTPS.
Notes:
- I didn't change any to or in licences
- Some links don't support HTTPS :(
- Some `http` links were dead, in those cases I upgraded them to their new places (all of which used HTTPS)
Re-add support for parsing (and pretty-printing) inner-attributes in match body
Re-add support for parsing (and pretty-printing) inner-attributes within body of a `match`.
In other words, we can do `match EXPR { #![inner_attr] ARM_1 ARM_2 ... }` again.
I believe this unbreaks the only four crates that crater flagged as broken by PR #83312.
(I am putting this up so that the lang-team can check it out and decide whether it changes their mind about what to do regarding PR #83312.)
Fix two ICEs in the parser
This pull request fixes#84104 and fixes#84148. The latter is caused by an invalid `assert_ne!()` in the parser, which I have simply removed because the error is then caught in another part of the parser.
#84104 is somewhat more subtle and has to do with a suggestion to remove extraneous `<` characters; for instance:
```rust
fn main() {
foo::<Ty<<<i32>();
}
```
currently leads to
```
error: unmatched angle brackets
--> unmatched-langle.rs:2:10
|
2 | foo::<Ty<<<i32>();
| ^^^ help: remove extra angle brackets
```
which is obviously wrong and stems from the fact that the code for issuing the above suggestion does not consider the possibility that there might be other tokens in between the opening angle brackets. In #84104, this has led to a span being generated that ends in the middle of a multi-byte character (because the code issuing the suggestion thought that it was only skipping over `<`, which are single-byte), causing an ICE.
Remove unused feature gates
The first commit removes a usage of a feature gate, but I don't expect it to be controversial as the feature gate was only used to workaround a limitation of rust in the past. (closures never being `Clone`)
The second commit uses `#[allow_internal_unstable]` to avoid leaking the `trusted_step` feature gate usage from inside the index newtype macro. It didn't work for the `min_specialization` feature gate though.
The third commit removes (almost) all feature gates from the compiler that weren't used anyway.
# Stabilization report
## Summary
This stabilizes using macro expansion in key-value attributes, like so:
```rust
#[doc = include_str!("my_doc.md")]
struct S;
#[path = concat!(env!("OUT_DIR"), "/generated.rs")]
mod m;
```
See the changes to the reference for details on what macros are allowed;
see Petrochenkov's excellent blog post [on internals](https://internals.rust-lang.org/t/macro-expansion-points-in-attributes/11455)
for alternatives that were considered and rejected ("why accept no more
and no less?")
This has been available on nightly since 1.50 with no major issues.
## Notes
### Accepted syntax
The parser accepts arbitrary Rust expressions in this position, but any expression other than a macro invocation will ultimately lead to an error because it is not expected by the built-in expression forms (e.g., `#[doc]`). Note that decorators and the like may be able to observe other expression forms.
### Expansion ordering
Expansion of macro expressions in "inert" attributes occurs after decorators have executed, analogously to macro expressions appearing in the function body or other parts of decorator input.
There is currently no way for decorators to accept macros in key-value position if macro expansion must be performed before the decorator executes (if the macro can simply be copied into the output for later expansion, that can work).
## Test cases
- https://github.com/rust-lang/rust/blob/master/src/test/ui/attributes/key-value-expansion-on-mac.rs
- https://github.com/rust-lang/rust/blob/master/src/test/rustdoc/external-doc.rs
The feature has also been dogfooded extensively in the compiler and
standard library:
- https://github.com/rust-lang/rust/pull/83329
- https://github.com/rust-lang/rust/pull/83230
- https://github.com/rust-lang/rust/pull/82641
- https://github.com/rust-lang/rust/pull/80534
## Implementation history
- Initial proposal: https://github.com/rust-lang/rust/issues/55414#issuecomment-554005412
- Experiment to see how much code it would break: https://github.com/rust-lang/rust/pull/67121
- Preliminary work to restrict expansion that would conflict with this
feature: https://github.com/rust-lang/rust/pull/77271
- Initial implementation: https://github.com/rust-lang/rust/pull/78837
- Fix for an ICE: https://github.com/rust-lang/rust/pull/80563
## Unresolved Questions
~~https://github.com/rust-lang/rust/pull/83366#issuecomment-805180738 listed some concerns, but they have been resolved as of this final report.~~
## Additional Information
There are two workarounds that have a similar effect for `#[doc]`
attributes on nightly. One is to emulate this behavior by using a limited version of this feature that was stabilized for historical reasons:
```rust
macro_rules! forward_inner_docs {
($e:expr => $i:item) => {
#[doc = $e]
$i
};
}
forward_inner_docs!(include_str!("lib.rs") => struct S {});
```
This also works for other attributes (like `#[path = concat!(...)]`).
The other is to use `doc(include)`:
```rust
#![feature(external_doc)]
#[doc(include = "lib.rs")]
struct S {}
```
The first works, but is non-trivial for people to discover, and
difficult to read and maintain. The second is a strange special-case for
a particular use of the macro. This generalizes it to work for any use
case, not just including files.
I plan to remove `doc(include)` when this is stabilized. The
`forward_inner_docs` workaround will still compile without warnings, but
I expect it to be used less once it's no longer necessary.
Recover from invalid `struct` item syntax
Parse unsupported "default field const values":
```rust
struct S {
field: Type = const_val,
}
```
Recover from small `:` typo and provide suggestion:
```rust
struct S {
field; Type,
field2= Type,
}
```
Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths
This PR fixes#73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.
`RealFileName::Named` introduced in #72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.
`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.
When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".
`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by #44940) and `name_was_remapped` (introduced by #41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.
cc `@eddyb` who implemented `/rustc/...` path devirtualisation
Parse unsupported "default field const values":
```rust
struct S {
field: Type = const_val,
}
```
Recover from small `:` typo and provide suggestion:
```rust
struct S {
field; Type,
field2= Type,
}
```
In other words, we can do `match EXPR { #![inner_attr] ARM_1 ARM_2 ... }` again.
I believe this unbreaks the only four crates that crater flagged as broken by PR 83312.
(I am putting this up so that the lang-team can check it out and decide whether
it changes their mind about what to do regarding PR 83312.)
Improve diagnostics for functions in `struct` definitions
Tries to implement #76421.
This is probably going to need unit tests, but I wanted to hear from review all the cases tests should cover.
I'd like to follow up with the "mechanically applicable suggestion here that adds an impl block" step, but I'd need guidance. My idea for now would be to try to parse a function, and if that succeeds, create a dummy `ast::Item` impl block to then format it using `pprust`. Would that be a viable approach? Is there a better alternative?
r? `@matklad` cc `@estebank`
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:
* Derives macros no longer lose spans when their input is modified
by eager cfg-expansion. This is accomplished by performing eager
cfg-expansion on the token stream that we pass to the derive
proc-macro
* Inner attributes now preserve spans in all cases, including when we
have multiple inner attributes in a row.
This is accomplished through the following changes:
* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
These are very similar to a normal `TokenTree`, but they also track
the position of attributes and attribute targets within the stream.
They are built when we collect tokens during parsing.
An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
`AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
is created during the parsing of a nested AST node to make the 'outer'
AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
Avoid `;` -> `,` recovery and unclosed `}` recovery from being too verbose
Those two recovery attempts have a very bad interaction that causes too
unnecessary output. Add a simple gate to avoid interpreting a `;` as a
`,` when there are unclosed braces.
Fix#83498.
Those two recovery attempts have a very bad interaction that causes too
unnecessary output. Add a simple gate to avoid interpreting a `;` as a
`,` when there are unclosed braces.
Previously, we would silently remove any `None`-delimiters when
capturing a `TokenStream`, 'flattenting' them to their inner tokens.
This was not normally visible, since we usually have
`TokenKind::Interpolated` (which gets converted to a `None`-delimited
group during macro invocation) instead of an actual `None`-delimited
group.
However, there are a couple of cases where this becomes visible to
proc-macros:
1. A cross-crate `macro_rules!` macro has a `None`-delimited group
stored in its body (as a result of being produced by another
`macro_rules!` macro). The cross-crate `macro_rules!` invocation
can then expand to an attribute macro invocation, which needs
to be able to see the `None`-delimited group.
2. A proc-macro can invoke an attribute proc-macro with its re-collected
input. If there are any nonterminals present in the input, they will
get re-collected to `None`-delimited groups, which will then get
captured as part of the attribute macro invocation.
Both of these cases are incredibly obscure, so there hopefully won't be
any breakage. This change will allow more agressive 'flattenting' of
nonterminals in #82608 without losing `None`-delimited groups.
ast/hir: Rename field-related structures
I always forget what `ast::Field` and `ast::StructField` mean despite working with AST for long time, so this PR changes the naming to less confusing and more consistent.
- `StructField` -> `FieldDef` ("field definition")
- `Field` -> `ExprField` ("expression field", not "field expression")
- `FieldPat` -> `PatField` ("pattern field", not "field pattern")
Various visiting and other methods working with the fields are renamed correspondingly too.
The second commit reduces the size of `ExprKind` by boxing fields of `ExprKind::Struct` in preparation for https://github.com/rust-lang/rust/pull/80080.
StructField -> FieldDef ("field definition")
Field -> ExprField ("expression field", not "field expression")
FieldPat -> PatField ("pattern field", not "field pattern")
Also rename visiting and other methods working on them.
Avoid sorting predicates by `DefId`
Fixes issue #82920
Even if an item does not change between compilation sessions, it may end
up with a different `DefId`, since inserting/deleting an item affects
the `DefId`s of all subsequent items. Therefore, we use a `DefPathHash`
in the incremental compilation system, which is stable in the face of
changes to unrelated items.
In particular, the query system will consider the inputs to a query to
be unchanged if any `DefId`s in the inputs have their `DefPathHash`es
unchanged. Queries are pure functions, so the query result should be
unchanged if the query inputs are unchanged.
Unfortunately, it's possible to inadvertantly make a query result
incorrectly change across compilations, by relying on the specific value
of a `DefId`. Specifically, if the query result is a slice that gets
sorted by `DefId`, the precise order will depend on how the `DefId`s got
assigned in a particular compilation session. If some definitions end up
with different `DefId`s (but the same `DefPathHash`es) in a subsequent
compilation session, we will end up re-computing a *different* value for
the query, even though the query system expects the result to unchanged
due to the unchanged inputs.
It turns out that we have been sorting the predicates computed during
`astconv` by their `DefId`. These predicates make their way into the
`super_predicates_that_define_assoc_type`, which ends up getting used to
compute the vtables of trait objects. This, re-ordering these predicates
between compilation sessions can lead to undefined behavior at runtime -
the query system will re-use code built with a *differently ordered*
vtable, resulting in the wrong method being invoked at runtime.
This PR avoids sorting by `DefId` in `astconv`, fixing the
miscompilation. However, it's possible that other instances of this
issue exist - they could also be easily introduced in the future.
To fully fix this issue, we should
1. Turn on `-Z incremental-verify-ich` by default. This will cause the
compiler to ICE whenver an 'unchanged' query result changes between
compilation sessions, instead of causing a miscompilation.
2. Remove the `Ord` impls for `CrateNum` and `DefId`. This will make it
difficult to introduce ICEs in the first place.
or-patterns: disallow in `let` bindings
~~Blocked on https://github.com/rust-lang/rust/pull/81869~~
Disallows top-level or-patterns before type ascription. We want to reserve this syntactic space for possible future generalized type ascription.
r? ``@petrochenkov``
Suggest character encoding is incorrect when encountering random null bytes
This adds a note whenever null bytes are seen at the start of a token unexpectedly, since those tend to come from UTF-16 encoded files without a [BOM](https://en.wikipedia.org/wiki/Byte_order_mark) (if a UTF-16 BOM appears it won't be valid UTF-8, but if there is no BOM it be both valid UTF-16 and valid but garbled UTF-8). This approach was suggested in https://github.com/rust-lang/rust/issues/73979#issuecomment-653976451.
Closes#73979.
This adds recovery when in array type syntax user writes
[X; Y<Z, ...>]
instead of
[X; Y::<Z, ...>]
Fixes#82566
Note that whenever we parse an expression and know that the next token
cannot be `,`, we should be calling
check_mistyped_turbofish_with_multiple_type_params for this recovery.
Previously we only did this for statement parsing (e.g. `let x = f<a,
b>;`). We now also do it when parsing the length field in array type
syntax.
check_mistyped_turbofish_with_multiple_type_params was previously
expecting type arguments between angle brackets, which is not right, as
we can also see const expressions. We now use generic argument parser
instead of type parser.
Test with one, two, and three generic arguments added to check
consistentcy between
1. check_no_chained_comparison: Called after parsing a nested binop
application like `x < A > ...` where angle brackets are interpreted as
binary operators and `A` is an expression.
2. check_mistyped_turbofish_with_multiple_type_params: called by
`parse_full_stmt` when we expect to see a semicolon after parsing an
expression but don't see it.
(In `T2<1, 2>::C;`, the expression is `T2 < 1`)
When token-based attribute handling is implemeneted in #80689,
we will need to access tokens from `HasAttrs` (to perform
cfg-stripping), and we will to access attributes from `HasTokens` (to
construct a `PreexpTokenStream`).
This PR merges the `HasAttrs` and `HasTokens` traits into a new
`AstLike` trait. The previous `HasAttrs` impls from `Vec<Attribute>` and `AttrVec`
are removed - they aren't attribute targets, so the impls never really
made sense.
Improve suggestion for tuple struct pattern matching errors.
Closes#80174
This change allows numbers to be parsed as field names when pattern matching on structs, which allows us to provide better error messages when tuple structs are matched using a struct pattern.
r? ``@estebank``
ast: Keep expansion status for out-of-line module items
I.e. whether a module `mod foo;` is already loaded from a file or not.
This is a pre-requisite to correctly treating inner attributes on such modules (https://github.com/rust-lang/rust/issues/81661).
With this change AST structures for `mod` items diverge even more for AST structure for the crate root, which previously used `ast::Mod`.
Therefore this PR removes `ast::Mod` from `ast::Crate` in the first commit, these two things are sufficiently different from each other, at least at syntactic level.
Customization points for visiting a "`mod` item or crate root" were also removed from AST visitors (`fn visit_mod`).
`ast::Mod` itself was refactored away in the second commit in favor of `ItemKind::Mod(Unsafe, ModKind)`.
Crate root is sufficiently different from `mod` items, at least at syntactic level.
Also remove customization point for "`mod` item or crate root" from AST visitors.
Along the way, we also implement a handful of diagnostics improvements
and fixes, particularly with respect to the special handling of `||` in
place of `|` and when there are leading verts in function params, which
don't allow top-level or-patterns anyway.
Rollup of 11 pull requests
Successful merges:
- #80523 (#[doc(inline)] sym_generated)
- #80920 (Visit more targets when validating attributes)
- #81720 (Updated smallvec version due to RUSTSEC-2021-0003)
- #81891 ([rustdoc-json] Make `header` a vec of modifiers, and FunctionPointer consistent)
- #81912 (Implement the precise analysis pass for lint `disjoint_capture_drop_reorder`)
- #81914 (Fixing bad suggestion for `_` in `const` type when a function #81885)
- #81919 (BTreeMap: fix internal comments)
- #81927 (Add a regression test for #32498)
- #81965 (Fix MIR pretty printer for non-local DefIds)
- #82029 (Use debug log level for developer oriented logs)
- #82056 (fix ice (#82032))
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
This is a pure refactoring split out from #80689.
It represents the most invasive part of that PR, requiring changes in
every caller of `parse_outer_attributes`
In order to eagerly expand `#[cfg]` attributes while preserving the
original `TokenStream`, we need to know the range of tokens that
corresponds to every attribute target. This is accomplished by making
`parse_outer_attributes` return an opaque `AttrWrapper` struct. An
`AttrWrapper` must be converted to a plain `AttrVec` by passing it to
`collect_tokens_trailing_token`. This makes it difficult to accidentally
construct an AST node with attributes without calling `collect_tokens_trailing_token`,
since AST nodes store an `AttrVec`, not an `AttrWrapper`.
As a result, we now call `collect_tokens_trailing_token` for attribute
targets which only support inert attributes, such as generic arguments
and struct fields. Currently, the constructed `LazyTokenStream` is
simply discarded. Future PRs will record the token range corresponding
to the attribute target, allowing those tokens to be removed from an
enclosing `collect_tokens_trailing_token` call if necessary.
The panic happens when in recovery parsing a full `impl`
(`parse_item_impl`) fails and we drop the `DiagnosticBuilder` for the
recovery suggestion and return the `parse_item_impl` error.
We now raise the original error "expected identifier found `impl`" when
parsing the `impl` fails.
Note that the regression test is slightly simplified version of the
original repro in #81806, to make the error output smaller and more
resilient to unrelated changes in parser error messages.
Fixes#81806
Box the biggest ast::ItemKind variants
This PR is a different approach on https://github.com/rust-lang/rust/pull/81400, aiming to save memory in humongous ASTs.
The three affected item kind enums are:
- `ast::ItemKind` (208 -> 112 bytes)
- `ast::AssocItemKind` (176 -> 72 bytes)
- `ast::ForeignItemKind` (176 -> 72 bytes)
Improve handling of spans around macro result parse errors
Fixes#81543
After we expand a macro, we try to parse the resulting tokens as a AST
node. This commit makes several improvements to how we handle spans when
an error occurs:
* Only ovewrite the original `Span` if it's a dummy span. This preserves
a more-specific span if one is available.
* Use `self.prev_token` instead of `self.token` when emitting an error
message after encountering EOF, since an EOF token always has a dummy
span
* Make `SourceMap::next_point` leave dummy spans unused. A dummy span
does not have a logical 'next point', since it's a zero-length span.
Re-using the span span preserves its 'dummy-ness' for other checks
Fixes#81543
After we expand a macro, we try to parse the resulting tokens as a AST
node. This commit makes several improvements to how we handle spans when
an error occurs:
* Only ovewrite the original `Span` if it's a dummy span. This preserves
a more-specific span if one is available.
* Use `self.prev_token` instead of `self.token` when emitting an error
message after encountering EOF, since an EOF token always has a dummy
span
* Make `SourceMap::next_point` leave dummy spans unused. A dummy span
does not have a logical 'next point', since it's a zero-length span.
Re-using the span span preserves its 'dummy-ness' for other checks
Clone entire `TokenCursor` when collecting tokens
Reverts PR #80830Fixestaiki-e/pin-project#312
We can have an arbitrary number of `None`-delimited group frames pushed
on the stack due to proc-macro invocations, which can legally be exited.
Attempting to account for this would add a lot of complexity for a tiny
performance gain, so let's just use the original strategy.
Reverts PR #80830Fixestaiki-e/pin-project#312
We can have an arbitrary number of `None`-delimited group frames pushed
on the stack due to proc-macro invocations, which can legally be exited.
Attempting to account for this would add a lot of complexity for a tiny
performance gain, so let's just use the original strategy.
Improve diagnostics when parsing angle args
https://github.com/rust-lang/rust/pull/79266 introduced parsing of generic arguments in associated type constraints, this however resulted in possibly very confusing error messages in cases in which closing angle brackets were missing such as in `Vec<(u32, _, _) = vec![]`, which outputs an incorrectly parsed equality constraint error, as noted by `@cynecx.`
This PR tries to provide better error messages in such cases.
r? `@petrochenkov`
Refactor token collection to capture trailing token immediately
Split out from https://github.com/rust-lang/rust/pull/80689 - when we start capturing more information about attribute targets, we'll need to know in advance if we're capturing a trailing token or not.
r? `@ghost`
Currently, when a user uses a struct pattern to pattern match on
a tuple struct, the errors we emit generally suggest adding fields
using their field names, which are numbers. However, numbers are
not valid identifiers, so the suggestions, which use the shorthand
notation, are not valid syntax. This commit changes those errors
to suggest using the actual tuple struct pattern syntax instead,
which is a more actionable suggestion.
Fixes#81007
Previously, we would fail to collect tokens in the proper place when
only builtin attributes were present. As a result, we would end up with
attribute tokens in the collected `TokenStream`, leading to duplication
when we attempted to prepend the attributes from the AST node.
We now explicitly track when token collection must be performed due to
nomterminal parsing.
Set tokens on AST node in `collect_tokens`
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
A new `HasTokens` trait is introduced, which is used to move logic from
the callers of `collect_tokens` into the body of `collect_tokens`.
In addition to reducing duplication, this paves the way for PR #80689,
which needs to perform additional logic during token collection.
Rework diagnostics for wrong number of generic args (fixes#66228 and #71924)
This PR reworks the `wrong number of {} arguments` message, so that it provides more details and contextual hints.
Suggest async {} for async || {}
Fixes#76011
This adds support for adding help diagnostics to the feature gating checks and
then uses it for the async_closure gate to add the extra bit of help
information as described in the issue.
We will never need to pop past our starting frame during token
capturing. Using an empty stack allows us to avoid pointless heap
allocations/deallocations.