A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
The change was "Show invisible delimiters (within comments) when pretty
printing". It's useful to show these delimiters, but is a breaking
change for some proc macros.
Fixes#97608.
Begin fixing all the broken doctests in `compiler/`
Begins to fix#95994.
All of them pass now but 24 of them I've marked with `ignore HELP (<explanation>)` (asking for help) as I'm unsure how to get them to work / if we should leave them as they are.
There are also a few that I marked `ignore` that could maybe be made to work but seem less important.
Each `ignore` has a rough "reason" for ignoring after it parentheses, with
- `(pseudo-rust)` meaning "mostly rust-like but contains foreign syntax"
- `(illustrative)` a somewhat catchall for either a fragment of rust that doesn't stand on its own (like a lone type), or abbreviated rust with ellipses and undeclared types that would get too cluttered if made compile-worthy.
- `(not-rust)` stuff that isn't rust but benefits from the syntax highlighting, like MIR.
- `(internal)` uses `rustc_*` code which would be difficult to make work with the testing setup.
Those reason notes are a bit inconsistently applied and messy though. If that's important I can go through them again and try a more principled approach. When I run `rg '```ignore \(' .` on the repo, there look to be lots of different conventions other people have used for this sort of thing. I could try unifying them all if that would be helpful.
I'm not sure if there was a better existing way to do this but I wrote my own script to help me run all the doctests and wade through the output. If that would be useful to anyone else, I put it here: https://github.com/Elliot-Roberts/rust_doctest_fixing_tool
Overhaul `MacArgs`
Motivation:
- Clarify some code that I found hard to understand.
- Eliminate one use of three places where `TokenKind::Interpolated` values are created.
r? `@petrochenkov`
The value in `MacArgs::Eq` is currently represented as a `Token`.
Because of `TokenKind::Interpolated`, `Token` can be either a token or
an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a
literal or macro call AST fragment, and then is later lowered to a
literal token. But this is very non-obvious. `Token` is a much more
general type than what is needed.
This commit restricts things, by introducing a new type `MacArgsEqKind`
that is either an AST expression (pre-lowering) or an AST literal
(post-lowering). The downside is that the code is a bit more verbose in
a few places. The benefit is that makes it much clearer what the
possibilities are (though also shorter in some other places). Also, it
removes one use of `TokenKind::Interpolated`, taking us a step closer to
removing that variant, which will let us make `Token` impl `Copy` and
remove many "handle Interpolated" code paths in the parser.
Things to note:
- Error messages have improved. Messages like this:
```
unexpected token: `"bug" + "found"`
```
now say "unexpected expression", which makes more sense. Although
arbitrary expressions can exist within tokens thanks to
`TokenKind::Interpolated`, that's not obvious to anyone who doesn't
know compiler internals.
- In `parse_mac_args_common`, we no longer need to collect tokens for
the value expression.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
This commit rearranges the `match`. The new code avoids testing for
`MacArgs::Eq` twice, at the cost of repeating the `self.print_path()`
call. I think this is worthwhile because it puts the `match` in a more
standard and readable form.
Parse inner attributes on inline const block
According to https://github.com/rust-lang/rust/pull/84414#issuecomment-826150936, inner attributes are intended to be supported *"in all containers for statements (or some subset of statements)"*.
This PR adds inner attribute parsing and pretty-printing for inline const blocks (https://github.com/rust-lang/rust/issues/76001), which contain statements just like an unsafe block or a loop body.
```rust
let _ = const {
#![allow(...)]
let x = ();
x
};
```
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
Factor convenience functions out of main printer implementation
The pretty printer in rustc_ast_pretty has a section of methods commented "Convenience functions to talk to the printer". This PR pulls those out to a separate module. This leaves pp.rs with only the minimal API that is core to the pretty printing algorithm.
I found this separation to be helpful in https://github.com/dtolnay/prettyplease because it makes clear when changes are adding some fundamental new capability to the pretty printer algorithm vs just making it more convenient to call some already existing functionality.
The `print_expr` method already places an `ibox(INDENT_UNIT)` around
every expr that gets printed. Some exprs were then using `self.head`
inside of that, which does its own `cbox(INDENT_UNIT)`, resulting in two
levels of indentation:
while true {
stuff;
}
This commit fixes those cases to produce the expected single level of
indentation within every expression containing a block.
while true {
stuff;
}
Previously the pretty printer would compute indentation always relative
to whatever column a block begins at, like this:
fn demo(arg1: usize,
arg2: usize);
This is never the thing to do in the dominant contemporary Rust style.
Rustfmt's default and the style used by the vast majority of Rust
codebases is block indentation:
fn demo(
arg1: usize,
arg2: usize,
);
where every indentation level is a multiple of 4 spaces and each level
is indented relative to the indentation of the previous line, not the
position that the block starts in.
Render more readable macro matcher tokens in rustdoc
Follow-up to #92334.
This PR lifts some of the token rendering logic from https://github.com/dtolnay/prettyplease into rustdoc so that even the matchers for which a source code snippet is not available (because they are macro-generated, or any other reason) follow some baseline good assumptions about where the tokens in the macro matcher are appropriate to space.
The below screenshots show an example of the difference using one of the gnarliest macros I could find. Some things to notice:
- In the **before**, notice how a couple places break in between `$(....)`↵`*`, which is just about the worst possible place that it could break.
- In the **before**, the lines that wrapped are weirdly indented by 1 space of indentation relative to column 0. In the **after**, we use the typical way of block indenting in Rust syntax which is put the open/close delimiters on their own line and indent their contents by 4 spaces relative to the previous line (so 8 spaces relative to column 0, because the matcher itself is indented by 4 relative to the `macro_rules` header).
- In the **after**, macro_rules metavariables like `$tokens:tt` are kept together, which is how just about everybody writing Rust today writes them.
## Before
![Screenshot from 2022-01-14 13-05-53](https://user-images.githubusercontent.com/1940490/149585105-1f182b78-751f-421f-a234-9dbc04fa3bbd.png)
## After
![Screenshot from 2022-01-14 13-06-04](https://user-images.githubusercontent.com/1940490/149585118-d4b52ea7-3e67-4b6e-a12b-31dfb8172f86.png)
r? `@camelid`
PrintStackElems with pbreak=PrintStackBreak::Fits always carried a
meaningless value offset=0. We can combine the two types PrintStackElem
+ PrintStackBreak into one PrintFrame enum that stores offset only for
Broken frames.