Commit Graph

68 Commits

Author SHA1 Message Date
mojave2
f404990eb0
improve AttrTokenStream 2023-09-04 20:07:28 +08:00
Nicholas Nethercote
d72fc5ce44 Remove TokenTreeCursor::replace_prev_and_rewind.
It's no longer used.
2023-07-31 14:47:06 +10:00
Nicholas Nethercote
ff7d5ba65e Move doc comment desugaring out of TokenCursor.
`TokenCursor` currently does doc comment desugaring on the fly, if the
`desugar_doc_comment` field is set. This requires also modifying the
token stream on the fly with `replace_prev_and_rewind`.

This commit moves the doc comment desugaring out of `TokenCursor`, by
introducing a new `TokenStream::desugar_doc_comment` method. This
separation of desugaring and iterating makes the code nicer.
2023-07-31 14:47:03 +10:00
Nicholas Nethercote
4ebf2be8bb Remove Iterator impl for TokenTreeCursor.
This is surprising, but the new comment explains why. It's a logical
conclusion in the drive to avoid `TokenTree` clones.

`TokenTreeCursor` is now only used within `Parser`. It's still needed
due to `replace_prev_and_rewind`.
2023-07-27 11:59:03 +10:00
Nicholas Nethercote
55a732461d Make TokenTree::uninterpolate take &self and return a Cow.
Making it similar to `Token::uninterpolate`. This avoids some more token
tree cloning.
2023-07-27 11:58:42 +10:00
The 8472
7916a2c1c9 Use Lrc::make_mut instead of Lrc::get_mut 2023-07-03 13:40:25 +02:00
The 8472
14e57d7c13 perform TokenStream replacement in-place when possible in expand_macro 2023-07-03 13:29:15 +02:00
Caleb Cartwright
00c3f7552e refactor: add chunks method to TokenStream to obviate rustdoc clones 2023-05-13 16:59:28 -05:00
SparrowLii
089a38880b correct literals for dyn thread safe 2023-05-06 09:34:53 +08:00
SparrowLii
b9746ce039 introduce DynSend and DynSync auto trait 2023-05-06 09:34:18 +08:00
Nicholas Nethercote
a86fc727fa Rename Cursor/CursorRef as TokenTreeCursor/RefTokenTreeCursor.
This makes it clear they return token trees, and makes for a nice
comparison against `TokenCursor` which returns tokens.
2023-02-03 10:06:52 +11:00
Nicholas Nethercote
b23f272db0 Make clear that TokenTree::Token shouldn't contain a delimiter. 2023-02-03 10:06:52 +11:00
Nicholas Nethercote
af1d16e82d Improve doc comment desugaring.
Sometimes the parser needs to desugar a doc comment into `#[doc =
r"foo"]`. Currently it does this in a hacky way: by pushing a "fake" new
frame (one without a delimiter) onto the `TokenCursor` stack.

This commit changes things so that the token stream itself is modified
in place. The nice thing about this is that it means
`TokenCursorFrame::delim_sp` is now only `None` for the outermost frame.
2023-02-03 10:06:52 +11:00
nils
fd7a159710 Fix uninlined_format_args for some compiler crates
Convert all the crates that have had their diagnostic migration
completed (except save_analysis because that will be deleted soon and
apfloat because of the licensing problem).
2023-01-05 19:01:12 +01:00
KaDiWa
9bc69925cb
compiler: remove unnecessary imports and qualified paths 2022-12-10 18:45:34 +01:00
Maybe Waffle
f2b97a8bfe Remove useless borrows and derefs 2022-12-01 17:34:43 +00:00
Maybe Waffle
1d42936b18 Prefer doc comments over //-comments in compiler 2022-11-27 11:19:04 +00:00
Nilstrieb
7bfef19844 Use tidy-alphabetical in the compiler 2022-10-12 17:49:10 +05:30
Nicholas Nethercote
1e848a564b Remove TokenStreamBuilder.
`TokenStreamBuilder` exists to concatenate multiple `TokenStream`s
together. This commit removes it, and moves the concatenation
functionality directly into `TokenStream`, via two new methods
`push_tree` and `push_stream`. This makes things both simpler and
faster.

`push_tree` is particularly important. `TokenStreamBuilder` only had a
single `push` method, which pushed a stream. But in practice most of the
time we push a single token tree rather than a stream, and `push_tree`
avoids the need to build a token stream with a single entry (which
requires two allocations, one for the `Lrc` and one for the `Vec`).

The main `push_tree` use arises from a change to one of the `ToInternal`
impls in `proc_macro_server.rs`. It now returns a `SmallVec` instead of
a `TokenStream`. This return value is then iterated over by
`concat_trees`, which does `push_tree` on each element. Furthermore, the
use of `SmallVec` avoids more allocations, because there is always only
one or two token trees.

Note: the removed `TokenStreamBuilder::push` method had some code to
deal with a quadratic blowup case from #57735. This commit removes the
code. I tried and failed to reproduce the blowup from that PR, before
and after this change. Various other changes have happened to
`TokenStreamBuilder` in the meantime, so I suspect the original problem
is no longer relevant, though I don't have proof of this. Generally
speaking, repeatedly extending a `Vec` without pre-determining its
capacity is *not* quadratic. It's also incredibly common, within rustc
and many other Rust programs, so if there were performance problems
there you'd think it would show up in other places, too.
2022-10-05 12:42:54 +11:00
Nicholas Nethercote
bbb53bf772 Add comments to Spacing. 2022-10-03 11:42:21 +11:00
Nicholas Nethercote
5ab68a82d5 Group together more size assertions.
Also add a few more assertions for some relevant token-related types.

And fix an erroneous comment in `rustc_errors`.
2022-10-01 07:30:23 +10:00
Nicholas Nethercote
d2df07c425 Rename {Create,Lazy}TokenStream as {To,Lazy}AttrTokenStream.
`To` is better than `Create` for indicating that this is a non-consuming
conversion, rather than creating something out of nothing.

And the addition of `Attr` is because the current names makes them sound
like they relate to `TokenStream`, but really they relate to
`AttrTokenStream`.
2022-09-09 17:25:38 +10:00
Nicholas Nethercote
f6c9e1df59 Inline and remove TokenStream::opt_from_ast. 2022-09-09 16:53:17 +10:00
Nicholas Nethercote
81eaf877d6 Tweak some formatting. 2022-09-09 16:40:25 +10:00
Nicholas Nethercote
208ca93cda Change return type of Attribute::tokens.
The `AttrTokenStream` is always immediately turned into a `TokenStream`.
2022-09-09 16:25:31 +10:00
Nicholas Nethercote
a56d345490 Rename AttrAnnotatedToken{Stream,Tree}.
These two type names are long and have long matching prefixes. I find
them hard to read, especially in combinations like
`AttrAnnotatedTokenStream::new(vec![AttrAnnotatedTokenTree::Token(..)])`.

This commit renames them as `AttrToken{Stream,Tree}`.
2022-09-09 12:45:26 +10:00
Nicholas Nethercote
890e759ffc Move Spacing out of AttrAnnotatedTokenStream.
And into `AttrAnnotatedTokenTree::Token`.

PR #99887 did the same thing for `TokenStream`.
2022-09-09 12:00:46 +10:00
bors
eac6c33bc6 Auto merge of #100869 - nnethercote:replace-ThinVec, r=spastorino
Replace `rustc_data_structures::thin_vec::ThinVec` with `thin_vec::ThinVec`

`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.

This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.

The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
  `ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
  avoid some unnecessary allocations.

r? `@spastorino`
2022-09-01 08:01:06 +00:00
Donough Liu
97b1a6146c Use more into_iter rather than drain(..) 2022-08-30 04:42:03 +01:00
Nicholas Nethercote
b38106b6d8 Replace rustc_data_structures::thin_vec::ThinVec with thin_vec::ThinVec.
`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.

This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.

The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
  `ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
  avoid some unnecessary allocations.
2022-08-29 15:42:13 +10:00
Nicholas Nethercote
332dffb1f9 Remove TreeAndSpacing.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.

This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.

The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`

These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.

This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.

These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
2022-07-29 15:52:15 +10:00
Nicholas Nethercote
2a5487afb4 Merge TokenStreamBuilder::push into TokenStreamBuilder::build.
Both functions do some modifying of streams using `make_mut`:
- `push` sometimes glues the first token of the next stream to the last
  token of the first stream.
- `build` appends tokens to the first stream.

By doing all of this in the one place, things are simpler. The first
stream can be modified in both ways (if necessary) in the one place, and
any next stream with the first token removed doesn't need to be stored.
2022-06-20 13:46:11 +10:00
Nicholas Nethercote
f6b57883e0 Remove TokenStream::from_streams.
By inlining it into the only non-test call site. The one test call site
is changed to use `TokenStreamBuilder`.
2022-06-20 09:33:08 +10:00
Nicholas Nethercote
178b746d04 Remove Cursor::index.
It's unused.
2022-06-20 09:27:58 +10:00
Nicholas Nethercote
ccd956aca6 Remove Cursor::append.
It's a weird function: it lets you modify the token stream in the middle
of iteration. There is only one call site, and it is only used for the
rare `ProceduralMasquerade` legacy case.
2022-06-20 09:19:10 +10:00
Nicholas Nethercote
1acbe7573d Use delayed error handling for Encodable and Encoder infallible.
There are two impls of the `Encoder` trait: `opaque::Encoder` and
`opaque::FileEncoder`. The former encodes into memory and is infallible, the
latter writes to file and is fallible.

Currently, standard `Result`/`?`/`unwrap` error handling is used, but this is a
bit verbose and has non-trivial cost, which is annoying given how rare failures
are (especially in the infallible `opaque::Encoder` case).

This commit changes how `Encoder` fallibility is handled. All the `emit_*`
methods are now infallible. `opaque::Encoder` requires no great changes for
this. `opaque::FileEncoder` now implements a delayed error handling strategy.
If a failure occurs, it records this via the `res` field, and all subsequent
encoding operations are skipped if `res` indicates an error has occurred. Once
encoding is complete, the new `finish` method is called, which returns a
`Result`. In other words, there is now a single `Result`-producing method
instead of many of them.

This has very little effect on how any file errors are reported if
`opaque::FileEncoder` has any failures.

Much of this commit is boring mechanical changes, removing `Result` return
values and `?` or `unwrap` from expressions. The more interesting parts are as
follows.
- serialize.rs: The `Encoder` trait gains an `Ok` associated type. The
  `into_inner` method is changed into `finish`, which returns
  `Result<Vec<u8>, !>`.
- opaque.rs: The `FileEncoder` adopts the delayed error handling
  strategy. Its `Ok` type is a `usize`, returning the number of bytes
  written, replacing previous uses of `FileEncoder::position`.
- Various methods that take an encoder now consume it, rather than being
  passed a mutable reference, e.g. `serialize_query_result_cache`.
2022-06-08 07:01:26 +10:00
Vadim Petrochenkov
8e8fb4f49e rustc_parse: Move AST -> TokenStream conversion logic to rustc_ast 2022-05-22 12:01:07 +03:00
klensy
cc5f3e21ac use CursorRef more, to not to clone Trees 2022-05-18 18:43:48 +03:00
Vadim Petrochenkov
2733ec1be3 rustc_ast: Harmonize delimiter naming with proc_macro::Delimiter 2022-04-28 10:04:29 +03:00
Nicholas Nethercote
643e9f707e Introduced Cursor::next_with_spacing_ref.
This lets us clone just the parts within a `TokenTree` that need
cloning, rather than the entire thing. This is a surprisingly large
performance win, up to 4% on `async-std-1.10.0`.
2022-04-21 13:49:40 +10:00
Nicholas Nethercote
5b653c1a43 Inline Cursor::next_with_spacing. 2022-04-20 12:43:25 +10:00
Nicholas Nethercote
aefbbeec34 Inline and remove TokenTree::{open_tt,close_tt}.
They both have a single call site.
2022-04-19 17:02:48 +10:00
Nicholas Nethercote
ad566b78f2 Tweak Cursor::next_with_spacing.
This makes it more like `CursorRef::next_with_spacing`. There is no
performance effect, just a consistency improvement.
2022-04-19 17:02:48 +10:00
Yuri Astrakhan
5160f8f843 Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
2022-03-30 15:14:15 -04:00
Caio
ef5601b321 2 - Make more use of let_chains
Continuation of #94376.

cc #53667
2022-02-26 13:45:43 -03:00
Nicholas Nethercote
416399dc10 Make Decodable and Decoder infallible.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
  currently panics on failure (e.g. if the input is too short, or on a
  bad `Result` discriminant), and in some places it returns an error
  (e.g. on a bad `Option` discriminant). The number of places where
  either happens is surprisingly small, just because the binary
  representation has very little redundancy and a lot of input reading
  can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
  `.rlink` file production, and there's a `FIXME` comment suggesting it
  should change to a binary format, and (b) in a few tests in
  non-fundamental ways. Indeed #85993 is open to remove it entirely.

And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.

Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
  optimization for small counts that the impl for `Result<T, E>` has,
  because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
  `collect`, which is nice; the one for `Vec` uses unsafe code, because
  that gave better perf on some benchmarks.
2022-01-22 10:38:31 +11:00
EliseZeroTwo
7402eb001b
fix: inner attribute followed by outer attribute causing ICE 2021-10-25 17:31:27 +02:00
klensy
56a2a2ae1f don't clone attrs 2021-05-30 22:44:40 +03:00
klensy
f43ee8ebf6 fix few typos 2021-04-19 15:57:08 +03:00
Aaron Hill
a93c4f05de
Implement token-based handling of attributes during expansion
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:

* Derives macros no longer lose spans when their input is modified
  by eager cfg-expansion. This is accomplished by performing eager
  cfg-expansion on the token stream that we pass to the derive
  proc-macro
* Inner attributes now preserve spans in all cases, including when we
  have multiple inner attributes in a row.

This is accomplished through the following changes:

* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
  These are very similar to a normal `TokenTree`, but they also track
  the position of attributes and attribute targets within the stream.
  They are built when we collect tokens during parsing.
  An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
  we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
  `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
  is created during the parsing of a nested AST node to make the 'outer'
  AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
2021-04-11 01:31:36 -04:00