`tokenstream::Spacing` appears on all `TokenTree::Token` instances,
both punct and non-punct. Its current usage:
- `Joint` means "can join with the next token *and* that token is a
punct".
- `Alone` means "cannot join with the next token *or* can join with the
next token but that token is not a punct".
The fact that `Alone` is used for two different cases is awkward.
This commit augments `tokenstream::Spacing` with a new variant
`JointHidden`, resulting in:
- `Joint` means "can join with the next token *and* that token is a
punct".
- `JointHidden` means "can join with the next token *and* that token is a
not a punct".
- `Alone` means "cannot join with the next token".
This *drastically* improves the output of `print_tts`. For example,
this:
```
stringify!(let a: Vec<u32> = vec![];)
```
currently produces this string:
```
let a : Vec < u32 > = vec! [] ;
```
With this PR, it now produces this string:
```
let a: Vec<u32> = vec![] ;
```
(The space after the `]` is because `TokenTree::Delimited` currently
doesn't have spacing information. The subsequent commit fixes this.)
The new `print_tts` doesn't replicate original code perfectly. E.g.
multiple space characters will be condensed into a single space
character. But it's much improved.
`print_tts` still produces the old, uglier output for code produced by
proc macros. Because we have to translate the generated code from
`proc_macro::Spacing` to the more expressive `token::Spacing`, which
results in too much `proc_macro::Along` usage and no
`proc_macro::JointHidden` usage. So `space_between` still exists and
is used by `print_tts` in conjunction with the `Spacing` field.
This change will also help with the removal of `Token::Interpolated`.
Currently interpolated tokens are pretty-printed nicely via AST pretty
printing. `Token::Interpolated` removal will mean they get printed with
`print_tts`. Without this change, that would result in much uglier
output for code produced by decl macro expansions. With this change, AST
pretty printing and `print_tts` produce similar results.
The commit also tweaks the comments on `proc_macro::Spacing`. In
particular, it refers to "compound tokens" rather than "multi-char
operators" because lifetimes aren't operators.
Properly print cstr literals in `proc_macro::Literal::to_string`
Previously we printed the contents of the string, rather than the actual string literal (e.g. `the c string` instead of `c"the c string"`).
Fixes#112820
cc #105723
without this, the only way to create a `LitKind::Byte` is by
doing `"b'a'".parse::<Literal>()`, this solves that by enabling
`Literal::byte_character(b'a')`
It lints against features that are inteded to be internal to the
compiler and standard library. Implements MCP #596.
We allow `internal_features` in the standard library and compiler as those
use many features and this _is_ the standard library from the "internal to the compiler and
standard library" after all.
Marking some features as internal wasn't exactly the most scientific approach, I just marked some
mostly obvious features. While there is a categorization in the macro,
it's not very well upheld (should probably be fixed in another PR).
We always pass `-Ainternal_features` in the testsuite
About 400 UI tests and several other tests use internal features.
Instead of throwing the attribute on each one, just always allow them.
There's nothing wrong with testing internal features^^
The status quo is highly confusing, since the overlap is not apparent,
and specialization is not a feature of Rust. This addresses #87545;
I'm not certain if it closes it, since that issue might also be trackign
a *general* solution for hiding specializing impls automatically.
Added byte position range for `proc_macro::Span`
Currently, the [`Debug`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#impl-Debug-for-Span) implementation for [`proc_macro::Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#) calls the debug function implemented in the trait implementation of `server::Span` for the type `Rustc` in the `rustc-expand` crate.
The current implementation, of the referenced function, looks something like this:
```rust
fn debug(&mut self, span: Self::Span) -> String {
if self.ecx.ecfg.span_debug {
format!("{:?}", span)
} else {
format!("{:?} bytes({}..{})", span.ctxt(), span.lo().0, span.hi().0)
}
}
```
It returns the byte position of the [`Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#) as an interpolated string.
Because this is currently the only way to get a spans position in the file, I might lead someone, who is interested in this information, to parsing this interpolated string back into a range of bytes, which I think is a very non-rusty way.
The proposed `position()`, method implemented in this PR, gives the ability to directly get this info.
It returns a [`std::ops::Range`](https://doc.rust-lang.org/std/ops/struct.Range.html#) wrapping the lowest and highest byte of the [`Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#).
I put it behind the `proc_macro_span` feature flag because many of the other functions that have a similar footprint also are annotated with it, I don't actually know if this is right.
It would be great if somebody could take a look at this, thank you very much in advanced.
Use associated items of `char` instead of freestanding items in `core::char`
The associated functions and constants on `char` have been stable since 1.52 and the freestanding items have soft-deprecated since 1.62 (https://github.com/rust-lang/rust/pull/95566). This PR ~~marks them as "deprecated in future", similar to the integer and floating point modules (`core::{i32, f32}` etc)~~ replaces all uses of `core::char::*` with `char::*` to prepare for future deprecation of `core::char::*`.
While working on some other changes in the bridge, I noticed that when
running a nested proc-macro (which is currently only possible using
the unstable `TokenStream::expand_expr`), any symbols held by the
proc-macro client would be invalidated, as the same thread would be used
for the nested macro by default, and the interner doesn't handle nested
use.
After discussing with @eddyb, we decided the best approach might be to
force the use of the cross-thread executor for nested invocations, as it
will never re-use thread-local storage, avoiding the issue. This
shouldn't impact performance, as expand_expr is still unstable, and
infrequently used.
This was chosen rather than making the client symbol interner handle
nested invocations, as that would require replacing the internal
interner `Vec` with a `BTreeMap` (as valid symbol id ranges could now be
disjoint), and the symbol interner is known to be fairly perf-sensitive.
This patch adds checks to the execution strategy to use the cross-thread
executor when doing nested invocations. An alternative implementation
strategy could be to track this information in the `ExtCtxt`, however a
thread-local in the `proc_macro` crate was chosen to add an assertion so
that `rust-analyzer` is aware of the issue if it implements
`expand_expr` in the future.
r? @eddyb
This removes some RPC when creating and emitting diagnostics, and
simplifies the bridge slightly.
After this change, there are no remaining methods which take advantage
of the support for `&mut` references to objects in the store as
arguments, meaning that support for them could technically be removed if
we wanted. The only remaining uses of immutable references into the
store are `TokenStream` and `SourceFile`.
This is done by having the crossbeam dependency inserted into the
proc_macro server code from the server side, to avoid adding a
dependency to proc_macro.
In addition, this introduces a -Z command-line option which will switch
rustc to run proc-macros using this cross-thread executor. With the
changes to the bridge in #98186, #98187, #98188 and #98189, the
performance of the executor should be much closer to same-thread
execution.
In local testing, the crossbeam executor was substantially more
performant than either of the two existing CrossThread strategies, so
they have been removed to keep things simple.
This method is still only used for Literal::subspan, however the
implementation only depends on the Span component, so it is simpler and
more efficient for now to pass down only the information that is needed.
In the future, if more information about the Literal is required in the
implementation (e.g. to validate that spans line up as expected with
source text), that extra information can be added back with extra
arguments.
This builds on the symbol infrastructure built for `Ident` to replicate
the `LitKind` and `Lit` structures in rustc within the `proc_macro`
client, allowing literals to be fully created and interacted with from
the client thread. Only parsing and subspan operations still require
sync RPC.
Doing this for all unicode identifiers would require a dependency on
`unicode-normalization` and `rustc_lexer`, which is currently not
possible for `proc_macro` due to it being built concurrently with `std`
and `core`. Instead, ASCII identifiers are validated locally, and an RPC
message is used to validate unicode identifiers when needed.
String values are interned on the both the server and client when
deserializing, to avoid unnecessary copies and keep Ident cheap to copy and
move. This appears to be important for performance.
The client-side interner is based roughly on the one from rustc_span, and uses
an arena inspired by rustc_arena.
RPC messages passing symbols always include the full value. This could
potentially be optimized in the future if it is revealed to be a
performance bottleneck.
Despite now having a relevant implementaion of Display for Ident, ToString is
still specialized, as it is a hot-path for this object.
The symbol infrastructure will also be used for literals in the next
part.
Unfortunately, as it is difficult to depend on crates from within proc_macro,
this is done by vendoring a copy of the hasher as a module rather than
depending on the rustc_hash crate.
This probably doesn't have a substantial impact up-front, however will be more
relevant once symbols are interned within the proc_macro client.
This greatly reduces round-trips to fetch relevant extra information about the
token in proc macro code, and avoids RPC messages to create Group tokens.