`rustc_middle` is a huge crate and it's always good to move stuff out of
it. There are lots of similar methods already on `Span`, so these two
functions, `in_external_macro` and `is_from_async_await`, fit right in.
The diff is big because `in_external_macro` is used a lot by clippy
lints.
Properly record metavar spans for other expansions other than TT
This properly records metavar spans for nonterminals other than tokentree. This means that we operations like `span.to(other_span)` work correctly for macros. As you can see, other diagnostics involving metavars have improved as a result.
Fixes#132908
Alternative to #133270
cc `@ehuss`
cc `@petrochenkov`
I believe this variant name was used incorrectly. The timeline is roughly:
* `FileName::cfg_spec_source_code` was added in
https://github.com/rust-lang/rust/pull/54517. However, it used
`FileName::Quote` instead of `FileName::CfgSpec` which I believe was a
mistake.
* Quote stuff was removed in
https://github.com/rust-lang/rust/pull/51285, but did not remove
`FileName::Quote`.
* `FileName::CfgSpec` was removed in
https://github.com/rust-lang/rust/pull/116474 because it was unused.
This restores it so that the `--cfg` variant uses a name that makes more
sense with how it is used, and restores what I think is the original
intent.
There are a few locations where the crate name is checked against an
enumerated list of `std`, `core`, `alloc`, and `proc_macro`, or some
subset thereof. In most of these cases, all four crates should likely be
treated the same. Change this so the crates are listed in one place, and
that list is used wherever a list of `std` crates is needed.
`test` could be considered relevant in some of these cases, but
generally treating it separate from the others seems preferable while it
is unstable.
There are also a few places that Clippy will be able to use this.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
This is the only trait specializable outside of the standard library.
Before stabilizing specialization we will probably want to remove
support for this. It was originally made specializable to allow a more
efficient ToString in libproc_macro back when this way the only way to
get any data out of a TokenStream. We now support getting individual
tokens, so proc macros no longer need to call it as often.
Pass end position of span through inline ASM cookie
Before this PR, only the start position of the span was passed though the inline ASM cookie to diagnostics. LLVM 19 has full support for 64-bit inline ASM cookies; this PR uses that to pass the end position of the span in the upper 32 bits, meaning inline ASM diagnostics now point at the entire line the error occurred on, not just the first character of it.
It was inconsistently done (sometimes even within a single function) and
most of the rest of the compiler uses fatal errors instead, which need
to be caught using catch_with_exit_code anyway. Using fatal errors
instead of ErrorGuaranteed everywhere in the driver simplifies things a
bit.
rustc: Fail fast when compiling a source file larger than 4 GiB
Currently if you try to compile a file that is larger than 4 GiB, `rustc` will first read the whole into memory before failing.
If we can read the metadata of the file, we can fail before reading the file.
Skip locking span interner for some syntax context checks
- `from_expansion` now never needs to consult the interner
- `eq_ctxt` now only needs the interner when both spans are fully interned
Delete the `cfg(not(parallel))` serial compiler
Since it's inception a long time ago, the parallel compiler and its cfgs have been a maintenance burden. This was a necessary evil the allow iteration while not degrading performance because of synchronization overhead.
But this time is over. Thanks to the amazing work by the parallel working group (and the dyn sync crimes), the parallel compiler has now been fast enough to be shipped by default in nightly for quite a while now.
Stable and beta have still been on the serial compiler, because they can't use `-Zthreads` anyways.
But this is quite suboptimal:
- the maintenance burden still sucks
- we're not testing the serial compiler in nightly
Because of these reasons, it's time to end it. The serial compiler has served us well in the years since it was split from the parallel one, but it's over now.
Let the knight slay one head of the two-headed dragon!
#113349
Note that the default is still 1 thread, as more than 1 thread is still fairly broken.
cc `@onur-ozkan` to see if i did the bootstrap field removal correctly, `@SparrowLii` on the sync parts
Since it's inception a long time ago, the parallel compiler and its cfgs
have been a maintenance burden. This was a necessary evil the allow
iteration while not degrading performance because of synchronization
overhead.
But this time is over. Thanks to the amazing work by the parallel
working group (and the dyn sync crimes), the parallel compiler has now
been fast enough to be shipped by default in nightly for quite a while
now.
Stable and beta have still been on the serial compiler, because they
can't use `-Zthreads` anyways.
But this is quite suboptimal:
- the maintenance burden still sucks
- we're not testing the serial compiler in nightly
Because of these reasons, it's time to end it. The serial compiler has
served us well in the years since it was split from the parallel one,
but it's over now.
Let the knight slay one head of the two-headed dragon!
Switch from `derivative` to `derive-where`
This is a part of the effort to get rid of `syn 1.*` in compiler's dependencies: #109302
Derivative has not been maintained in nearly 3 years[^1]. It also depends on `syn 1.*`.
This PR replaces `derivative` with `derive-where`[^2], a not dead alternative, which uses `syn 2.*`.
A couple of `Debug` formats have changed around the skipped fields[^3], but I doubt this is an issue.
[^1]: https://github.com/mcarton/rust-derivative/issues/117
[^2]: https://lib.rs/crates/derive-where
[^3]: See the changes in `tests/ui`
We already point these out quite aggressively, telling people not to use them, but would normally be rendered as nothing. Having them visible will make it easier for people to actually deal with them.
```
error: unicode codepoint changing visible direction of text present in literal
--> $DIR/unicode-control-codepoints.rs:26:22
|
LL | println!("{:?}", '�');
| ^-^
| ||
| |'\u{202e}'
| this literal contains an invisible unicode text flow control codepoint
|
= note: these kind of unicode codepoints change the way text flows on applications that support them, but can cause confusion because they change the order of characters on the screen
= help: if their presence wasn't intentional, you can remove them
help: if you want to keep them but make them visible in your source code, you can escape them
|
LL | println!("{:?}", '\u{202e}');
| ~~~~~~~~
```
vs the previous
```
error: unicode codepoint changing visible direction of text present in literal
--> $DIR/unicode-control-codepoints.rs:26:22
|
LL | println!("{:?}", '');
| ^-
| ||
| |'\u{202e}'
| this literal contains an invisible unicode text flow control codepoint
|
= note: these kind of unicode codepoints change the way text flows on applications that support them, but can cause confusion because they change the order of characters on the screen
= help: if their presence wasn't intentional, you can remove them
help: if you want to keep them but make them visible in your source code, you can escape them
|
LL | println!("{:?}", '\u{202e}');
| ~~~~~~~~
```
No longer track "zero-width" chars in `SourceMap`, read directly from the line when calculating the `display_col` of a `BytePos`. Move `char_width` to `rustc_span` and use it from the emitter.
This change allows the following to properly align in terminals (depending on the font, the replaced control codepoints are rendered as 1 or 2 width, on my terminal they are rendered as 1, on VSCode text they are rendered as 2):
```
error: this file contains an unclosed delimiter
--> $DIR/issue-68629.rs:5:17
|
LL | ␜␟ts␀![{i
| -- unclosed delimiter
| |
| unclosed delimiter
LL | ␀␀ fn rݻoa>rݻm
| ^
```
Stop sorting `Span`s' `SyntaxContext`, as that is incompatible with incremental
work towards https://github.com/rust-lang/rust/issues/90317
Luckily no one actually needed these to be sorted, so it didn't even affect diagnostics. I'm guessing they'd have been sorted by creation time anyway, so it wouldn't really have mattered.
r? `@cjgillot`
Introduce `{IndexNewtype,SyntaxContext}::from_u16` for convenience because small indices are sometimes encoded as `u16`.
Use `SpanData::span` instead of `Span::new` where appropriate.
Add a clarifying comment about decoding span parents.
Currently `SourceMap` is constructed slightly later than
`SessionGlobals`, and inserted. This commit changes things so they are
done at the same time.
Benefits:
- `SessionGlobals::source_map` changes from
`Lock<Option<Lrc<SourceMap>>>` to `Option<Lrc<SourceMap>>`. It's still
optional, but mutability isn't required because it's initialized at
construction.
- `set_source_map` is removed, simplifying `run_compiler`, which is
good because that's a critical function and it's nice to make it
simpler.
This requires moving things around a bit, so the necessary inputs are
available when `SessionGlobals` is created, in particular the `loader`
and `hash_kind`, which are no longer computed by `build_session`. These
inputs are captured by the new `SourceMapInputs` type, which is threaded
through various places.