Currently, the output of `rustc --explain foo` displays the raw markdown in a
pager. This is acceptable, but using actual formatting makes it easier to
understand.
This patch consists of three major components:
1. A markdown parser. This is an extremely simple non-backtracking recursive
implementation that requires normalization of the final token stream
2. A utility to write the token stream to an output buffer
3. Configuration within rustc_driver_impl to invoke this combination for
`--explain`. Like the current implementation, it first attempts to print to
a pager with a fallback colorized terminal, and standard print as a last
resort.
If color is disabled, or if the output does not support it, or if printing
with color fails, it will write the raw markdown (which matches current
behavior).
Pagers known to support color are: `less` (with `-r`), `bat` (aka `catbat`),
and `delta`.
The markdown parser does not support the entire markdown specification, but
should support the following with reasonable accuracy:
- Headings, including formatting
- Comments
- Code, inline and fenced block (no indented block)
- Strong, emphasis, and strikethrough formatted text
- Links, anchor, inline, and reference-style
- Horizontal rules
- Unordered and ordered list items, including formatting
This parser and writer should be reusable by other systems if ever needed.
Sometimes, especially with MIR validation, the backtraces from delayed
bugs are noise and make it harder to look at them. Respect the
environment variable and don't print it when the user doesn't want it.
Each of `{D,Subd}iagnosticMessage::{Str,Eager}` has a comment:
```
// FIXME(davidtwco): can a `Cow<'static, str>` be used here?
```
This commit answers that question in the affirmative. It's not the most
compelling change ever, but it might be worth merging.
This requires changing the `impl<'a> From<&'a str>` impls to `impl
From<&'static str>`, which involves a bunch of knock-on changes that
require/result in call sites being a little more precise about exactly
what kind of string they use to create errors, and not just `&str`. This
will result in fewer unnecessary allocations, though this will not have
any notable perf effects given that these are error paths.
Note that I was lazy within Clippy, using `to_string` in a few places to
preserve the existing string imprecision. I could have used `impl
Into<{D,Subd}iagnosticMessage>` in various places as is done in the
compiler, but that would have required changes to *many* call sites
(mostly changing `&format("...")` to `format!("...")`) which didn't seem
worthwhile.
Ensure Fluent messages are in alphabetical order
Fixes#111847
This adds a tidy check to ensure Fluent messages are in alphabetical order, as well as sorting all existing messages. I think the error could be worded better, would appreciate suggestions.
<details>
<summary>Script used to sort files</summary>
```py
import sys
import re
fn = sys.argv[1]
with open(fn, 'r') as f:
data = f.read().split("\n")
chunks = []
cur = ""
for line in data:
if re.match(r"^([a-zA-Z0-9_]+)\s*=\s*", line):
chunks.append(cur)
cur = ""
cur += line + "\n"
chunks.append(cur)
chunks.sort()
with open(fn, 'w') as f:
f.write(''.join(chunks).strip("\n\n") + "\n")
```
</details>
Fix overflow in error emitter
Fix#109854Close#94171 (was already fixed before but missing test)
This bug happens when a multipart suggestion spans more than one line.
The fix is to update the `acc` variable, which didn't handle the case when the text to remove spans multiple lines but the text to add spans only one line.
Also, use `usize::try_from` instead of `as usize` to detect overflows earlier in the future, and point to the source of the overflow (the original issue points to a different place where this value is used, not where the overflow had happened).
And finally add an `if start != end` check to avoid doing any extra work in case of empty ranges.
Long explanation:
Given this test case:
```rust
fn generate_setter() {
String::with_capacity(
//~^ ERROR this function takes 1 argument but 3 arguments were supplied
generate_setter,
r#"
pub(crate) struct Person<T: Clone> {}
"#,
r#""#,
);
}
```
The compiler will try to convert that code into the following:
```rust
fn generate_setter() {
String::with_capacity(
//~^ ERROR this function takes 1 argument but 3 arguments were supplied
/* usize */,
);
}
```
So it creates a suggestion with 3 separate parts:
```
// Replace "generate_setter" with "/* usize */"
SubstitutionPart { span: fuzz_input.rs:4:5: 4:20 (#0), snippet: "/* usize */" }
// Remove second arg (multiline string)
SubstitutionPart { span: fuzz_input.rs:4:20: 7:3 (#0), snippet: "" }
// Remove third arg (r#""#)
SubstitutionPart { span: fuzz_input.rs:7:3: 8:11 (#0), snippet: "" }
```
Each of this parts gets a separate `SubstitutionHighlight` (this marks the relevant text green in a terminal, the values are 0-indexed so `start: 4` means column 5):
```
SubstitutionHighlight { start: 4, end: 15 }
SubstitutionHighlight { start: 15, end: 15 }
SubstitutionHighlight { start: 18446744073709551614, end: 18446744073709551614 }
```
The 2nd and 3rd suggestion are empty (start = end) because they only remove text, so there are no additions to highlight. But the 3rd span has overflowed because the compiler assumes that the 3rd suggestion is on the same line as the first suggestion. The 2nd span starts at column 20 and the highlight starts at column 16 (15+1), so that suggestion is good. But since the 3rd span starts at column 3, the result is `3 - 4`, or column -1, which turns into -2 with 0-indexed, and that's equivalent to `18446744073709551614 as isize`.
With this fix, the resulting `SubstitutionHighlight` are:
```
SubstitutionHighlight { start: 4, end: 15 }
SubstitutionHighlight { start: 15, end: 15 }
SubstitutionHighlight { start: 15, end: 15 }
```
As expected. I guess ideally we shouldn't emit empty highlights when removing text, but I am too scared to change that.
Before:
```
= note: delayed at 0: <rustc_errors::HandlerInner>::emit_diagnostic
at ./compiler/rustc_errors/src/lib.rs:1335:29
1: <rustc_errors::Handler>::emit_diagnostic
at ./compiler/rustc_errors/src/lib.rs:1124:9
...
```
After:
```
= note: delayed at compiler/rustc_parse/src/parser/diagnostics.rs:2158:28
0: <rustc_errors::HandlerInner>::emit_diagnostic
at ./compiler/rustc_errors/src/lib.rs:1335:29
1: <rustc_errors::Handler>::emit_diagnostic
at ./compiler/rustc_errors/src/lib.rs:1124:9
```
This both makes the relevant frame easier to find without having to dig
through diagnostic internals, and avoids the weird-looking formatting
for the first frame.
Introduce `DynSend` and `DynSync` auto trait for parallel compiler
part of parallel-rustc #101566
This PR introduces `DynSend / DynSync` trait and `FromDyn / IntoDyn` structure in rustc_data_structure::marker. `FromDyn` can dynamically check data structures for thread safety when switching to parallel environments (such as calling `par_for_each_in`). This happens only when `-Z threads > 1` so it doesn't affect single-threaded mode's compile efficiency.
r? `@cjgillot`
Currently a `{D,Subd}iagnosticMessage` can be created from any type that
impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static,
str>`, which are reasonable. It also includes `&String`, which is pretty
weird, and results in many places making unnecessary allocations for
patterns like this:
```
self.fatal(&format!(...))
```
This creates a string with `format!`, takes a reference, passes the
reference to `fatal`, which does an `into()`, which clones the
reference, doing a second allocation. Two allocations for a single
string, bleh.
This commit changes the `From` impls so that you can only create a
`{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static,
str>`. This requires changing all the places that currently create one
from a `&String`. Most of these are of the `&format!(...)` form
described above; each one removes an unnecessary static `&`, plus an
allocation when executed. There are also a few places where the existing
use of `&String` was more reasonable; these now just use `clone()` at
the call site.
As well as making the code nicer and more efficient, this is a step
towards possibly using `Cow<'static, str>` in
`{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing
the `From<&'a str>` impls to `From<&'static str>`, which is doable, but
I'm not yet sure if it's worthwhile.
Add `rustc_fluent_macro` to decouple fluent from `rustc_macros`
Fluent, with all the icu4x it brings in, takes quite some time to compile. `fluent_messages!` is only needed in further downstream rustc crates, but is blocking more upstream crates like `rustc_index`. By splitting it out, we allow `rustc_macros` to be compiled earlier, which speeds up `x check compiler` by about 5 seconds (and even more after the needless dependency on `serde_json` is removed from `rustc_data_structures`).
Encode hashes as bytes, not varint
In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.
Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`
Before:
```
( 1) 373418203 (53.7%, 53.7%): 1
( 2) 196240113 (28.2%, 81.9%): 3
( 3) 108157958 (15.6%, 97.5%): 2
( 4) 17213120 ( 2.5%, 99.9%): 4
( 5) 223614 ( 0.0%,100.0%): 9
( 6) 216262 ( 0.0%,100.0%): 10
( 7) 15447 ( 0.0%,100.0%): 5
( 8) 3633 ( 0.0%,100.0%): 19
( 9) 3030 ( 0.0%,100.0%): 8
( 10) 1167 ( 0.0%,100.0%): 18
( 11) 1032 ( 0.0%,100.0%): 7
( 12) 1003 ( 0.0%,100.0%): 6
( 13) 10 ( 0.0%,100.0%): 16
( 14) 10 ( 0.0%,100.0%): 17
( 15) 5 ( 0.0%,100.0%): 12
( 16) 4 ( 0.0%,100.0%): 14
```
After:
```
( 1) 372939136 (53.7%, 53.7%): 1
( 2) 196240140 (28.3%, 82.0%): 3
( 3) 108014969 (15.6%, 97.5%): 2
( 4) 17192375 ( 2.5%,100.0%): 4
( 5) 435 ( 0.0%,100.0%): 5
( 6) 83 ( 0.0%,100.0%): 18
( 7) 79 ( 0.0%,100.0%): 10
( 8) 50 ( 0.0%,100.0%): 9
( 9) 6 ( 0.0%,100.0%): 19
```
The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
Fluent, with all the icu4x it brings in, takes quite some time to
compile. `fluent_messages!` is only needed in further downstream rustc
crates, but is blocking more upstream crates like `rustc_index`. By
splitting it out, we allow `rustc_macros` to be compiled earlier, which
speeds up `x check compiler` by about 5 seconds (and even more after the
needless dependency on `serde_json` is removed from
`rustc_data_structures`).
Migrate most of `rustc_builtin_macros` to diagnostic impls
cc #100717
This is a couple of days work, but I decided to stop for now before the PR becomes too big. There's around 50 unresolved failures when `rustc::untranslatable_diagnostic` is denied, which I'll finish addressing once this PR goes thtough
A couple of outputs have changed, but in all instances I think the changes are an improvement/are more consistent with other diagnostics (although I'm happy to revert any which seem worse)
This makes it easier to open the messages file while developing on features.
The commit was the result of automatted changes:
for p in compiler/rustc_*; do mv $p/locales/en-US.ftl $p/messages.ftl; rmdir $p/locales; done
for p in compiler/rustc_*; do sed -i "s#\.\./locales/en-US.ftl#../messages.ftl#" $p/src/lib.rs; done