Auto merge of #128465 - GrigorenkoPV:128200, r=estebank

Some `const { }` asserts for #128200

The correctness of code in #128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with `// tidy-alphabetical` (and characters being written in `\u{XXXX}` form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced.

This PR changes it to using a `const{ }` assertion (and also checks for duplicate entries). Sadly, we cannot use the recently-stabilized `is_sorted_by_key` here, because it is not const (but it would not allow us to check for uniqueness anyways). Instead, let's write a manual loop.

Alternative approach (perfect hash function): #128463

r? `@ghost`
This commit is contained in:
bors 2024-08-08 09:59:09 +00:00
commit d3a393932e

View File

@ -2595,9 +2595,7 @@ fn num_decimal_digits(num: usize) -> usize {
// We replace some characters so the CLI output is always consistent and underlines aligned. // We replace some characters so the CLI output is always consistent and underlines aligned.
// Keep the following list in sync with `rustc_span::char_width`. // Keep the following list in sync with `rustc_span::char_width`.
// ATTENTION: keep lexicografically sorted so that the binary search will work
const OUTPUT_REPLACEMENTS: &[(char, &str)] = &[ const OUTPUT_REPLACEMENTS: &[(char, &str)] = &[
// tidy-alphabetical-start
// In terminals without Unicode support the following will be garbled, but in *all* terminals // In terminals without Unicode support the following will be garbled, but in *all* terminals
// the underlying codepoint will be as well. We could gate this replacement behind a "unicode // the underlying codepoint will be as well. We could gate this replacement behind a "unicode
// support" gate. // support" gate.
@ -2610,7 +2608,7 @@ const OUTPUT_REPLACEMENTS: &[(char, &str)] = &[
('\u{0006}', ""), ('\u{0006}', ""),
('\u{0007}', ""), ('\u{0007}', ""),
('\u{0008}', ""), ('\u{0008}', ""),
('\u{0009}', " "), // We do our own tab replacement ('\t', " "), // We do our own tab replacement
('\u{000b}', ""), ('\u{000b}', ""),
('\u{000c}', ""), ('\u{000c}', ""),
('\u{000d}', ""), ('\u{000d}', ""),
@ -2643,13 +2641,23 @@ const OUTPUT_REPLACEMENTS: &[(char, &str)] = &[
('\u{2067}', "<EFBFBD>"), ('\u{2067}', "<EFBFBD>"),
('\u{2068}', "<EFBFBD>"), ('\u{2068}', "<EFBFBD>"),
('\u{2069}', "<EFBFBD>"), ('\u{2069}', "<EFBFBD>"),
// tidy-alphabetical-end
]; ];
fn normalize_whitespace(s: &str) -> String { fn normalize_whitespace(s: &str) -> String {
// Scan the input string for a character in the ordered table above. If it's present, replace const {
// it with it's alternative string (it can be more than 1 char!). Otherwise, retain the input let mut i = 1;
// char. At the end, allocate all chars into a string in one operation. while i < OUTPUT_REPLACEMENTS.len() {
assert!(
OUTPUT_REPLACEMENTS[i - 1].0 < OUTPUT_REPLACEMENTS[i].0,
"The OUTPUT_REPLACEMENTS array must be sorted (for binary search to work) \
and must contain no duplicate entries"
);
i += 1;
}
}
// Scan the input string for a character in the ordered table above.
// If it's present, replace it with its alternative string (it can be more than 1 char!).
// Otherwise, retain the input char.
s.chars().fold(String::with_capacity(s.len()), |mut s, c| { s.chars().fold(String::with_capacity(s.len()), |mut s, c| {
match OUTPUT_REPLACEMENTS.binary_search_by_key(&c, |(k, _)| *k) { match OUTPUT_REPLACEMENTS.binary_search_by_key(&c, |(k, _)| *k) {
Ok(i) => s.push_str(OUTPUT_REPLACEMENTS[i].1), Ok(i) => s.push_str(OUTPUT_REPLACEMENTS[i].1),