Commit Graph

76 Commits

Author SHA1 Message Date
Marijn Schouten
2a5225a369 rustc_lexer: typo fix + small cleanups 2025-06-06 13:08:16 +00:00
Matthew Jasper
55f59fb0e3 Fix parsing of frontmatters with inner hyphens 2025-06-04 15:51:36 +00:00
Deadbeef
662182637e Implement RFC 3503: frontmatters
Supercedes #137193
2025-05-05 23:10:08 +08:00
Guillaume Gomez
aff2bc7a88 Replace rustc_lexer/unescape with rustc-literal-escaper crate 2025-04-04 14:44:45 +02:00
Ralf Jung
20d04d8a40 Revert "Rollup merge of #136355 - GuillaumeGomez:proc-macro_add_value_retrieval_methods, r=Amanieu"
This reverts commit 08dfbf49e3, reversing
changes made to 10bcdad7df.
2025-03-18 13:28:56 +01:00
Jacob Pratt
08dfbf49e3
Rollup merge of #136355 - GuillaumeGomez:proc-macro_add_value_retrieval_methods, r=Amanieu
Add `*_value` methods to proc_macro lib

This is the implementation of https://github.com/rust-lang/libs-team/issues/459.

It allows to get the actual value (unescaped) of the different string literals.

Part of https://github.com/rust-lang/rust/issues/136652.

r? libs-api
2025-03-17 05:47:48 -04:00
Nicholas Nethercote
ff0a5fe975 Remove #![warn(unreachable_pub)] from all compiler/ crates.
It's no longer necessary now that `-Wunreachable_pub` is being passed.
2025-03-11 13:14:21 +11:00
许杰友 Jieyou Xu (Joe)
063ef18fdc Revert "Use workspace lints for crates in compiler/ #138084"
Revert <https://github.com/rust-lang/rust/pull/138084> to buy time to
consider options that avoids breaking downstream usages of cargo on
distributed `rustc-src` artifacts, where such cargo invocations fail due
to inability to inherit `lints` from workspace root manifest's
`workspace.lints` (this is only valid for the source rust-lang/rust
workspace, but not really the distributed `rustc-src` artifacts).

This breakage was reported in
<https://github.com/rust-lang/rust/issues/138304>.

This reverts commit 48caf81484, reversing
changes made to c6662879b2.
2025-03-10 18:12:47 +08:00
Nicholas Nethercote
8a3e03392e Remove #![warn(unreachable_pub)] from all compiler/ crates.
(Except for `rustc_codegen_cranelift`.)

It's no longer necessary now that `unreachable_pub` is in the workspace
lints.
2025-03-08 08:41:43 +11:00
Michael Goulet
12e3911d81 Greatly simplify lifetime captures in edition 2024 2025-02-22 22:24:52 +00:00
Guillaume Gomez
94f0f2b603 Reexport literal-escaper from rustc_lexer to allow rust-analyzer to compile 2025-02-10 10:38:22 +01:00
Guillaume Gomez
49d2d5a116 Extract unescape from rustc_lexer into its own crate 2025-02-10 10:38:22 +01:00
gvozdvmozgu
9f469eb600 implement eat_until leveraging memchr in lexer 2025-02-05 07:03:53 -08:00
Eric Huss
a97404eee3 Add test to check unicode identifier version 2024-12-09 06:23:59 -08:00
Michael Goulet
b87e935407 Revert "Reject raw lifetime followed by \' as well"
This reverts commit 1990f15608.
2024-12-01 05:22:16 +00:00
Nicholas Nethercote
4cd2840f00 Clean up c_or_byte_string.
- Rename a misleading local `mk_kind` as `single_quoted`.
- Use `fn` for all three arguments, for consistency.
2024-11-25 16:10:55 +11:00
Nicholas Nethercote
e9a0c3c98c Remove TokenKind::InvalidPrefix.
It was added in #123752 to handle some cases involving emoji, but it
isn't necessary because it's always treated the same as
`TokenKind::InvalidIdent`. This commit removes it, which makes things a
little simpler.
2024-11-19 18:06:22 +11:00
Nicholas Nethercote
2c7c3697db Improve TokenKind comments.
- Improve wording.
- Use backticks consistently for examples.
2024-11-19 18:04:01 +11:00
Nicholas Nethercote
df29f9b0c3 Improve fake_ident_or_unknown_prefix.
- Rename it as `invalid_ident_or_prefix`, which matches the possible
  outputs (`InvalidIdent` or `InvalidPrefix`).
- Use the local wrapper for `is_xid_continue`, for consistency.
- Make it clear what `\u{200d}` means.
2024-11-19 18:01:43 +11:00
Michael Goulet
1990f15608 Reject raw lifetime followed by \' as well 2024-10-30 01:13:18 +00:00
Peter Jaszkowiak
321a5db7d4 Reserve guarded string literals (RFC 3593) 2024-10-08 18:21:16 -06:00
Michael Goulet
97910580aa Add initial support for raw lifetimes 2024-09-06 10:32:48 -04:00
Michael Goulet
3b3e43a386 Format lexer 2024-09-06 10:32:48 -04:00
Michael Goulet
9aaf873396 Reserve prefix lifetimes too 2024-09-06 10:32:48 -04:00
Nicholas Nethercote
6c84c55c9f Add warn(unreachable_pub) to rustc_lexer. 2024-08-27 15:12:46 +10:00
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
Nicholas Nethercote
75b164d836 Use tidy to sort crate attributes for all compiler crates.
We already do this for a number of crates, e.g. `rustc_middle`,
`rustc_span`, `rustc_metadata`, `rustc_span`, `rustc_errors`.

For the ones we don't, in many cases the attributes are a mess.
- There is no consistency about order of attribute kinds (e.g.
  `allow`/`deny`/`feature`).
- Within attribute kind groups (e.g. the `feature` attributes),
  sometimes the order is alphabetical, and sometimes there is no
  particular order.
- Sometimes the attributes of a particular kind aren't even grouped
  all together, e.g. there might be a `feature`, then an `allow`, then
  another `feature`.

This commit extends the existing sorting to all compiler crates,
increasing consistency. If any new attribute line is added there is now
only one place it can go -- no need for arbitrary decisions.

Exceptions:
- `rustc_log`, `rustc_next_trait_solver` and `rustc_type_ir_macros`,
  because they have no crate attributes.
- `rustc_codegen_gcc`, because it's quasi-external to rustc (e.g. it's
  ignored in `rustfmt.toml`).
2024-06-12 15:49:10 +10:00
Michael Scholten
3c5e88c7d1 Improved the compiler code with clippy 2024-04-24 09:41:44 +02:00
Esteban Küber
19821ad234 Properly handle emojis as literal prefix in macros
Do not accept the following

```rust
macro_rules! lexes {($($_:tt)*) => {}}
lexes!(🐛"foo");
```

Before, invalid emoji identifiers were gated during parsing instead of lexing in all cases, but this didn't account for macro expansion of literal prefixes.

Fix #123696.
2024-04-10 23:19:27 +00:00
Nicholas Nethercote
0ac1195ee0 Invert diagnostic lints.
That is, change `diagnostic_outside_of_impl` and
`untranslatable_diagnostic` from `allow` to `deny`, because more than
half of the compiler has be converted to use translated diagnostics.

This commit removes more `deny` attributes than it adds `allow`
attributes, which proves that this change is warranted.
2024-02-06 13:12:33 +11:00
León Orell Valerian Liehr
c0a9f722c4
Undeprecate and use lint unstable_features 2023-12-20 18:16:28 +01:00
Charles Lew
bca79a26d8 Update lexer emoji diagnostics to Unicode 15.0 2023-07-29 08:47:21 +08:00
Deadbeef
df9bd80d74 reimplement C string literals 2023-07-23 06:54:07 +00:00
León Orell Valerian Liehr
c6643b50ea
Revert the lexing of c_str_literals 2023-07-05 13:11:17 +02:00
Nicholas Nethercote
e52794decd Don't try to eat non-existent decimal digits.
After seeing a `0`, if it's followed by any of `[0-9]`, `_`, `.`, `e`,
or `E`, we consume all the digits. But in the `.`, `e` and `E` cases
this is pointless because we know there aren't any digits.
2023-05-15 18:33:12 +10:00
Nicholas Nethercote
19967c5890 Make Cursor::number less DRY.
A tiny bit of repetition makes this easier to read, and avoids a test on
the "Not a base prefix" match arm.
2023-05-15 18:30:26 +10:00
Deadbeef
78e3455d37 address comments 2023-05-02 10:32:07 +00:00
Deadbeef
a49570fd20 fix TODO comments 2023-05-02 10:32:07 +00:00
Deadbeef
8ff3903643 initial step towards implementing C string literals 2023-05-02 10:30:09 +00:00
Michael Goulet
a047064d6b Revert "Don't recover lifetimes/labels containing emojis as character literals"
Reverts PR #108031
Fixes (doesnt close until beta backported) #109746

This reverts commit e3f9db5fc3.
This reverts commit 98b82aedba.
This reverts commit 380fa26413.
2023-04-10 06:52:41 +00:00
est31
5a02105fff Rustdoc-ify LiteralKind note 2023-03-03 08:39:36 +01:00
许杰友 Jieyou Xu (Joe)
380fa26413
Don't recover lifetimes/labels containing emojis as character literals
Note that at the time of this commit, `unic-emoji-char` seems to have
data tables only up to Unicode 5.0, but Unicode is already newer than
this.

A newer emoji such as `🥺` will not be recognized as an emoji
but older emojis such as `🐱` will.
2023-02-14 17:31:58 +08:00
Maybe Waffle
6a28fb42a8 Remove double spaces after dots in comments 2023-01-17 08:09:33 +00:00
Michael Goulet
aff403cf68 Recover fn keyword as Fn trait in bounds 2022-12-27 06:14:46 +00:00
KaDiWa
9bc69925cb
compiler: remove unnecessary imports and qualified paths 2022-12-10 18:45:34 +01:00
Nicholas Nethercote
358a603f11 Use token::Lit in ast::ExprKind::Lit.
Instead of `ast::Lit`.

Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.

This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
2022-11-16 09:41:28 +11:00
Dylan DPC
4b50fb3745
Rollup merge of #103919 - nnethercote:unescaping-cleanups, r=matklad
Unescaping cleanups

Some code improvements, and some error message improvements.

Best reviewed one commit at a time.

r? ````@matklad````
2022-11-09 19:21:22 +05:30
Nicholas Nethercote
dba6fc3ef5 Make underscore_literal_suffix a hard error.
It's been a warning for 5.5 years. Time to make it a hard error.

Closes #42326.
2022-11-07 10:00:36 +11:00
Nicholas Nethercote
a203482d2a Inline and remove validate_int_literal.
It has a single callsite, and is fairly small. The `Float` match arm
already has base-specific checking inline, so this makes things more
consistent.
2022-11-04 14:24:41 +11:00
Tshepang Mbambo
b66f92197a rustc_lexer::TokenKind improve docs 2022-10-26 23:32:14 +02:00