Commit Graph

310 Commits

Author SHA1 Message Date
Michael Goulet
af8d911d63 Also fix if in else 2024-09-11 17:24:01 -04:00
bors
6d05f12170 Auto merge of #129346 - nnethercote:fix-double-handling-in-collect_tokens, r=petrochenkov
Fix double handling in `collect_tokens`

Double handling of AST nodes can occur in `collect_tokens`. This is when an inner call to `collect_tokens` produces an AST node, and then an outer call to `collect_tokens` produces the same AST node. This can happen in a few places, e.g. expression statements where the statement delegates `HasTokens` and `HasAttrs` to the expression. It will also happen more after #124141.

This PR fixes some double handling cases that cause problems, including #129166.

r? `@petrochenkov`
2024-09-08 05:35:23 +00:00
Michael Goulet
97910580aa Add initial support for raw lifetimes 2024-09-06 10:32:48 -04:00
Nicholas Nethercote
1fdabfbebb Avoid double-handling of attributes in collect_tokens.
By keeping track of attributes that have been previously processed.

This fixes the `macro-rules-derive-cfg.stdout` test, and is necessary
for #124141 which removes nonterminals.

Also shrink the `SmallVec` inline size used in `IntervalSet`. 2 gives
slightly better perf than 4 now that there's an `IntervalSet` in
`Parser`, which is cloned reasonably often.
2024-08-24 06:57:47 +10:00
Nicholas Nethercote
39b38a94e3 Split the assertion in NodeRange::new. 2024-08-23 14:40:08 +10:00
Nicholas Nethercote
9d31f86f0d Overhaul token collection.
This commit does the following.

- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
  (a) it's annoying long, and (b) the `_trailing_token` bit is less
  accurate now that its types have changed.

- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
  `UsePreAttrPos` in the return type of `f`. These are used in
  `parse_expr_force_collect` (for vanilla expressions) and in
  `parse_stmt_without_recovery` (for two different cases of expression
  statements). Together these ensure are enough to fix all the problems
  with token collection and assoc expressions. The changes to the
  `stringify.rs` test demonstrate some of these.

- Adds a new test. The code in this test was causing an assertion
  failure prior to this commit, due to an invalid `NodeRange`.

The extra complexity is annoying, but necessary to fix the existing
problems.
2024-08-16 09:07:55 +10:00
Nicholas Nethercote
5aaa2f92ee Add an assertion to NodeRange::new. 2024-08-16 09:07:31 +10:00
Nicholas Nethercote
c8098be41f Convert a bool to Trailing.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
2024-08-16 09:07:29 +10:00
Nicholas Nethercote
7923b20dd9 Use impl PartialEq<TokenKind> for Token more.
This lets us compare a `Token` with a `TokenKind`. It's used a lot, but
can be used even more, avoiding the need for some `.kind` uses.
2024-08-14 16:37:09 +10:00
Guillaume Gomez
99a785d62d
Rollup merge of #128994 - nnethercote:fix-Parser-look_ahead-more, r=compiler-errors
Fix bug in `Parser::look_ahead`.

The special case was failing to handle invisible delimiters on one path.

Fixes (but doesn't close until beta backported) #128895.

r? `@davidtwco`
2024-08-12 17:09:20 +02:00
Nicholas Nethercote
46b4c5adc5 Fix bug in Parser::look_ahead.
The special case was failing to handle invisible delimiters on one path.

Fixes #128895.
2024-08-12 13:00:12 +10:00
Michael Goulet
c361c924a0 Use assert_matches around the compiler 2024-08-11 12:25:39 -04:00
Matthias Krüger
7d9ed2a864
Rollup merge of #127921 - spastorino:stabilize-unsafe-extern-blocks, r=compiler-errors
Stabilize unsafe extern blocks (RFC 3484)

# Stabilization report

## Summary

This is a tracking issue for the RFC 3484: Unsafe Extern Blocks

We are stabilizing `#![feature(unsafe_extern_blocks)]`, as described in [Unsafe Extern Blocks RFC 3484](https://github.com/rust-lang/rfcs/pull/3484). This feature makes explicit that declaring an extern block is unsafe. Starting in Rust 2024, all extern blocks must be marked as unsafe. In all editions, items within unsafe extern blocks may be marked as safe to use.

RFC: https://github.com/rust-lang/rfcs/pull/3484
Tracking issue: #123743

## What is stabilized

### Summary of stabilization

We now need extern blocks to be marked as unsafe and items inside can also have safety modifiers (unsafe or safe), by default items with no modifiers are unsafe to offer easy migration without surprising results.

```rust
unsafe extern {
    // sqrt (from libm) may be called with any `f64`
    pub safe fn sqrt(x: f64) -> f64;

    // strlen (from libc) requires a valid pointer,
    // so we mark it as being an unsafe fn
    pub unsafe fn strlen(p: *const c_char) -> usize;

    // this function doesn't say safe or unsafe, so it defaults to unsafe
    pub fn free(p: *mut core::ffi::c_void);

    pub safe static IMPORTANT_BYTES: [u8; 256];

    pub safe static LINES: SyncUnsafeCell<i32>;
}
```

## Tests

The relevant tests are in `tests/ui/rust-2024/unsafe-extern-blocks`.

## History

- https://github.com/rust-lang/rust/pull/124482
- https://github.com/rust-lang/rust/pull/124455
- https://github.com/rust-lang/rust/pull/125077
- https://github.com/rust-lang/rust/pull/125522
- https://github.com/rust-lang/rust/issues/126738
- https://github.com/rust-lang/rust/issues/126749
- https://github.com/rust-lang/rust/issues/126755
- https://github.com/rust-lang/rust/pull/126757
- https://github.com/rust-lang/rust/pull/126758
- https://github.com/rust-lang/rust/issues/126756
- https://github.com/rust-lang/rust/pull/126973
- https://github.com/rust-lang/rust/pull/127535
- https://github.com/rust-lang/rustfmt/pull/6204

## Unresolved questions

I am not aware of any unresolved questions.
2024-08-03 20:51:51 +02:00
Matthias Krüger
dee57ce043
Rollup merge of #128483 - nnethercote:still-more-cfg-cleanups, r=petrochenkov
Still more `cfg` cleanups

Found while looking closely at `cfg`/`cfg_attr` processing code.

r? `````````@petrochenkov`````````
2024-08-03 11:17:44 +02:00
Nicholas Nethercote
d1f05fd184 Distinguish the two kinds of token range.
When collecting tokens there are two kinds of range:
- a range relative to the parser's full token stream (which we get when
  we are parsing);
- a range relative to a single AST node's token stream (which we use
  within `LazyAttrTokenStreamImpl` when replacing tokens).

These are currently both represented with `Range<u32>` and it's easy to
mix them up -- until now I hadn't properly understood the difference.

This commit introduces `ParserRange` and `NodeRange` to distinguish
them. This also requires splitting `ReplaceRange` in two, giving the new
types `ParserReplacement` and `NodeReplacement`. (These latter two names
reduce the overloading of the word "range".)

The commit also rewrites some comments to be clearer.

The end result is a little more verbose, but much clearer.
2024-08-01 19:30:40 +10:00
Michael Goulet
e4076e34f8 Mark Parser::eat/check methods as must_use 2024-07-29 21:29:08 -04:00
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
Folkert
d3858f7465
improve error message when global_asm! uses asm! options 2024-07-25 22:33:52 +02:00
Santiago Pastorino
8366c7fe9c
Stabilize unsafe extern blocks (RFC 3484) 2024-07-23 00:29:39 -03:00
bors
3811f40d27 Auto merge of #127957 - matthiaskrgr:rollup-1u5ivck, r=matthiaskrgr
Rollup of 6 pull requests

Successful merges:

 - #127350 (Parser: Suggest Placing the Return Type After Function Parameters)
 - #127621 (Rewrite and rename `issue-22131` and `issue-26006` `run-make` tests to rmake)
 - #127662 (When finding item gated behind a `cfg` flag, point at it)
 - #127903 (`force_collect` improvements)
 - #127932 (rustdoc: fix `current` class on sidebar modnav)
 - #127943 (Don't allow unsafe statics outside of extern blocks)

r? `@ghost`
`@rustbot` modify labels: rollup
2024-07-19 13:39:12 +00:00
Matthias Krüger
9ada89d9a1
Rollup merge of #127903 - nnethercote:force_collect-improvements, r=petrochenkov
`force_collect` improvements

Yet more cleanups relating to `cfg_attr` processing.

r? ````@petrochenkov````
2024-07-19 10:48:05 +02:00
Matthias Krüger
c86e13f330
Rollup merge of #127350 - veera-sivarajan:bugfix-126311, r=lcnr
Parser: Suggest Placing the Return Type After Function Parameters

Fixes #126311

This PR suggests placing the return type after the function parameters when it's misplaced after a `where` clause.

This also tangentially improves diagnostics for cases like [this](86d6f1312a/tests/ui/parser/issues/misplaced-return-type-without-where-issue-126311.rs (L1C1-L1C28)) and adds doc comments for `parser::AllowPlus`.
2024-07-19 10:48:03 +02:00
Nicholas Nethercote
1dd566a6d0 Overhaul comments in collect_tokens_trailing_token.
Adding details, clarifying lots of little things, etc. In particular,
the commit adds details of an example. I find this very helpful, because
it's taken me a long time to understand how this code works.
2024-07-19 15:25:55 +10:00
Nicholas Nethercote
ca6649516f Make Parser::num_bump_calls 0-indexed.
Currently in `collect_tokens_trailing_token`, `start_pos` and `end_pos`
are 1-indexed by `replace_ranges` is 0-indexed, which is really
confusing. Making them both 0-indexed makes debugging much easier.
2024-07-19 15:25:55 +10:00
Nicholas Nethercote
757f73f506 Simplify CaptureState::inner_attr_ranges.
The `Option`s within the `ReplaceRange`s within the hashmap are always
`None`. This PR omits them and inserts them when they are extracted from
the hashmap.
2024-07-19 15:25:54 +10:00
Nicholas Nethercote
e69ff1c106 Remove an unnecessary ForceCollect::Yes.
No need to collect tokens on this recovery path, because the parsed
statement isn't even looked at.
2024-07-19 08:20:57 +10:00
Veera
4cad705017 Parser: Suggest Placing the Return Type After Function Parameters 2024-07-18 17:56:34 -04:00
Nicholas Nethercote
487802d6c8 Remove TrailingToken.
It's used in `Parser::collect_tokens_trailing_token` to decide whether
to capture a trailing token. But the callers actually know whether to
capture a trailing token, so it's simpler for them to just pass in a
bool.

Also, the `TrailingToken::Gt` case was weird, because it didn't result
in a trailing token being captured. It could have been subsumed by the
`TrailingToken::MaybeComma` case, and it effectively is in the new code.
2024-07-18 17:28:49 +10:00
Nicholas Nethercote
9c4f3dbd06 Remove references to maybe_whole_expr.
It was removed in #126571.
2024-07-16 16:40:35 +10:00
Matthias Krüger
febe4423c1
Rollup merge of #127273 - nnethercote:fix-DebugParser, r=workingjubilee
Fix `DebugParser`.

I tried using this and it didn't work at all. `prev_token` is never eof, so the accumulator is always false, which means the `then_some` always returns `None`, which means `scan` always returns `None`, and `tokens` always ends up an empty vec. I'm not sure how this code was supposed to work.

(An aside: I find `Iterator::scan` to be a pretty wretched function, that produces code which is very hard to understand. Probably why this is just one of two uses of it in the entire compiler.)

This commit changes it to a simpler imperative style that produces a valid `tokens` vec.

r? `@workingjubilee`
2024-07-14 20:24:58 +02:00
Jubilee
125343e7ab
Rollup merge of #127558 - nnethercote:more-Attribute-cleanups, r=petrochenkov
More attribute cleanups

A follow-up to #127308.

r? ```@petrochenkov```
2024-07-13 20:19:46 -07:00
Nicholas Nethercote
aa0e8e1475 Fix DebugParser.
It currently doesn't work at all. This commit changes it to a simpler
imperative style that produces a valid `tokens` vec.

(An aside: I find `Iterator::scan` to be a pretty wretched function,
that produces code which is very hard to understand. Probably why this
is just one of two uses of it in the entire compiler.)
2024-07-13 16:42:00 +10:00
Nicholas Nethercote
100f3fd133 Add a new special case to Parser::look_ahead.
This new special case is simpler than the old special case because it
only is used when `dist == 1`. But that's still enough to cover ~98% of
cases. This results in equivalent performance to the old special case,
and identical behaviour as the general case.
2024-07-12 13:35:24 +10:00
Nicholas Nethercote
ebe1305b1e Remove the bogus special case from Parser::look_ahead.
The general case at the bottom of `look_ahead` is slow, because it
clones the token cursor. Above it there is a special case for
performance that is hit most of the time and avoids the cloning.
Unfortunately, its behaviour differs from the general case in two ways.

- When within a pair of delimiters, if you look any distance past the
  closing delimiter you get the closing delimiter instead of what comes
  after the closing delimiter.

- It uses `tree_cursor.look_ahead(dist - 1)` which totally confuses
  tokens with token trees. This means that only the first token in a
  token tree will be seen. E.g. in a sequence like `{ a }` the `a` and
  `}` will be skipped over. Bad!

It's likely that these differences weren't noticed before now because
the use of `look_ahead` in the parser is limited to small distances and
relatively few contexts.

Removing the special case causes slowdowns up of to 2% on a range of
benchmarks. The next commit will add a new, correct special case to
regain that lost performance.
2024-07-12 13:33:38 +10:00
Nicholas Nethercote
f5527949f2 Move Spacing into FlatToken.
It's only needed for the `FlatToken::Token` variant. This makes things a
little more concise.
2024-07-09 21:54:32 +10:00
Nicholas Nethercote
3a5c4b6e4e Rename some attribute types for consistency.
- `AttributesData` -> `AttrsTarget`
- `AttrTokenTree::Attributes` -> `AttrTokenTree::AttrsTarget`
- `FlatToken::AttrTarget` -> `FlatToken::AttrsTarget`
2024-07-07 16:14:30 +10:00
Nicholas Nethercote
9d33a8fe51 Simplify ReplaceRange.
Currently the second element is a `Vec<(FlatToken, Spacing)>`. But the
vector always has zero or one elements, and the `FlatToken` is always
`FlatToken::AttrTarget` (which contains an `AttributesData`), and the
spacing is always `Alone`. So we can simplify it to
`Option<AttributesData>`.

An assertion in `to_attr_token_stream` can can also be removed, because
`new_tokens.len()` was always 0 or 1, which means than `range.len()`
is always greater than or equal to it, because `range.is_empty()` is
always false (as per the earlier assertion).
2024-07-07 15:58:36 +10:00
Nicholas Nethercote
3d750e2702 Shrink parser positions from usize to u32.
The number of source code bytes can't exceed a `u32`'s range, so a token
position also can't. This reduces the size of `Parser` and
`LazyAttrTokenStreamImpl` by eight bytes each.
2024-07-02 17:03:53 +10:00
Michael Goulet
108b3f214a Properly gate safe keyword in pre-expansion 2024-06-20 14:14:49 -04:00
bors
894f7a4ba6 Auto merge of #126678 - nnethercote:fix-duplicated-attrs-on-nt-expr, r=petrochenkov
Fix duplicated attributes on nonterminal expressions

This PR fixes a long-standing bug (#86055) whereby expression attributes can be duplicated when expanded through declarative macros.

First, consider how items are parsed in declarative macros:
```
Items:
- parse_nonterminal
  - parse_item(ForceCollect::Yes)
    - parse_item_
      - attrs = parse_outer_attributes
      - parse_item_common(attrs)
        - maybe_whole!
        - collect_tokens_trailing_token
```
The important thing is that the parsing of outer attributes is outside token collection, so the item's tokens don't include the attributes. This is how it's supposed to be.

Now consider how expression are parsed in declarative macros:
```
Exprs:
- parse_nonterminal
  - parse_expr_force_collect
    - collect_tokens_no_attrs
      - collect_tokens_trailing_token
        - parse_expr
          - parse_expr_res(None)
            - parse_expr_assoc_with
              - parse_expr_prefix
                - parse_or_use_outer_attributes
                - parse_expr_dot_or_call
```
The important thing is that the parsing of outer attributes is inside token collection, so the the expr's tokens do include the attributes, i.e. in `AttributesData::tokens`.

This PR fixes the bug by rearranging expression parsing to that outer attribute parsing happens outside of token collection. This requires a number of small refactorings because expression parsing is somewhat complicated. While doing so the PR makes the code a bit cleaner and simpler, by eliminating `parse_or_use_outer_attributes` and `Option<AttrWrapper>` arguments (in favour of the simpler `parse_outer_attributes` and `AttrWrapper` arguments), and simplifying `LhsExpr`.

r? `@petrochenkov`
2024-06-19 13:58:21 +00:00
Nicholas Nethercote
8170acb197 Refactor parse_expr_res.
This removes the final `Option<AttrWrapper>` argument.
2024-06-19 19:12:02 +10:00
Oli Scherer
c91edc3888 Prefer dcx methods over fields or fields' methods 2024-06-18 13:45:08 +00:00
Michael Goulet
68bd001c00 Make parse_seq_to_before_tokens take expected/nonexpected tokens, use in parse_precise_capturing_syntax 2024-06-17 22:35:25 -04:00
Matthias Krüger
4aceaaa7f3
Rollup merge of #126052 - nnethercote:rustc_parse-more-cleanups, r=spastorino
More `rustc_parse` cleanups

Following on from #125815.

r? `@spastorino`
2024-06-07 20:14:30 +02:00
Oli Scherer
cbee17d502 Revert "Create const block DefIds in typeck instead of ast lowering"
This reverts commit ddc5f9b6c1.
2024-06-07 08:33:58 +00:00
Nicholas Nethercote
95b4c07ef8 Reduce pub exposure. 2024-06-06 08:26:54 +10:00
Santiago Pastorino
2a377122dd
Handle safety keyword for extern block inner items 2024-06-04 14:19:42 -03:00
Oli Scherer
ddc5f9b6c1 Create const block DefIds in typeck instead of ast lowering 2024-05-28 13:38:43 +00:00
Santiago Pastorino
6b46a919e1
Rename Unsafe to Safety 2024-05-17 18:33:37 -03:00
Nicholas Nethercote
95e519ecbf Remove NtIdent and NtLifetime.
The extra span is now recorded in the new `TokenKind::NtIdent` and
`TokenKind::NtLifetime`. These both consist of a single token, and so
there's no operator precedence problems with inserting them directly
into the token stream.

The other way to do this would be to wrap the ident/lifetime in invisible
delimiters, but there's a lot of code that assumes an interpolated
ident/lifetime fits in a single token, and changing all that code to work with
invisible delimiters would have been a pain. (Maybe it could be done in a
follow-up.)

This change might not seem like much of a win, but it's a first step toward the
much bigger and long-desired removal of `Nonterminal` and
`TokenKind::Interpolated`. That change is big and complex enough that it's
worth doing this piece separately. (Indeed, this commit is based on part of a
late commit in #114647, a prior attempt at that big and complex change.)
2024-05-14 08:19:58 +10:00