rust/compiler/rustc_parse/src/parser/attr.rs

441 lines
17 KiB
Rust
Raw Normal View History

use super::{AttrWrapper, Capturing, FnParseMode, ForceCollect, Parser, PathStyle};
2020-04-27 17:56:11 +00:00
use rustc_ast as ast;
use rustc_ast::attr;
use rustc_ast::token::{self, Nonterminal};
use rustc_ast_pretty::pprust;
use rustc_errors::{error_code, Diagnostic, PResult};
use rustc_span::{sym, BytePos, Span};
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
use std::convert::TryInto;
2019-10-11 11:06:36 +00:00
2020-08-14 06:05:01 +00:00
use tracing::debug;
2014-05-16 07:16:13 +00:00
// Public for rustfmt usage
#[derive(Debug)]
pub enum InnerAttrPolicy<'a> {
Permitted,
Forbidden { reason: &'a str, saw_doc_comment: bool, prev_attr_sp: Option<Span> },
}
const DEFAULT_UNEXPECTED_INNER_ATTR_ERR_MSG: &str = "an inner attribute is not \
permitted in this context";
pub(super) const DEFAULT_INNER_ATTR_FORBIDDEN: InnerAttrPolicy<'_> = InnerAttrPolicy::Forbidden {
reason: DEFAULT_UNEXPECTED_INNER_ATTR_ERR_MSG,
saw_doc_comment: false,
prev_attr_sp: None,
};
enum OuterAttributeType {
DocComment,
DocBlockComment,
Attribute,
}
2015-10-24 01:37:21 +00:00
impl<'a> Parser<'a> {
/// Parses attributes that appear before an item.
pub(super) fn parse_outer_attributes(&mut self) -> PResult<'a, AttrWrapper> {
let mut attrs: Vec<ast::Attribute> = Vec::new();
let mut just_parsed_doc_comment = false;
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
let start_pos = self.token_cursor.num_next_calls;
loop {
2020-11-05 17:27:48 +00:00
let attr = if self.check(&token::Pound) {
let inner_error_reason = if just_parsed_doc_comment {
"an inner attribute is not permitted following an outer doc comment"
} else if !attrs.is_empty() {
"an inner attribute is not permitted following an outer attribute"
} else {
DEFAULT_UNEXPECTED_INNER_ATTR_ERR_MSG
};
let inner_parse_policy = InnerAttrPolicy::Forbidden {
reason: inner_error_reason,
saw_doc_comment: just_parsed_doc_comment,
prev_attr_sp: attrs.last().map(|a| a.span),
};
just_parsed_doc_comment = false;
Some(self.parse_attribute(inner_parse_policy)?)
} else if let token::DocComment(comment_kind, attr_style, data) = self.token.kind {
2020-11-05 17:27:48 +00:00
if attr_style != ast::AttrStyle::Outer {
let span = self.token.span;
let mut err = self.sess.span_diagnostic.struct_span_err_with_code(
span,
"expected outer doc comment",
error_code!(E0753),
);
if let Some(replacement_span) = self.annotate_following_item_if_applicable(
&mut err,
span,
match comment_kind {
token::CommentKind::Line => OuterAttributeType::DocComment,
token::CommentKind::Block => OuterAttributeType::DocBlockComment,
},
) {
err.note(
"inner doc comments like this (starting with `//!` or `/*!`) can \
only appear before items",
);
err.span_suggestion_verbose(
replacement_span,
"you might have meant to write a regular comment",
String::new(),
rustc_errors::Applicability::MachineApplicable,
);
}
err.emit();
2020-11-05 17:27:48 +00:00
}
self.bump();
just_parsed_doc_comment = true;
// Always make an outer attribute - this allows us to recover from a misplaced
// inner attribute.
Some(attr::mk_doc_comment(
comment_kind,
ast::AttrStyle::Outer,
data,
self.prev_token.span,
))
} else {
2020-11-05 17:27:48 +00:00
None
};
2020-11-05 17:27:48 +00:00
if let Some(attr) = attr {
attrs.push(attr);
} else {
break;
}
2012-05-24 20:44:42 +00:00
}
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
Ok(AttrWrapper::new(attrs.into(), start_pos))
}
/// Matches `attribute = # ! [ meta_item ]`.
2020-11-05 17:27:48 +00:00
/// `inner_parse_policy` prescribes how to handle inner attributes.
// Public for rustfmt usage.
pub fn parse_attribute(
2019-10-08 07:35:34 +00:00
&mut self,
inner_parse_policy: InnerAttrPolicy<'_>,
2019-10-08 07:35:34 +00:00
) -> PResult<'a, ast::Attribute> {
2019-12-22 22:42:04 +00:00
debug!(
2020-11-05 17:27:48 +00:00
"parse_attribute: inner_parse_policy={:?} self.token={:?}",
2019-12-22 22:42:04 +00:00
inner_parse_policy, self.token
);
let lo = self.token.span;
// Attributes can't have attributes of their own [Editor's note: not with that attitude]
self.collect_tokens_no_attrs(|this| {
2020-11-05 17:27:48 +00:00
if this.eat(&token::Pound) {
let style = if this.eat(&token::Not) {
ast::AttrStyle::Inner
} else {
ast::AttrStyle::Outer
};
2020-11-05 17:27:48 +00:00
this.expect(&token::OpenDelim(token::Bracket))?;
let item = this.parse_attr_item(false)?;
this.expect(&token::CloseDelim(token::Bracket))?;
let attr_sp = lo.to(this.prev_token.span);
// Emit error if inner attribute is encountered and forbidden.
if style == ast::AttrStyle::Inner {
this.error_on_forbidden_inner_attr(attr_sp, inner_parse_policy);
}
Ok(attr::mk_attr_from_item(item, None, style, attr_sp))
2020-11-05 17:27:48 +00:00
} else {
let token_str = pprust::token_to_string(&this.token);
let msg = &format!("expected `#`, found `{}`", token_str);
Err(this.struct_span_err(this.token.span, msg))
}
})
}
fn annotate_following_item_if_applicable(
&self,
err: &mut Diagnostic,
span: Span,
attr_type: OuterAttributeType,
) -> Option<Span> {
let mut snapshot = self.clone();
let lo = span.lo()
+ BytePos(match attr_type {
OuterAttributeType::Attribute => 1,
_ => 2,
});
let hi = lo + BytePos(1);
let replacement_span = span.with_lo(lo).with_hi(hi);
if let OuterAttributeType::DocBlockComment | OuterAttributeType::DocComment = attr_type {
snapshot.bump();
}
loop {
// skip any other attributes, we want the item
if snapshot.token.kind == token::Pound {
if let Err(err) = snapshot.parse_attribute(InnerAttrPolicy::Permitted) {
err.cancel();
return Some(replacement_span);
}
} else {
break;
}
}
match snapshot.parse_item_common(
AttrWrapper::empty(),
true,
false,
FnParseMode { req_name: |_| true, req_body: true },
ForceCollect::No,
) {
Ok(Some(item)) => {
let attr_name = match attr_type {
OuterAttributeType::Attribute => "attribute",
_ => "doc comment",
};
err.span_label(
item.span,
&format!("the inner {} doesn't annotate this {}", attr_name, item.kind.descr()),
);
err.span_suggestion_verbose(
replacement_span,
&format!(
"to annotate the {}, change the {} from inner to outer style",
item.kind.descr(),
attr_name
),
(match attr_type {
OuterAttributeType::Attribute => "",
OuterAttributeType::DocBlockComment => "*",
OuterAttributeType::DocComment => "/",
})
.to_string(),
rustc_errors::Applicability::MachineApplicable,
);
return None;
}
Err(item_err) => {
item_err.cancel();
}
Ok(None) => {}
}
Some(replacement_span)
}
pub(super) fn error_on_forbidden_inner_attr(&self, attr_sp: Span, policy: InnerAttrPolicy<'_>) {
if let InnerAttrPolicy::Forbidden { reason, saw_doc_comment, prev_attr_sp } = policy {
let prev_attr_note =
if saw_doc_comment { "previous doc comment" } else { "previous outer attribute" };
let mut diag = self.struct_span_err(attr_sp, reason);
if let Some(prev_attr_sp) = prev_attr_sp {
diag.span_label(attr_sp, "not permitted following an outer attribute")
.span_label(prev_attr_sp, prev_attr_note);
}
diag.note(
"inner attributes, like `#![no_std]`, annotate the item enclosing them, and \
are usually found at the beginning of source files",
);
if self
.annotate_following_item_if_applicable(
&mut diag,
attr_sp,
OuterAttributeType::Attribute,
)
.is_some()
{
diag.note("outer attributes, like `#[test]`, annotate the item following them");
};
diag.emit();
}
2012-05-24 20:44:42 +00:00
}
/// Parses an inner part of an attribute (the path and following tokens).
/// The tokens must be either a delimited token stream, or empty token stream,
/// or the "legacy" key-value form.
/// PATH `(` TOKEN_STREAM `)`
/// PATH `[` TOKEN_STREAM `]`
/// PATH `{` TOKEN_STREAM `}`
/// PATH
2019-09-30 21:58:30 +00:00
/// PATH `=` UNSUFFIXED_LIT
/// The delimiters or `=` are still put into the resulting token stream.
pub fn parse_attr_item(&mut self, capture_tokens: bool) -> PResult<'a, ast::AttrItem> {
let item = match self.token.kind {
2020-07-01 10:16:49 +00:00
token::Interpolated(ref nt) => match **nt {
Nonterminal::NtMeta(ref item) => Some(item.clone().into_inner()),
2017-03-08 23:13:35 +00:00
_ => None,
},
_ => None,
};
Ok(if let Some(item) = item {
2017-03-08 23:13:35 +00:00
self.bump();
item
2017-03-08 23:13:35 +00:00
} else {
let do_parse = |this: &mut Self| {
let path = this.parse_path(PathStyle::Mod)?;
let args = this.parse_attr_args()?;
Ok(ast::AttrItem { path, args, tokens: None })
};
// Attr items don't have attributes
if capture_tokens { self.collect_tokens_no_attrs(do_parse) } else { do_parse(self) }?
2017-03-08 23:13:35 +00:00
})
}
/// Parses attributes that appear after the opening of an item. These should
2014-06-09 20:12:30 +00:00
/// be preceded by an exclamation mark, but we accept and warn about one
/// terminated by a semicolon.
///
/// Matches `inner_attrs*`.
crate fn parse_inner_attributes(&mut self) -> PResult<'a, Vec<ast::Attribute>> {
let mut attrs: Vec<ast::Attribute> = vec![];
loop {
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
let start_pos: u32 = self.token_cursor.num_next_calls.try_into().unwrap();
// Only try to parse if it is an inner attribute (has `!`).
2020-11-05 17:27:48 +00:00
let attr = if self.check(&token::Pound) && self.look_ahead(1, |t| t == &token::Not) {
Some(self.parse_attribute(InnerAttrPolicy::Permitted)?)
} else if let token::DocComment(comment_kind, attr_style, data) = self.token.kind {
if attr_style == ast::AttrStyle::Inner {
self.bump();
Some(attr::mk_doc_comment(comment_kind, attr_style, data, self.prev_token.span))
} else {
2020-11-05 17:27:48 +00:00
None
}
} else {
None
};
if let Some(attr) = attr {
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
let end_pos: u32 = self.token_cursor.num_next_calls.try_into().unwrap();
// If we are currently capturing tokens, mark the location of this inner attribute.
// If capturing ends up creating a `LazyTokenStream`, we will include
// this replace range with it, removing the inner attribute from the final
// `AttrAnnotatedTokenStream`. Inner attributes are stored in the parsed AST note.
// During macro expansion, they are selectively inserted back into the
// token stream (the first inner attribute is remoevd each time we invoke the
// corresponding macro).
let range = start_pos..end_pos;
if let Capturing::Yes = self.capture_state.capturing {
self.capture_state.inner_attr_ranges.insert(attr.id, (range, vec![]));
}
attrs.push(attr);
} else {
break;
2012-05-24 20:44:42 +00:00
}
}
Ok(attrs)
2012-05-24 20:44:42 +00:00
}
2019-12-05 13:19:00 +00:00
crate fn parse_unsuffixed_lit(&mut self) -> PResult<'a, ast::Lit> {
let lit = self.parse_lit()?;
debug!("checking if {:?} is unusuffixed", lit);
2019-09-26 15:56:53 +00:00
if !lit.kind.is_unsuffixed() {
self.struct_span_err(lit.span, "suffixed literals are not allowed in attributes")
2019-12-22 22:42:04 +00:00
.help(
2020-03-07 08:31:00 +00:00
"instead of using a suffixed literal (`1u8`, `1.0f32`, etc.), \
use an unsuffixed version (`1`, `1.0`, etc.)",
2019-12-22 22:42:04 +00:00
)
.emit();
}
Ok(lit)
}
2019-10-08 07:14:07 +00:00
/// Parses `cfg_attr(pred, attr_item_list)` where `attr_item_list` is comma-delimited.
2019-10-10 08:26:10 +00:00
pub fn parse_cfg_attr(&mut self) -> PResult<'a, (ast::MetaItem, Vec<(ast::AttrItem, Span)>)> {
2019-10-08 07:14:07 +00:00
let cfg_predicate = self.parse_meta_item()?;
self.expect(&token::Comma)?;
// Presumably, the majority of the time there will only be one attr.
let mut expanded_attrs = Vec::with_capacity(1);
2019-12-05 05:45:50 +00:00
while self.token.kind != token::Eof {
let lo = self.token.span;
let item = self.parse_attr_item(true)?;
expanded_attrs.push((item, lo.to(self.prev_token.span)));
2019-12-05 13:19:00 +00:00
if !self.eat(&token::Comma) {
break;
}
2019-10-08 07:14:07 +00:00
}
Ok((cfg_predicate, expanded_attrs))
}
2019-12-05 13:19:00 +00:00
/// Matches `COMMASEP(meta_item_inner)`.
crate fn parse_meta_seq_top(&mut self) -> PResult<'a, Vec<ast::NestedMetaItem>> {
// Presumably, the majority of the time there will only be one attr.
let mut nmis = Vec::with_capacity(1);
while self.token.kind != token::Eof {
nmis.push(self.parse_meta_item_inner()?);
if !self.eat(&token::Comma) {
break;
}
}
Ok(nmis)
}
/// Matches the following grammar (per RFC 1559).
///
2019-09-30 21:58:30 +00:00
/// meta_item : PATH ( '=' UNSUFFIXED_LIT | '(' meta_item_inner? ')' )? ;
/// meta_item_inner : (meta_item | UNSUFFIXED_LIT) (',' meta_item_inner)? ;
pub fn parse_meta_item(&mut self) -> PResult<'a, ast::MetaItem> {
2019-06-04 22:17:07 +00:00
let nt_meta = match self.token.kind {
2020-07-01 10:16:49 +00:00
token::Interpolated(ref nt) => match **nt {
token::NtMeta(ref e) => Some(e.clone()),
_ => None,
},
2016-01-27 19:42:26 +00:00
_ => None,
2014-09-13 16:06:01 +00:00
};
if let Some(item) = nt_meta {
return match item.meta(item.path.span) {
Some(meta) => {
self.bump();
Ok(meta)
}
None => self.unexpected(),
2019-12-22 22:42:04 +00:00
};
}
let lo = self.token.span;
let path = self.parse_path(PathStyle::Mod)?;
let kind = self.parse_meta_item_kind()?;
let span = lo.to(self.prev_token.span);
Ok(ast::MetaItem { path, kind, span })
}
crate fn parse_meta_item_kind(&mut self) -> PResult<'a, ast::MetaItemKind> {
Ok(if self.eat(&token::Eq) {
ast::MetaItemKind::NameValue(self.parse_unsuffixed_lit()?)
} else if self.check(&token::OpenDelim(token::Paren)) {
// Matches `meta_seq = ( COMMASEP(meta_item_inner) )`.
let (list, _) = self.parse_paren_comma_seq(|p| p.parse_meta_item_inner())?;
ast::MetaItemKind::List(list)
} else {
ast::MetaItemKind::Word
})
}
/// Matches `meta_item_inner : (meta_item | UNSUFFIXED_LIT) ;`.
fn parse_meta_item_inner(&mut self) -> PResult<'a, ast::NestedMetaItem> {
match self.parse_unsuffixed_lit() {
Ok(lit) => return Ok(ast::NestedMetaItem::Literal(lit)),
Err(err) => err.cancel(),
}
match self.parse_meta_item() {
Ok(mi) => return Ok(ast::NestedMetaItem::MetaItem(mi)),
Err(err) => err.cancel(),
}
2019-12-07 02:07:35 +00:00
let found = pprust::token_to_string(&self.token);
2019-03-06 21:16:52 +00:00
let msg = format!("expected unsuffixed literal or identifier, found `{}`", found);
2019-12-30 13:56:57 +00:00
Err(self.struct_span_err(self.token.span, &msg))
}
}
Rewrite `collect_tokens` implementations to use a flattened buffer Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
2020-09-27 01:56:29 +00:00
pub fn maybe_needs_tokens(attrs: &[ast::Attribute]) -> bool {
// One of the attributes may either itself be a macro,
2020-11-18 22:50:16 +00:00
// or expand to macro attributes (`cfg_attr`).
Rewrite `collect_tokens` implementations to use a flattened buffer Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
2020-09-27 01:56:29 +00:00
attrs.iter().any(|attr| {
Implement token-based handling of attributes during expansion This PR modifies the macro expansion infrastructure to handle attributes in a fully token-based manner. As a result: * Derives macros no longer lose spans when their input is modified by eager cfg-expansion. This is accomplished by performing eager cfg-expansion on the token stream that we pass to the derive proc-macro * Inner attributes now preserve spans in all cases, including when we have multiple inner attributes in a row. This is accomplished through the following changes: * New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced. These are very similar to a normal `TokenTree`, but they also track the position of attributes and attribute targets within the stream. They are built when we collect tokens during parsing. An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when we invoke a macro. * Token capturing and `LazyTokenStream` are modified to work with `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which is created during the parsing of a nested AST node to make the 'outer' AST node aware of the attributes and attribute target stored deeper in the token stream. * When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`), we tokenize and reparse our target, capturing additional information about the locations of `#[cfg]` and `#[cfg_attr]` attributes at any depth within the target. This is a performance optimization, allowing us to perform less work in the typical case where captured tokens never have eager cfg-expansion run.
2020-11-28 23:33:17 +00:00
if attr.is_doc_comment() {
return false;
}
2020-11-18 22:50:16 +00:00
attr.ident().map_or(true, |ident| {
ident.name == sym::cfg_attr || !rustc_feature::is_builtin_attr_name(ident.name)
2020-11-18 22:50:16 +00:00
})
Rewrite `collect_tokens` implementations to use a flattened buffer Instead of trying to collect tokens at each depth, we 'flatten' the stream as we go allong, pushing open/close delimiters to our buffer just like regular tokens. One capturing is complete, we reconstruct a nested `TokenTree::Delimited` structure, producing a normal `TokenStream`. The reconstructed `TokenStream` is not created immediately - instead, it is produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This closure stores a clone of the original `TokenCursor`, plus a record of the number of calls to `next()/next_desugared()`. This is sufficient to reconstruct the tokenstream seen by the callback without storing any additional state. If the tokenstream is never used (e.g. when a captured `macro_rules!` argument is never passed to a proc macro), we never actually create a `TokenStream`. This implementation has a number of advantages over the previous one: * It is significantly simpler, with no edge cases around capturing the start/end of a delimited group. * It can be easily extended to allow replacing tokens an an arbitrary 'depth' by just using `Vec::splice` at the proper position. This is important for PR #76130, which requires us to track information about attributes along with tokens. * The lazy approach to `TokenStream` construction allows us to easily parse an AST struct, and then decide after the fact whether we need a `TokenStream`. This will be useful when we start collecting tokens for `Attribute` - we can discard the `LazyTokenStream` if the parsed attribute doesn't need tokens (e.g. is a builtin attribute). The performance impact seems to be neglibile (see https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a small slowdown on a few benchmarks, but it only rises above 1% for incremental builds, where it represents a larger fraction of the much smaller instruction count. There a ~1% speedup on a few other incremental benchmarks - my guess is that the speedups and slowdowns will usually cancel out in practice.
2020-09-27 01:56:29 +00:00
})
}