rustdoc: Replaces fn main search and extern crate search with proper parsing during doctests.
Fixes#21299.
Fixes#33731.
Let me know if there's any additional changes you'd like made!
This commit avoids an allocation when parsing any float and integer
literals that don't involved underscores.
This reduces the number of allocations done for the `tuple-stress`
benchmark by 10%, reducing its instruction count by just under 1%.
syntax: Optimize some literal parsing
Currently in the `wasm-bindgen` project we have a very very large crate that's
procedurally generated, `web-sys`. To generate this crate we parse all of a
browser's WebIDL and we then generate bindings for all of the APIs contained
within.
The resulting Rust file is 18MB large (wow!) and currently takes a very long
time to compile in debug mode. On the nightly compiler a *debug* build takes 90s
for the crate to finish. I was curious what was taking so long and upon
investigating a *massive* portion of the time was spent in the `lit_token`
method of the compiler, primarily formatting strings via `format!`.
Upon some more investigation it looks like the `byte_str_lit` was allocating an
error message once per byte, causing a very large number of allocations to
happen for large literals, of which wasm-bindgen generates quite a few (some are
MB large).
This commit fixes the issue by lazily allocating the error message, only doing
so if the error message is actually needed (which should be never). As a result,
the debug mode compilation time for our `web-sys` crate decreased from 90s to
20s, a very nice improvement! (although we've still got some work to do).
Currently in the `wasm-bindgen` project we have a very very large crate that's
procedurally generated, `web-sys`. To generate this crate we parse all of a
browser's WebIDL and we then generate bindings for all of the APIs contained
within.
The resulting Rust file is 18MB large (wow!) and currently takes a very long
time to compile in debug mode. On the nightly compiler a *debug* build takes 90s
for the crate to finish. I was curious what was taking so long and upon
investigating a *massive* portion of the time was spent in the `lit_token`
method of the compiler, primarily formatting strings via `format!`.
Upon some more investigation it looks like the `byte_str_lit` was allocating an
error message once per byte, causing a very large number of allocations to
happen for large literals, of which wasm-bindgen generates quite a few (some are
MB large).
This commit fixes the issue by lazily allocating the error message, only doing
so if the error message is actually needed (which should be never). As a result,
the debug mode compilation time for our `web-sys` crate decreased from 90s to
20s, a very nice improvement! (although we've still got some work to do).
async/await
This PR implements `async`/`await` syntax for `async fn` in Rust 2015 and `async` closures and `async` blocks in Rust 2018 (tracking issue: https://github.com/rust-lang/rust/issues/50547). Limitations: non-`move` async closures with arguments are currently not supported, nor are `async fn` with multiple different input lifetimes. These limitations are not fundamental and will be removed in the future, however I'd like to go ahead and get this PR merged so we can start experimenting with this in combination with futures 0.3.
Based on https://github.com/rust-lang/rust/pull/51414.
cc @petrochenkov for parsing changes.
r? @eddyb
This is gated on edition 2018 & the `async_await` feature gate.
The parser will accept `async fn` and `async unsafe fn` as fn
items. Along the same lines as `const fn`, only `async unsafe fn`
is permitted, not `unsafe async fn`.The parser will not accept
`async` functions as trait methods.
To do a little code clean up, four fields of the function type
struct have been merged into the new `FnHeader` struct: constness,
asyncness, unsafety, and ABI.
Also, a small bug in HIR printing is fixed: it previously printed
`const unsafe fn` as `unsafe const fn`, which is grammatically
incorrect.
In the common case, the string value in a string literal Token is the
same as the string value in a string literal LitKind. (The exception is
when escapes or \r are involved.) This patch takes advantage of that to
avoid calling str_lit() and re-interning the string in that case. This
speeds up incremental builds for a few of the rustc-benchmarks, the best
by 3%.
`char_lit` uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.
This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.
Do not emit type errors on recovered blocks
When a parse error occurs on a block, the parser will recover and create
a block with the statements collected until that point. Now a flag
stating that a recovery has been performed in this block is propagated
so that the type checker knows that the type of the block (which will be
identified as `()`) shouldn't be checked against the expectation to
reduce the amount of irrelevant diagnostic errors shown to the user.
Fix#44579.
When a parse error occurs on a block, the parser will recover and create
a block with the statements collected until that point. Now a flag
stating that a recovery has been performed in this block is propagated
so that the type checker knows that the type of the block (which will be
identified as `()`) shouldn't be checked against the expectation to
reduce the amount of irrelevant diagnostic errors shown to the user.
The Generics now contain one Vec of an enum for the generic parameters,
rather than two separate Vec's for lifetime and type parameters.
Additionally, places that previously used Vec<LifetimeDef> now use
Vec<GenericParam> instead.
This PR kicks off the implementation of the [default binding modes RFC][1] by
introducing the `pat_binding_modes` typeck table mentioned in the [mentoring
instructions][2].
`pat_binding_modes` is populated in `librustc_typeck/check/_match.rs` and
used wherever the HIR would be scraped prior to this PR. Unfortunately, one
blemish, namely a two callers to `contains_explicit_ref_binding`, remains.
This will likely have to be removed when the second part of [1], the
`pat_adjustments` table, is tackled. Appropriate comments have been added.
See #42640.
[1]: https://github.com/rust-lang/rfcs/pull/2005
[2]: https://github.com/rust-lang/rust/issues/42640#issuecomment-313535089
Add Span to ast::WhereClause
This PR adds `Span` field to `ast::WhereClause`. The motivation here is to make rustfmt's life easier when recovering comments before and after where clause.
r? @nrc
This is then later used by `proc_macro` to generate a new
`proc_macro::TokenTree` which preserves span information. Unfortunately this
isn't a bullet-proof approach as it doesn't handle the case when there's still
other attributes on the item, especially inner attributes.
Despite this the intention here is to solve the primary use case for procedural
attributes, attached to functions as outer attributes, likely bare. In this
situation we should be able to now yield a lossless stream of tokens to preserve
span information.
This is useful if parsing from stdin or a String and don't want to try and read in a module from another file. Instead we just leave a stub in the AST.
This commit updates the version number to 1.17.0 as we're not on that version of
the nightly compiler, and at the same time this updates src/stage0.txt to
bootstrap from freshly minted beta compiler and beta Cargo.
Refactor the parser to consume token trees
This is groundwork for efficiently parsing attribute proc macro invocations, bang macro invocations, and `TokenStream`-based attributes and fragment matchers.
This improves parsing performance by 8-15% and expansion performance by 0-5% on a sampling of the compiler's crates.
r? @nrc
This commit introduces 128-bit integers. Stage 2 builds and produces a working compiler which
understands and supports 128-bit integers throughout.
The general strategy used is to have rustc_i128 module which provides aliases for iu128, equal to
iu64 in stage9 and iu128 later. Since nowhere in rustc we rely on large numbers being supported,
this strategy is good enough to get past the first bootstrap stages to end up with a fully working
128-bit capable compiler.
In order for this strategy to work, number of locations had to be changed to use associated
max_value/min_value instead of MAX/MIN constants as well as the min_value (or was it max_value?)
had to be changed to use xor instead of shift so both 64-bit and 128-bit based consteval works
(former not necessarily producing the right results in stage1).
This commit includes manual merge conflict resolution changes from a rebase by @est31.
Most of the Rust community agrees that the vec! macro is clearer when
called using square brackets [] instead of regular brackets (). Most of
these ocurrences are from before macros allowed using different types of
brackets.
There is one left unchanged in a pretty-print test, as the pretty
printer still wants it to have regular brackets.
This commit does the following.
- Removes parsing support for '\X12', '\u123456' and '\U12345678' char
literals. These are no longer valid Rust and rejected by the lexer.
(This strange-sounding situation occurs because the parser rescans
char literals to compute their value.)
- Rearranges the function so that all the escaped values are handled in
a single `match`, and changes the error-handling to use vanilla
assert!() and unwrap().
This reduces the time taken to run
`rustc -Zparse-only rustc-benchmarks/issue-32278-big-array-of-strings`
from 0.18s to 0.15s on my machine, and reduces the number of
instructions (as measured by Cachegrind) from 1.34B to 1.01B.
With the change applied, the time to fully compile that benchmark is
1.96s, so this is a 1.5% improvement.
Added tokenstream parser procedure
A tiny PR that simply adds a procedure for parsing `TokenStream`s to the parser in `src/libsyntax`. This is to ease using `TokenStream`s with the current (old) procedural macro system.
To allow these braced macro invocation, this PR removes the optional expression from `ast::Block` and instead uses a `StmtKind::Expr` at the end of the statement list.
Currently, braced macro invocations in blocks can expand into statements (and items) except when they are last in a block, in which case they can only expand into expressions.
For example,
```rust
macro_rules! make_stmt {
() => { let x = 0; }
}
fn f() {
make_stmt! {} //< This is OK...
let x = 0; //< ... unless this line is commented out.
}
```
Fixes#34418.
syntax-[breaking-change] cc #31645
(Only breaking because ast::TokenTree is now tokenstream::TokenTree.)
This pull request refactors TokenTrees into their own file as src/libsyntax/tokenstream.rs, moving them out of src/libsyntax/ast.rs, in order to prepare for an accompanying TokenStream implementation (per RFC 1566).
When items are inlined from extern crates, the filename in the debug info
is taken from the FileMap that's serialized in the rlib metadata.
Currently this is just FileMap.name, which is whatever path is passed to rustc.
Since libcore and libstd are built by invoking rustc with relative paths,
they wind up with relative paths in the rlib, and when linked into a binary
the debug info uses relative paths for the names, but since the compilation
directory for the final binary, tools trying to read source filenames
will wind up with bad paths. We noticed this in Firefox with source
filenames from libcore/libstd having bad paths.
This change stores an absolute path in FileMap.abs_path, and uses that
if available for writing debug info. This is not going to magically make
debuggers able to find the source, but it will at least provide sensible
paths.
The extra filename and line was mainly there to keep the indentation
relative to the main snippet; now that this doesn't include
filename/line-number as a prefix, it is distracted.