This PR moves all `compile-fail` tests that fail at the parsing stage to a `parse-fail` directory, in order to use the tests in the `parse-fail` directory to test if the new LALR parser rejects the same files as the Rust parser. I also adjusted the `testparser.py` script to handle the tests in `parse-fail` differently.
However during working on this, I discovered, that Rust's parser sometimes fails during parsing, but does not return a nonzero return code, e.g. compiling `/test/compile-fail/doc-before-semi.rs` with `-Z parse-only` prints an error message, but returns status code 0. Compiling the same file without `-Z parse-only`, the same error message is displayed, but error code 101 returned. I'll look into that over the next week.
This restructures tidy.py to walk the tree itself,
and improves performance considerably by not loading entire
files into buffers for licenseck.
Splits build rules into 'tidy', 'tidy-basic', 'tidy-binaries',
'tidy-errors', 'tidy-features'.
This adds a new lexer/parser combo for the entire Rust language can be generated with with flex and bison, taken from my project at https://github.com/bleibig/rust-grammar. There is also a testing script that runs the generated parser with all *.rs files in the repository (except for tests in compile-fail or ones that marked as "ignore-test" or "ignore-lexer-test"). If you have flex and bison installed, you can run these tests using the new "check-grammar" make target.
This does not depend on or interact with the existing testing code in the grammar, which only provides and tests a lexer specification.
OS X users should take note that the version of bison that comes with the Xcode toolchain (2.3) is too old to work with this grammar, they need to download and install version 3.0 or later.
The parser builds up an S-expression-based AST, which can be displayed by giving the "-v" argument to parser-lalr (normally it only gives output on error). It is only a rough approximation of what is parsed and doesn't capture every detail and nuance of the program.
Hopefully this should be sufficient for issue #2234, or at least a good starting point.
The regex library was largely used for non-critical aspects of the compiler and
various external tooling. The library at this point is duplicated with its
out-of-tree counterpart and as such imposes a bit of a maintenance overhead as
well as compile time hit for the compiler itself.
The last major user of the regex library is the libtest library, using regexes
for filters when running tests. This removal means that the filtering has gone
back to substring matching rather than using regexes.
macro_rules! is like an item that defines a macro. Other items don't have a
trailing semicolon, or use a paren-delimited body.
If there's an argument for matching the invocation syntax, e.g. parentheses for
an expr macro, then I think that applies more strongly to the *inner*
delimiters on the LHS, wrapping the individual argument patterns.
This removes a large array of deprecated functionality, regardless of how
recently it was deprecated. The purpose of this commit is to clean out the
standard libraries and compiler for the upcoming alpha release.
Some notable compiler changes were to enable warnings for all now-deprecated
command line arguments (previously the deprecated versions were silently
accepted) as well as removing deriving(Zero) entirely (the trait was removed).
The distribution no longer contains the libtime or libregex_macros crates. Both
of these have been deprecated for some time and are available externally.
This patch for #15877 uses Antlr's input lookahead (`_input.LA(1) != '.'`) to solve the conflict between the LIT_FLOAT and the range syntax.
Note that in order to execute the grammar tests, #20245 should land first.
Closes#14602. As discussed in that issue, the existing `at` and `name`
functions represent two different results with the empty string:
1. Matched the empty string.
2. Did not match anything.
Consider the following example. This regex has two named matched
groups, `key` and `value`. `value` is optional:
```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```
We can access `value` using `caps.name("value")`, but there's no way for
us to distinguish between the `"foo"` and `"foo;v="` cases.
Early this year, @BurntSushi recommended modifying the existing `at` and
`name` functions to return `Option`, instead of adding new functions to
the API.
This is a [breaking-change], but the fix is easy:
- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
This common representation for delimeters should make pattern matching easier. Having a separate `token::DelimToken` enum also allows us to enforce the invariant that the opening and closing delimiters must be the same in `ast::TtDelimited`, removing the need to ensure matched delimiters when working with token trees.
https://github.com/rust-lang/rfcs/pull/221
The current terminology of "task failure" often causes problems when
writing or speaking about code. You often want to talk about the
possibility of an operation that returns a Result "failing", but cannot
because of the ambiguity with task failure. Instead, you have to speak
of "the failing case" or "when the operation does not succeed" or other
circumlocutions.
Likewise, we use a "Failure" header in rustdoc to describe when
operations may fail the task, but it would often be helpful to separate
out a section describing the "Err-producing" case.
We have been steadily moving away from task failure and toward Result as
an error-handling mechanism, so we should optimize our terminology
accordingly: Result-producing functions should be easy to describe.
To update your code, rename any call to `fail!` to `panic!` instead.
Assuming you have not created your own macro named `panic!`, this
will work on UNIX based systems:
grep -lZR 'fail!' . | xargs -0 -l sed -i -e 's/fail!/panic!/g'
You can of course also do this by hand.
[breaking-change]
- Removes f128 from the grammar, which is no longer support in rustc
- The fragment modifier is added so it won't parse float suffix as a separate token