From 8ac0b3144c93af5108da779848c51825e01e1858 Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 10 Nov 2015 07:34:44 +1300 Subject: [PATCH] Information for new contributors --- Contributing.md | 165 ++++++++++++++++++++++++++++++++++++++++++++++-- README.md | 28 +++++++- 2 files changed, 183 insertions(+), 10 deletions(-) diff --git a/Contributing.md b/Contributing.md index 06a11973371..50209a22d47 100644 --- a/Contributing.md +++ b/Contributing.md @@ -1,6 +1,13 @@ -## Contributing +# Contributing -### Test and file issues +There are many ways to contribute to Rustfmt. This document lays out what they +are and has information for how to get started. If you have any questions about +contributing or need help with anything, please ping nrc on irc, #rust-tools is +probably the best channel. Feel free to also ask questions on issues, or file +new issues specifically to get help. + + +## Test and file issues It would be really useful to have people use rustfmt on their projects and file issues where it does something you don't expect. @@ -8,9 +15,13 @@ issues where it does something you don't expect. A really useful thing to do that on a crate from the Rust repo. If it does something unexpected, file an issue; if not, make a PR to the Rust repo with the reformatted code. We hope to get the whole repo consistently rustfmt'ed and to -replace `make tidy` with rustfmt as a medium-term goal. +replace `make tidy` with rustfmt as a medium-term goal. Issues with stack traces +for bugs and/or minimal test cases are especially useful. -### Create test cases +See this [blog post](http://ncameron.org/blog/rustfmt-ing-rust/) for more details. + + +## Create test cases Having a strong test suite for a tool like this is essential. It is very easy to create regressions. Any tests you can add are very much appreciated. @@ -45,11 +56,151 @@ that toml file located in `./tests/config/` for its configuration. Including `// rustfmt-config: small_tabs.toml` will run your test with the configuration file found at `./tests/config/small_tabs.toml`. -### Hack! + +## Hack! Here are some [good starting issues](https://github.com/nrc/rustfmt/issues?q=is%3Aopen+is%3Aissue+label%3Aeasy). -Note than some of those issues tagged 'easy' are not that easy and might be better -second issues, rather than good first issues to fix. If you've found areas which need polish and don't have issues, please submit a PR, don't feel there needs to be an issue. + + +### Guidelines + +Rustfmt bootstraps, that is part of its test suite is running itself on its +source code. So, basically, the only style guideline is that you must pass the +tests. That ensures that the Rustfmt source code adheres to our own conventions. + +Talking of tests, if you add a new feature or fix a bug, please also add a test. +It's really easy, see above for details. Please run `cargo test` before +submitting a PR to ensure your patch passes all tests, it's pretty quick. + +Please try to avoid leaving `TODO`s in the code. There are a few around, but I +wish there weren't. You can leave `FIXME`s, preferably with an issue number. + + +### A quick tour of Rustfmt + +Rustfmt is basically a pretty printer - that is, it's mode of operation is to +take an AST (abstract syntax tree) and print it in a nice way (including staying +under the maximum permitted width for a line). In order to get that AST, we +first have to parse the source text, we use the Rust compiler's parser to do +that (see [src/lib.rs]). We shy away from doing anything too fancy, such as +algebraic approaches to pretty printing, instead relying on an heuristic +approach, 'manually' crafting a string for each AST node. This results in quite +a lot of code, but it is relatively simple. + +The AST is a tree view of source code. It carries all the semantic information +about the code, but not all of the syntax. In particular, we lose white space +and comments (although doc comments are preserved). Rustfmt uses a view of the +AST before macros are expanded, so there are still macro uses in the code. The +arguments to macros are not an AST, but raw tokens - this makes them harder to +format. + +There are different nodes for every kind of item and expression in Rust. For +more details see the source code in the compiler - +[ast.rs](https://dxr.mozilla.org/rust/source/src/libsyntax/ast.rs) - and/or the +[docs](http://manishearth.github.io/rust-internals-docs/syntax/ast/index.html). + +Many nodes in the AST (but not all, annoyingly) have a `Span`. A `Span` is a +range in the source code, it can easily be converted to a snippet of source +text. When the AST does not contain enough information for us, we rely heavily +on `Span`s. For example, we can look between spans to try and find comments, or +parse a snippet to see how the user wrote their source code. + +The downside of using the AST is that we miss some information - primarily white +space and comments. White space is sometimes significant, although mostly we +want to ignore it and make our own. We strive to reproduce all comments, but +this is sometimes difficult. The crufty corners of Rustfmt are where we hack +around the absence of comments in the AST and try to recreate them as best we +can. + +Our primary tool here is to look between spans for text we've missed. For +example, in a function call `foo(a, b)`, we have spans for `a` and `b`, in this +case there is only a comma and a single space between the end of `a` and the +start of `b`, so there is nothing much to do. But if we look at +`foo(a /* a comment */, b)`, then between `a` and `b` we find the comment. + +At a higher level, Rustfmt has machinery so that we account for text between +'top level' items. Then we can reproduce that text pretty much verbatim. We only +count spans we actually reformat, so if we can't format a span it is not missed +completely, but is reproduced in the output without being formatted. This is +mostly handled in [src/missed_spans.rs]. See also `FmtVisitor::last_pos` in +[src/visitor.rs]. + + +#### Some important elements + +At the highest level, Rustfmt uses a `Visitor` implementation called `FmtVisitor` +to walk the AST. This is in [src/visitor.rs]. This is really just used to walk +items, rather than the bodies of functions. We also cover macros and attributes +here. Most methods of the visitor call out to `Rewrite` implementations that +then walk their own children. + +The `Rewrite` trait is defined in [src/rewrite.rs]. It is implemented for many +things that can be rewritten, mostly AST nodes. It has a single function, +`rewrite`, which is called to rewrite `self` into an `Option`. The +arguments are `width` which is the horizontal space we write into, and `offset` +which is how much we are currently indented from the lhs of the page. We also +take a context which contains information used for parsing, the current block +indent, and a configuration (see below). + +To understand the indents, consider + +``` +impl Foo { + fn foo(...) { + bar(argument_one, + baz()); + } +} +``` + +When formatting the `bar` call we will format the arguments in order, after the +first one we know we are working on multiple lines (imagine it is longer than +written). So, when we come to the second argument, the indent we pass to +`rewrite` is 12, which puts us under the first argument. The current block +indent (stored in the context) is 8. The former is used for visual indenting +(when objects are vertically aligned with some marker), the latter is used for +block indenting (when objects are tabbed in from the lhs). The width available +for `baz()` will be the maximum width, minus the space used for indenting, minus +the space used for the `);`. (Note that actual argument formatting does not +quite work like this, but it's close enough). + +The `rewrite` function returns an `Option` - either we successfully rewrite and +return the rewritten string for the caller to use, or we fail to rewrite and +return `None`. This could be because Rustfmt encounters something it doesn't +know how to reformat, but more often it is because Rustfmt can't fit the item +into the required width. How to handle this is up to the caller. Often the +caller just gives up, ultimately relying on the missed spans system to paste in +the un-formatted source. A better solution (although not performed in many +places) is for the caller to shuffle around some of it's other items to make +more width, then call the function again with more space. + +Since it is common for callers to bail out when a callee fails, we often use a +`try_opt!` macro to make this pattern more succinct. + +One way we might find out that we don't have enough space is when computing how much +space we have. Something like `available_space = budget - overhead`. Since +widths are unsized integers, this would cause underflow. Therefore we use +checked subtraction: `available_space = try_opt!(budget.checked_sub(overhead))`. +`checked_sub` returns an `Option`, and if we would underflow `try_opt!` returns +`None`, otherwise we proceed with the computed space. + +Much syntax in Rust is lists: lists of arguments, lists of fields, lists of +array elements, etc. We have some generic code to handle lists, including how to +space them in horizontal and vertical space, indentation, comments between +items, trailing separators, etc. However, since there are so many options, the +code is a bit complex. Look in [src/lists.rs]. `write_list` is the key function, +and `ListFormatting` the key structure for configuration. You'll need to make a +`ListItems` for input, this is usually done using `itemize_list`. + +Rustfmt strives to be highly configurable. Often the first part of a patch is +creating a configuration option for the feature you are implementing. All +handling of configuration options is done in [src/config.rs]. Look for the +`create_config!` macro at the end of the file for all the options. The rest of +the file defines a bunch of enums used for options, and the machinery to produce +the config struct and parse a config file, etc. Checking an option is done by +accessing the correct field on the config struct, e.g., `config.max_width`. Most +functions have a `Config`, or one can be accessed via a visitor or context of +some kind. diff --git a/README.md b/README.md index 1b8fd4dcffd..4951ce5d582 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,9 @@ A tool for formatting Rust code according to style guidelines. +If you'd like to help out (and you should, it's a fun project!), see +[Contributing.md]. + ## Installation @@ -47,12 +50,31 @@ the command line. screen, for example. +## What style does Rustfmt use? + +Rustfmt is designed to be very configurable. You can create a TOML file called +rustfmt.toml, place it in the project directory and it will apply the options +in that file. See `cargo run --help-config` for the options which are available, +or if you prefer to see source code, [src/config.rs]. + +By default, Rustfmt uses a style which (mostly) confirms to the +[Rust style guidelines](https://github.com/rust-lang/rust/tree/master/src/doc/style). +There are many details which the style guidelines do not cover, and in these +cases we try to adhere to a style similar to that used in the +[Rust repo](https://github.com/rust-lang/rust). Once Rustfmt is more complete, and +able to re-format large repositories like Rust, we intend to go through the Rust +RFC process to nail down the default style in detail. + +If there are styling choices you don't agree with, we are usually happy to add +options covering different styles. File an issue, or even better, submit a PR. + + ## Gotchas * For things you do not want rustfmt to mangle, use one of - ```rust - #[rustfmt_skip] - #[cfg_attr(rustfmt, rustfmt_skip)] + ```rust + #[rustfmt_skip] + #[cfg_attr(rustfmt, rustfmt_skip)] ``` * When you run rustfmt, place a file named rustfmt.toml in target file directory or its parents to override the default settings of rustfmt.