mirror of
https://github.com/rust-lang/rust.git
synced 2025-02-05 11:33:04 +00:00
Explain what ifmt! is all about
This commit is contained in:
parent
1f6afa887b
commit
27b4d104c8
@ -8,6 +8,307 @@
|
||||
// option. This file may not be copied, modified, or distributed
|
||||
// except according to those terms.
|
||||
|
||||
/**!
|
||||
|
||||
# The Formatting Module
|
||||
|
||||
This module contains the runtime support for the `ifmt!` syntax extension. This
|
||||
macro is implemented in the compiler to emit calls to this module in order to
|
||||
format arguments at runtime into strings and streams.
|
||||
|
||||
The functions contained in this module should not normally be used in everyday
|
||||
use cases of `ifmt!`. The assumptions made by these functions are unsafe for all
|
||||
inputs, and the compiler performs a large amount of validation on the arguments
|
||||
to `ifmt!` in order to ensure safety at runtime. While it is possible to call
|
||||
these functions directly, it is not recommended to do so in the general case.
|
||||
|
||||
## Usage
|
||||
|
||||
The `ifmt!` macro is intended to be familiar to those coming from C's
|
||||
printf/sprintf functions or Python's `str.format` function. In its current
|
||||
revision, the `ifmt!` macro returns a `~str` type which is the result of the
|
||||
formatting. In the future it will also be able to pass in a stream to format
|
||||
arguments directly while performing minimal allocations.
|
||||
|
||||
Some examples of the `ifmt!` extension are:
|
||||
|
||||
~~~{.rust}
|
||||
ifmt!("Hello") // => ~"Hello"
|
||||
ifmt!("Hello, {:s}!", "world") // => ~"Hello, world!"
|
||||
ifmt!("The number is {:d}", 1) // => ~"The number is 1"
|
||||
ifmt!("{}", ~[3, 4]) // => ~"~[3, 4]"
|
||||
ifmt!("{value}", value=4) // => ~"4"
|
||||
ifmt!("{} {}", 1, 2) // => ~"1 2"
|
||||
~~~
|
||||
|
||||
From these, you can see that the first argument is a format string. It is
|
||||
required by the compiler for this to be a string literal; it cannot be a
|
||||
variable passed in (in order to perform validity checking). The compiler will
|
||||
then parse the format string and determine if the list of arguments provided is
|
||||
suitable to pass to this format string.
|
||||
|
||||
### Positional parameters
|
||||
|
||||
Each formatting argument is allowed to specify which value argument it's
|
||||
referencing, and if omitted it is assumed to be "the next argument". For
|
||||
example, the format string `{} {} {}` would take three parameters, and they
|
||||
would be formatted in the same order as they're given. The format string
|
||||
`{2} {1} {0}`, however, would format arguments in reverse order.
|
||||
|
||||
A format string is required to use all of its arguments, otherwise it is a
|
||||
compile-time error. You may refer to the same argument more than once in the
|
||||
format string, although it must always be referred to with the same type.
|
||||
|
||||
### Named parameters
|
||||
|
||||
Rust itself does not have a Python-like equivalent of named parameters to a
|
||||
function, but the `ifmt!` macro is a syntax extension which allows it to
|
||||
leverage named parameters. Named parameters are listed at the end of the
|
||||
argument list and have the syntax:
|
||||
|
||||
~~~
|
||||
identifier '=' expression
|
||||
~~~
|
||||
|
||||
It is illegal to put positional parameters (those without names) after arguments
|
||||
which have names. Like positional parameters, it is illegal to provided named
|
||||
parameters that are unused by the format string.
|
||||
|
||||
### Argument types
|
||||
|
||||
Each argument's type is dictated by the format string. It is a requirement that
|
||||
every argument is only ever referred to by one type. When specifying the format
|
||||
of an argument, however, a string like `{}` indicates no type. This is allowed,
|
||||
and if all references to one argument do not provide a type, then the format `?`
|
||||
is used (the type's rust-representation is printed). For example, this is an
|
||||
invalid format string:
|
||||
|
||||
~~~
|
||||
{0:d} {0:s}
|
||||
~~~
|
||||
|
||||
Because the first argument is both referred to as an integer as well as a
|
||||
string.
|
||||
|
||||
Because formatting is done via traits, there is no requirement that the
|
||||
`d` format actually takes an `int`, but rather it simply requires a type which
|
||||
ascribes to the `Signed` formatting trait. There are various parameters which do
|
||||
require a particular type, however. Namely if the sytnax `{:.*s}` is used, then
|
||||
the number of characters to print from the string precedes the actual string and
|
||||
must have the type `uint`. Although a `uint` can be printed with `{:u}`, it is
|
||||
illegal to reference an argument as such. For example, this is another invalid
|
||||
format string:
|
||||
|
||||
~~~
|
||||
{:.*s} {0:u}
|
||||
~~~
|
||||
|
||||
### Formatting traits
|
||||
|
||||
When requesting that an argument be formatted with a particular type, you are
|
||||
actually requesting that an argument ascribes to a particular trait. This allows
|
||||
multiple actual types to be formatted via `{:d}` (like `i8` as well as `int`).
|
||||
The current mapping of types to traits is:
|
||||
|
||||
* `?` => Poly
|
||||
* `d` => Signed
|
||||
* `i` => Signed
|
||||
* `u` => Unsigned
|
||||
* `b` => Bool
|
||||
* `c` => Char
|
||||
* `o` => Octal
|
||||
* `x` => LowerHex
|
||||
* `X` => UpperHex
|
||||
* `s` => String
|
||||
* `p` => Pointer
|
||||
* `t` => Binary
|
||||
|
||||
What this means is that any type of argument which implements the
|
||||
`std::fmt::Binary` trait can then be formatted with `{:t}`. Implementations are
|
||||
provided for these traits for a number of primitive types by the standard
|
||||
library as well. Again, the default formatting type (if no other is specified)
|
||||
is `?` which is defined for all types by default.
|
||||
|
||||
When implementing a format trait for your own time, you will have to implement a
|
||||
method of the signature:
|
||||
|
||||
~~~
|
||||
fn fmt(value: &T, f: &mut std::fmt::Formatter);
|
||||
~~~
|
||||
|
||||
Your type will be passed by-reference in `value`, and then the function should
|
||||
emit output into the `f.buf` stream. It is up to each format trait
|
||||
implementation to correctly adhere to the requested formatting parameters. The
|
||||
values of these parameters will be listed in the fields of the `Formatter`
|
||||
struct. In order to help with this, the `Formatter` struct also provides some
|
||||
helper methods.
|
||||
|
||||
## Internationalization
|
||||
|
||||
The formatting syntax supported by the `ifmt!` extension supports
|
||||
internationalization by providing "methods" which execute various differnet
|
||||
outputs depending on the input. The syntax and methods provided are similar to
|
||||
other internationalization systems, so again nothing should seem alien.
|
||||
Currently two methods are supported by this extension: "select" and "plural".
|
||||
|
||||
Each method will execute one of a number of clauses, and then the value of the
|
||||
clause will become what's the result of the argument's format. Inside of the
|
||||
cases, nested argument strings may be provided, but all formatting arguments
|
||||
must not be done through implicit positional means. All arguments inside of each
|
||||
case of a method must be explicitly selected by their name or their integer
|
||||
position.
|
||||
|
||||
Furthermore, whenever a case is running, the special character `#` can be used
|
||||
to reference the string value of the argument which was selected upon. As an
|
||||
example:
|
||||
|
||||
~~~
|
||||
ifmt!("{0, select, other{#}}", "hello") // => ~"hello"
|
||||
~~~
|
||||
|
||||
This example is the equivalent of `{0:s}` essentially.
|
||||
|
||||
### Select
|
||||
|
||||
The select method is a switch over a `&str` parameter, and the parameter *must*
|
||||
be of the type `&str`. An example of the syntax is:
|
||||
|
||||
~~~
|
||||
{0, select, male{...} female{...} other{...}}
|
||||
~~~
|
||||
|
||||
Breaking this down, the `0`-th argument is selected upon with the `select`
|
||||
method, and then a number of cases follow. Each case is preceded by an
|
||||
identifier which is the match-clause to execute the given arm. In this case,
|
||||
there are two explicit cases, `male` and `female`. The case will be executed if
|
||||
the string argument provided is an exact match to the case selected.
|
||||
|
||||
The `other` case is also a required case for all `select` methods. This arm will
|
||||
be executed if none of the other arms matched the word being selected over.
|
||||
|
||||
### Plural
|
||||
|
||||
The plural method is a switch statement over a `uint` parameter, and the
|
||||
parameter *must* be a `uint`. A plural method in its full glory can be specified
|
||||
as:
|
||||
|
||||
~~~
|
||||
{0, plural, offset=1 =1{...} two{...} many{...} other{...}}
|
||||
~~~
|
||||
|
||||
To break this down, the first `0` indicates that this method is selecting over
|
||||
the value of the first positional parameter to the format string. Next, the
|
||||
`plural` method is being executed. An optionally-supplied `offset` is then given
|
||||
which indicates a number to subtract from argument `0` when matching. This is
|
||||
then followed by a list of cases.
|
||||
|
||||
Each case is allowed to supply a specific value to match upon with the syntax
|
||||
`=N`. This case is executed if the value at argument `0` matches N exactly,
|
||||
without taking the offset into account. A case may also be specified by one of
|
||||
five keywords: `zero`, `one`, `two`, `few`, and `many`. These cases are matched
|
||||
on after argument `0` has the offset taken into account. Currently the
|
||||
definitions of `many` and `few` are hardcoded, but they are in theory defined by
|
||||
the current locale.
|
||||
|
||||
Finally, all `plural` methods must have an `other` case supplied which will be
|
||||
executed if none of the other cases match.
|
||||
|
||||
## Syntax
|
||||
|
||||
The syntax for the formatting language used is drawn from other languages, so it
|
||||
should not be too alien. Arguments are formatted with python-like syntax,
|
||||
meaning that arguments are surrounded by `{}` instead of the C-like `%`. The
|
||||
actual grammar for the formatting syntax is:
|
||||
|
||||
~~~
|
||||
format_string := <text> [ format <text> ] *
|
||||
format := '{' [ argument ] [ ':' format_spec ] [ ',' function_spec ] '}'
|
||||
argument := integer | identifier
|
||||
|
||||
format_spec := [[fill]align][sign]['#'][0][width]['.' precision][type]
|
||||
fill := character
|
||||
align := '<' | '>'
|
||||
sign := '+' | '-'
|
||||
width := count
|
||||
precision := count | '*'
|
||||
type := identifier | ''
|
||||
count := parameter | integer
|
||||
parameter := integer '$'
|
||||
|
||||
function_spec := plural | select
|
||||
select := 'select' ',' ( identifier arm ) *
|
||||
plural := 'plural' ',' [ 'offset:' integer ] ( selector arm ) *
|
||||
selector := '=' integer | keyword
|
||||
keyword := 'zero' | 'one' | 'two' | 'few' | 'many' | 'other'
|
||||
arm := '{' format_string '}'
|
||||
~~~
|
||||
|
||||
## Formatting Parameters
|
||||
|
||||
Each argument being formatted can be transformed by a number of formatting
|
||||
parameters (corresponding to `format_spec` in the syntax above). These
|
||||
parameters affect the string representation of what's being formatted. This
|
||||
syntax draws heavily from Python's, so it may seem a bit familiar.
|
||||
|
||||
### Fill/Alignment
|
||||
|
||||
The fill character is provided normally in conjunction with the `width`
|
||||
parameter. This indicates that if the value being formatted is smaller than
|
||||
`width` some extra characters will be printed around it. The extra characters
|
||||
are specified by `fill`, and the alignment can be one of two options:
|
||||
|
||||
* `<` - the argument is left-aligned in `width` columns
|
||||
* `>` - the argument is right-aligned in `width` columns
|
||||
|
||||
### Sign/#/0
|
||||
|
||||
These can all be interpreted as flags for a particular formatter.
|
||||
|
||||
* '+' - This is intended for numeric types and indicates that the sign should
|
||||
always be printed. Positive signs are never printed by default, and the
|
||||
negative sign is only printed by default for the `Signed` trait. This
|
||||
flag indicates that the correct sign (+ or -) should always be printed.
|
||||
* '-' - Currently not used
|
||||
* '#' - This flag is indicates that the "alternate" form of printing should be
|
||||
used. By default, this only applies to the integer formatting traits and
|
||||
performs like:
|
||||
* `x` - precedes the argument with a "0x"
|
||||
* `X` - precedes the argument with a "0x"
|
||||
* `t` - precedes the argument with a "0b"
|
||||
* `o` - precedes the argument with a "0o"
|
||||
* '0' - This is used to indicate for integer formats that the padding should
|
||||
both be done with a `0` character as well as be sign-aware. A format
|
||||
like `{:08d}` would yield `00000001` for the integer `1`, while the same
|
||||
format would yield `-0000001` for the integer `-1`. Notice that the
|
||||
negative version has one fewer zero than the positive version.
|
||||
|
||||
### Width
|
||||
|
||||
This is a parameter for the "minimum width" that the format should take up. If
|
||||
the value's string does not fill up this many characters, then the padding
|
||||
specified by fill/alignment will be used to take up the required space.
|
||||
|
||||
The default fill/alignment for non-numerics is a space and left-aligned. The
|
||||
defaults for numeric formatters is also a space but with right-alignment. If the
|
||||
'0' flag is specified for numerics, then the implicit fill character is '0'.
|
||||
|
||||
The value for the width can also be provided as a `uint` in the list of
|
||||
parameters by using the `2$` syntax indicating that the second argument is a
|
||||
`uint` specifying the width.
|
||||
|
||||
### Precision
|
||||
|
||||
For non-numeric types, this can be considered a "maximum width". If the
|
||||
resulting string is longer than this width, then it is truncated down to this
|
||||
many characters and only those are emitted.
|
||||
|
||||
For integral types, this has no meaning currently.
|
||||
|
||||
For floating-point types, this indicates how many digits after the decimal point
|
||||
should be printed.
|
||||
|
||||
*/
|
||||
|
||||
use prelude::*;
|
||||
|
||||
use cast;
|
||||
|
Loading…
Reference in New Issue
Block a user