mirror of
https://github.com/rust-lang/rust.git
synced 2025-01-25 22:22:44 +00:00
3831 lines
136 KiB
Plaintext
3831 lines
136 KiB
Plaintext
\input texinfo @c -*-texinfo-*-
|
|
@c %**start of header
|
|
@setfilename rust.info
|
|
@settitle Rust Documentation
|
|
@setchapternewpage odd
|
|
@c %**end of header
|
|
|
|
@syncodeindex fn cp
|
|
@include version.texi
|
|
|
|
@ifinfo
|
|
This manual is for the ``Rust'' programming language.
|
|
|
|
|
|
@uref{http://github.com/graydon/rust}
|
|
|
|
Version: @gitversion
|
|
|
|
|
|
Copyright 2006-2010 Graydon Hoare
|
|
|
|
Copyright 2009-2010 Mozilla Foundation
|
|
|
|
See accompanying LICENSE.txt for terms.
|
|
|
|
@end ifinfo
|
|
|
|
@dircategory Programming
|
|
@direntry
|
|
* rust: (rust). Rust programming language
|
|
@end direntry
|
|
|
|
@titlepage
|
|
@title Rust
|
|
@subtitle A safe, concurrent, practical language.
|
|
@author Graydon Hoare
|
|
@author Mozilla Foundation
|
|
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
|
|
|
|
@uref{http://github.com/graydon/rust}
|
|
|
|
Version: @gitversion
|
|
|
|
@sp 2
|
|
|
|
Copyright @copyright{} 2006-2010 Graydon Hoare
|
|
|
|
Copyright @copyright{} 2009-2010 Mozilla Foundation
|
|
|
|
See accompanying LICENSE.txt for terms.
|
|
|
|
@end titlepage
|
|
|
|
@everyfooting @| @emph{-- Draft @today --} @|
|
|
|
|
@ifnottex
|
|
@node Top
|
|
@top Top
|
|
|
|
Rust Documentation
|
|
|
|
@end ifnottex
|
|
|
|
@menu
|
|
* Disclaimer:: Notes on a work in progress.
|
|
* Introduction:: Background, intentions, lineage.
|
|
* Tutorial:: Gentle introduction to reading Rust code.
|
|
* Reference:: Systematic reference of language elements.
|
|
* Index:: Index
|
|
@end menu
|
|
|
|
@ifnottex
|
|
Complete table of contents
|
|
@end ifnottex
|
|
|
|
@contents
|
|
|
|
@c ############################################################
|
|
@c Disclaimer
|
|
@c ############################################################
|
|
|
|
@node Disclaimer
|
|
@chapter Disclaimer
|
|
|
|
To the reader,
|
|
|
|
Rust is a work in progress. The language continues to evolve as the design
|
|
shifts and is fleshed out in working code. Certain parts work, certain parts
|
|
do not, certain parts will be removed or changed.
|
|
|
|
This manual is a snapshot written in the present tense. Some features
|
|
described do not yet exist in working code. Some may be temporary. It
|
|
is a @emph{draft}, and we ask that you not take anything you read here
|
|
as either definitive or final. The manual is to help you get a sense
|
|
of the language and its organization, not to serve as a complete
|
|
specification. At least not yet.
|
|
|
|
If you have suggestions to make, please try to focus them on @emph{reductions}
|
|
to the language: possible features that can be combined or omitted. At this
|
|
point, every ``additive'' feature we're likely to support is already on the
|
|
table. The task ahead involves combining, trimming, and implementing.
|
|
|
|
|
|
@c ############################################################
|
|
@c Introduction
|
|
@c ############################################################
|
|
|
|
@node Introduction
|
|
@chapter Introduction
|
|
|
|
@quotation
|
|
We have to fight chaos, and the most effective way of doing that is
|
|
to prevent its emergence.
|
|
@flushright
|
|
- Edsger Dijkstra
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
Rust is a curly-brace, block-structured expression language. It visually
|
|
resembles the C language family, but differs significantly in syntactic and
|
|
semantic details. Its design is oriented toward concerns of ``programming in
|
|
the large'', that is, of creating and maintaining @emph{boundaries} -- both
|
|
abstract and operational -- that preserve large-system @emph{integrity},
|
|
@emph{availability} and @emph{concurrency}.
|
|
|
|
It supports a mixture of imperative procedural, concurrent actor, object
|
|
oriented and pure functional styles. Rust also supports generic programming
|
|
and metaprogramming, in both static and dynamic styles.
|
|
|
|
@menu
|
|
* Goals:: Intentions, motivations.
|
|
* Sales Pitch:: A summary for the impatient.
|
|
* Influences:: Relationship to past languages.
|
|
@end menu
|
|
|
|
|
|
@node Goals
|
|
@section Goals
|
|
|
|
The language design pursues the following goals:
|
|
|
|
@sp 1
|
|
@itemize
|
|
@item Compile-time error detection and prevention.
|
|
@item Run-time fault tolerance and containment.
|
|
@item System building, analysis and maintenance affordances.
|
|
@item Clarity and precision of expression.
|
|
@item Implementation simplicity.
|
|
@item Run-time efficiency.
|
|
@item High concurrency.
|
|
@end itemize
|
|
@sp 1
|
|
|
|
Note that most of these goals are @emph{engineering} goals, not showcases for
|
|
sophisticated language technology. Most of the technology in Rust is
|
|
@emph{old} and has been seen decades earlier in other languages.
|
|
|
|
All new languages are developed in a technological context. Rust's goals arise
|
|
from the context of writing large programs that interact with the internet --
|
|
both servers and clients -- and are thus much more concerned with
|
|
@emph{safety} and @emph{concurrency} than older generations of program. Our
|
|
experience is that these two forces do not conflict; rather they drive system
|
|
design decisions toward extensive use of @emph{partitioning} and
|
|
@emph{statelessness}. Rust aims to make these a more natural part of writing
|
|
programs, within the niche of lower-level, practical, resource-conscious
|
|
languages.
|
|
|
|
|
|
@page
|
|
@node Sales Pitch
|
|
@section Sales Pitch
|
|
|
|
The following comprises a brief ``sales pitch'' overview of the salient
|
|
features of Rust, relative to other languages.
|
|
|
|
@itemize
|
|
|
|
@sp 1
|
|
@item No @code{null} pointers
|
|
|
|
The initialization state of every slot is statically computed as part of the
|
|
typestate system (see below), and requires that all slots are initialized
|
|
before use. There is no @code{null} value; uninitialized slots are
|
|
uninitialized, and can only be written to, not read.
|
|
|
|
The common use for @code{null} in other languages -- as a sentinel value -- is
|
|
subsumed into the more general facility of disjoint union types. A program
|
|
must explicitly model its use of such types.
|
|
|
|
@sp 1
|
|
@item Lightweight tasks with no shared mutable state
|
|
|
|
Like many @emph{actor} languages, Rust provides an isolation (and concurrency)
|
|
model based on lightweight tasks scheduled by the language runtime. These
|
|
tasks are very inexpensive and statically unable to mutate one another's local
|
|
memory. Breaking the rule of task isolation is only possible by calling
|
|
external (C/C++) code.
|
|
|
|
Inter-task communication is typed, asynchronous and simplex, based on passing
|
|
messages over channels to ports. Transmission can be rate-limited or
|
|
rate-unlimited. Selection between multiple senders is pseudo-randomized on the
|
|
receiver side.
|
|
|
|
@sp 1
|
|
@item Predictable native code, simple runtime
|
|
@cindex DWARF
|
|
|
|
The meaning and cost of every operation within a Rust program is intended to
|
|
be easy to model for the reader. The code should not ``surprise'' the
|
|
programmer once it has been compiled.
|
|
|
|
Rust compiles to native code. Rust compilation units are large and the
|
|
compilation model is designed around multi-file, whole-library or
|
|
whole-program optimization. The compiled units are standard loadable objects
|
|
(ELF, PE, Mach-O) containing standard metadata (DWARF) and are compatible with
|
|
existing, standard low-level tools (disassemblers, debuggers, profilers,
|
|
dynamic loaders).
|
|
|
|
The Rust runtime library is a small collection of support code for scheduling,
|
|
memory management, inter-task communication, reflection and runtime
|
|
linkage. This library is written in standard C++ and is quite
|
|
straightforward. It presents a simple interface to embeddings. No
|
|
research-level virtual machine, JIT or garbage collection technology is
|
|
required. It should be relatively easy to adapt a Rust front-end on to many
|
|
existing native toolchains.
|
|
|
|
@sp 1
|
|
@item Integrated system-construction facility
|
|
|
|
The units of compilation of Rust are multi-file amalgamations called
|
|
@emph{crates}. A crate is described by a separate, declarative type of source
|
|
file that guides the compilation of the crate, its packaging, its versioning,
|
|
and its external dependencies. Crates are also the units of distribution and
|
|
loading. Significantly: the dependency graph of crates is @emph{acyclic} and
|
|
@emph{anonymous}: there is no global namespace for crates, and module-level
|
|
recursion cannot cross crate barriers.
|
|
|
|
Unlike many languages, individual modules do @emph{not} carry all the
|
|
mechanisms or restrictions of crates. Modules and crates serve different
|
|
roles.
|
|
|
|
@sp 1
|
|
@item Static control over memory allocation, packing and aliasing.
|
|
|
|
Many values in Rust are allocated @emph{within} their containing stack-frame
|
|
or parent structure. Numbers, records, tuples and tags are all allocated this
|
|
way. To allocate such values in the heap, they must be explicitly
|
|
@emph{boxed}. A @dfn{box} is a pointer to a heap allocation that holds another
|
|
value, its @emph{content}.
|
|
|
|
Boxing and unboxing in Rust is explicit, though in many cases (arithmetic
|
|
operations, name-component dereferencing) Rust will automatically ``reach
|
|
through'' the box to access its content. Box values can be passed and assigned
|
|
independently, like pointers in C; the difference is that in Rust they always
|
|
point to live contents, and are not subject to pointer arithmetic.
|
|
|
|
In addition to boxes, Rust supports a kind of pass-by-reference slot called an
|
|
alias. Forming or releasing an alias does not perform reference-count
|
|
operations; aliases can only be formed on referents that will provably outlive
|
|
the alias, and are therefore only used for passing arguments to
|
|
functions. Aliases are not ``general values'', in the sense that they cannot
|
|
be independently manipulated. They are more like C++ references, except that
|
|
like boxes, aliases are safe: they always point to live values.
|
|
|
|
In addition, every slot (stack-local allocation or alias) has a static
|
|
initialization state that is calculated by the typestate system. This permits
|
|
late initialization of slots in functions with complex control-flow, while
|
|
still guaranteeing that every use of a slot occurs after it has been
|
|
initialized.
|
|
|
|
@sp 1
|
|
@item Static control over mutability and garbage collection.
|
|
|
|
Types in Rust are classified into @emph{layers}. There is a layer of immutable
|
|
values, a layer of state values, and a layer of GC values. By default, all
|
|
types are immutable.
|
|
|
|
If a field within a type is declared as @code{mutable}, then the type is part
|
|
of the @code{state} layer and must be declared as such. Any type directly
|
|
marked as @code{state} @emph{or indirectly referring to} a state type is also
|
|
a state type.
|
|
|
|
If a field within a type is potentially cyclic (this is a narrow, but
|
|
well-defined condition involving mutable recursive types) then it is part of
|
|
the @code{gc} layer and must be declared as such.
|
|
|
|
This classification of data types in Rust interacts with the memory allocation,
|
|
transmission and destruction rules. In particular:
|
|
|
|
@itemize
|
|
@item Only immutable values can be sent over channels.
|
|
@item Only non-GC objects can have destructor functions.
|
|
@end itemize
|
|
|
|
Garbage collection, when present, operates per-task and does not interrupt
|
|
other tasks while running. It is limited to types that need it and can be
|
|
statically avoided altogether by limiting the types in a program to the state
|
|
and immutable layers.
|
|
|
|
Non-GC values are reference-counted and have a deterministic destruction
|
|
order: top-down, immediately upon release of the last live reference.
|
|
|
|
State values can refer to non-state values, but not vice-versa; likewise GC
|
|
values can refer to non-GC values but not vice-versa. Rust therefore
|
|
encourages the programmer to write in a style that consists primarily of
|
|
immutable types, but also permits limited, local (per-task) mutability,
|
|
and provides local (per-task) GC only when required.
|
|
|
|
@sp 1
|
|
@item Stack-based iterators
|
|
|
|
Rust provides a type of function-like multiple-invocation iterator that is
|
|
very efficient: the iterator state lives only on the stack and is tightly
|
|
coupled to the loop that invoked it.
|
|
|
|
@sp 1
|
|
@item Direct interface to C code
|
|
|
|
Rust can load and call many C library functions simply by declaring
|
|
them. Calling a C function is an ``unsafe'' action, and can only be taken
|
|
within a block marked with the @code{unsafe} keyword. Every unsafe block
|
|
in a Rust compilation unit must be explicitly authorized in the crate file.
|
|
|
|
@sp 1
|
|
@item Structural algebraic data types
|
|
|
|
The Rust type system is primarily structural, and contains the standard
|
|
assortment of useful ``algebraic'' type constructors from functional
|
|
languages, such as function types, tuples, record types, vectors, and
|
|
nominally-tagged disjoint unions. Such values may be @emph{pattern-matched} in
|
|
an @code{alt} expression.
|
|
|
|
@sp 1
|
|
@item Generic code
|
|
|
|
Rust supports a simple form of parametric polymorphism: functions, iterators,
|
|
types and objects can be parametrized by other types.
|
|
|
|
@sp 1
|
|
@item Argument binding
|
|
|
|
Rust provides a mechanism of partially binding arguments to functions,
|
|
producing new functions that accept the remaining un-bound arguments. This
|
|
mechanism combines some of the features of lexical closures with some of the
|
|
features of currying, in a smaller and simpler package.
|
|
|
|
@sp 1
|
|
@item Local type inference
|
|
|
|
To save some quantity of programmer key-pressing, Rust supports local type
|
|
inference: signatures of functions, objects and iterators always require type
|
|
annotation, but within the body of a function or iterator many slots can be
|
|
declared @code{auto} and Rust will infer the slot's type from its uses.
|
|
|
|
@sp 1
|
|
@item Structural object system
|
|
|
|
Rust has a lightweight object system based on structural object types: there
|
|
is no ``class hierarchy'' nor any concept of inheritance. Method overriding
|
|
and object restriction are performed explicitly on object values, which are
|
|
little more than order-insensitive records of methods sharing a common private
|
|
value. Objects that reside outside the GC layer can have destructors.
|
|
|
|
@sp 1
|
|
@item Dynamic type
|
|
|
|
Rust includes support for values of a top type, @code{any}, that can hold any
|
|
type of value whatsoever. An @code{any} value is a pair of a type code and a
|
|
boxed value of that type. Injection into an @code{any} and projection by
|
|
type-case-selection is integrated into the language.
|
|
|
|
@sp 1
|
|
@item Dynamic metaprogramming (reflection)
|
|
|
|
Rust supports run-time reflection on the structure of a crate, using a
|
|
combination of custom descriptor structures and the DWARF metadata tables used
|
|
to support crate linkage and other runtime services.
|
|
|
|
@sp 1
|
|
@item Static metaprogramming (syntactic extension)
|
|
|
|
Rust supports a system for syntactic extensions that can be loaded into the
|
|
compiler, to implement user-defined notations, macros, program-generators and
|
|
the like. These notations are @emph{marked} using a special form of
|
|
bracketing, such that a reader unfamiliar with the extension can still parse
|
|
the surrounding text by skipping over the bracketed ``extension text''.
|
|
|
|
@sp 1
|
|
@item Idempotent failure
|
|
|
|
If a task fails due to a signal, or if it evaluates the special @code{fail}
|
|
expression, it enters the @emph{failing} state. A failing task unwinds its
|
|
control stack, frees all of its owned resources (executing destructors) and
|
|
enters the @emph{dead} state. Failure is idempotent and non-recoverable.
|
|
|
|
@sp 1
|
|
@item Signal handling
|
|
|
|
Rust has a system for propagating task-failures and other spontaneous
|
|
events between tasks. Some signals can be trapped and redirected to
|
|
channels; other signals are fatal and result in task-failure. Tasks
|
|
can designate other tasks to handle signals for them. This permits
|
|
organizing tasks into mutually-supervising or mutually-failing groups.
|
|
|
|
@sp 1
|
|
@item Deterministic destruction
|
|
|
|
Non-GC objects can have destructor functions, which are executed
|
|
deterministically in top-down ownership order, as control frames are exited
|
|
and/or objects are otherwise freed from data structures holding them. The same
|
|
destructors are run in the same order whether the object is deleted by
|
|
unwinding during failure or normal execution.
|
|
|
|
Similarly, the rules for freeing non-GC values are deterministic and
|
|
predictable: on scope-exit or structure-release, local slots are released
|
|
immediately. Referenced boxes have their reference count decreased and are
|
|
released if the count drops to zero. Aliases are silently forgotten.
|
|
|
|
GC values are local to a task, and are subject to per-task garbage
|
|
collection. As a result, unreferenced GC-layer boxes are not necessarily freed
|
|
immediately; if an unreferenced GC box is part of an acyclic graph, it is
|
|
freed when the last reference to it drops, but if it is part of a reference
|
|
cycle it will be freed when the GC collects it (or when the owning task
|
|
terminates, at the latest).
|
|
|
|
GC values can point to non-GC values but not vice-versa. Doing so merely
|
|
delays (to an undefined future time) the moment when the deterministic,
|
|
top-down destruction sequence for the referenced non-GC values
|
|
@emph{start}. In other words, the non-GC ``leaves'' of a GC value are released
|
|
in a locally-predictable order, even if the ``interior'' cyclic part of the GC
|
|
value is released in an unpredictable order.
|
|
|
|
@sp 1
|
|
@item Typestate system
|
|
|
|
Every storage slot in a Rust frame participates in not only a conventional
|
|
structural static type system, describing the interpretation of memory in the
|
|
slot, but also a @emph{typestate} system. The static typestates of a program
|
|
describe the set of @emph{pure, dynamic predicates} that provably hold over
|
|
some set of slots, at each point in the program's control-flow graph within
|
|
each frame. The static calculation of the typestates of a program is a
|
|
function-local dataflow problem, and handles user-defined predicates in a
|
|
similar fashion to the way the type system permits user-defined types.
|
|
|
|
A short way of thinking of this is: types statically model values,
|
|
typestates statically model @emph{assertions that hold} before and
|
|
after statements and expressions.
|
|
|
|
@end itemize
|
|
|
|
|
|
@page
|
|
@node Influences
|
|
@section Influences
|
|
@sp 2
|
|
|
|
@quotation
|
|
The essential problem that must be solved in making a fault-tolerant
|
|
software system is therefore that of fault-isolation. Different programmers
|
|
will write different modules, some modules will be correct, others will have
|
|
errors. We do not want the errors in one module to adversely affect the
|
|
behaviour of a module which does not have any errors.
|
|
|
|
@flushright
|
|
- Joe Armstrong
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@quotation
|
|
In our approach, all data is private to some process, and processes can
|
|
only communicate through communications channels. @emph{Security}, as used
|
|
in this paper, is the property which guarantees that processes in a system
|
|
cannot affect each other except by explicit communication.
|
|
|
|
When security is absent, nothing which can be proven about a single module
|
|
in isolation can be guaranteed to hold when that module is embedded in a
|
|
system [...]
|
|
@flushright
|
|
- Robert Strom and Shaula Yemini
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@quotation
|
|
Concurrent and applicative programming complement each other. The
|
|
ability to send messages on channels provides I/O without side effects,
|
|
while the avoidance of shared data helps keep concurrent processes from
|
|
colliding.
|
|
@flushright
|
|
- Rob Pike
|
|
@end flushright
|
|
@end quotation
|
|
@sp 2
|
|
|
|
@page
|
|
Rust is not a particularly original language. It may however appear unusual by
|
|
contemporary standards, as its design elements are drawn from a number of
|
|
``historical'' languages that have, with a few exceptions, fallen out of
|
|
favour. Five prominent lineages contribute the most:
|
|
|
|
@itemize
|
|
@sp 1
|
|
@item
|
|
The NIL (1981) and Hermes (1990) family. These languages were developed by
|
|
Robert Strom, Shaula Yemini, David Bacon and others in their group at IBM
|
|
Watson Research Center (Yorktown Heights, NY, USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Erlang (1987) language, developed by Joe Armstrong, Robert Virding, Claes
|
|
Wikstr@"om, Mike Williams and others in their group at the Ericsson Computer
|
|
Science Laboratory (@"Alvsj@"o, Stockholm, Sweden) .
|
|
|
|
@sp 1
|
|
@item
|
|
The Sather (1990) language, developed by Stephen Omohundro, Chu-Cheow Lim,
|
|
Heinz Schmidt and others in their group at The International Computer Science
|
|
Institute of the University of California, Berkeley (Berkeley, CA, USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Newsqueak (1988), Alef (1995), and Limbo (1996) family. These languages
|
|
were developed by Rob Pike, Phil Winterbottom, Sean Dorward and others in
|
|
their group at Bell labs Computing Sciences Reserch Center (Murray Hill, NJ,
|
|
USA).
|
|
|
|
@sp 1
|
|
@item
|
|
The Napier (1985) and Napier88 (1988) family. These languages were developed
|
|
by Malcolm Atkinson, Ron Morrison and others in their group at the University
|
|
of St. Andrews (St. Andrews, Fife, UK).
|
|
@end itemize
|
|
|
|
@sp 1
|
|
Additional specific influences can be seen from the following languages:
|
|
@itemize
|
|
@item The structural algebraic types and compilation manager of SML.
|
|
@item The syntax-extension systems of Camlp4 and the Common Lisp readtable.
|
|
@item The deterministic destructor system of C++.
|
|
@end itemize
|
|
|
|
@c ############################################################
|
|
@c Tutorial
|
|
@c ############################################################
|
|
|
|
@node Tutorial
|
|
@chapter Tutorial
|
|
|
|
@emph{TODO}.
|
|
|
|
@c ############################################################
|
|
@c Reference
|
|
@c ############################################################
|
|
|
|
@node Reference
|
|
@chapter Reference
|
|
|
|
@menu
|
|
* Ref.Lex:: Lexical structure.
|
|
* Ref.Path:: References to slots and items.
|
|
* Ref.Gram:: Grammar.
|
|
* Ref.Comp:: Compilation and component model.
|
|
* Ref.Mem:: Semantic model of memory.
|
|
* Ref.Task:: Semantic model of tasks.
|
|
* Ref.Item:: The components of a module.
|
|
* Ref.Type:: The types of values held in memory.
|
|
* Ref.Typestate:: Predicates that hold at points in time.
|
|
* Ref.Stmt:: Components of an executable block.
|
|
* Ref.Expr:: Units of execution and evaluation.
|
|
* Ref.Run:: Organization of runtime services.
|
|
@end menu
|
|
|
|
@node Ref.Lex
|
|
@section Ref.Lex
|
|
@c * Ref.Lex:: Lexical structure.
|
|
@cindex Lexical structure
|
|
@cindex Token
|
|
|
|
The lexical structure of a Rust source file or crate file is defined in terms
|
|
of Unicode character codes and character properties.
|
|
|
|
Groups of Unicode character codes and characters are organized into
|
|
@emph{tokens}. Tokens are defined as the longest contiguous sequence of
|
|
characters within the same token type (identifier, keyword, literal, symbol),
|
|
or interrupted by ignored characters.
|
|
|
|
Most tokens in Rust follow rules similar to the C family.
|
|
|
|
Most tokens (including whitespace, keywords, operators and structural symbols)
|
|
are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn
|
|
from Unicode characters specified by the @code{XID_start} and
|
|
@code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex
|
|
#31: Unicode Identifier and Pattern Syntax}. String and character literals may
|
|
include the full range of Unicode characters.
|
|
|
|
@emph{TODO: formalize this section much more}.
|
|
|
|
@menu
|
|
* Ref.Lex.Ignore:: Ignored characters.
|
|
* Ref.Lex.Ident:: Identifier tokens.
|
|
* Ref.Lex.Key:: Keyword tokens.
|
|
* Ref.Lex.Res:: Reserved tokens.
|
|
* Ref.Lex.Num:: Numeric tokens.
|
|
* Ref.Lex.Text:: String and character tokens.
|
|
* Ref.Lex.Syntax:: Syntactic extension tokens.
|
|
* Ref.Lex.Sym:: Special symbol tokens.
|
|
@end menu
|
|
|
|
@node Ref.Lex.Ignore
|
|
@subsection Ref.Lex.Ignore
|
|
@c * Ref.Lex.Ignore:: Ignored tokens.
|
|
|
|
Characters considered to be @emph{whitespace} or @emph{comment} are ignored,
|
|
and are not considered as tokens. They serve only to delimit tokens. Rust is
|
|
otherwise a free-form language.
|
|
|
|
@dfn{Whitespace} is any of the following Unicode characters: U+0020 (space),
|
|
U+0009 (tab, @code{'\t'}), U+000A (LF, @code{'\n'}), U+000D (CR, @code{'\r'}).
|
|
|
|
@dfn{Comments} are @emph{single-line comments} or @emph{multi-line comments}.
|
|
|
|
A @dfn{single-line comment} is any sequence of Unicode characters beginning
|
|
with U+002F U+002F (@code{"//"}) and extending to the next U+000A character,
|
|
@emph{excluding} cases in which such a sequence occurs within a string literal
|
|
token or a syntactic extension token.
|
|
|
|
A @dfn{multi-line comments} is any sequence of Unicode characters beginning
|
|
with U+002F U+002A (@code{"/*"}) and ending with U+002A U+002F (@code{"*/"}),
|
|
@emph{excluding} cases in which such a sequence occurs within a string literal
|
|
token or a syntactic extension token. Multi-line comments may be nested.
|
|
|
|
@node Ref.Lex.Ident
|
|
@subsection Ref.Lex.Ident
|
|
@c * Ref.Lex.Ident:: Identifier tokens.
|
|
@cindex Identifier token
|
|
|
|
Identifiers follow the rules given by Unicode Standard Annex #31, in the form
|
|
closed under NFKC normalization, @emph{excluding} those tokens that are
|
|
otherwise defined as keywords or reserved
|
|
tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
|
|
|
|
That is: an identifier starts with any character having derived property
|
|
@code{XID_Start} and continues with zero or more characters having derived
|
|
property @code{XID_Continue}; and such an identifier is NFKC-normalized during
|
|
lexing, such that all subsequent comparison of identifiers is performed on the
|
|
NFKC-normalized forms.
|
|
|
|
@emph{TODO: define relationship between Unicode and Rust versions}.
|
|
|
|
@footnote{This identifier syntax is a superset of the identifier syntaxes of C
|
|
and Java, and is modeled on Python PEP #3131, which formed the definition of
|
|
identifiers in Python 3.0 and later.}
|
|
|
|
@node Ref.Lex.Key
|
|
@subsection Ref.Lex.Key
|
|
@c * Ref.Lex.Key:: Keyword tokens.
|
|
|
|
The keywords are:
|
|
@cindex Keywords
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .15 .15 .15 .15 .15
|
|
@item @code{use}
|
|
@tab @code{syntax}
|
|
@tab @code{mutable}
|
|
@tab @code{native}
|
|
@item @code{mod}
|
|
@tab @code{import}
|
|
@tab @code{export}
|
|
@tab @code{let}
|
|
@tab @code{auto}
|
|
@item @code{state}
|
|
@tab @code{gc}
|
|
@tab @code{const}
|
|
@tab @code{thread}
|
|
@item @code{auth}
|
|
@tab @code{unsafe}
|
|
@tab @code{as}
|
|
@tab @code{self}
|
|
@tab @code{log}
|
|
@item @code{bind}
|
|
@tab @code{type}
|
|
@tab @code{true}
|
|
@tab @code{false}
|
|
@tab @code{any}
|
|
@item @code{int}
|
|
@tab @code{uint}
|
|
@tab @code{float}
|
|
@tab @code{char}
|
|
@tab @code{bool}
|
|
@item @code{u8}
|
|
@tab @code{u16}
|
|
@tab @code{u32}
|
|
@tab @code{u64}
|
|
@tab @code{f32}
|
|
@item @code{i8}
|
|
@tab @code{i16}
|
|
@tab @code{i32}
|
|
@tab @code{i64}
|
|
@tab @code{f64}
|
|
@item @code{rec}
|
|
@tab @code{tup}
|
|
@tab @code{tag}
|
|
@tab @code{vec}
|
|
@tab @code{str}
|
|
@item @code{fn}
|
|
@tab @code{iter}
|
|
@tab @code{pred}
|
|
@tab @code{obj}
|
|
@tab @code{drop}
|
|
@item @code{task}
|
|
@tab @code{port}
|
|
@tab @code{chan}
|
|
@tab @code{spawn}
|
|
@tab @code{with}
|
|
@item @code{if}
|
|
@tab @code{else}
|
|
@tab @code{alt}
|
|
@tab @code{case}
|
|
@tab @code{in}
|
|
@item @code{do}
|
|
@tab @code{while}
|
|
@tab @code{break}
|
|
@tab @code{cont}
|
|
@tab @code{note}
|
|
@item @code{assert}
|
|
@tab @code{claim}
|
|
@tab @code{check}
|
|
@tab @code{prove}
|
|
@tab @code{fail}
|
|
@item @code{for}
|
|
@tab @code{each}
|
|
@tab @code{ret}
|
|
@tab @code{put}
|
|
@tab @code{be}
|
|
@end multitable
|
|
|
|
@node Ref.Lex.Res
|
|
@subsection Ref.Lex.Res
|
|
@c * Ref.Lex.Res:: Reserved tokens.
|
|
|
|
The reserved tokens are:
|
|
@cindex Reserved
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .15 .15 .15 .15 .15
|
|
@item @code{f16}
|
|
@tab @code{f80}
|
|
@tab @code{f128}
|
|
@item @code{m32}
|
|
@tab @code{m64}
|
|
@tab @code{m128}
|
|
@tab @code{dec}
|
|
@end multitable
|
|
|
|
@sp 2
|
|
|
|
At present these tokens have no defined meaning in the Rust language.
|
|
|
|
These tokens may correspond, in some current or future implementation,
|
|
to additional built-in types for decimal floating-point, extended
|
|
binary and interchange floating-point formats, as defined in the IEEE
|
|
754-1985 and IEEE 754-2008 specifications.
|
|
|
|
|
|
@node Ref.Lex.Num
|
|
@subsection Ref.Lex.Num
|
|
@c * Ref.Lex.Num:: Numeric tokens.
|
|
@cindex Number token
|
|
@cindex Hex token
|
|
@cindex Decimal token
|
|
@cindex Binary token
|
|
@cindex Floating-point token
|
|
|
|
@c FIXME: This discussion isn't quite right since 'f' and 'i' can be used as
|
|
@c suffixes
|
|
|
|
A @dfn{number literal} is either an @emph{integer literal} or a
|
|
@emph{floating-point literal}.
|
|
|
|
@sp 1
|
|
An @dfn{integer literal} has one of three forms:
|
|
@enumerate
|
|
@item A @dfn{decimal literal} starts with a @emph{decimal digit} and continues
|
|
with any mixture of @emph{decimal digits} and @emph{underscores}.
|
|
|
|
@item A @dfn{hex literal} starts with the character sequence U+0030
|
|
U+0078 (@code{"0x"}) and continues as any mixture @emph{hex digits}
|
|
and @emph{underscores}.
|
|
|
|
@item A @dfn{binary literal} starts with the character sequence U+0030
|
|
U+0062 (@code{"0b"}) and continues as any mixture @emph{binary digits}
|
|
and @emph{underscores}.
|
|
|
|
@end enumerate
|
|
|
|
By default, an integer literal is of type @code{int}. An integer literal may
|
|
be followed (immediately, without any spaces) by a @dfn{integer suffix}, which
|
|
changes the type of the literal. There are three kinds of integer literal
|
|
suffix:
|
|
|
|
@enumerate
|
|
@item The @code{u} suffix gives the literal type @code{uint}.
|
|
@item The @code{g} suffix gives the literal type @code{big}.
|
|
@item Each of the signed and unsigned machine types @code{u8}, @code{i8},
|
|
@code{u16}, @code{i16}, @code{u32}, @code{i32}, @code{u64} and @code{i64}
|
|
give the literal the corresponding machine type.
|
|
@end enumerate
|
|
|
|
@sp 1
|
|
A @dfn{floating-point literal} has one of two forms:
|
|
@enumerate
|
|
@item Two @emph{decimal literals} separated by a period
|
|
character U+002E ('.'), with an optional @emph{exponent} trailing after the
|
|
second @emph{decimal literal}.
|
|
@item A single @emph{decimal literal} followed by an @emph{exponent}.
|
|
@end enumerate
|
|
|
|
By default, a floating-point literal is of type @code{float}. A floating-point
|
|
literal may be followed (immediately, without any spaces) by a
|
|
@dfn{floating-point suffix}, which changes the type of the literal. There are
|
|
only two floating-point suffixes: @code{f32} and @code{f64}. Each of these
|
|
gives the floating point literal the associated type, rather than
|
|
@code{float}.
|
|
|
|
A set of suffixes are also reserved to accommodate literal support for
|
|
types corresponding to reserved tokens. The reserved suffixes are @code{f16},
|
|
@code{f80}, @code{f128}, @code{m}, @code{m32}, @code{m64} and @code{m128}.
|
|
|
|
@sp 1
|
|
A @dfn{hex digit} is either a @emph{decimal digit} or else a character in the
|
|
ranges U+0061-U+0066 and U+0041-U+0046 (@code{'a'}-@code{'f'},
|
|
@code{'A'}-@code{'F'}).
|
|
|
|
A @dfn{binary digit} is either the character U+0030 or U+0031 (@code{'0'} or
|
|
@code{'1'}).
|
|
|
|
An @dfn{exponent} begins with either of the characters U+0065 or U+0045
|
|
(@code{'e'} or @code{'E'}), followed by an optional @emph{sign character},
|
|
followed by a trailing @emph{decimal literal}.
|
|
|
|
A @dfn{sign character} is either U+002B or U+002D (@code{'+'} or @code{'-'}).
|
|
|
|
|
|
Examples of integer literals of various forms:
|
|
@example
|
|
123; // type int
|
|
123u; // type uint
|
|
123_u; // type uint
|
|
0xff00; // type int
|
|
0xffu8; // type u8
|
|
0b1111_1111_1001_0000_i32; // type i32
|
|
0xffff_ffff_ffff_ffff_ffff_ffffg; // type big
|
|
@end example
|
|
|
|
|
|
Examples of floating-point literals of various forms:
|
|
@example
|
|
123.0; // type float
|
|
0.1; // type float
|
|
0.1f32; // type f32
|
|
12E+99_f64; // type f64
|
|
@end example
|
|
|
|
|
|
@node Ref.Lex.Text
|
|
@subsection Ref.Lex.Text
|
|
@c * Ref.Lex.Key:: String and character tokens.
|
|
@cindex String token
|
|
@cindex Character token
|
|
@cindex Escape sequence
|
|
@cindex Unicode
|
|
|
|
A @dfn{character literal} is a single Unicode character enclosed within two
|
|
U+0027 (single-quote) characters, with the exception of U+0027 itself, which
|
|
must be @emph{escaped} by a preceding U+005C character ('\').
|
|
|
|
A @dfn{string literal} is a sequence of any Unicode characters enclosed
|
|
within two U+0022 (double-quote) characters, with the exception of U+0022
|
|
itself, which must be @emph{escaped} by a preceding U+005C character
|
|
('\').
|
|
|
|
Some additional @emph{escapes} are available in either character or string
|
|
literals. An escape starts with a U+005C ('\') and continues with one
|
|
of the following forms:
|
|
@itemize
|
|
@item An @dfn{8-bit codepoint escape} escape starts with U+0078 ('x') and is
|
|
followed by exactly two @dfn{hex digits}. It denotes the Unicode codepoint
|
|
equal to the provided hex value.
|
|
@item A @dfn{16-bit codepoint escape} starts with U+0075 ('u') and is followed
|
|
by exactly four @dfn{hex digits}. It denotes the Unicode codepoint equal to
|
|
the provided hex value.
|
|
@item A @dfn{32-bit codepoint escape} starts with U+0055 ('U') and is followed
|
|
by exactly eight @dfn{hex digits}. It denotes the Unicode codepoint equal to
|
|
the provided hex value.
|
|
@item A @dfn{whitespace escape} is one of the characters U+006E, U+0072, or
|
|
U+0074, denoting the unicode values U+000A (LF), U+000D (CR) or U+0009 (HT)
|
|
respectively.
|
|
@item The @dfn{backslash escape} is the character U+005C ('\') which must be
|
|
escaped in order to denote @emph{itself}.
|
|
@end itemize
|
|
|
|
@node Ref.Lex.Syntax
|
|
@subsection Ref.Lex.Syntax
|
|
@c * Ref.Lex.Syntax:: Syntactic extension tokens.
|
|
|
|
Syntactic extensions are marked with the @emph{pound} sigil U+0023 (@code{#}),
|
|
followed by a qualified name of a compile-time imported module item, an
|
|
optional parenthesized list of @emph{parsed expressions}, and an optional
|
|
brace-enclosed region of free-form text (with brace-matching and
|
|
brace-escaping used to determine the limit of the
|
|
region). @xref{Ref.Comp.Syntax}.
|
|
|
|
@emph{TODO: formalize those terms more}.
|
|
|
|
@node Ref.Lex.Sym
|
|
@subsection Ref.Lex.Sym
|
|
@c * Ref.Lex.Sym:: Special symbol tokens.
|
|
|
|
@cindex Symbol
|
|
@cindex Operator
|
|
|
|
The special symbols are:
|
|
|
|
@sp 2
|
|
|
|
@multitable @columnfractions .1 .1 .1 .1 .1 .1
|
|
|
|
@item @code{@@}
|
|
@tab @code{_}
|
|
@item @code{#}
|
|
@tab @code{:}
|
|
@tab @code{.}
|
|
@tab @code{;}
|
|
@tab @code{,}
|
|
@item @code{[}
|
|
@tab @code{]}
|
|
@tab @code{@{}
|
|
@tab @code{@}}
|
|
@tab @code{(}
|
|
@tab @code{)}
|
|
@item @code{=}
|
|
@tab @code{<-}
|
|
@tab @code{<|}
|
|
@tab @code{<+}
|
|
@tab @code{->}
|
|
@item @code{+}
|
|
@tab @code{++}
|
|
@tab @code{+=}
|
|
@tab @code{-}
|
|
@tab @code{--}
|
|
@tab @code{-=}
|
|
@item @code{*}
|
|
@tab @code{/}
|
|
@tab @code{%}
|
|
@tab @code{*=}
|
|
@tab @code{/=}
|
|
@tab @code{%=}
|
|
@item @code{&}
|
|
@tab @code{|}
|
|
@tab @code{!}
|
|
@tab @code{~}
|
|
@tab @code{^}
|
|
@item @code{&=}
|
|
@tab @code{|=}
|
|
@tab @code{^=}
|
|
@tab @code{!=}
|
|
@item @code{>>}
|
|
@tab @code{>>>}
|
|
@tab @code{<<}
|
|
@tab @code{<<=}
|
|
@tab @code{>>=}
|
|
@tab @code{>>>=}
|
|
@item @code{<}
|
|
@tab @code{<=}
|
|
@tab @code{==}
|
|
@tab @code{>=}
|
|
@tab @code{>}
|
|
@item @code{&&}
|
|
@tab @code{||}
|
|
@end multitable
|
|
|
|
@page
|
|
@page
|
|
@node Ref.Path
|
|
@section Ref.Path
|
|
@c * Ref.Path:: References to slots and items.
|
|
@cindex Names of items or slots
|
|
@cindex Path name
|
|
@cindex Type parameters
|
|
|
|
A @dfn{path} is a ubiquitous syntactic form in Rust that deserves special
|
|
attention. A path denotes a slot or an
|
|
item. @xref{Ref.Mem.Slot}. @xref{Ref.Item}. Every slot and item in a Rust
|
|
crate has a @emph{canonical path} that refers to it from the crate top-level,
|
|
as well as a number of shorter @emph{relative paths} that may also denote it
|
|
in inner scopes of the crate. There is no way to define a slot or item without
|
|
a canonical path within its crate (with the exception of the crate's implicit
|
|
top-level module). Paths have meaning only within a specific
|
|
crate. @xref{Ref.Comp.Crate}.
|
|
|
|
Paths consist of period-separated components. In the simplest form, path
|
|
components are identifiers. @xref{Ref.Lex.Ident}.
|
|
|
|
Two examples of simple paths consisting of only identifier components:
|
|
@example
|
|
x;
|
|
x.y.z;
|
|
@end example
|
|
|
|
Paths fall into two important categories: @emph{names} and
|
|
@emph{lvals}.
|
|
|
|
A @dfn{name} denotes an item, and is statically resolved to its
|
|
referent at compile time.
|
|
|
|
An @dfn{lval} denotes a slot or some component of a value held within a slot,
|
|
and is statically resolved at compile time to a sequence of memory operations
|
|
and primitive (arithmetic) expressions that will be executed to load or store
|
|
the associated value, starting from the task stack frame, at run time.
|
|
|
|
In some contexts, the Rust grammar accepts a general @emph{path}, but a
|
|
subsequent syntactic restriction requires the path to be an lval or a name. In
|
|
other words: in some contexts an lval is required (for example, on the left
|
|
hand side of the copy operator, @pxref{Ref.Expr.Copy}) and in other contexts a
|
|
name is required (for example, as a type parameter, @pxref{Ref.Item}). In no
|
|
case is the grammar made ambiguous by accepting a general path and restricting
|
|
allowed paths to names or lvals after parsing. These restrictions are noted in
|
|
the grammar. @xref{Ref.Gram}.
|
|
|
|
A name component may include type parameters. Type parameters are denoted by
|
|
square brackets. Square brackets are used @emph{only} to denote type
|
|
parameters in Rust. If a name component includes a type parameter, the type
|
|
parameter must also resolve statically to a type in the environment of the
|
|
name. Type parameters are only part of the names of items. @xref{Ref.Item}.
|
|
|
|
An example of a name with type parameters:
|
|
@example
|
|
m.map[int,str];
|
|
@end example
|
|
|
|
An lval component may include an indexing operator. Index operators are
|
|
enclosed in parentheses and can include any integral expression. Indexing
|
|
operators can only be applied to vectors or strings, and imply a run-time
|
|
bounds-check. @xref{Ref.Type.Vec}.
|
|
|
|
An example of an lval with a dynamic indexing operator:
|
|
@example
|
|
x.y.(1 + v).z;
|
|
@end example
|
|
|
|
@page
|
|
@node Ref.Gram
|
|
@section Ref.Gram
|
|
@c * Ref.Gram:: Grammar.
|
|
|
|
@emph{TODO: mostly LL(1), it reads like C, Alef and bits of Napier;
|
|
formalize here}.
|
|
|
|
@page
|
|
@node Ref.Comp
|
|
@section Ref.Comp
|
|
@c * Ref.Comp:: Compilation and component model.
|
|
@cindex Compilation model
|
|
|
|
Rust is a @emph{compiled} language. Its semantics are divided along a
|
|
@emph{phase distinction} between compile-time and run-time. Those semantic
|
|
rules that have a @emph{static interpretation} govern the success or failure
|
|
of compilation. A program that fails to compile due to violation of a
|
|
compile-time rule has no defined semantics at run-time; the compiler should
|
|
halt with an error report, and produce no executable artifact.
|
|
|
|
The compilation model centres on artifacts called @emph{crates}. Each
|
|
compilation is directed towards a single crate in source form, and if
|
|
successful produces a single crate in executable form.
|
|
|
|
@menu
|
|
* Ref.Comp.Crate:: Units of compilation and linking.
|
|
* Ref.Comp.Meta:: Metadata about a crate.
|
|
* Ref.Comp.Syntax:: Syntax extensions.
|
|
@end menu
|
|
|
|
@node Ref.Comp.Crate
|
|
@subsection Ref.Comp.Crate
|
|
@c * Ref.Comp.Crate:: Units of compilation and linking.
|
|
@cindex Crate
|
|
|
|
A @dfn{crate} is a unit of compilation and linking, as well as versioning,
|
|
distribution and runtime loading. Crates are defined by @emph{crate source
|
|
files}, which are a type of source file written in a special declarative
|
|
language: @emph{crate language}.@footnote{A crate is somewhat analogous to an
|
|
@emph{assembly} in the ECMA-335 CLI model, a @emph{library} in the SML/NJ
|
|
Compilation Manager, a @emph{unit} in the Owens and Flatt module system, or a
|
|
@emph{configuration} in Mesa.} A crate source file describes:
|
|
|
|
@itemize
|
|
@item Metadata about the crate, such as author, name, version, and copyright.
|
|
@item The source-file and directory modules that make up the crate.
|
|
@item The set of syntax extensions to enable for the crate.
|
|
@item Any external crates or native modules that the crate imports to its top level.
|
|
@item The organization of the crate's internal namespace.
|
|
@item The set of names exported from the crate.
|
|
@end itemize
|
|
|
|
A single crate source file may describe the compilation of a large number of
|
|
Rust source files; it is compiled in its entirety, as a single indivisible
|
|
unit. The compilation phase attempts to transform a single crate source file,
|
|
and its referenced contents, into a single compiled crate. Crate source files
|
|
and compiled crates have a 1:1 relationship.
|
|
|
|
The syntactic form of a crate is a sequence of @emph{directives}, some of
|
|
which have nested sub-directives.
|
|
|
|
A crate defines an implicit top-level anonymous module: within this module,
|
|
all members of the crate have canonical path names. @xref{Ref.Path}. The
|
|
@code{mod} directives within a crate file specify sub-modules to include in
|
|
the crate: these are either directory modules, corresponding to directories in
|
|
the filesystem of the compilation environment, or file modules, corresponding
|
|
to Rust source files. The names given to such modules in @code{mod} directives
|
|
become prefixes of the paths of items and slots defined within any included
|
|
Rust source files.
|
|
|
|
The @code{use} directives within the crate specify @emph{other crates} to scan
|
|
for, locate, import into the crate's module namespace during compilation, and
|
|
link against at runtime. Use directives may also occur independently in rust
|
|
source files. These directives may specify loose or tight ``matching
|
|
criteria'' for imported crates, depending on the preferences of the crate
|
|
developer. In the simplest case, a @code{use} directive may only specify a
|
|
symbolic name and leave the task of locating and binding an appropriate crate
|
|
to a compile-time heuristic. In a more controlled case, a @code{use} directive
|
|
may specify any metadata as matching criteria, such as a URI, an author name
|
|
or version number, a checksum or even a cryptographic signature, in order to
|
|
select an an appropriate imported crate. @xref{Ref.Comp.Meta}.
|
|
|
|
The compiled form of a crate is a loadable and executable object file full of
|
|
machine code, in a standard loadable operating-system format such as ELF, PE
|
|
or Mach-O. The loadable object contains extensive DWARF metadata, describing:
|
|
@itemize
|
|
@item Metadata required for type reflection.
|
|
@item The publicly exported module structure of the crate.
|
|
@item Any metadata about the crate, defined by @code{meta} directives.
|
|
@item The crates to dynamically link with at run-time, with matching criteria
|
|
derived from the same @code{use} directives that guided compile-time imports.
|
|
@end itemize
|
|
|
|
The @code{syntax} directives of a crate are similar to the @code{use}
|
|
directives, except they govern the syntax extension namespace (accessed
|
|
through the syntax-extension sigil @code{#}, @pxref{Ref.Comp.Syntax})
|
|
available only at compile time. A @code{syntax} directive also makes its
|
|
extension available to all subsequent directives in the crate file.
|
|
|
|
An example of a crate:
|
|
|
|
@example
|
|
// Metadata about this crate
|
|
meta (author = "Jane Doe",
|
|
name = "projx"
|
|
desc = "Project X",
|
|
ver = "2.5");
|
|
|
|
// Import a module.
|
|
use std (ver = "1.0");
|
|
|
|
// Activate a syntax-extension.
|
|
syntax re;
|
|
|
|
// Define some modules.
|
|
mod foo = "foo.rs";
|
|
mod bar @{
|
|
mod quux = "quux.rs";
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Comp.Meta
|
|
@subsection Ref.Comp.Meta
|
|
@cindex Metadata, in crates
|
|
|
|
@c FIXME: This section is out of date. The @code{meta} keyword has been replaced
|
|
@c by general crate/item attributes.
|
|
|
|
In a crate, a @code{meta} directive associates free form key-value metadata
|
|
with the crate. This metadata can, in turn, be used in providing partial
|
|
matching parameters to syntax-extension loading and crate importing
|
|
directives, denoted by @code{syntax} and @code{use} keywords respectively.
|
|
|
|
Alternatively, metadata can serve as a simple form of documentation.
|
|
|
|
@node Ref.Comp.Syntax
|
|
@subsection Ref.Comp.Syntax
|
|
@c * Ref.Comp.Syntax:: Syntax extension.
|
|
@cindex Syntax extension
|
|
|
|
Rust provides a notation for @dfn{syntax extension}. The notation is a marked
|
|
syntactic form that can appear as an expression, statement or item in the body
|
|
of a Rust program, or as a directive in a Rust crate, and which causes the
|
|
text enclosed within the marked form to be translated through a named
|
|
extension function loaded into the compiler at compile-time.
|
|
|
|
The compile-time extension function must return a value of the corresponding
|
|
Rust AST type, either an expression node, a statement node or an item
|
|
node. @footnote{The syntax-extension system is analogous to the extensible
|
|
reader system provided by Lisp @emph{readtables}, or the Camlp4 system of
|
|
Objective Caml.} @xref{Ref.Lex.Syntax}.
|
|
|
|
A syntax extension is enabled by a @code{syntax} directive, which must occur
|
|
in a crate file. When the Rust compiler encounters a @code{syntax} directive
|
|
in a crate file, it immediately loads the named syntax extension, and makes it
|
|
available for all subsequent crate directives within the enclosing block scope
|
|
of the crate file, and all Rust source files referenced as modules from the
|
|
enclosing block scope of the crate file.
|
|
|
|
For example, this extension might provide a syntax for regular
|
|
expression literals:
|
|
|
|
@example
|
|
// In a crate file:
|
|
|
|
// Requests the 're' syntax extension from the compilation environment.
|
|
syntax re;
|
|
|
|
// Also declares an import dependency on the module 're'.
|
|
use re;
|
|
|
|
// Reference to a Rust source file as a module in the crate.
|
|
mod foo = "foo.rs";
|
|
|
|
@dots{}
|
|
|
|
// In the source file "foo.rs", use the #re syntax extension and
|
|
// the re module at run-time.
|
|
let str s = get_string();
|
|
let regex pattern = #re.pat@{ aa+b? @};
|
|
let bool matched = re.match(pattern, s);
|
|
@end example
|
|
|
|
@page
|
|
@node Ref.Mem
|
|
@section Ref.Mem
|
|
@c * Ref.Mem:: Semantic model of memory.
|
|
@cindex Memory model
|
|
@cindex Box
|
|
@cindex Slot
|
|
|
|
A Rust task's memory consists of a static set of @emph{items}, a set of tasks
|
|
each with its own @emph{stack}, and a @emph{heap}. Immutable portions of the
|
|
heap may be shared between tasks, mutable portions may not.
|
|
|
|
Allocations in the stack consist of @emph{slots}, and allocations in the heap
|
|
consist of @emph{boxes}.
|
|
|
|
@menu
|
|
* Ref.Mem.Alloc:: Memory allocation model.
|
|
* Ref.Mem.Own:: Memory ownership model.
|
|
* Ref.Mem.Slot:: Stack memory model.
|
|
* Ref.Mem.Box:: Heap memory model.
|
|
* Ref.Mem.Acct:: Memory accounting model.
|
|
@end menu
|
|
|
|
@node Ref.Mem.Alloc
|
|
@subsection Ref.Mem.Alloc
|
|
@c * Ref.Mem.Alloc:: Memory allocation model.
|
|
@cindex Item
|
|
@cindex Stack
|
|
@cindex Heap
|
|
@cindex Shared box
|
|
@cindex Task-local box
|
|
|
|
The @dfn{items} of a program are those functions, iterators, objects, modules
|
|
and types that have their value calculated at compile-time and stored uniquely
|
|
in the memory image of the rust process. Items are neither dynamically
|
|
allocated nor freed.
|
|
|
|
A task's @dfn{stack} consists of activation frames automatically allocated on
|
|
entry to each function as the task executes. A stack allocation is reclaimed
|
|
when control leaves the frame containing it.
|
|
|
|
The @dfn{heap} is a general term that describes two separate sets of boxes:
|
|
@emph{task-local} state and GC boxes, and the @emph{shared} immutable boxes.
|
|
|
|
State and GC boxes are @dfn{task-local}, owned by the task. Like any other
|
|
state or GC value, they cannot pass over channels. State and GC boxes do not
|
|
outlive the task that owns them. When unreferenced, they are either
|
|
immediately destructed (if acyclic) or else collected using a general
|
|
(cycle-aware) garbage-collector local to each task. Garbage collection within
|
|
a local heap does not interrupt execution of other tasks.
|
|
|
|
Immutable boxes are @dfn{shared}, and can be multiply-referenced by many
|
|
different tasks. Like any other immutable type, they can pass over channels,
|
|
and live as long as the last task referencing them within a given domain. When
|
|
unreferenced, they are destroyed immediately (due to reference-counting) and
|
|
returned to the heap memory allocator. Destruction of an immutable box also
|
|
executes within the context of the task that drops the last reference to a
|
|
shared heap allocation, so executing a long-running destructor does not
|
|
interrupt execution of other tasks.
|
|
|
|
|
|
@node Ref.Mem.Own
|
|
@subsection Ref.Mem.Own
|
|
@c * Ref.Mem.Own:: Memory ownership model.
|
|
@cindex Ownership
|
|
|
|
A task @emph{owns} all the @emph{stack-local} slot allocations in its stack
|
|
and @emph{task-local} boxes accessible from its stack. A task @emph{shares}
|
|
ownership of @emph{shared} boxes accessible from its stack. A task does not
|
|
own any items.
|
|
|
|
@dfn{Ownership} of an allocation means that the owning task is the only task
|
|
that can access the allocation.
|
|
|
|
@dfn{Sharing} of an allocation means that the same allocation may be
|
|
concurrently read by multiple tasks. The only shared allocations are those
|
|
that are non-state.
|
|
|
|
When a stack frame is exited, its local allocations are all released, and its
|
|
references to boxes (both shared and owned) are dropped.
|
|
|
|
When a task finishes, its stack is necessarily empty and it therefore has no
|
|
references to any boxes.
|
|
|
|
@node Ref.Mem.Slot
|
|
@subsection Ref.Mem.Slot
|
|
@c * Ref.Mem.Slot:: Stack memory model.
|
|
@cindex Stack
|
|
@cindex Slot
|
|
@cindex Local slot
|
|
@cindex Alias slot
|
|
|
|
A task's stack contains slots.
|
|
|
|
A @dfn{slot} is a component of a stack frame. A slot is either @emph{local} or
|
|
an @emph{alias}.
|
|
|
|
A @dfn{local} slot (or @emph{stack-local} allocation) holds a value directly,
|
|
allocated within the stack's memory. The value is a part of the stack frame.
|
|
|
|
An @dfn{alias} references a value outside the frame. An alias may refer to a
|
|
value allocated in another frame @emph{or} a boxed value in the heap. The
|
|
alias-formation rules ensure that the referent of an alias will outlive the
|
|
alias.
|
|
|
|
Local slots are always implicitly mutable.
|
|
|
|
Local slots are not initialized when allocated; the entire frame worth of
|
|
local slots are allocated at once, on frame-entry, in an uninitialized
|
|
state. Subsequent statements within a function may or may not initialize the
|
|
local slots. Local slots can only be used after they have been initialized;
|
|
this condition is guaranteed by the typestate system.
|
|
|
|
Aliases can @emph{only} be declared as arguments in a function or iterator
|
|
signature, bound to the lifetime of a stack frame. Aliases are not general
|
|
values and cannot be held in boxed allocations or other general data types.
|
|
|
|
Alias slots are indicated by the @emph{ampersand} sigil @code{&}.
|
|
|
|
An example function that accepts an alias parameter:
|
|
@example
|
|
type point3d = rec(int x, int y, int z);
|
|
|
|
fn extract_z(&point3d p) -> int @{
|
|
ret p.z;
|
|
@}
|
|
@end example
|
|
|
|
An example function that accepts an alias to a mutable value:
|
|
@example
|
|
fn incr(& mutable int i) @{
|
|
i = i + 1;
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Mem.Box
|
|
@subsection Ref.Mem.Box
|
|
@c * Ref.Mem.Box:: Heap memory model.
|
|
@cindex Box
|
|
@cindex Dereference operator
|
|
|
|
A @dfn{box} is a reference to a reference-counted heap allocation holding
|
|
another value.
|
|
|
|
Box types and values are constructed by the @emph{at} sigil @code{@@}.
|
|
|
|
An example of constructing a box type and value:
|
|
@example
|
|
let @@int x = @@10;
|
|
@end example
|
|
|
|
Some operations implicitly dereference boxes. Examples of such @dfn{implicit
|
|
dereference} operations are:
|
|
@itemize
|
|
@item arithmetic operators (@code{x + y - z})
|
|
@item name-component selection (@code{x.y.z})
|
|
@end itemize
|
|
|
|
An example of an implicit-dereference operation performed on box values:
|
|
@example
|
|
let @@int x = @@10;
|
|
let @@int y = @@12;
|
|
assert (x + y == 22);
|
|
@end example
|
|
|
|
Other operations act on box values as single-word-sized address values,
|
|
automatically adjusting reference counts on the associated heap
|
|
allocation. For these operations, to access the value held in the box requires
|
|
an explicit dereference of the box value. Explicitly dereferencing a box is
|
|
indicated with the unary @emph{star} operator @code{*}. Examples of such
|
|
@dfn{explicit dereference} operations are:
|
|
@itemize
|
|
@item copying box values (@code{x = y})
|
|
@item passing box values to functions (@code{f(x,y)})
|
|
@end itemize
|
|
|
|
An example of an explicit-dereference operation performed on box values:
|
|
@example
|
|
fn takes_boxed(@@int b) @{
|
|
@}
|
|
|
|
fn takes_unboxed(int b) @{
|
|
@}
|
|
|
|
fn main() @{
|
|
let @@int x = @@10;
|
|
takes_boxed(x);
|
|
takes_unboxed(*x);
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Mem.Acct
|
|
@subsection Ref.Mem.Acct
|
|
@c * Ref.Mem.Acct:: Memory accounting model.
|
|
@cindex Domain
|
|
@cindex Accounting
|
|
@cindex Memory budget
|
|
|
|
Every task belongs to a domain, and that domain tracks the amount of memory
|
|
allocated and not yet released by tasks within it. @xref{Ref.Task.Dom}. Each
|
|
domain has a memory budget. The @dfn{budget} of a domain is the maximum amount
|
|
of memory that can be simultaneously allocated in the domain. If a task tries
|
|
to allocate memory within a domain with an exceeded budget, the task will
|
|
receive a signal.
|
|
|
|
Within a task, accounting is strictly enforced: all memory allocated through
|
|
the runtime library, both user data, sub-domains and runtime-support
|
|
structures such as channel and signal queues, are charged to a task's domain.
|
|
|
|
When a communication channel crosses from one domain to another, any value
|
|
sent over the channel is guaranteed to have been @emph{detached} from the
|
|
domain's memory graph (singly referenced, and/or deep-copied), so its memory
|
|
cost is transferred to the receiving domain.
|
|
|
|
|
|
@page
|
|
@node Ref.Task
|
|
@section Ref.Task
|
|
@c * Ref.Task:: Semantic model of tasks.
|
|
@cindex Task
|
|
@cindex Process
|
|
|
|
An executing Rust program consists of a tree of tasks. A Rust @dfn{task}
|
|
consists of an entry function, a stack, a set of outgoing communication
|
|
channels and incoming communication ports, and ownership of some portion of
|
|
the heap of a single operating-system process.
|
|
|
|
Multiple Rust tasks may coexist in a single operating-system
|
|
process. Execution of multiple Rust tasks in a single operating-system process
|
|
may be either truly concurrent or interleaved by the runtime scheduler. Rust
|
|
tasks are lightweight: each consumes less memory than an operating-system
|
|
process, and switching between Rust tasks is faster than switching between
|
|
operating-system processes.
|
|
|
|
@menu
|
|
* Ref.Task.Comm:: Inter-task communication.
|
|
* Ref.Task.Life:: Task lifecycle and state transitions.
|
|
* Ref.Task.Dom:: Task domains.
|
|
* Ref.Task.Sched:: Task scheduling model.
|
|
@end menu
|
|
|
|
@node Ref.Task.Comm
|
|
@subsection Ref.Task.Comm
|
|
@c * Ref.Task.Comm:: Inter-task communication.
|
|
|
|
@cindex Communication
|
|
@cindex Port
|
|
@cindex Channel
|
|
@cindex Message passing
|
|
@cindex Send expression
|
|
@cindex Receive expression
|
|
|
|
With the exception of @emph{unsafe} blocks, Rust tasks are isolated from
|
|
interfering with one another's memory directly. Instead of manipulating shared
|
|
storage, Rust tasks communicate with one another using a typed, asynchronous,
|
|
simplex message-passing system.
|
|
|
|
A @dfn{port} is a communication endpoint that can @emph{receive}
|
|
messages. Ports receive messages from channels.
|
|
|
|
A @dfn{channel} is a communication endpoint that can @emph{send}
|
|
messages. Channels send messages to ports.
|
|
|
|
Each port is implicitly boxed and mutable; as such a port has a unique
|
|
per-task identity and cannot be replicated or transmitted. If a port value is
|
|
copied, both copies refer to the @emph{same} port. New ports can be
|
|
constructed dynamically and stored in data structures.
|
|
|
|
Each channel is bound to a port when the channel is constructed, so the
|
|
destination port for a channel must exist before the channel itself. A channel
|
|
cannot be rebound to a different port from the one it was constructed with.
|
|
|
|
Many channels can be bound to the same port, but each channel is bound to a
|
|
single port. In other words, channels and ports exist in an N:1 relationship,
|
|
N channels to 1 port. @footnote{It may help to remember nautical terminology
|
|
when differentiating channels from ports. Many different waterways --
|
|
channels -- may lead to the same port.}
|
|
|
|
Each port and channel can carry only one type of message. The message type is
|
|
encoded as a parameter of the channel or port type. The message type of a
|
|
channel is equal to the message type of the port it is bound to.
|
|
|
|
Messages are sent asynchronously or semi-synchronously. A channel contains a
|
|
message queue and asynchronously sending a message merely inserts it into the
|
|
sending channel's queue; message receipt is the responsibility of the
|
|
receiving task.
|
|
|
|
Queued messages in channels are charged to the domain of the @emph{sending}
|
|
task. If too many messages are queued for transmission from a single sending
|
|
task, without being received by a receiving task, the sending task may exceed
|
|
its memory budget, which causes a run-time signal. To help control this
|
|
possibility, a semi-synchronous send operation is possible, which blocks until
|
|
there is room in the existing queue and then executes an asynchronous send.
|
|
|
|
The asynchronous message-send operator is @code{<+}. The semi-synchronous
|
|
message-send operator is @code{<|}. @xref{Ref.Expr.Send}. The message-receive
|
|
operator is @code{<-}. @xref{Ref.Expr.Recv}.
|
|
|
|
@node Ref.Task.Life
|
|
@subsection Ref.Task.Life
|
|
@c * Ref.Task.Life:: Task lifecycle and state transitions.
|
|
|
|
@cindex Lifecycle of task
|
|
@cindex Scheduling
|
|
@cindex Running, task state
|
|
@cindex Blocked, task state
|
|
@cindex Failing, task state
|
|
@cindex Dead, task state
|
|
@cindex Soft failure
|
|
@cindex Hard failure
|
|
|
|
The @dfn{lifecycle} of a task consists of a finite set of states and events
|
|
that cause transitions between the states. The lifecycle states of a task are:
|
|
|
|
@itemize
|
|
@item running
|
|
@item blocked
|
|
@item failing
|
|
@item dead
|
|
@end itemize
|
|
|
|
A task begins its lifecycle -- once it has been spawned -- in the
|
|
@emph{running} state. In this state it executes the statements of its entry
|
|
function, and any functions called by the entry function.
|
|
|
|
A task may transition from the @emph{running} state to the @emph{blocked}
|
|
state any time it evaluates a communication expression on a port or channel that
|
|
cannot be immediately completed. When the communication expression can be
|
|
completed -- when a message arrives at a sender, or a queue drains
|
|
sufficiently to complete a semi-synchronous send -- then the blocked task will
|
|
unblock and transition back to @emph{running}.
|
|
|
|
A task may transition to the @emph{failing} state at any time, due to an
|
|
un-trapped signal or the evaluation of a @code{fail} expression. Once
|
|
@emph{failing}, a task unwinds its stack and transitions to the @emph{dead}
|
|
state. Unwinding the stack of a task is done by the task itself, on its own
|
|
control stack. If a value with a destructor is freed during unwinding, the
|
|
code for the destructor is run, also on the task's control
|
|
stack. Running the destructor code causes a temporary transition to a
|
|
@emph{running} state, and allows the destructor code to cause any
|
|
subsequent state transitions. The original task of unwinding and
|
|
failing thereby may suspend temporarily, and may involve (recursive)
|
|
unwinding of the stack of a failed destructor. Nonetheless, the
|
|
outermost unwinding activity will continue until the stack is unwound
|
|
and the task transitions to the @emph{dead} state. There is no way to
|
|
``recover'' from task failure. Once a task has temporarily suspended
|
|
its unwinding in the @emph{failing} state, failure occurring from
|
|
within this destructor results in @emph{hard} failure. The unwinding
|
|
procedure of hard failure frees resources but does not execute
|
|
destructors. The original (soft) failure is still resumed at the
|
|
point where it was temporarily suspended.
|
|
|
|
A task in the @emph{dead} state cannot transition to other states; it exists
|
|
only to have its termination status inspected by other tasks, and/or to await
|
|
reclamation when the last reference to it drops.
|
|
|
|
@node Ref.Task.Dom
|
|
@subsection Ref.Task.Dom
|
|
@c * Ref.Task.Dom:: Task domains
|
|
|
|
@cindex Domain
|
|
@cindex Process
|
|
@cindex Thread
|
|
|
|
Every task belongs to a domain. A @dfn{domain} is a structure that owns tasks,
|
|
schedules tasks, tracks memory allocation within tasks and manages access to
|
|
runtime services on behalf of tasks.
|
|
|
|
Typically each domain runs on a separate operating-system @emph{thread}, or
|
|
within an isolated operating-system @emph{process}. An easy way to think of a
|
|
domain is as an abstraction over either an operating-system thread @emph{or} a
|
|
process.
|
|
|
|
The key feature of a domain is that it isolates memory references created by
|
|
the Rust tasks within it. No Rust task can refer directly to memory outside
|
|
its domain.
|
|
|
|
Tasks can own sub-domains, which in turn own their own tasks. Every domain
|
|
owns one @emph{root task}, which is the root of the tree of tasks owned by the
|
|
domain.
|
|
|
|
@node Ref.Task.Sched
|
|
@subsection Ref.Task.Sched
|
|
@c * Ref.Task.Sched:: Task scheduling model.
|
|
|
|
@cindex Scheduling
|
|
@cindex Preemption
|
|
@cindex Yielding control
|
|
|
|
Every task is @emph{scheduled} within its domain. @xref{Ref.Task.Dom}. The
|
|
currently scheduled task is given a finite @emph{time slice} in which to
|
|
execute, after which it is @emph{descheduled} at a loop-edge or similar
|
|
preemption point, and another task within the domain is scheduled,
|
|
pseudo-randomly.
|
|
|
|
An executing task can @code{yield} control at any time, which deschedules it
|
|
immediately. Entering any other non-executing state (blocked, dead) similarly
|
|
deschedules the task.
|
|
|
|
@page
|
|
@node Ref.Item
|
|
@section Ref.Item
|
|
@c * Ref.Item:: The components of a module.
|
|
|
|
@cindex Item
|
|
@cindex Type parameters
|
|
@cindex Module item
|
|
|
|
An @dfn{item} is a component of a module. Items are entirely determined at
|
|
compile-time, remain constant during execution, and may reside in read-only
|
|
memory.
|
|
|
|
There are five primary kinds of item: modules, functions, iterators, objects and
|
|
type definitions.
|
|
|
|
All items form an implicit scope for the declaration of sub-items. In other
|
|
words, within a function, object or iterator, declarations of items can (in
|
|
many cases) be mixed with the statements, control blocks, and similar
|
|
artifacts that otherwise compose the item body. The meaning of these scoped
|
|
items is the same as if the item was declared outside the scope, except that
|
|
the item's @emph{path name} within the module namespace is qualified by the
|
|
name of the enclosing item. The exact locations in which sub-items may be
|
|
declared is given by the grammar. @xref{Ref.Gram}.
|
|
|
|
Functions, iterators, objects and type definitions may be @emph{parametrized}
|
|
by type. Type parameters are given as a comma-separated list of identifiers
|
|
enclosed in square brackets (@code{[]}), after the name of the item and before
|
|
its definition. The type parameters of an item are part of the name, not the
|
|
type of the item; in order to refer to the type-parametrized item, a
|
|
referencing name must in general provide type arguments as a list of
|
|
comma-separated types enclosed within square brackets (though the
|
|
type-inference system can often infer such argument types from context). There
|
|
are no general parametric types.
|
|
|
|
@menu
|
|
* Ref.Item.Mod:: Items defining modules.
|
|
* Ref.Item.Fn:: Items defining functions.
|
|
* Ref.Item.Pred:: Items defining predicates for typestates.
|
|
* Ref.Item.Iter:: Items defining iterators.
|
|
* Ref.Item.Obj:: Items defining objects.
|
|
* Ref.Item.Type:: Items defining the types of values and slots.
|
|
* Ref.Item.Tag:: Items defining the constructors of a tag type.
|
|
@end menu
|
|
|
|
@node Ref.Item.Mod
|
|
@subsection Ref.Item.Mod
|
|
@c * Ref.Item.Mod:: Items defining sub-modules.
|
|
|
|
@cindex Module item
|
|
@cindex Importing names
|
|
@cindex Exporting names
|
|
@cindex Visibility control
|
|
|
|
A @dfn{module item} contains declarations of other @emph{items}. The items
|
|
within a module may be functions, modules, objects or types. These
|
|
declarations have both static and dynamic interpretation. The purpose of a
|
|
module is to organize @emph{names} and control @emph{visibility}. Modules are
|
|
declared with the keyword @code{mod}.
|
|
|
|
An example of a module:
|
|
@example
|
|
mod math @{
|
|
type complex = (f64,f64);
|
|
fn sin(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
fn cos(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
fn tan(f64) -> f64 @{
|
|
@dots{}
|
|
@}
|
|
@dots{}
|
|
@}
|
|
@end example
|
|
|
|
Modules may also include any number of @dfn{import and export
|
|
declarations}. These declarations must precede any module item declarations
|
|
within the module, and control the visibility of names both within the module
|
|
and outside of it.
|
|
|
|
@menu
|
|
* Ref.Item.Mod.Import:: Declarations for module-local synonyms.
|
|
* Ref.Item.Mod.Export:: Declarations for restricting visibility.
|
|
@end menu
|
|
|
|
@node Ref.Item.Mod.Import
|
|
@subsubsection Ref.Item.Mod.Import
|
|
@c * Ref.Item.Mod.Import:: Declarations for module-local synonyms.
|
|
|
|
@cindex Importing names
|
|
@cindex Visibility control
|
|
|
|
An @dfn{import declaration} creates one or more local name bindings synonymous
|
|
with some other name. Usually an import declaration is used to shorten the
|
|
path required to refer to a module item.
|
|
|
|
@emph{Note}: unlike many languages, Rust's @code{import} declarations do
|
|
@emph{not} declare linkage-dependency with external crates. Linkage
|
|
dependencies are independently declared with @code{use}
|
|
declarations. @xref{Ref.Comp.Crate}.
|
|
|
|
An example of an import:
|
|
@example
|
|
import std.math.sin;
|
|
fn main() @{
|
|
// Equivalent to 'log std.math.sin(1.0);'
|
|
log sin(1.0);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Item.Mod.Export
|
|
@subsubsection Ref.Item.Mod.Export
|
|
@c * Ref.Item.Mod.Import:: Declarations for restricting visibility.
|
|
|
|
@cindex Exporting names
|
|
@cindex Visibility control
|
|
|
|
An @dfn{export declaration} restricts the set of local declarations within a
|
|
module that can be accessed from code outside the module. By default, all
|
|
local declarations in a module are exported. If a module contains an export
|
|
declaration, this declaration replaces the default export with the export
|
|
specified.
|
|
|
|
An example of an export:
|
|
@example
|
|
mod foo @{
|
|
export primary;
|
|
|
|
fn primary() @{
|
|
helper(1, 2);
|
|
helper(3, 4);
|
|
@}
|
|
|
|
fn helper(int x, int y) @{
|
|
@dots{}
|
|
@}
|
|
@}
|
|
|
|
fn main() @{
|
|
foo.primary(); // Will compile.
|
|
foo.helper(2,3) // ERROR: will not compile.
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Item.Fn
|
|
@subsection Ref.Item.Fn
|
|
@c * Ref.Item.Fn:: Items defining functions.
|
|
@cindex Functions
|
|
@cindex Slots, function input and output
|
|
|
|
A @dfn{function item} defines a sequence of statements associated with a name
|
|
and a set of parameters. Functions are declared with the keyword
|
|
@code{fn}. Functions declare a set of @emph{input slots} as parameters,
|
|
through which the caller passes arguments into the function, and an
|
|
@emph{output slot} through which the function passes results back to the
|
|
caller.
|
|
|
|
A function may also be copied into a first class @emph{value}, in which case
|
|
the value has the corresponding @emph{function type}, and can be used
|
|
otherwise exactly as a function item (with a minor additional cost of calling
|
|
the function, as such a call is indirect). @xref{Ref.Type.Fn}.
|
|
|
|
Every control path in a function ends with a @code{ret} or @code{be}
|
|
expression or with a diverging expression (described later in this
|
|
section). If a control path lacks a @code{ret} expression in source code, an
|
|
implicit @code{ret} expression is appended to the end of the control path
|
|
during compilation, returning the implicit @code{()} value.
|
|
|
|
An example of a function:
|
|
@example
|
|
fn add(int x, int y) -> int @{
|
|
ret x + y;
|
|
@}
|
|
@end example
|
|
|
|
A special kind of function can be declared with a @code{!} character where the
|
|
output slot type would normally be. For example:
|
|
@example
|
|
fn my_err(str s) -> ! @{
|
|
log s;
|
|
fail;
|
|
@}
|
|
@end example
|
|
|
|
We call such functions ``diverging'' because they never return a value to the
|
|
caller. Every control path in a diverging function must end with a @code{fail}
|
|
or a call to another diverging function on every control path. The @code{!}
|
|
annotation does @emph{not} denote a type. Rather, the result type
|
|
of a diverging function is a special type called @math{\bot} (``bottom'') that
|
|
unifies with any type. Rust has no syntax for @math{\bot}.
|
|
|
|
It might be necessary to declare a diverging function because as mentioned
|
|
previously, the typechecker checks that every control path in a function ends
|
|
with a @code{ret}, @code{be}, or diverging expression. So, if @code{my_err}
|
|
were declared without the @code{!} annotation, the following code would not
|
|
typecheck:
|
|
@example
|
|
fn f(int i) -> int @{
|
|
if (i == 42) @{
|
|
ret 42;
|
|
@}
|
|
else @{
|
|
my_err("Bad number!");
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
The typechecker would complain that @code{f} doesn't return a value in the
|
|
@code{else} branch. Adding the @code{!} annotation on @code{my_err} would
|
|
express that @code{f} requires no explicit @code{ret}, as if it returns
|
|
control to the caller, it returns a value (true because it never returns
|
|
control).
|
|
|
|
@node Ref.Item.Pred
|
|
@subsection Ref.Item.Pred
|
|
@c * Ref.Item.Pred:: Items defining predicates.
|
|
@cindex Predicate
|
|
|
|
Any pure boolean function is called a @emph{predicate}, and may be used
|
|
as part of the static typestate system. @xref{Ref.Typestate.Constr}. A
|
|
predicate declaration is identical to a function declaration, except that it
|
|
is declared with the keyword @code{pred} instead of @code{fn}. In addition,
|
|
the typechecker checks the body of a predicate with a restricted set of
|
|
typechecking rules. A predicate
|
|
@itemize
|
|
@item may not contain a @code{put}, @code{send}, @code{recv}, assignment, or
|
|
self-call expression; and
|
|
@item may only call other predicates, not general functions.
|
|
@end itemize
|
|
|
|
An example of a predicate:
|
|
@example
|
|
pred lt_42(int x) -> bool @{
|
|
ret (x < 42);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Item.Iter
|
|
@subsection Ref.Item.Iter
|
|
@c * Ref.Item.Iter:: Items defining iterators.
|
|
|
|
@cindex Iterators
|
|
@cindex Put expression
|
|
@cindex Put each expression
|
|
@cindex Foreach expression
|
|
|
|
Iterators are function-like items that can @code{put} multiple values during
|
|
their execution before returning or tail-calling.
|
|
|
|
Putting a value is similar to returning a value -- the argument to @code{put}
|
|
is copied into the caller's frame and control transfers back to the caller --
|
|
but the iterator frame is only @emph{suspended} during the put, and will be
|
|
@emph{resumed} at the point after the @code{put}, on the next iteration of
|
|
the caller's loop.
|
|
|
|
The output type of an iterator is the type of value that the function will
|
|
@code{put}, before it eventually evaluates a @code{ret} or @code{be} expression
|
|
of type @code{()} and completes its execution.
|
|
|
|
An iterator can only be called in the loop header of a matching @code{for
|
|
each} loop or as the argument in a @code{put each} expression.
|
|
@xref{Ref.Expr.Foreach}.
|
|
|
|
An example of an iterator:
|
|
@example
|
|
iter range(int lo, int hi) -> int @{
|
|
let int i = lo;
|
|
while (i < hi) @{
|
|
put i;
|
|
i = i + 1;
|
|
@}
|
|
@}
|
|
|
|
let int sum = 0;
|
|
for each (int x in range(0,100)) @{
|
|
sum += x;
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Item.Obj
|
|
@subsection Ref.Item.Obj
|
|
@c * Ref.Item.Obj:: Items defining objects.
|
|
@cindex Objects
|
|
@cindex Object constructors
|
|
|
|
An @dfn{object item} defines the @emph{state} and @emph{methods} of a set of
|
|
@emph{object values}. Object values have object types. @xref{Ref.Type.Obj}.
|
|
|
|
An @emph{object item} declaration -- in addition to providing a scope for
|
|
state and method declarations -- implicitly declares a static function called
|
|
the @emph{object constructor}, as well as a named @emph{object type}. The name
|
|
given to the object item is resolved to a type when used in type context, or a
|
|
constructor function when used in value context (such as a call).
|
|
|
|
Example of an object item:
|
|
@example
|
|
obj counter(int state) @{
|
|
fn incr() @{
|
|
state += 1;
|
|
@}
|
|
fn get() -> int @{
|
|
ret state;
|
|
@}
|
|
@}
|
|
|
|
let counter c = counter(1);
|
|
|
|
c.incr();
|
|
c.incr();
|
|
assert (c.get() == 3);
|
|
@end example
|
|
|
|
There is no @emph{this} or @emph{self} available inside an object's
|
|
methods, either implicitly or explicitly, so you can't directly call
|
|
other methods. For example:
|
|
@example
|
|
obj my_obj() @{
|
|
fn get() -> int @{
|
|
ret 3;
|
|
@}
|
|
fn foo() @{
|
|
auto c = get(); // Fails
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
The current replacement is to write a factory function for your type,
|
|
which provides any private helper functions:
|
|
@example
|
|
type my_obj =
|
|
obj @{
|
|
fn get() -> int;
|
|
fn foo();
|
|
@};
|
|
|
|
fn mk_my_obj() -> my_obj @{
|
|
fn get_helper() -> int @{
|
|
ret 3;
|
|
@}
|
|
|
|
obj impl() @{
|
|
fn get() -> int @{
|
|
ret get_helper();
|
|
@}
|
|
fn foo() @{
|
|
auto c = get_helper(); // Works
|
|
@}
|
|
@}
|
|
|
|
ret impl();
|
|
@}
|
|
@end example
|
|
|
|
This factory function also allows you to bind the object's state
|
|
variables to initial values.
|
|
|
|
@node Ref.Item.Type
|
|
@subsection Ref.Item.Type
|
|
@c * Ref.Item.Type:: Items defining the types of values and slots.
|
|
@cindex Type definitions
|
|
|
|
A @dfn{type definition} defines a set of possible values in
|
|
memory. @xref{Ref.Type}. Type definitions are declared with the keyword
|
|
@code{type}. Every value has a single, specific type; the type-specified
|
|
aspects of a value include:
|
|
|
|
@itemize
|
|
@item Whether the value is composed of sub-values or is indivisible.
|
|
@item Whether the value represents textual or numerical information.
|
|
@item Whether the value represents integral or floating-point information.
|
|
@item The sequence of memory operations required to access the value.
|
|
@item The storage layer the value resides in (immutable, state or gc).
|
|
@end itemize
|
|
|
|
For example, the type @code{rec(u8 x, u8 y)} defines the set of immutable
|
|
values that are composite records, each containing two unsigned 8-bit integers
|
|
accessed through the components @code{x} and @code{y}, and laid out in memory
|
|
with the @code{x} component preceding the @code{y} component.
|
|
|
|
@node Ref.Item.Tag
|
|
@subsection Ref.Item.Tag
|
|
@c * Ref.Item.Type:: Items defining the constructors of a tag type.
|
|
@cindex Tag types
|
|
|
|
A tag item simultaneously declares a new nominal tag type
|
|
(@pxref{Ref.Type.Tag}) as well as a set of @emph{constructors} that can be
|
|
used to create or pattern-match values of the corresponding tag type.
|
|
|
|
The constructors of a @code{tag} type may be recursive: that is, each constructor
|
|
may take an argument that refers, directly or indirectly, to the tag type the constructor
|
|
is a member of. Such recursion has restrictions:
|
|
@itemize
|
|
@item Recursive types can only be introduced through @code{tag} constructors.
|
|
@item A recursive @code{tag} item must have at least one non-recursive
|
|
constructor (in order to give the recursion a basis case).
|
|
@item The recursively argument of recursive tag constructors must be @emph{box}
|
|
values (in order to bound the in-memory size of the constructor).
|
|
@item Recursive type definitions can cross module boundaries, but not module
|
|
@emph{visibility} boundaries, nor crate boundaries (in order to simplify the
|
|
module system).
|
|
@end itemize
|
|
|
|
An example of a @code{tag} item and its use:
|
|
@example
|
|
tag animal @{
|
|
dog;
|
|
cat;
|
|
@}
|
|
|
|
let animal a = dog;
|
|
a = cat;
|
|
@end example
|
|
|
|
An example of a @emph{recursive} @code{tag} item and its use:
|
|
@example
|
|
tag list[T] @{
|
|
nil;
|
|
cons(T, @@list[T]);
|
|
@}
|
|
|
|
let list[int] a = cons(7, cons(13, nil));
|
|
@end example
|
|
|
|
|
|
@page
|
|
@node Ref.Type
|
|
@section Ref.Type
|
|
@cindex Types
|
|
|
|
Every slot and value in a Rust program has a type. The @dfn{type} of a
|
|
@emph{value} defines the interpretation of the memory holding it. The type of
|
|
a @emph{slot} may also include constraints. @xref{Ref.Type.Constr}.
|
|
|
|
Built-in types and type-constructors are tightly integrated into the language,
|
|
in nontrivial ways that are not possible to emulate in user-defined
|
|
types. User-defined types have limited capabilities. In addition, every
|
|
built-in type or type-constructor name is reserved as a @emph{keyword} in
|
|
Rust; they cannot be used as user-defined identifiers in any context.
|
|
|
|
@menu
|
|
* Ref.Type.Any:: An open union of every possible type.
|
|
* Ref.Type.Mach:: Machine-level types.
|
|
* Ref.Type.Int:: The machine-dependent integer types.
|
|
* Ref.Type.Float:: The machine-dependent floating-point types.
|
|
* Ref.Type.Prim:: Primitive types.
|
|
* Ref.Type.Big:: The arbitrary-precision integer type.
|
|
* Ref.Type.Text:: Strings and characters.
|
|
* Ref.Type.Rec:: Labeled products of heterogeneous types.
|
|
* Ref.Type.Tup:: Unlabeled products of heterogeneous types.
|
|
* Ref.Type.Vec:: Open products of homogeneous types.
|
|
* Ref.Type.Tag:: Disjoint unions of heterogeneous types.
|
|
* Ref.Type.Fn:: Subroutine types.
|
|
* Ref.Type.Iter:: Scoped coroutine types.
|
|
* Ref.Type.Port:: Unique inter-task message-receipt endpoints.
|
|
* Ref.Type.Chan:: Copyable inter-task message-send capabilities.
|
|
* Ref.Type.Task:: General coroutine-instance types.
|
|
* Ref.Type.Obj:: Abstract types.
|
|
* Ref.Type.Constr:: Constrained types.
|
|
* Ref.Type.Type:: Types describing types.
|
|
@end menu
|
|
|
|
@node Ref.Type.Any
|
|
@subsection Ref.Type.Any
|
|
@cindex Any type
|
|
@cindex Dynamic type, see @i{Any type}
|
|
@cindex Reflection
|
|
@cindex Alt type expression
|
|
|
|
The type @code{any} is the union of all possible Rust types. A value of type
|
|
@code{any} is represented in memory as a pair consisting of a boxed value of
|
|
some non-@code{any} type @var{T} and a reflection of the type @var{T}.
|
|
|
|
Values of type @code{any} can be used in an @code{alt type} expression, in
|
|
which the reflection is used to select a block corresponding to a particular
|
|
type extraction. @xref{Ref.Expr.Alt}.
|
|
|
|
@node Ref.Type.Mach
|
|
@subsection Ref.Type.Mach
|
|
@cindex Machine types
|
|
@cindex Floating-point types
|
|
@cindex Integer types
|
|
@cindex Word types
|
|
|
|
The machine types are the following:
|
|
|
|
@itemize
|
|
@item
|
|
The unsigned word types @code{u8}, @code{u16}, @code{u32} and @code{u64},
|
|
with values drawn from the integer intervals
|
|
@iftex
|
|
@math{[0, 2^8 - 1]},
|
|
@math{[0, 2^{16} - 1]},
|
|
@math{[0, 2^{32} - 1]} and
|
|
@math{[0, 2^{64} - 1]}
|
|
@end iftex
|
|
@ifhtml
|
|
@html
|
|
[0, 2<sup>8</sup>-1],
|
|
[0, 2<sup>16</sup>-1],
|
|
[0, 2<sup>32</sup>-1] and
|
|
[0, 2<sup>64</sup>-1]
|
|
@end html
|
|
@end ifhtml
|
|
respectively.
|
|
@item
|
|
The signed two's complement word types @code{i8}, @code{i16}, @code{i32} and
|
|
@code{i64}, with values drawn from the integer intervals
|
|
@iftex
|
|
@math{[-(2^7),(2^7)-1)]},
|
|
@math{[-(2^{15}),2^{15}-1)]},
|
|
@math{[-(2^{31}),2^{31}-1)]} and
|
|
@math{[-(2^{63}),2^{63}-1)]}
|
|
@end iftex
|
|
@ifhtml
|
|
@html
|
|
[-(2<sup>7</sup>), 2<sup>7</sup>-1],
|
|
[-(2<sup>15</sup>), 2<sup>15</sup>-1],
|
|
[-(2<sup>31</sup>), 2<sup>31</sup>-1] and
|
|
[-(2<sup>63</sup>), 2<sup>63</sup>-1]
|
|
@end html
|
|
@end ifhtml
|
|
respectively.
|
|
@item
|
|
The IEEE 754-2008 @code{binary32} and @code{binary64} floating-point types:
|
|
@code{f32} and @code{f64}, respectively.
|
|
@end itemize
|
|
|
|
@node Ref.Type.Int
|
|
@subsection Ref.Type.Int
|
|
@cindex Machine-dependent types
|
|
@cindex Integer types
|
|
@cindex Word types
|
|
|
|
|
|
The Rust type @code{uint}@footnote{A Rust @code{uint} is analogous to a C99
|
|
@code{uintptr_t}.} is an unsigned integer type with with
|
|
target-machine-dependent size. Its size, in bits, is equal to the number of
|
|
bits required to hold any memory address on the target machine.
|
|
|
|
The Rust type @code{int}@footnote{A Rust @code{int} is analogous to a C99
|
|
@code{intptr_t}.} is a two's complement signed integer type with
|
|
target-machine-dependent size. Its size, in bits, is equal to the size of the
|
|
rust type @code{uint} on the same target machine.
|
|
|
|
@node Ref.Type.Float
|
|
@subsection Ref.Type.Float
|
|
@cindex Machine-dependent types
|
|
@cindex Floating-point types
|
|
|
|
The Rust type @code{float} is a machine-specific type equal to one of the
|
|
supported Rust floating-point machine types (@code{f32} or @code{f64}). It is
|
|
the largest floating-point type that is directly supported by hardware on the
|
|
target machine, or if the target machine has no floating-point hardware
|
|
support, the largest floating-point type supported by the software
|
|
floating-point library used to support the other floating-point machine types.
|
|
|
|
Note that due to the preference for hardware-supported floating-point, the
|
|
type @code{float} may not be equal to the largest @emph{supported}
|
|
floating-point type.
|
|
|
|
|
|
@node Ref.Type.Prim
|
|
@subsection Ref.Type.Prim
|
|
@cindex Primitive types
|
|
@cindex Integer types
|
|
@cindex Floating-point types
|
|
@cindex Character type
|
|
@cindex Boolean type
|
|
|
|
The primitive types are the following:
|
|
|
|
@itemize
|
|
@item
|
|
The ``nil'' type @code{()}, having the single ``nil'' value
|
|
@code{()}.@footnote{The ``nil'' value @code{()} is @emph{not} a sentinel
|
|
``null pointer'' value for alias slots; the ``nil'' type is the implicit
|
|
return type from functions otherwise lacking a return type, and can be used in
|
|
other contexts (such as message-sending or type-parametric code) as a
|
|
zero-size type.}
|
|
@item
|
|
The boolean type @code{bool} with values @code{true} and @code{false}.
|
|
@item
|
|
The machine types.
|
|
@item
|
|
The machine-dependent integer and floating-point types.
|
|
@end itemize
|
|
|
|
|
|
@node Ref.Type.Big
|
|
@subsection Ref.Type.Big
|
|
@cindex Integer types
|
|
@cindex Big integer type
|
|
|
|
The Rust type @code{big}@footnote{A Rust @code{big} is analogous to a Lisp
|
|
bignum or a Python long integer.} is an arbitrary precision integer type that
|
|
fits in a machine word @emph{when possible} and transparently expands to a
|
|
boxed ``big integer'' allocated in the run-time heap when it overflows or
|
|
underflows outside of the range of a machine word.
|
|
|
|
A Rust @code{big} grows to accommodate extra binary digits as they are needed,
|
|
by taking extra memory from the memory budget available to each Rust task, and
|
|
should only exhaust its range due to memory exhaustion.
|
|
|
|
@node Ref.Type.Text
|
|
@subsection Ref.Type.Text
|
|
@cindex Text types
|
|
@cindex String type
|
|
@cindex Character type
|
|
@cindex Unicode
|
|
@cindex UCS-4
|
|
@cindex UTF-8
|
|
|
|
The types @code{char} and @code{str} hold textual data.
|
|
|
|
A value of type @code{char} is a Unicode character, represented as a 32-bit
|
|
unsigned word holding a UCS-4 codepoint.
|
|
|
|
A value of type @code{str} is a Unicode string, represented as a vector of
|
|
8-bit unsigned bytes holding a sequence of UTF-8 codepoints.
|
|
|
|
@node Ref.Type.Rec
|
|
@subsection Ref.Type.Rec
|
|
@cindex Record types
|
|
@cindex Structure types, see @i{Record types}
|
|
|
|
The record type-constructor @code{rec} forms a new heterogeneous product of
|
|
values.@footnote{The @code{rec} type-constructor is analogous to the
|
|
@code{struct} type-constructor in the Algol/C family, the @emph{record} types
|
|
of the ML family, or the @emph{structure} types of the Lisp family.} Fields of
|
|
a @code{rec} type are accessed by name and are arranged in memory in the order
|
|
specified by the @code{rec} type.
|
|
|
|
An example of a @code{rec} type and its use:
|
|
@example
|
|
type point = rec(int x, int y);
|
|
let point p = rec(x=10, y=11);
|
|
let int px = p.x;
|
|
@end example
|
|
|
|
@node Ref.Type.Tup
|
|
@subsection Ref.Type.Tup
|
|
@cindex Tuple types
|
|
|
|
The tuple type-constructor @code{tup} forms a new heterogeneous product of
|
|
values exactly as the @code{rec} type-constructor does, with the difference
|
|
that tuple members are automatically assigned implicit field names, given by
|
|
ascending integers prefixed by the underscore character: @code{_0}, @code{_1},
|
|
@code{_2}, etc. The members of a tuple are laid out in memory contiguously,
|
|
like a record, in order specified by the tuple type.
|
|
|
|
An example of a tuple type and its use:
|
|
@example
|
|
type pair = tup(int,str);
|
|
let pair p = tup(10,"hello");
|
|
assert (p._0 == 10);
|
|
p._1 = "world";
|
|
assert (p._1 == "world");
|
|
@end example
|
|
|
|
|
|
@node Ref.Type.Vec
|
|
@subsection Ref.Type.Vec
|
|
@cindex Vector types
|
|
@cindex Array types, see @i{Vector types}
|
|
|
|
The vector type-constructor @code{vec} represents a homogeneous array of
|
|
values of a given type. A vector has a fixed size. The layer of a vector type
|
|
is to the layer of its member type, like any type that contains a single
|
|
member type.
|
|
|
|
Vectors can be sliced. A slice expression builds a new vector by copying a
|
|
contiguous range -- given by a pair of indices representing a half-open
|
|
interval -- out of the sliced vector.
|
|
|
|
An example of a @code{vec} type and its use:
|
|
@example
|
|
let vec[int] v = vec(7, 5, 3);
|
|
let int i = v.(2);
|
|
let vec[int] v2 = v.(0,1); // Form a slice.
|
|
@end example
|
|
|
|
Vectors always @emph{allocate} a storage region sufficient to store the first
|
|
power of two worth of elements greater than or equal to the size of the
|
|
vector. This behaviour supports idiomatic in-place ``growth'' of a mutable
|
|
slot holding a vector:
|
|
|
|
@example
|
|
let mutable vec[int] v = vec(1, 2, 3);
|
|
v += vec(4, 5, 6);
|
|
@end example
|
|
|
|
Normal vector concatenation causes the allocation of a fresh vector to hold
|
|
the result; in this case, however, the slot holding the vector recycles the
|
|
underlying storage in-place (since the reference-count of the underlying
|
|
storage is equal to 1).
|
|
|
|
All accessible elements of a vector are always initialized, and access to a
|
|
vector is always bounds-checked.
|
|
|
|
|
|
@node Ref.Type.Tag
|
|
@subsection Ref.Type.Tag
|
|
@cindex Tag types
|
|
@cindex Union types, see @i{Tag types}
|
|
|
|
A @emph{tag type} is a nominal, heterogeneous disjoint union
|
|
type.@footnote{The @code{tag} type is analogous to a @code{data} constructor
|
|
declaration in ML or a @emph{pick ADT} in Limbo.} A @code{tag} @emph{item}
|
|
consists of a number of @emph{constructors}, each of which is independently
|
|
named and takes an optional tuple of arguments.
|
|
|
|
Tag types cannot be denoted @emph{structurally} as types, but must be denoted
|
|
by named reference to a @emph{tag item} declaration. @xref{Ref.Item.Tag}.
|
|
|
|
@node Ref.Type.Fn
|
|
@subsection Ref.Type.Fn
|
|
@cindex Function types
|
|
|
|
The function type-constructor @code{fn} forms new function types. A function
|
|
type consists of a sequence of input slots, an optional set of input
|
|
constraints (@pxref{Ref.Typestate.Constr}) and an output
|
|
slot. @xref{Ref.Item.Fn}.
|
|
|
|
An example of a @code{fn} type:
|
|
@example
|
|
fn add(int x, int y) -> int @{
|
|
ret x + y;
|
|
@}
|
|
|
|
let int x = add(5,7);
|
|
|
|
type binop = fn(int,int) -> int;
|
|
let binop bo = add;
|
|
x = bo(5,7);
|
|
@end example
|
|
|
|
@node Ref.Type.Iter
|
|
@subsection Ref.Type.Iter
|
|
@cindex Iterator types
|
|
|
|
The iterator type-constructor @code{iter} forms new iterator types. An
|
|
iterator type consists a sequence of input slots, an optional set of input
|
|
constraints and an output slot. @xref{Ref.Item.Iter}.
|
|
|
|
An example of an @code{iter} type:
|
|
@example
|
|
iter range(int x, int y) -> int @{
|
|
while (x < y) @{
|
|
put x;
|
|
x += 1;
|
|
@}
|
|
@}
|
|
|
|
for each (int i in range(5,7)) @{
|
|
@dots{};
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Type.Port
|
|
@subsection Ref.Type.Port
|
|
@cindex Port types
|
|
@cindex Communication
|
|
|
|
The port type-constructor @code{port} forms types that describe ports. A port
|
|
is the @emph{receiving end} of a typed, asynchronous, simplex inter-task
|
|
communication facility. @xref{Ref.Task.Comm}. A @code{port} type takes a
|
|
single type parameter, denoting the type of value that can be received from a
|
|
@code{port} value of that type.
|
|
|
|
Ports are modeled as stateful native types, with built-in meaning to the
|
|
language. They cannot be transmitted over channels or otherwise replicated,
|
|
and are always local to the task that creates them.
|
|
|
|
Ports (like channels) can only be carry types of the immutable layer. No
|
|
mutable values can pass over a port or channel.
|
|
|
|
An example of a @code{port} type:
|
|
@example
|
|
type port[vec[str]] svp;
|
|
let svp p = get_port();
|
|
let vec[str] v;
|
|
v <- p;
|
|
@end example
|
|
|
|
@node Ref.Type.Chan
|
|
@subsection Ref.Type.Chan
|
|
@cindex Channel types
|
|
@cindex Communication
|
|
|
|
The channel type-constructor @code{chan} forms types that describe channels. A
|
|
channel is the @emph{sending end} of a typed, asynchronous, simplex inter-task
|
|
communication facility. @xref{Ref.Task.Comm}. A @code{chan} type takes a
|
|
single type parameter, denoting the type of value that can be sent to a
|
|
channel of that type.
|
|
|
|
Channels are immutable, and can be transmitted over channels to other
|
|
tasks. They are modeled as immutable native types with built-in meaning to the
|
|
language.
|
|
|
|
Channels (like ports) can only be carry types of the immutable layer. No
|
|
mutable values can pass over a port or channel.
|
|
|
|
When a task sends a message into a channel, the task forms an outgoing queue
|
|
associated with that channel. The per-task queue @emph{associated} with a
|
|
channel can be indirectly manipulated by the task, but is @emph{not} otherwise
|
|
considered ``part of'' to the channel: the queue is ``part of'' the
|
|
@emph{sending task}. Sending a channel to another task does not copy the queue
|
|
associated with the channel.
|
|
|
|
Channels are also @emph{weak}: a channel is directly coupled to a particular
|
|
destination port on a particular task, but does not keep that port or task
|
|
@emph{alive}. A channel may therefore fail to operate at any moment. If a task
|
|
sends a message to a channel that is connected to a nonexistent port, the
|
|
message is dropped.
|
|
|
|
An example of a @code{chan} type:
|
|
@example
|
|
type chan[vec[str]] svc;
|
|
let svc c = get_chan();
|
|
let vec[str] v = vec("hello", "world");
|
|
c <| v;
|
|
@end example
|
|
|
|
@node Ref.Type.Task
|
|
@subsection Ref.Type.Task
|
|
@cindex Task type
|
|
|
|
The task type @code{task} describes values that are @emph{live
|
|
tasks}.
|
|
|
|
Tasks form an @emph{ownership tree} in which each task (except the root task)
|
|
is directly owned by exactly one parent task. The purpose of a variable of
|
|
@code{task} type is to manage the lifecycle of the associated
|
|
task. Communication is carried out solely using channels and ports.
|
|
|
|
Like ports, tasks are modeled as mutable native types with built-in meaning to
|
|
the language. They cannot be transmitted over channels or otherwise
|
|
replicated, and are always local to the task that spawns them.
|
|
|
|
If all references to a task are dropped (due to the release of any structure
|
|
holding those references), the runtime signals the un-referenced task, which
|
|
then fails. @xref{Ref.Task.Life}.
|
|
|
|
|
|
@node Ref.Type.Obj
|
|
@subsection Ref.Type.Obj
|
|
@c * Ref.Type.Obj:: Object types.
|
|
@cindex Object types
|
|
|
|
A @dfn{object type} describes values of abstract type, that carry some hidden
|
|
@emph{fields} and are accessed through a set of un-ordered
|
|
@emph{methods}. Every object item (@pxref{Ref.Item.Obj}) implicitly declares
|
|
an object type carrying methods with types derived from all the methods of the
|
|
object item.
|
|
|
|
Object types can also be declared in isolation, independent of any object item
|
|
declaration. Such a ``plain'' object type can be used to describe an interface
|
|
that a variety of particular objects may conform to, by supporting a superset
|
|
of the methods.
|
|
|
|
An object type that can contain fields of a given layer must be declared as
|
|
residing in that layer (or lower), like any other type.
|
|
|
|
An example of an object type with two separate object items supporting it, and
|
|
a client function using both items via the object type:
|
|
|
|
@example
|
|
|
|
state type taker =
|
|
state obj @{
|
|
fn take(int);
|
|
@};
|
|
|
|
state obj adder(mutable int x) @{
|
|
fn take(int y) @{
|
|
x += y;
|
|
@}
|
|
@}
|
|
|
|
obj sender(chan[int] c) @{
|
|
fn take(int z) @{
|
|
c <| z;
|
|
@}
|
|
@}
|
|
|
|
fn give_ints(taker t) @{
|
|
t.take(1);
|
|
t.take(2);
|
|
t.take(3);
|
|
@}
|
|
|
|
let port[int] p = port();
|
|
|
|
let taker t1 = adder(0);
|
|
let taker t2 = sender(chan(p));
|
|
|
|
give_ints(t1);
|
|
give_ints(t2);
|
|
|
|
@end example
|
|
|
|
|
|
|
|
@node Ref.Type.Constr
|
|
@subsection Ref.Type.Constr
|
|
@c * Ref.Type.Constr:: Constrained types.
|
|
@cindex Constrained types
|
|
|
|
A @dfn{constrained type} is a type that carries a @emph{formal constraint}
|
|
(@pxref{Ref.Typestate.Constr}), which is similar to a normal constraint except
|
|
that the @emph{base name} of any slots mentioned in the constraint must be the
|
|
special @emph{formal symbol} @emph{*}.
|
|
|
|
When a constrained type is instantiated in a particular slot declaration, the
|
|
formal symbol in the constraint is replaced with the name of the declared slot
|
|
and the resulting constraint is checked immediately after the slot is
|
|
declared. @xref{Ref.Expr.Check}.
|
|
|
|
An example of a constrained type with two separate instantiations:
|
|
@example
|
|
type ordered_range = rec(int low, int high) : less_than(*.low, *.high);
|
|
|
|
let ordered_range rng1 = rec(low=5, high=7);
|
|
// implicit: 'check less_than(rng1.low, rng1.high);'
|
|
|
|
let ordered_range rng2 = rec(low=15, high=17);
|
|
// implicit: 'check less_than(rng2.low, rng2.high);'
|
|
@end example
|
|
|
|
@node Ref.Type.Type
|
|
@subsection Ref.Type.Type
|
|
@c * Ref.Type.Type:: Types describing types.
|
|
@cindex Type type
|
|
|
|
@emph{TODO}.
|
|
|
|
|
|
|
|
* Ref.Typestate:: The static system of predicate analysis.
|
|
@node Ref.Typestate
|
|
@section Ref.Typestate
|
|
@c * Ref.Typestate:: The static system of predicate analysis.
|
|
@cindex Typestate system
|
|
|
|
Rust programs have a static semantics that determine the types of values
|
|
produced by each expression, as well as the @emph{predicates} that hold over
|
|
slots in the environment at each point in time during execution.
|
|
|
|
The latter semantics -- the dataflow analysis of predicates holding over slots
|
|
-- is called the @emph{typestate} system.
|
|
|
|
@menu
|
|
* Ref.Typestate.Point:: Discrete positions in execution.
|
|
* Ref.Typestate.CFG:: The control-flow graph formed by points.
|
|
* Ref.Typestate.Constr:: Predicates applied to slots.
|
|
* Ref.Typestate.Cond:: Constraints required and implied by a point.
|
|
* Ref.Typestate.State:: Constraints that hold at points.
|
|
* Ref.Typestate.Check:: Relating dynamic state to static typestate.
|
|
@end menu
|
|
|
|
@node Ref.Typestate.Point
|
|
@subsection Ref.Typestate.Point
|
|
@c * Ref.Typestate.Point:: Discrete positions in execution.
|
|
@cindex Points
|
|
|
|
Control flows from statement to statement in a block, and through the
|
|
evaluation of each expression, from one sub-expression to another. This
|
|
sequential control flow is specified as a set of @dfn{points}, each of which
|
|
has a set of points before and after it in the implied control flow.
|
|
|
|
For example, this code:
|
|
|
|
@example
|
|
s = "hello, world";
|
|
print(s);
|
|
@end example
|
|
|
|
Consists of 2 statements, 3 expressions and 12 points:
|
|
|
|
@itemize
|
|
@item the point before the first statement
|
|
@item the point before evaluating the static initializer @code{"hello, world"}
|
|
@item the point after evaluating the static initializer @code{"hello, world"}
|
|
@item the point after the first statement
|
|
@item the point before the second statement
|
|
@item the point before evaluating the function value @code{print}
|
|
@item the point after evaluating the function value @code{print}
|
|
@item the point before evaluating the arguments to @code{print}
|
|
@item the point before evaluating the symbol @code{s}
|
|
@item the point after evaluating the symbol @code{s}
|
|
@item the point after evaluating the arguments to @code{print}
|
|
@item the point after the second statement
|
|
@end itemize
|
|
|
|
Whereas this code:
|
|
|
|
@example
|
|
print(x() + y());
|
|
@end example
|
|
|
|
Consists of 1 statement, 7 expressions and 14 points:
|
|
|
|
@itemize
|
|
@item the point before the statement
|
|
@item the point before evaluating the function value @code{print}
|
|
@item the point after evaluating the function value @code{print}
|
|
@item the point before evaluating the arguments to @code{print}
|
|
@item the point before evaluating the arguments to @code{+}
|
|
@item the point before evaluating the function value @code{x}
|
|
@item the point after evaluating the function value @code{x}
|
|
@item the point before evaluating the arguments to @code{x}
|
|
@item the point after evaluating the arguments to @code{x}
|
|
@item the point before evaluating the function value @code{y}
|
|
@item the point after evaluating the function value @code{y}
|
|
@item the point before evaluating the arguments to @code{y}
|
|
@item the point after evaluating the arguments to @code{y}
|
|
@item the point after evaluating the arguments to @code{+}
|
|
@item the point after evaluating the arguments to @code{print}
|
|
@end itemize
|
|
|
|
|
|
The typestate system reasons over points, rather than statements or
|
|
expressions. This may seem counter-intuitive, but points are the more
|
|
primitive concept. Another way of thinking about a point is as a set of
|
|
@emph{instants in time} at which the state of a task is fixed. By contrast, a
|
|
statement or expression represents a @emph{duration in time}, during which the
|
|
state of the task changes. The typestate system is concerned with constraining
|
|
the possible states of a task's memory at @emph{instants}; it is meaningless
|
|
to speak of the state of a task's memory ``at'' a statement or expression, as
|
|
each statement or expression is likely to change the contents of memory.
|
|
|
|
@node Ref.Typestate.CFG
|
|
@subsection Ref.Typestate.CFG
|
|
@c * Ref.Typestate.CFG:: The control-flow graph formed by points.
|
|
@cindex Control-flow graph
|
|
|
|
Each @emph{point} can be considered a vertex in a directed @emph{graph}. Each
|
|
kind of expression or statement implies a number of points @emph{and edges} in
|
|
this graph. The edges connect the points within each statement or expression,
|
|
as well as between those points and those of nearby statements and expressions
|
|
in the program. The edges between points represent @emph{possible} indivisible
|
|
control transfers that might occur during execution.
|
|
|
|
This implicit graph is called the @dfn{control-flow graph}, or @dfn{CFG}.
|
|
|
|
@node Ref.Typestate.Constr
|
|
@subsection Ref.Typestate.Constr
|
|
@c * Ref.Typestate.Constr:: Predicates applied to slots.
|
|
@cindex Predicate
|
|
@cindex Constraint
|
|
|
|
A @dfn{predicate} is a pure boolean function declared with the keyword
|
|
@code{pred}. @xref{Ref.Item.Pred}.
|
|
|
|
A @dfn{constraint} is a predicate applied to specific slots.
|
|
|
|
For example, consider the following code:
|
|
|
|
@example
|
|
pred is_less_than(int a, int b) -> bool @{
|
|
ret a < b;
|
|
@}
|
|
|
|
fn test() @{
|
|
let int x = 10;
|
|
let int y = 20;
|
|
check is_less_than(x,y);
|
|
@}
|
|
@end example
|
|
|
|
This example defines the predicate @code{is_less_than}, and applies it to the
|
|
slots @code{x} and @code{y}. The constraint being checked on the third line of
|
|
the function is @code{is_less_than(x,y)}.
|
|
|
|
Predicates can only apply to slots holding immutable values. The slots a
|
|
predicate applies to can themselves be mutable, but the types of values held
|
|
in those slots must be immutable.
|
|
|
|
@node Ref.Typestate.Cond
|
|
@subsection Ref.Typestate.Cond
|
|
@c * Ref.Typestate.Cond:: Constraints required and implied by a point.
|
|
@cindex Condition
|
|
@cindex Precondition
|
|
@cindex Postcondition
|
|
|
|
A @dfn{condition} is a set of zero or more constraints.
|
|
|
|
Each @emph{point} has an associated @emph{condition}:
|
|
|
|
@itemize
|
|
@item The @dfn{precondition} of a statement or expression is the condition
|
|
required at in the point before it.
|
|
@item The @dfn{postcondition} of a statement or expression is the condition
|
|
enforced in the point after it.
|
|
@end itemize
|
|
|
|
Any constraint present in the precondition and @emph{absent} in the
|
|
postcondition is considered to be @emph{dropped} by the statement or
|
|
expression.
|
|
|
|
@node Ref.Typestate.State
|
|
@subsection Ref.Typestate.State
|
|
@c * Ref.Typestate.State:: Constraints that hold at points.
|
|
@cindex Typestate
|
|
@cindex Prestate
|
|
@cindex Poststate
|
|
|
|
The typestate checking system @emph{calculates} an additional condition for
|
|
each point called its typestate. For a given statement or expression, we call
|
|
the two typestates associated with its two points the prestate and a
|
|
poststate.
|
|
|
|
@itemize
|
|
@item The @dfn{prestate} of a statement or expression is the typestate of the
|
|
point before it.
|
|
@item The @dfn{poststate} of a statement or expression is the typestate of the
|
|
point after it.
|
|
@end itemize
|
|
|
|
A @dfn{typestate} is a condition that has @emph{been determined by the
|
|
typestate algorithm} to hold at a point. This is a subtle but important point
|
|
to understand: preconditions and postconditions are @emph{inputs} to the
|
|
typestate algorithm; prestates and poststates are @emph{outputs} from the
|
|
typestate algorithm.
|
|
|
|
The typestate algorithm analyses the preconditions and postconditions of every
|
|
statement and expression in a block, and computes a condition for each
|
|
typestate. Specifically:
|
|
|
|
@itemize
|
|
@item Initially, every typestate is empty.
|
|
@item Each statement or expression's poststate is given the union of the its
|
|
prestate, precondition, and postcondition.
|
|
@item Each statement or expression's poststate has the difference between its
|
|
precondition and postcondition removed.
|
|
@item Each statement or expression's prestate is given the intersection of the
|
|
poststates of every predecessor point in the CFG.
|
|
@item The previous three steps are repeated until no typestates in the
|
|
block change.
|
|
@end itemize
|
|
|
|
The typestate algorithm is a very conventional dataflow calculation, and can
|
|
be performed using bit-set operations, with one bit per predicate and one
|
|
bit-set per condition.
|
|
|
|
After the typestates of a block are computed, the typestate algorithm checks
|
|
that every constraint in the precondition of a statement is satisfied by its
|
|
prestate. If any preconditions are not satisfied, the mismatch is considered a
|
|
static (compile-time) error.
|
|
|
|
|
|
@node Ref.Typestate.Check
|
|
@subsection Ref.Typestate.Check
|
|
@c * Ref.Typestate.Check:: Relating dynamic state to static typestate.
|
|
@cindex Check statement
|
|
@cindex Assertions, see @i{Check statement}
|
|
|
|
The key mechanism that connects run-time semantics and compile-time analysis
|
|
of typestates is the use of @code{check} expressions. @xref{Ref.Expr.Check}. A
|
|
@code{check} expression guarantees that @emph{if} control were to proceed past
|
|
it, the predicate associated with the @code{check} would have succeeded, so
|
|
the constraint being checked @emph{statically} holds in subsequent
|
|
points.@footnote{A @code{check} expression is similar to an @code{assert}
|
|
call in a C program, with the significant difference that the Rust compiler
|
|
@emph{tracks} the constraint that each @code{check} expression
|
|
enforces. Naturally, @code{check} expressions cannot be omitted from a
|
|
``production build'' of a Rust program the same way @code{asserts} are
|
|
frequently disabled in deployed C programs.}
|
|
|
|
It is important to understand that the typestate system has @emph{no insight}
|
|
into the meaning of a particular predicate. Predicates and constraints are not
|
|
evaluated in any way at compile time. Predicates are treated as specific (but
|
|
unknown) functions applied to specific (also unknown) slots. All the typestate
|
|
system does is track which of those predicates -- whatever they calculate --
|
|
@emph{must have been checked already} in order for program control to reach a
|
|
particular point in the CFG. The fundamental building block, therefore, is the
|
|
@code{check} statement, which tells the typestate system ``if control passes
|
|
this point, the checked predicate holds''.
|
|
|
|
From this building block, constraints can be propagated to function signatures
|
|
and constrained types, and the responsibility to @code{check} a constraint
|
|
pushed further and further away from the site at which the program requires it
|
|
to hold in order to execute properly.
|
|
|
|
|
|
@page
|
|
@node Ref.Stmt
|
|
@section Ref.Stmt
|
|
@c * Ref.Stmt:: Components of an executable block.
|
|
@cindex Statements
|
|
|
|
A @dfn{statement} is a component of a block, which is in turn a component of
|
|
an outer block-expression, a function or an iterator. When a function is
|
|
spawned into a task, the task @emph{executes} statements in an order
|
|
determined by the body of the enclosing structure. Each statement causes the
|
|
task to perform certain actions.
|
|
|
|
Rust has two kinds of statement: declarations and expressions.
|
|
|
|
A declaration serves to introduce a @emph{name} that can be used in the block
|
|
@emph{scope} enclosing the statement: all statements before and after the
|
|
name, from the previous opening curly-brace (@code{@{}) up to the next closing
|
|
curly-brace (@code{@}}).
|
|
|
|
An expression serves the dual roles of causing side effects and producing a
|
|
@emph{value}. Expressions are said to @emph{evaluate to} a value, and the side
|
|
effects are caused during @emph{evaluation}. Many expressions contain
|
|
sub-expressions as operands; the definition of each kind of expression
|
|
dictates whether or not, and in which order, it will evaluate its
|
|
sub-expressions, and how the expression's value derives from the value of its
|
|
sub-expressions.
|
|
|
|
In this way, the structure of execution -- both the overall sequence of
|
|
observable side effects and the final produced value -- is dictated by the
|
|
structure of expressions. Blocks themselves are expressions, so the nesting
|
|
sequence of block, statement, expression, and block can repeatedly nest to an
|
|
arbitrary depth.
|
|
|
|
@menu
|
|
* Ref.Stmt.Decl:: Statement declaring an item or slot.
|
|
* Ref.Stmt.Expr:: Statement evaluating an expression.
|
|
@end menu
|
|
|
|
@node Ref.Stmt.Decl
|
|
@subsection Ref.Stmt.Decl
|
|
@c * Ref.Stmt.Decl:: Statement declaring an item or slot.
|
|
@cindex Declaration statement
|
|
|
|
A @dfn{declaration statement} is one that introduces a @emph{name} into the
|
|
enclosing statement block. The declared name may denote a new slot or a new
|
|
item. The scope of the name extends to the entire containing block, both
|
|
before and after the declaration.
|
|
|
|
@menu
|
|
* Ref.Stmt.Decl.Item:: Statement declaring an item.
|
|
* Ref.Stmt.Decl.Slot:: Statement declaring a slot.
|
|
@end menu
|
|
|
|
@node Ref.Stmt.Decl.Item
|
|
@subsubsection Ref.Stmt.Decl.Item
|
|
@c * Ref.Stmt.Decl.Item:: Statement declaring an item.
|
|
|
|
An @dfn{item declaration statement} has a syntactic form identical to an item
|
|
declaration within a module. Declaring an item -- a function, iterator,
|
|
object, type or module -- locally within a statement block is simply a way of
|
|
restricting its scope to a narrow region containing all of its uses; it is
|
|
otherwise identical in meaning to declaring the item outside the statement
|
|
block.
|
|
|
|
Note: there is no implicit capture of the function's dynamic environment when
|
|
declaring a function-local item.
|
|
|
|
@node Ref.Stmt.Decl.Slot
|
|
@subsubsection Ref.Stmt.Decl.Slot
|
|
@c * Ref.Stmt.Decl.Slot:: Statement declaring an slot.
|
|
@cindex Local slot
|
|
@cindex Variable, see @i{Local slot}
|
|
@cindex Type inference
|
|
@cindex Automatic slot
|
|
|
|
A @code{slot declaration statement} has one one of two forms:
|
|
|
|
@itemize
|
|
@item @code{let} @var{type} @var{slot} @var{optional-init};
|
|
@item @code{auto} @var{slot} @var{optional-init};
|
|
@end itemize
|
|
|
|
Where @var{type} is a type expression, @var{slot} is the name of the slot
|
|
being declared, and @var{optional-init} is either the empty string or an
|
|
equals sign (@code{=}) followed by a primitive expression.
|
|
|
|
Both forms introduce a new slot into the containing block scope. The new slot
|
|
is visible across the entire scope, but is initialized only at the point
|
|
following the declaration statement.
|
|
|
|
The latter (@code{auto}) form of slot declaration causes the compiler to infer
|
|
the static type of the slot through unification with the types of values
|
|
assigned to the slot in the remaining code in the block scope. Inference only
|
|
occurs on frame-local slots, not argument slots. Function, iterator and object
|
|
signatures must always declared types for all argument slots.
|
|
@xref{Ref.Mem.Slot}.
|
|
|
|
@node Ref.Stmt.Expr
|
|
@subsection Ref.Stmt.Expr
|
|
@c * Ref.Stmt.Expr:: Statement evaluating an expression
|
|
@cindex Expression statement
|
|
|
|
An @dfn{expression statement} is one that evaluates an expression and drops
|
|
its result. The purpose of an expression statement is often to cause the side
|
|
effects of the expression's evaluation.
|
|
|
|
@page
|
|
@node Ref.Expr
|
|
@section Ref.Expr
|
|
@c * Ref.Expr:: Parsed and primitive expressions.
|
|
@cindex Expressions
|
|
|
|
|
|
@menu
|
|
* Ref.Expr.Copy:: Expression for copying a value.
|
|
* Ref.Expr.Spawn:: Expressions for creating new tasks.
|
|
* Ref.Expr.Send:: Expressions for sending a value into a channel.
|
|
* Ref.Expr.Recv:: Expression for receiving a value from a channel.
|
|
* Ref.Expr.Call:: Expression for calling a function.
|
|
* Ref.Expr.Bind:: Expression for binding arguments to functions.
|
|
* Ref.Expr.Ret:: Expression for stopping and producing a value.
|
|
* Ref.Expr.Be:: Expression for stopping and executing a tail call.
|
|
* Ref.Expr.Put:: Expression for pausing and producing a value.
|
|
* Ref.Expr.Fail:: Expression for causing task failure.
|
|
* Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
|
|
* Ref.Expr.Note:: Expression for logging values during failure.
|
|
* Ref.Expr.While:: Expression for simple conditional looping.
|
|
* Ref.Expr.Break:: Expression for terminating a loop.
|
|
* Ref.Expr.Cont:: Expression for terminating a single loop iteration.
|
|
* Ref.Expr.For:: Expression for looping over strings and vectors.
|
|
* Ref.Expr.Foreach:: Expression for looping via an iterator.
|
|
* Ref.Expr.If:: Expression for simple conditional branching.
|
|
* Ref.Expr.Alt:: Expression for complex conditional branching.
|
|
* Ref.Expr.Prove:: Expression for static assertion of typestate.
|
|
* Ref.Expr.Check:: Expression for dynamic assertion of typestate.
|
|
* Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
|
|
* Ref.Expr.Assert:: Expression for halting the program if a
|
|
boolean condition fails to hold.
|
|
* Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
|
|
@end menu
|
|
|
|
|
|
@node Ref.Expr.Copy
|
|
@subsection Ref.Expr.Copy
|
|
@c * Ref.Expr.Copy:: Expression for copying a value.
|
|
@cindex Copy expression
|
|
@cindex Assignment operator, see @i{Copy expression}
|
|
|
|
A @dfn{copy expression} consists of an @emph{lval} followed by an equals-sign
|
|
(@code{=}) and a primitive expression. @xref{Ref.Expr}.
|
|
|
|
Executing a copy expression causes the value denoted by the expression --
|
|
either a value or a primitive combination of values -- to be copied into the
|
|
memory location denoted by the @emph{lval}.
|
|
|
|
A copy may entail the adjustment of reference counts, execution of destructors,
|
|
or similar adjustments in order to respect the path through the memory graph
|
|
implied by the @code{lval}, as well as any existing value held in the memory
|
|
being written-to. All such adjustment is automatic and implied by the @code{=}
|
|
operator.
|
|
|
|
An example of three different copy expressions:
|
|
@example
|
|
x = y;
|
|
x.y = z;
|
|
x.y = z + 2;
|
|
@end example
|
|
|
|
@node Ref.Expr.Spawn
|
|
@subsection Ref.Expr.Spawn
|
|
@c * Ref.Expr.Spawn:: Expressions creating new tasks.
|
|
@cindex Spawn expression
|
|
|
|
A @code{spawn} expression consists of keyword @code{spawn}, followed by
|
|
an optional literal string naming the new task and then a normal
|
|
@emph{call} expression (@pxref{Ref.Expr.Call}). A @code{spawn}
|
|
expression causes the runtime to construct a new task executing the
|
|
called function with the given name. The called function is referred
|
|
to as the @dfn{entry function} for the spawned task, and its arguments
|
|
are copied from the spawning task to the spawned task before the
|
|
spawned task begins execution. If no explicit name is present, the
|
|
task is implicitly named with the string of the call expression.
|
|
|
|
Functions taking alias-slot arguments, or returning non-nil values, cannot be
|
|
spawned. Iterators cannot be spawned.
|
|
|
|
The result of a @code{spawn} expression is a @code{task} value.
|
|
|
|
An example of a @code{spawn} expression:
|
|
@example
|
|
fn helper(chan[u8] out) @{
|
|
// do some work.
|
|
out <| result;
|
|
@}
|
|
|
|
let port[u8] out;
|
|
let task p = spawn helper(chan(out));
|
|
let task p2 = spawn "my_helper" helper(chan(out));
|
|
// let task run, do other things.
|
|
auto result <- out;
|
|
|
|
@end example
|
|
|
|
@node Ref.Expr.Send
|
|
@subsection Ref.Expr.Send
|
|
@c * Ref.Expr.Send:: Expressions for sending a value into a channel.
|
|
@cindex Send expression
|
|
@cindex Communication
|
|
|
|
Sending a value through a channel can be done via two different expressions.
|
|
Both expressions take an @emph{lval}, denoting a channel, and a value to send
|
|
into the channel. The action of @emph{sending} varies depending on the
|
|
@dfn{send operator} employed.
|
|
|
|
The @emph{asynchronous send} operator @code{<+} adds a value to the channel's
|
|
queue, without blocking. If the queue is full, it is extended, taking memory
|
|
from the task's domain. If the task memory budget is exhausted, a signal is
|
|
sent to the task.
|
|
|
|
The @emph{semi-synchronous send} operator @code{<|} adds a value to the
|
|
channel's queue @emph{only if} the queue has room; if the queue is full, the
|
|
operation @emph{blocks} the sender until the queue has room.
|
|
|
|
An example of an asynchronous send:
|
|
@example
|
|
chan[str] c = @dots{};
|
|
c <+ "hello, world";
|
|
@end example
|
|
|
|
An example of a semi-synchronous send:
|
|
@example
|
|
chan[str] c = @dots{};
|
|
c <| "hello, world";
|
|
@end example
|
|
|
|
@node Ref.Expr.Recv
|
|
@subsection Ref.Expr.Recv
|
|
@c * Ref.Expr.Recv:: Expression for receiving a value from a channel.
|
|
@cindex Receive expression
|
|
@cindex Communication
|
|
|
|
The @dfn{receive expression} takes an @var{lval} to receive into and an
|
|
expression denoting a port, and applies the @emph{receive operator}
|
|
(@code{<-}) to the pair, copying a value out of the port and into the
|
|
@var{lval}. The expression causes the receiving task to enter the @emph{blocked
|
|
reading} state until a task is sending a value to the port, at which point the
|
|
runtime pseudo-randomly selects a sending task and copies a value from the
|
|
head of one of the task queues to the receiving location in memory, and
|
|
un-blocks the receiving task. @xref{Ref.Run.Comm}.
|
|
|
|
An example of a @emph{receive}:
|
|
@example
|
|
port[str] p = @dots{};
|
|
let str s <- p;
|
|
@end example
|
|
|
|
@node Ref.Expr.Call
|
|
@subsection Ref.Expr.Call
|
|
@c * Ref.Expr.Call:: Expression for calling a function.
|
|
@cindex Call expression
|
|
@cindex Function calls
|
|
|
|
A @dfn{call expression} invokes a function, providing a tuple of input slots
|
|
and an alias slot to serve as the function's output, bound to the @var{lval}
|
|
on the right hand side of the call. If the function eventually returns, then
|
|
the expression completes.
|
|
|
|
A call expression statically requires that the precondition declared in the
|
|
callee's signature is satisfied by the expression prestate. In this way,
|
|
typestates propagate through function boundaries. @xref{Ref.Typestate}.
|
|
|
|
An example of a call expression:
|
|
@example
|
|
let int x = add(1, 2);
|
|
@end example
|
|
|
|
@node Ref.Expr.Bind
|
|
@subsection Ref.Expr.Bind
|
|
@c * Ref.Expr.Bind:: Expression for binding arguments to functions.
|
|
@cindex Bind expression
|
|
@cindex Closures
|
|
@cindex Currying
|
|
|
|
A @dfn{bind expression} constructs a new function from an existing
|
|
function.@footnote{The @code{bind} expression is analogous to the @code{bind}
|
|
expression in the Sather language.} The new function has zero or more of its
|
|
arguments @emph{bound} into a new, hidden boxed tuple that holds the
|
|
bindings. For each concrete argument passed in the @code{bind} expression, the
|
|
corresponding parameter in the existing function is @emph{omitted} as a
|
|
parameter of the new function. For each argument passed the placeholder symbol
|
|
@code{_} in the @code{bind} expression, the corresponding parameter of the
|
|
existing function is @emph{retained} as a parameter of the new function.
|
|
|
|
Any subsequent invocation of the new function with residual arguments causes
|
|
invocation of the existing function with the combination of bound arguments
|
|
and residual arguments that was specified during the binding.
|
|
|
|
An example of a @code{bind} expression:
|
|
@example
|
|
fn add(int x, int y) -> int @{
|
|
ret x + y;
|
|
@}
|
|
type single_param_fn = fn(int) -> int;
|
|
|
|
let single_param_fn add4 = bind add(4, _);
|
|
|
|
let single_param_fn add5 = bind add(_, 5);
|
|
|
|
assert (add(4,5) == add4(5));
|
|
assert (add(4,5) == add5(4));
|
|
|
|
@end example
|
|
|
|
A @code{bind} expression generally stores a copy of the bound arguments in the
|
|
hidden, boxed tuple, owned by the resulting first-class function. For each
|
|
bound slot in the bound function's signature, space is allocated in the hidden
|
|
tuple and populated with a copy of the bound value.
|
|
|
|
The @code{bind} expression is a lightweight mechanism for simulating the more
|
|
elaborate construct of @emph{lexical closures} that exist in other
|
|
languages. Rust has no support for lexical closures, but many realistic uses
|
|
of them can be achieved with @code{bind} expressions.
|
|
|
|
|
|
@node Ref.Expr.Ret
|
|
@subsection Ref.Expr.Ret
|
|
@c * Ref.Expr.Ret:: Expression for stopping and producing a value.
|
|
@cindex Return expression
|
|
|
|
Executing a @code{ret} expression@footnote{A @code{ret} expression is analogous
|
|
to a @code{return} expression in the C family.} copies a value into the output
|
|
slot of the current function, destroys the current function activation frame,
|
|
and transfers control to the caller frame.
|
|
|
|
An example of a @code{ret} expression:
|
|
@example
|
|
fn max(int a, int b) -> int @{
|
|
if (a > b) @{
|
|
ret a;
|
|
@}
|
|
ret b;
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Be
|
|
@subsection Ref.Expr.Be
|
|
@c * Ref.Expr.Be:: Expression for stopping and executing a tail call.
|
|
@cindex Be expression
|
|
@cindex Tail calls
|
|
|
|
Executing a @code{be} expression @footnote{A @code{be} expression in is
|
|
analogous to a @code{become} expression in Newsqueak or Alef.} destroys the
|
|
current function activation frame and replaces it with an activation frame for
|
|
the called function. In other words, @code{be} executes a tail-call. The
|
|
syntactic form of a @code{be} expression is therefore limited to @emph{tail
|
|
position}: its argument must be a @emph{call expression}, and it must be the
|
|
last expression in a block.
|
|
|
|
An example of a @code{be} expression:
|
|
@example
|
|
fn print_loop(int n) @{
|
|
if (n <= 0) @{
|
|
ret;
|
|
@} else @{
|
|
print_int(n);
|
|
be print_loop(n-1);
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
The above example executes in constant space, replacing each frame with a new
|
|
copy of itself.
|
|
|
|
|
|
|
|
@node Ref.Expr.Put
|
|
@subsection Ref.Expr.Put
|
|
@c * Ref.Expr.Put:: Expression for pausing and producing a value.
|
|
@cindex Put expression
|
|
@cindex Iterators
|
|
|
|
Executing a @code{put} expression copies a value into the output slot of the
|
|
current iterator, suspends execution of the current iterator, and transfers
|
|
control to the current put-recipient frame.
|
|
|
|
A @code{put} expression is only valid within an iterator. @footnote{A
|
|
@code{put} expression is analogous to a @code{yield} expression in the CLU, and
|
|
Sather languages, or in more recent languages providing a ``generator''
|
|
facility, such as Python, Javascript or C#. Like the generators of CLU and
|
|
Sather but @emph{unlike} these later languages, Rust's iterators reside on the
|
|
stack and obey a strict stack discipline.} The current put-recipient will
|
|
eventually resume the suspended iterator containing the @code{put} expression,
|
|
either continuing execution after the @code{put} expression, or terminating its
|
|
execution and destroying the iterator frame.
|
|
|
|
|
|
@node Ref.Expr.Fail
|
|
@subsection Ref.Expr.Fail
|
|
@c * Ref.Expr.Fail:: Expression for causing task failure.
|
|
@cindex Fail expression
|
|
@cindex Failure
|
|
@cindex Unwinding
|
|
|
|
Executing a @code{fail} expression causes a task to enter the @emph{failing}
|
|
state. In the @emph{failing} state, a task unwinds its stack, destroying all
|
|
frames and freeing all resources until it reaches its entry frame, at which
|
|
point it halts execution in the @emph{dead} state.
|
|
|
|
@node Ref.Expr.Log
|
|
@subsection Ref.Expr.Log
|
|
@c * Ref.Expr.Log:: Expression for logging values to diagnostic buffers.
|
|
@cindex Log expression
|
|
@cindex Logging
|
|
|
|
Executing a @code{log} expression may, depending on runtime configuration,
|
|
cause a value to be appended to an internal diagnostic logging buffer provided
|
|
by the runtime or emitted to a system console. Log expressions are enabled or
|
|
disabled dynamically at run-time on a per-task and per-item
|
|
basis. @xref{Ref.Run.Log}.
|
|
|
|
@example
|
|
@end example
|
|
|
|
@node Ref.Expr.Note
|
|
@subsection Ref.Expr.Note
|
|
@c * Ref.Expr.Note:: Expression for logging values during failure.
|
|
@cindex Note expression
|
|
@cindex Logging
|
|
@cindex Unwinding
|
|
@cindex Failure
|
|
|
|
A @code{note} expression has no effect during normal execution. The purpose of
|
|
a @code{note} expression is to provide additional diagnostic information to the
|
|
logging subsystem during task failure. @xref{Ref.Expr.Log}. Using @code{note}
|
|
expressions, normal diagnostic logging can be kept relatively sparse, while
|
|
still providing verbose diagnostic ``back-traces'' when a task fails.
|
|
|
|
When a task is failing, control frames @emph{unwind} from the innermost frame
|
|
to the outermost, and from the innermost lexical block within an unwinding
|
|
frame to the outermost. When unwinding a lexical block, the runtime processes
|
|
all the @code{note} expressions in the block sequentially, from the first
|
|
expression of the block to the last. During processing, a @code{note}
|
|
expression has equivalent meaning to a @code{log} expression: it causes the
|
|
runtime to append the argument of the @code{note} to the internal logging
|
|
diagnostic buffer.
|
|
|
|
An example of a @code{note} expression:
|
|
@example
|
|
fn read_file_lines(&str path) -> vec[str] @{
|
|
note path;
|
|
vec[str] r;
|
|
file f = open_read(path);
|
|
for each (str s in lines(f)) @{
|
|
vec.append(r,s);
|
|
@}
|
|
ret r;
|
|
@}
|
|
@end example
|
|
|
|
In this example, if the task fails while attempting to open or read a file,
|
|
the runtime will log the path name that was being read. If the function
|
|
completes normally, the runtime will not log the path.
|
|
|
|
A value that is marked by a @code{note} expression is @emph{not} copied aside
|
|
when control passes through the @code{note}. In other words, if a @code{note}
|
|
expression notes a particular @var{lval}, and code after the @code{note}
|
|
mutates that slot, and then a subsequent failure occurs, the @emph{mutated}
|
|
value will be logged during unwinding, @emph{not} the original value that was
|
|
denoted by the @var{lval} at the moment control passed through the @code{note}
|
|
expression.
|
|
|
|
@node Ref.Expr.While
|
|
@subsection Ref.Expr.While
|
|
@c * Ref.Expr.While:: Expression for simple conditional looping.
|
|
@cindex While expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
A @code{while} expression is a loop construct. A @code{while} loop may be
|
|
either a simple @code{while} or a @code{do}-@code{while} loop.
|
|
|
|
In the case of a simple @code{while}, the loop begins by evaluating the
|
|
boolean loop conditional expression. If the loop conditional expression
|
|
evaluates to @code{true}, the loop body block executes and control returns to
|
|
the loop conditional expression. If the loop conditional expression evaluates
|
|
to @code{false}, the @code{while} expression completes.
|
|
|
|
In the case of a @code{do}-@code{while}, the loop begins with an execution of
|
|
the loop body. After the loop body executes, it evaluates the loop conditional
|
|
expression. If it evaluates to @code{true}, control returns to the beginning
|
|
of the loop body. If it evaluates to @code{false}, control exits the loop.
|
|
|
|
An example of a simple @code{while} expression:
|
|
@example
|
|
while (i < 10) @{
|
|
print("hello\n");
|
|
i = i + 1;
|
|
@}
|
|
@end example
|
|
|
|
An example of a @code{do}-@code{while} expression:
|
|
@example
|
|
do @{
|
|
print("hello\n");
|
|
i = i + 1;
|
|
@} while (i < 10);
|
|
@end example
|
|
|
|
@node Ref.Expr.Break
|
|
@subsection Ref.Expr.Break
|
|
@c * Ref.Expr.Break:: Expression for terminating a loop.
|
|
@cindex Break expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
Executing a @code{break} expression immediately terminates the innermost loop
|
|
enclosing it. It is only permitted in the body of a loop.
|
|
|
|
@node Ref.Expr.Cont
|
|
@subsection Ref.Expr.Cont
|
|
@c * Ref.Expr.Cont:: Expression for terminating a single loop iteration.
|
|
@cindex Continue expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
Executing a @code{cont} expression immediately terminates the current iteration
|
|
of the innermost loop enclosing it, returning control to the loop
|
|
@emph{head}. In the case of a @code{while} loop, the head is the conditional
|
|
expression controlling the loop. In the case of a @code{for} or @code{for
|
|
each} loop, the head is the iterator or vector-slice increment controlling the
|
|
loop.
|
|
|
|
A @code{cont} expression is only permitted in the body of a loop.
|
|
|
|
|
|
@node Ref.Expr.For
|
|
@subsection Ref.Expr.For
|
|
@c * Ref.Expr.For:: Expression for looping over strings and vectors.
|
|
@cindex For expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
A @dfn{for loop} is controlled by a vector or string. The for loop
|
|
bounds-checks the underlying sequence @emph{once} when initiating the loop,
|
|
then repeatedly copies each value of the underlying sequence into the element
|
|
variable, executing the loop body once per copy. To perform a for loop on a
|
|
sub-range of a vector or string, form a temporary slice over the sub-range and
|
|
run the loop over the slice.
|
|
|
|
Example of 4 for loops, all identical:
|
|
@example
|
|
let vec[foo] v = vec(a, b, c);
|
|
|
|
for (foo e in v.(0, _vec.len(v))) @{
|
|
bar(e);
|
|
@}
|
|
|
|
for (foo e in v.(0,)) @{
|
|
bar(e);
|
|
@}
|
|
|
|
for (foo e in v.(,)) @{
|
|
bar(e);
|
|
@}
|
|
|
|
for (foo e in v) @{
|
|
bar(e);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Foreach
|
|
@subsection Ref.Expr.Foreach
|
|
@c * Ref.Expr.Foreach:: Expression for general conditional looping.
|
|
@cindex Foreach expression
|
|
@cindex Loops
|
|
@cindex Control-flow
|
|
|
|
An @dfn{foreach loop} is denoted by the @code{for each} keywords, and is
|
|
controlled by an iterator. The loop executes once for each value @code{put} by
|
|
the iterator. When the iterator returns or fails, the loop terminates.
|
|
|
|
Example of a foreach loop:
|
|
@example
|
|
let str txt;
|
|
let vec[str] lines;
|
|
for each (str s in _str.split(txt, "\n")) @{
|
|
vec.push(lines, s);
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.If
|
|
@subsection Ref.Expr.If
|
|
@c * Ref.Expr.If:: Expression for simple conditional branching.
|
|
@cindex If expression
|
|
@cindex Control-flow
|
|
|
|
An @code{if} expression is a conditional branch in program control. The form of
|
|
an @code{if} expression is a parenthesized condition expression, followed by a
|
|
consequent block, any number of @code{else if} conditions and blocks, and an
|
|
optional trailing @code{else} block. The condition expressions must have type
|
|
@code{bool}. If a condition expression evaluates to @code{true}, the
|
|
consequent block is executed and any subsequent @code{else if} or @code{else}
|
|
block is skipped. If a condition expression evaluates to @code{false}, the
|
|
consequent block is skipped and any subsequent @code{else if} condition is
|
|
evaluated. If all @code{if} and @code{else if} conditions evaluate to @code{false}
|
|
then any @code{else} block is executed.
|
|
|
|
@node Ref.Expr.Alt
|
|
@subsection Ref.Expr.Alt
|
|
@c * Ref.Expr.Alt:: Expression for complex conditional branching.
|
|
@cindex Alt expression
|
|
@cindex Control-flow
|
|
@cindex Switch expression, see @i{Alt expression}
|
|
|
|
An @code{alt} expression is a multi-directional branch in program control.
|
|
There are three kinds of @code{alt} expression: communication @code{alt}
|
|
expressions, pattern @code{alt} expressions, and @code{alt type} expressions.
|
|
|
|
The form of each kind of @code{alt} is similar: an initial @emph{head} that
|
|
describes the criteria for branching, followed by a sequence of zero or more
|
|
@emph{arms}, each of which describes a @emph{case} and provides a @emph{block}
|
|
of expressions associated with the case. When an @code{alt} is executed,
|
|
control enters the head, determines which of the cases to branch to, branches
|
|
to the block associated with the chosen case, and then proceeds to the
|
|
expression following the @code{alt} when the case block completes.
|
|
|
|
@menu
|
|
* Ref.Expr.Alt.Comm:: Expression for branching on communication events.
|
|
* Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
|
|
* Ref.Expr.Alt.Type:: Expression for branching on types.
|
|
@end menu
|
|
|
|
@node Ref.Expr.Alt.Comm
|
|
@subsubsection Ref.Expr.Alt.Comm
|
|
@c * Ref.Expr.Alt.Comm:: Expression for branching on communication events.
|
|
@cindex Communication alt expression
|
|
@cindex Control-flow
|
|
@cindex Communication
|
|
@cindex Multiplexing
|
|
|
|
The simplest form of @code{alt} expression is the a @emph{communication}
|
|
@code{alt}. The cases of a communication @code{alt}'s arms are send and
|
|
receive expressions. @xref{Ref.Task.Comm}.
|
|
|
|
To execute a communication @code{alt}, the runtime checks all of the ports and
|
|
channels involved in the arms of the expression to see if any @code{case} can
|
|
execute without blocking. If no @code{case} can execute, the task blocks, and
|
|
the runtime unblocks the task when a @code{case} @emph{can} execute. The
|
|
runtime then makes a pseudo-random choice from among the set of @code{case}
|
|
expressions that can execute, executes the expression of the @code{case} and
|
|
branches to the block of that arm.
|
|
|
|
An example of a communication @code{alt} expression:
|
|
@example
|
|
let chan c[int] = foo();
|
|
let port p[str] = bar();
|
|
let int x = 0;
|
|
let vec[str] strs;
|
|
|
|
alt @{
|
|
case (str s <- p) @{
|
|
vec.append(strs, s);
|
|
@}
|
|
case (c <| x) @{
|
|
x++;
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Alt.Pat
|
|
@subsubsection Ref.Expr.Alt.Pat
|
|
@c * Ref.Expr.Alt.Pat:: Expression for branching on pattern matches.
|
|
@cindex Pattern alt expression
|
|
@cindex Control-flow
|
|
|
|
A pattern @code{alt} expression branches on a @emph{pattern}. The exact form of
|
|
matching that occurs depends on the pattern. Patterns consist of some
|
|
combination of literals, tag constructors, variable binding specifications and
|
|
placeholders (@code{_}). A pattern @code{alt} has a parenthesized @emph{head
|
|
expression}, which is the value to compare to the patterns. The type of the
|
|
patterns must equal the type of the head expression.
|
|
|
|
To execute a pattern @code{alt} expression, first the head expression is
|
|
evaluated, then its value is sequentially compared to the patterns in the arms
|
|
until a match is found. The first arm with a matching @code{case} pattern is
|
|
chosen as the branch target of the @code{alt}, any variables bound by the
|
|
pattern are assigned to local @emph{auto} slots in the arm's block, and
|
|
control enters the block.
|
|
|
|
An example of a pattern @code{alt} expression:
|
|
|
|
@example
|
|
type list[X] = tag(nil, cons(X, @@list[X]));
|
|
|
|
let list[int] x = cons(10, cons(11, nil));
|
|
|
|
alt (x) @{
|
|
case (cons(a, cons(b, _))) @{
|
|
process_pair(a,b);
|
|
@}
|
|
case (cons(v=10, _)) @{
|
|
process_ten(v);
|
|
@}
|
|
case (_) @{
|
|
fail;
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.Alt.Type
|
|
@subsubsection Ref.Expr.Alt.Type
|
|
@c * Ref.Expr.Alt.Type:: Expression for branching on type.
|
|
@cindex Type alt expression
|
|
@cindex Control-flow
|
|
|
|
An @code{alt type} expression is similar to a pattern @code{alt}, but branches
|
|
on the @emph{type} of its head expression, rather than the value. The head
|
|
expression of an @code{alt type} expression must be of type @code{any}, and the
|
|
arms of the expression are slot patterns rather than value patterns. Control
|
|
branches to the arm with a @code{case} that matches the @emph{actual type} of
|
|
the value in the @code{any}.
|
|
|
|
An example of an @code{alt type} expression:
|
|
|
|
@example
|
|
let any x = foo();
|
|
|
|
alt type (x) @{
|
|
case (int i) @{
|
|
ret i;
|
|
@}
|
|
case (list[int] li) @{
|
|
ret int_list_sum(li);
|
|
@}
|
|
case (list[X] lx) @{
|
|
ret list_len(lx);
|
|
@}
|
|
case (_) @{
|
|
ret 0;
|
|
@}
|
|
@}
|
|
@end example
|
|
|
|
|
|
@node Ref.Expr.Prove
|
|
@subsection Ref.Expr.Prove
|
|
@c * Ref.Expr.Prove:: Expression for static assertion of typestate.
|
|
@cindex Prove expression
|
|
@cindex Typestate system
|
|
|
|
A @code{prove} expression has no run-time effect. Its purpose is to statically
|
|
check (and document) that its argument constraint holds at its expression entry
|
|
point. If its argument typestate does not hold, under the typestate algorithm,
|
|
the program containing it will fail to compile.
|
|
|
|
@node Ref.Expr.Check
|
|
@subsection Ref.Expr.Check
|
|
@c * Ref.Expr.Check:: Expression for dynamic assertion of typestate.
|
|
@cindex Check expression
|
|
@cindex Typestate system
|
|
|
|
A @code{check} expression connects dynamic assertions made at run-time to the
|
|
static typestate system. A @code{check} expression takes a constraint to check
|
|
at run-time. If the constraint holds at run-time, control passes through the
|
|
@code{check} and on to the next expression in the enclosing block. If the
|
|
condition fails to hold at run-time, the @code{check} expression behaves as a
|
|
@code{fail} expression.
|
|
|
|
The typestate algorithm is built around @code{check} expressions, and in
|
|
particular the fact that control @emph{will not pass} a check expression with a
|
|
condition that fails to hold. The typestate algorithm can therefore assume
|
|
that the (static) postcondition of a @code{check} expression includes the
|
|
checked constraint itself. From there, the typestate algorithm can perform
|
|
dataflow calculations on subsequent expressions, propagating conditions forward
|
|
and statically comparing implied states and their
|
|
specifications. @xref{Ref.Typestate}.
|
|
|
|
@example
|
|
pred even(&int x) -> bool @{
|
|
ret x & 1 == 0;
|
|
@}
|
|
|
|
fn print_even(int x) : even(x) @{
|
|
print(x);
|
|
@}
|
|
|
|
fn test() @{
|
|
let int y = 8;
|
|
|
|
// Cannot call print_even(y) here.
|
|
|
|
check even(y);
|
|
|
|
// Can call print_even(y) here, since even(y) now holds.
|
|
print_even(y);
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Claim
|
|
@subsection Ref.Expr.Claim
|
|
@c * Ref.Expr.Claim:: Expression for static (unsafe) or dynamic assertion of typestate.
|
|
@cindex Claim expression
|
|
@cindex Typestate system
|
|
|
|
A @code{claim} expression is an unsafe variant on a @code{check} expression
|
|
that is not actually checked at runtime. Thus, using a @code{claim} implies a
|
|
proof obligation to ensure---without compiler assistance---that an assertion
|
|
always holds.
|
|
|
|
Setting a runtime flag can turn all @code{claim} expressions
|
|
into @code{check} expressions in a compiled Rust program, but the default is to not check the assertion
|
|
contained in a @code{claim}. The idea behind @code{claim} is that performance profiling might identify a
|
|
few bottlenecks in the code where actually checking a given callee's predicate
|
|
is too expensive; @code{claim} allows the code to typecheck without removing
|
|
the predicate check at every other call site.
|
|
|
|
@node Ref.Expr.IfCheck
|
|
@subsection Ref.Expr.IfCheck
|
|
@c * Ref.Expr.IfCheck:: Expression for dynamic testing of typestate.
|
|
@cindex If check expression
|
|
@cindex Typestate system
|
|
@cindex Control-flow
|
|
|
|
An @code{if check} expression combines a @code{if} expression and a @code{check}
|
|
expression in an indivisible unit that can be used to build more complex
|
|
conditional control-flow than the @code{check} expression affords.
|
|
|
|
In fact, @code{if check} is a ``more primitive'' expression than @code{check};
|
|
instances of the latter can be rewritten as instances of the former. The
|
|
following two examples are equivalent:
|
|
|
|
@sp 1
|
|
Example using @code{check}:
|
|
@example
|
|
check even(x);
|
|
print_even(x);
|
|
@end example
|
|
|
|
@sp 1
|
|
Equivalent example using @code{if check}:
|
|
@example
|
|
if check even(x) @{
|
|
print_even(x);
|
|
@} else @{
|
|
fail;
|
|
@}
|
|
@end example
|
|
|
|
@node Ref.Expr.Assert
|
|
@subsection Ref.Expr.Assert
|
|
@c * Ref.Expr.Assert:: Expression that halts the program if a boolean condition fails to hold.
|
|
@cindex Assertions
|
|
|
|
An @code{assert} expression is similar to a @code{check} expression, except
|
|
the condition may be any boolean-typed expression, and the compiler makes no
|
|
use of the knowledge that the condition holds if the program continues to
|
|
execute after the @code{assert}.
|
|
|
|
@page
|
|
@node Ref.Run
|
|
@section Ref.Run
|
|
@c * Ref.Run:: Organization of runtime services.
|
|
@cindex Runtime library
|
|
|
|
The Rust @dfn{runtime} is a relatively compact collection of C and Rust code
|
|
that provides fundamental services and datatypes to all Rust tasks at
|
|
run-time. It is smaller and simpler than many modern language runtimes. It is
|
|
tightly integrated into the language's execution model of memory, tasks,
|
|
communication, reflection, logging and signal handling.
|
|
|
|
@menu
|
|
* Ref.Run.Mem:: Runtime memory management service.
|
|
* Ref.Run.Type:: Runtime built-in type services.
|
|
* Ref.Run.Comm:: Runtime communication service.
|
|
* Ref.Run.Refl:: Runtime reflection system.
|
|
* Ref.Run.Log:: Runtime logging system.
|
|
* Ref.Run.Sig:: Runtime signal handler.
|
|
@end menu
|
|
|
|
@node Ref.Run.Mem
|
|
@subsection Ref.Run.Mem
|
|
@c * Ref.Run.Mem:: Runtime memory management service.
|
|
@cindex Memory allocation
|
|
|
|
The runtime memory-management system is based on a @emph{service-provider
|
|
interface}, through which the runtime requests blocks of memory from its
|
|
environment and releases them back to its environment when they are no longer
|
|
in use. The default implementation of the service-provider interface consists
|
|
of the C runtime functions @code{malloc} and @code{free}.
|
|
|
|
The runtime memory-management system in turn supplies Rust tasks with
|
|
facilities for allocating, extending and releasing stacks, as well as
|
|
allocating and freeing boxed values.
|
|
|
|
@node Ref.Run.Type
|
|
@subsection Ref.Run.Type
|
|
@c * Ref.Run.Mem:: Runtime built-in type services.
|
|
@cindex Built-in types
|
|
|
|
The runtime provides C and Rust code to manage several built-in types:
|
|
@itemize
|
|
@item @code{vec}, the type of vectors.
|
|
@item @code{str}, the type of UTF-8 strings.
|
|
@item @code{big}, the type of arbitrary-precision integers.
|
|
@item @code{chan}, the type of communication channels.
|
|
@item @code{port}, the type of communication ports.
|
|
@item @code{task}, the type of tasks.
|
|
@end itemize
|
|
Support for other built-in types such as simple types, tuples,
|
|
records, and tags is open-coded by the Rust compiler.
|
|
|
|
@node Ref.Run.Comm
|
|
@subsection Ref.Run.Comm
|
|
@c * Ref.Run.Comm:: Runtime communication service.
|
|
@cindex Communication
|
|
@cindex Process
|
|
@cindex Thread
|
|
|
|
The runtime provides code to manage inter-task communication. This includes
|
|
the system of task-lifecycle state transitions depending on the contents of
|
|
queues, as well as code to copy values between queues and their recipients and
|
|
to serialize values for transmission over operating-system inter-process
|
|
communication facilities.
|
|
|
|
@node Ref.Run.Refl
|
|
@subsection Ref.Run.Refl
|
|
@c * Ref.Run.Refl:: Runtime reflection system.
|
|
@cindex Reflection
|
|
@cindex DWARF
|
|
|
|
The runtime reflection system is driven by the DWARF tables emitted into a
|
|
crate at compile-time. Reflecting on a slot or item allocates a Rust data
|
|
structure corresponding to the DWARF DIE for that slot or item.
|
|
|
|
@node Ref.Run.Log
|
|
@subsection Ref.Run.Log
|
|
@c * Ref.Run.Log:: Runtime logging system.
|
|
@cindex Logging
|
|
|
|
The runtime contains a system for directing logging expressions to a logging
|
|
console and/or internal logging buffers. @xref{Ref.Expr.Log}. Logging
|
|
expressions can be enabled or disabled via a two-dimensional filtering process:
|
|
|
|
@itemize
|
|
|
|
@sp 1
|
|
@item
|
|
By Item
|
|
|
|
Each @emph{item} (module, function, iterator, object, type) in Rust has a
|
|
static name-path within its crate module, and can have logging enabled or
|
|
disabled on a name-path-prefix basis.
|
|
|
|
@sp 1
|
|
@item
|
|
By Task
|
|
|
|
Each @emph{task} in a running Rust program has a unique ownership-path through
|
|
the task ownership tree, and can have logging enabled or disabled on an
|
|
ownership-path-prefix basis.
|
|
@end itemize
|
|
|
|
Logging is integrated into the language for efficiency reasons, as well as the
|
|
need to filter logs based on these two built-in dimensions.
|
|
|
|
@node Ref.Run.Sig
|
|
@subsection Ref.Run.Sig
|
|
@c * Ref.Run.Sig:: Runtime signal handler.
|
|
@cindex Signals
|
|
|
|
The runtime signal-handling system is driven by a signal-dispatch table and a
|
|
signal queue associated with each task. Sending a signal to a task inserts the
|
|
signal into the task's signal queue and marks the task as having a pending
|
|
signal. At the next scheduling opportunity, the runtime processes signals in
|
|
the task's queue using its dispatch table. The signal queue memory is charged
|
|
to the task's domain; if the queue grows too big, the task will fail.
|
|
|
|
@c ############################################################
|
|
@c end main body of nodes
|
|
@c ############################################################
|
|
|
|
@page
|
|
@node Index
|
|
@chapter Index
|
|
|
|
@printindex cp
|
|
|
|
@bye
|
|
|
|
@c Local Variables:
|
|
@c mode: texinfo
|
|
@c fill-column: 78;
|
|
@c indent-tabs-mode: nil
|
|
@c buffer-file-coding-system: utf-8-unix
|
|
@c compile-command: "make -C $RBUILD -k 2>&1 | sed -e 's/\\/x\\//x:\\//g'";
|
|
@c End:
|