f624427f87
Optimize `core::str::Chars::count` I wrote this a while ago after seeing this function as a bottleneck in a profile, but never got around to contributing it. I saw it again, and so here it is. The implementation is fairly complex, but I tried to explain what's happening at both a high level (in the header comment for the file), and in line comments in the impl. Hopefully it's clear enough. This implementation (`case00_cur_libcore` in the benchmarks below) is somewhat consistently around 4x to 5x faster than the old implementation (`case01_old_libcore` in the benchmarks below), for a wide variety of workloads, without regressing performance on any of the workload sizes I've tried. I also improved the benchmarks for this code, so that they explicitly check text in different languages and of different sizes (err, the cross product of language x size). The results of the benchmarks are here: <details> <summary>Benchmark results</summary> <pre> test str::char_count::emoji_huge::case00_cur_libcore ... bench: 20,216 ns/iter (+/- 3,673) = 17931 MB/s test str::char_count::emoji_huge::case01_old_libcore ... bench: 108,851 ns/iter (+/- 12,777) = 3330 MB/s test str::char_count::emoji_huge::case02_iter_increment ... bench: 329,502 ns/iter (+/- 4,163) = 1100 MB/s test str::char_count::emoji_huge::case03_manual_char_len ... bench: 223,333 ns/iter (+/- 14,167) = 1623 MB/s test str::char_count::emoji_large::case00_cur_libcore ... bench: 293 ns/iter (+/- 6) = 19331 MB/s test str::char_count::emoji_large::case01_old_libcore ... bench: 1,681 ns/iter (+/- 28) = 3369 MB/s test str::char_count::emoji_large::case02_iter_increment ... bench: 5,166 ns/iter (+/- 85) = 1096 MB/s test str::char_count::emoji_large::case03_manual_char_len ... bench: 3,476 ns/iter (+/- 62) = 1629 MB/s test str::char_count::emoji_medium::case00_cur_libcore ... bench: 48 ns/iter (+/- 0) = 14750 MB/s test str::char_count::emoji_medium::case01_old_libcore ... bench: 217 ns/iter (+/- 4) = 3262 MB/s test str::char_count::emoji_medium::case02_iter_increment ... bench: 642 ns/iter (+/- 7) = 1102 MB/s test str::char_count::emoji_medium::case03_manual_char_len ... bench: 445 ns/iter (+/- 3) = 1591 MB/s test str::char_count::emoji_small::case00_cur_libcore ... bench: 18 ns/iter (+/- 0) = 3777 MB/s test str::char_count::emoji_small::case01_old_libcore ... bench: 23 ns/iter (+/- 0) = 2956 MB/s test str::char_count::emoji_small::case02_iter_increment ... bench: 66 ns/iter (+/- 2) = 1030 MB/s test str::char_count::emoji_small::case03_manual_char_len ... bench: 29 ns/iter (+/- 1) = 2344 MB/s test str::char_count::en_huge::case00_cur_libcore ... bench: 25,909 ns/iter (+/- 39,260) = 13299 MB/s test str::char_count::en_huge::case01_old_libcore ... bench: 102,887 ns/iter (+/- 3,257) = 3349 MB/s test str::char_count::en_huge::case02_iter_increment ... bench: 166,370 ns/iter (+/- 12,439) = 2071 MB/s test str::char_count::en_huge::case03_manual_char_len ... bench: 166,332 ns/iter (+/- 4,262) = 2071 MB/s test str::char_count::en_large::case00_cur_libcore ... bench: 281 ns/iter (+/- 6) = 19160 MB/s test str::char_count::en_large::case01_old_libcore ... bench: 1,598 ns/iter (+/- 19) = 3369 MB/s test str::char_count::en_large::case02_iter_increment ... bench: 2,598 ns/iter (+/- 167) = 2072 MB/s test str::char_count::en_large::case03_manual_char_len ... bench: 2,578 ns/iter (+/- 55) = 2088 MB/s test str::char_count::en_medium::case00_cur_libcore ... bench: 44 ns/iter (+/- 1) = 15295 MB/s test str::char_count::en_medium::case01_old_libcore ... bench: 201 ns/iter (+/- 51) = 3348 MB/s test str::char_count::en_medium::case02_iter_increment ... bench: 322 ns/iter (+/- 40) = 2090 MB/s test str::char_count::en_medium::case03_manual_char_len ... bench: 319 ns/iter (+/- 5) = 2109 MB/s test str::char_count::en_small::case00_cur_libcore ... bench: 15 ns/iter (+/- 0) = 2333 MB/s test str::char_count::en_small::case01_old_libcore ... bench: 14 ns/iter (+/- 0) = 2500 MB/s test str::char_count::en_small::case02_iter_increment ... bench: 30 ns/iter (+/- 1) = 1166 MB/s test str::char_count::en_small::case03_manual_char_len ... bench: 30 ns/iter (+/- 1) = 1166 MB/s test str::char_count::ru_huge::case00_cur_libcore ... bench: 16,439 ns/iter (+/- 3,105) = 19777 MB/s test str::char_count::ru_huge::case01_old_libcore ... bench: 89,480 ns/iter (+/- 2,555) = 3633 MB/s test str::char_count::ru_huge::case02_iter_increment ... bench: 217,703 ns/iter (+/- 22,185) = 1493 MB/s test str::char_count::ru_huge::case03_manual_char_len ... bench: 157,330 ns/iter (+/- 19,188) = 2066 MB/s test str::char_count::ru_large::case00_cur_libcore ... bench: 243 ns/iter (+/- 6) = 20905 MB/s test str::char_count::ru_large::case01_old_libcore ... bench: 1,384 ns/iter (+/- 51) = 3670 MB/s test str::char_count::ru_large::case02_iter_increment ... bench: 3,381 ns/iter (+/- 543) = 1502 MB/s test str::char_count::ru_large::case03_manual_char_len ... bench: 2,423 ns/iter (+/- 429) = 2096 MB/s test str::char_count::ru_medium::case00_cur_libcore ... bench: 42 ns/iter (+/- 1) = 15119 MB/s test str::char_count::ru_medium::case01_old_libcore ... bench: 180 ns/iter (+/- 4) = 3527 MB/s test str::char_count::ru_medium::case02_iter_increment ... bench: 402 ns/iter (+/- 45) = 1579 MB/s test str::char_count::ru_medium::case03_manual_char_len ... bench: 280 ns/iter (+/- 29) = 2267 MB/s test str::char_count::ru_small::case00_cur_libcore ... bench: 12 ns/iter (+/- 0) = 2666 MB/s test str::char_count::ru_small::case01_old_libcore ... bench: 12 ns/iter (+/- 0) = 2666 MB/s test str::char_count::ru_small::case02_iter_increment ... bench: 19 ns/iter (+/- 0) = 1684 MB/s test str::char_count::ru_small::case03_manual_char_len ... bench: 14 ns/iter (+/- 1) = 2285 MB/s test str::char_count::zh_huge::case00_cur_libcore ... bench: 15,053 ns/iter (+/- 2,640) = 20067 MB/s test str::char_count::zh_huge::case01_old_libcore ... bench: 82,622 ns/iter (+/- 3,602) = 3656 MB/s test str::char_count::zh_huge::case02_iter_increment ... bench: 230,456 ns/iter (+/- 7,246) = 1310 MB/s test str::char_count::zh_huge::case03_manual_char_len ... bench: 220,595 ns/iter (+/- 11,624) = 1369 MB/s test str::char_count::zh_large::case00_cur_libcore ... bench: 227 ns/iter (+/- 65) = 20792 MB/s test str::char_count::zh_large::case01_old_libcore ... bench: 1,136 ns/iter (+/- 144) = 4154 MB/s test str::char_count::zh_large::case02_iter_increment ... bench: 3,147 ns/iter (+/- 253) = 1499 MB/s test str::char_count::zh_large::case03_manual_char_len ... bench: 2,993 ns/iter (+/- 400) = 1577 MB/s test str::char_count::zh_medium::case00_cur_libcore ... bench: 36 ns/iter (+/- 5) = 16388 MB/s test str::char_count::zh_medium::case01_old_libcore ... bench: 142 ns/iter (+/- 18) = 4154 MB/s test str::char_count::zh_medium::case02_iter_increment ... bench: 379 ns/iter (+/- 37) = 1556 MB/s test str::char_count::zh_medium::case03_manual_char_len ... bench: 364 ns/iter (+/- 51) = 1620 MB/s test str::char_count::zh_small::case00_cur_libcore ... bench: 11 ns/iter (+/- 1) = 3000 MB/s test str::char_count::zh_small::case01_old_libcore ... bench: 11 ns/iter (+/- 1) = 3000 MB/s test str::char_count::zh_small::case02_iter_increment ... bench: 20 ns/iter (+/- 3) = 1650 MB/s </pre> </details> I also added fairly thorough tests for different sizes and alignments. This completes on my machine in 0.02s, which is surprising given how thorough they are, but it seems to detect bugs in the implementation. (I haven't run the tests on a 32 bit machine yet since before I reworked the code a little though, so... hopefully I'm not about to embarrass myself). This uses similar SWAR-style techniques to the `is_ascii` impl I contributed in https://github.com/rust-lang/rust/pull/74066, so I'm going to request review from the same person who reviewed that one. That said am not particularly picky, and might not have the correct syntax for requesting a review from someone (so it goes). r? `@nagisa` |
||
---|---|---|
.github | ||
compiler | ||
library | ||
src | ||
.editorconfig | ||
.gitattributes | ||
.gitignore | ||
.gitmodules | ||
.mailmap | ||
Cargo.lock | ||
Cargo.toml | ||
CODE_OF_CONDUCT.md | ||
config.toml.example | ||
configure | ||
CONTRIBUTING.md | ||
COPYRIGHT | ||
LICENSE-APACHE | ||
LICENSE-MIT | ||
README.md | ||
RELEASES.md | ||
rustfmt.toml | ||
triagebot.toml | ||
x.py |
The Rust Programming Language
This is the main source code repository for Rust. It contains the compiler, standard library, and documentation.
Note: this README is for users rather than contributors. If you wish to contribute to the compiler, you should read the Getting Started section of the rustc-dev-guide instead. You can ask for help in the #new members Zulip stream.
Quick Start
Read "Installation" from The Book.
Installing from Source
The Rust build system uses a Python script called x.py
to build the compiler,
which manages the bootstrapping process. It lives in the root of the project.
The x.py
command can be run directly on most systems in the following format:
./x.py <subcommand> [flags]
This is how the documentation and examples assume you are running x.py
.
Systems such as Ubuntu 20.04 LTS do not create the necessary python
command by default when Python is installed that allows x.py
to be run directly. In that case you can either create a symlink for python
(Ubuntu provides the python-is-python3
package for this), or run x.py
using Python itself:
# Python 3
python3 x.py <subcommand> [flags]
# Python 2.7
python2.7 x.py <subcommand> [flags]
More information about x.py
can be found
by running it with the --help
flag or reading the rustc dev guide.
Building on a Unix-like system
-
Make sure you have installed the dependencies:
g++
5.1 or later orclang++
3.5 or laterpython
3 or 2.7- GNU
make
3.81 or later cmake
3.13.4 or laterninja
curl
git
ssl
which comes inlibssl-dev
oropenssl-devel
pkg-config
if you are compiling on Linux and targeting Linux
-
Clone the source with
git
:git clone https://github.com/rust-lang/rust.git cd rust
-
Configure the build settings:
The Rust build system uses a file named
config.toml
in the root of the source tree to determine various configuration settings for the build. Copy the defaultconfig.toml.example
toconfig.toml
to get started.cp config.toml.example config.toml
If you plan to use
x.py install
to create an installation, it is recommended that you set theprefix
value in the[install]
section to a directory.Create install directory if you are not installing in default directory
-
Build and install:
./x.py build && ./x.py install
When complete,
./x.py install
will place several programs into$PREFIX/bin
:rustc
, the Rust compiler, andrustdoc
, the API-documentation tool. This install does not include Cargo, Rust's package manager. To build and install Cargo, you may run./x.py install cargo
or set thebuild.extended
key inconfig.toml
totrue
to build and install all tools.
Building on Windows
There are two prominent ABIs in use on Windows: the native (MSVC) ABI used by Visual Studio, and the GNU ABI used by the GCC toolchain. Which version of Rust you need depends largely on what C/C++ libraries you want to interoperate with: for interop with software produced by Visual Studio use the MSVC build of Rust; for interop with GNU software built using the MinGW/MSYS2 toolchain use the GNU build.
MinGW
MSYS2 can be used to easily build Rust on Windows:
-
Grab the latest MSYS2 installer and go through the installer.
-
Run
mingw32_shell.bat
ormingw64_shell.bat
from wherever you installed MSYS2 (i.e.C:\msys64
), depending on whether you want 32-bit or 64-bit Rust. (As of the latest version of MSYS2 you have to runmsys2_shell.cmd -mingw32
ormsys2_shell.cmd -mingw64
from the command line instead) -
From this terminal, install the required tools:
# Update package mirrors (may be needed if you have a fresh install of MSYS2) pacman -Sy pacman-mirrors # Install build tools needed for Rust. If you're building a 32-bit compiler, # then replace "x86_64" below with "i686". If you've already got git, python, # or CMake installed and in PATH you can remove them from this list. Note # that it is important that you do **not** use the 'python2', 'cmake' and 'ninja' # packages from the 'msys2' subsystem. The build has historically been known # to fail with these packages. pacman -S git \ make \ diffutils \ tar \ mingw-w64-x86_64-python \ mingw-w64-x86_64-cmake \ mingw-w64-x86_64-gcc \ mingw-w64-x86_64-ninja
-
Navigate to Rust's source code (or clone it), then build it:
./x.py build && ./x.py install
MSVC
MSVC builds of Rust additionally require an installation of Visual Studio 2017
(or later) so rustc
can use its linker. The simplest way is to get the
Visual Studio, check the “C++ build tools” and “Windows 10 SDK” workload.
(If you're installing cmake yourself, be careful that “C++ CMake tools for Windows” doesn't get included under “Individual components”.)
With these dependencies installed, you can build the compiler in a cmd.exe
shell with:
python x.py build
Currently, building Rust only works with some known versions of Visual Studio. If you have a more recent version installed and the build system doesn't understand, you may need to force rustbuild to use an older version. This can be done by manually calling the appropriate vcvars file before running the bootstrap.
CALL "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
python x.py build
Specifying an ABI
Each specific ABI can also be used from either environment (for example, using the GNU ABI in PowerShell) by using an explicit build triple. The available Windows build triples are:
- GNU ABI (using GCC)
i686-pc-windows-gnu
x86_64-pc-windows-gnu
- The MSVC ABI
i686-pc-windows-msvc
x86_64-pc-windows-msvc
The build triple can be specified by either specifying --build=<triple>
when
invoking x.py
commands, or by copying the config.toml
file (as described
in Installing From Source), and modifying the
build
option under the [build]
section.
Configure and Make
While it's not the recommended build system, this project also provides a
configure script and makefile (the latter of which just invokes x.py
).
./configure
make && sudo make install
When using the configure script, the generated config.mk
file may override the
config.toml
file. To go back to the config.toml
file, delete the generated
config.mk
file.
Building Documentation
If you’d like to build the documentation, it’s almost the same:
./x.py doc
The generated documentation will appear under doc
in the build
directory for
the ABI used. I.e., if the ABI was x86_64-pc-windows-msvc
, the directory will be
build\x86_64-pc-windows-msvc\doc
.
Notes
Since the Rust compiler is written in Rust, it must be built by a precompiled "snapshot" version of itself (made in an earlier stage of development). As such, source builds require a connection to the Internet, to fetch snapshots, and an OS that can execute the available snapshot binaries.
Snapshot binaries are currently built and tested on several platforms:
Platform / Architecture | x86 | x86_64 |
---|---|---|
Windows (7, 8, 10, ...) | ✓ | ✓ |
Linux (kernel 2.6.32, glibc 2.11 or later) | ✓ | ✓ |
macOS (10.7 Lion or later) | (*) | ✓ |
(*): Apple dropped support for running 32-bit binaries starting from macOS 10.15 and iOS 11. Due to this decision from Apple, the targets are no longer useful to our users. Please read our blog post for more info.
You may find that other platforms work, but these are our officially supported build environments that are most likely to work.
Getting Help
The Rust community congregates in a few places:
- Stack Overflow - Direct questions about using the language.
- users.rust-lang.org - General discussion and broader questions.
- /r/rust - News and general discussion.
Contributing
If you are interested in contributing to the Rust project, please take a look at the Getting Started guide in the rustc-dev-guide.
License
Rust is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, LICENSE-MIT, and COPYRIGHT for details.
Trademark
The Rust Foundation owns and protects the Rust and Cargo trademarks and logos (the “Rust Trademarks”).
If you want to use these names or brands, please read the media guide.
Third-party logos may be subject to third-party copyrights and trademarks. See Licenses for details.