Update thread and futex APIs to work with Emscripten
This updates the thread and futex APIs in `std` to match the APIs exposed by
Emscripten. This allows threads to run on `wasm32-unknown-emscripten` and the
thread parker to compile without errors related to the missing `futex` module.
To make use of this, Rust code must be compiled with `-C target-feature=atomics`
and Emscripten must link with `-pthread`.
I have confirmed this works well locally when building multithreaded crates.
Attempting to enable `std` thread tests currently fails for seemingly obscure
reasons and Emscripten is currently disabled in CI, so further work is needed to
have proper test coverage here.
This updates the thread and futex APIs in `std` to match the APIs exposed by
Emscripten. This allows threads to run on `wasm32-unknown-emscripten` and the
thread parker to compile without errors related to the missing `futex` module.
To make use of this, Rust code must be compiled with `-C target-feature=atomics`
and Emscripten must link with `-pthread`.
I have confirmed this works well locally when building multithreaded crates.
Attempting to enable `std` thread tests currently fails for seemingly obscure
reasons and Emscripten is currently disabled in CI, so further work is needed to
have proper test coverage here.
Duration::zero() -> Duration::ZERO
In review for #72790, whether or not a constant or a function should be favored for `#![feature(duration_zero)]` was seen as an open question. In https://github.com/rust-lang/rust/issues/73544#issuecomment-691701670 an invitation was opened to either stabilize the methods or propose a switch to the constant value, supplemented with reasoning. Followup comments suggested community preference leans towards the const ZERO, which would be reason enough.
ZERO also "makes sense" beside existing associated consts for Duration. It is ever so slightly awkward to have a series of constants specifying 1 of various units but leave 0 as a method, especially when they are side-by-side in code. It seems unintuitive for the one non-dynamic value (that isn't from Default) to be not-a-const, which could hurt discoverability of the associated constants overall. Elsewhere in `std`, methods for obtaining a constant value were even deprecated, as seen with [std::u32::min_value](https://doc.rust-lang.org/std/primitive.u32.html#method.min_value).
Most importantly, ZERO costs less to use. A match supports a const pattern, but const fn can only be used if evaluated through a const context such as an inline `const { const_fn() }` or a `const NAME: T = const_fn()` declaration elsewhere. Likewise, while https://github.com/rust-lang/rust/issues/73544#issuecomment-691949373 notes `Duration::zero()` can optimize to a constant value, "can" is not "will". Only const contexts have a strong promise of such. Even without that in mind, the comment in question still leans in favor of the constant for simplicity. As it costs less for a developer to use, may cost less to optimize, and seems to have more of a community consensus for it, the associated const seems best.
r? ```@LukasKalbertodt```
The discussion seems to have resolved that this lint is a bit "noisy" in
that applying it in all places would result in a reduction in
readability.
A few of the trivial functions (like `Path::new`) are fine to leave
outside of closures.
The general rule seems to be that anything that is obviously an
allocation (`Box`, `Vec`, `vec![]`) should be in a closure, even if it
is a 0-sized allocation.
It was only ever used with Vec<u8> anyway. This simplifies some things.
- It no longer needs to be flushed, because that's a no-op anyway for
a Vec<u8>.
- Writing to a Vec<u8> never fails.
- No #[cfg(test)] code is needed anymore to use `realstd` instead of
`std`, because Vec comes from alloc, not std (like Write).
Define `fs::hard_link` to not follow symlinks.
POSIX leaves it [implementation-defined] whether `link` follows symlinks.
In practice, for example, on Linux it does not and on FreeBSD it does.
So, switch to `linkat`, so that we can pick a behavior rather than
depending on OS defaults.
Pick the option to not follow symlinks. This is somewhat arbitrary, but
seems the less surprising choice because hard linking is a very
low-level feature which requires the source and destination to be on
the same mounted filesystem, and following a symbolic link could end
up in a different mounted filesystem.
[implementation-defined]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html
Use Intra-doc links for std::io::buffered
Helps with #75080. I used the implicit link style for intrinsics, as that was what `minnumf32` and others already had.
``@rustbot`` modify labels: T-doc, A-intra-doc-links
r? ``@jyn514``
`#![deny(unsafe_op_in_unsafe_fn)]` in sys/hermit
Partial fix of #73904.
This encloses ``unsafe`` operations in ``unsafe fn`` in ``sys/hermit``.
Some unsafe blocks are not well documented because some system-based functions lack documents.
Partially fix#55002, deprecate in another release
Co-authored-by: Ashley Mannix <kodraus@hey.com>
Update stable version for stabilize_spin_loop
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Use better example for spinlock
As suggested by KodrAus
Remove renamed_spin_loop already available in master
Fix spin loop example
fix various aliasing issues in the standard library
This fixes various cases where the standard library either used raw pointers after they were already invalidated by using the original reference again, or created raw pointers for one element of a slice and used it to access neighboring elements.
Add note to process::arg[s] that args shouldn't be escaped or quoted
This came out of discussion on [forum](https://users.rust-lang.org/t/how-to-get-full-output-from-command/50626), where I recently asked a question and it turned out that the problem was redundant quotation:
```rust
Command::new("rg")
.arg("\"pattern\"") // this will look for "pattern" with quotes included
```
This is something that has bitten me few times already (in multiple languages actually), so It'd be grateful to have it in the docs, even though it's not sctrictly Rust specific problem. Other users also agreed.
This can be really annoying to debug, because in many cases (inluding mine), quotes can be legal part of the argument, so the command doesn't fail, it just behaves unexpectedly. Not everybody (including me) knows that quotes around arguments are part of the shell and not part of the called program. Coincidentally, somoene had the same problem [yesterday](https://www.reddit.com/r/rust/comments/jkxelc/going_crazy_over_running_a_curl_process_from_rust/) on reddit.
I am not a native speaker, so I welcome any corrections or better formulation, I don't expect this to be merged as is. I was also reminded that this is platform/shell specific behaviour, but I didn't find a good way to formulate that briefly, any ideas welcome.
It's also my first PR here, so I am not sure I did everything correctly, I did this just from Github UI.
make exp_m1 and ln_1p examples more representative of use
With this PR, the examples for `exp_m1` would fail if `x.exp() - 1.0` is used instead of `x.exp_m1()`, and the examples for `ln_1p` would fail if `(x + 1.0).ln()` is used instead of `x.ln_1p()`.
Add std::panic::panic_any.
The discussion of #67984 lead to the conclusion that there should be a macro or function separate from `std::panic!()` for throwing arbitrary payloads, to make it possible to deprecate or disallow (in edition 2021) `std::panic!(arbitrary_payload)`.
Alternative names:
- `panic_with!(..)`
- ~~`start_unwind(..)`~~ (panicking doesn't always unwind)
- `throw!(..)`
- `panic_throwing!(..)`
- `panic_with_value(..)`
- `panic_value(..)`
- `panic_with(..)`
- `panic_box(..)`
- `panic(..)`
The equivalent (private, unstable) function in `libstd` is called `std::panicking::begin_panic`.
I suggest `panic_any`, because it allows for any (`Any + Send`) type.
_Tracking issue: #78500_
Uplift `temporary-cstring-as-ptr` lint from `clippy` into rustc
The general consensus seems to be that this lint covers a common enough mistake to warrant inclusion in rustc.
The diagnostic message might need some tweaking, as I'm not sure the use of second-person perspective matches the rest of rustc, but I'd like to hear others' thoughts on that.
(cc #53224).
r? `@oli-obk`
Capture output from threads spawned in tests
This is revival of #75172.
Original text:
> Fixes#42474.
>
> r? `@dtolnay` since you expressed interest in this, but feel free to redirect if you aren't the right person anymore.
---
Closes#75172.
Updated the added documentation in llvm_util.rs to note which copies of LLVM need to be inspected.
Removed avx512bf16 and avx512vp2intersect because they are unsupported before LLVM 9 with the build with external LLVM 8 being supported
Re-introduced detection testing previously removed for un-requestable features tsc and mmx
`#[deny(unsafe_op_in_unsafe_fn)]` in sys/wasm
This is part of #73904.
This encloses unsafe operations in unsafe fn in `libstd/sys/wasm`.
@rustbot modify labels: F-unsafe-block-in-unsafe-fn
`movbe` seems to not be a run-time detectable feature on x86.
It has thus been removed from the list.
It was only commented out to ease comparison against the full list.
This PR both adds in-source documentation on what to look out for
when adding a new (X86) feature set and adds all that are detectable at run-time in Rust stable
as of 1.27.0.
This should only enable the use of the corresponding LLVM intrinsics.
Actual intrinsics need to be added separately in rust-lang/stdarch.
It also re-orders the run-time-detect test statements to be more consistent
with the actual list of intrinsics whitelisted and removes underscores not present
in the actual names (which might be mistaken as being part of the name)
replace `#[allow_internal_unstable]` with `#[rustc_allow_const_fn_unstable]` for `const fn`s
`#[allow_internal_unstable]` is currently used to side-step feature gate and stability checks.
While it was originally only meant to be used only on macros, its use was expanded to `const fn`s.
This pr adds stricter checks for the usage of `#[allow_internal_unstable]` (only on macros) and introduces the `#[rustc_allow_const_fn_unstable]` attribute for usage on `const fn`s.
This pr does not change any of the functionality associated with the use of `#[allow_internal_unstable]` on macros or the usage of `#[rustc_allow_const_fn_unstable]` (instead of `#[allow_internal_unstable]`) on `const fn`s (see https://github.com/rust-lang/rust/issues/69399#issuecomment-712911540).
Note: The check for `#[rustc_allow_const_fn_unstable]` currently only validates that the attribute is used on a function, because I don't know how I would check if the function is a `const fn` at the place of the check. I therefore openend this as a 'draft pull request'.
Closesrust-lang/rust#69399
r? @oli-obk
Throw core::panic!("message") as &str instead of String.
This makes `core::panic!("message")` consistent with `std::panic!("message")`, which throws a `&str` and not a `String`.
This also makes any other panics from `core::panicking::panic` result in a `&str` rather than a `String`, which includes compiler-generated panics such as the panics generated for `mem::zeroed()`.
---
Demonstration:
```rust
use std::panic;
use std::any::Any;
fn main() {
panic::set_hook(Box::new(|panic_info| check(panic_info.payload())));
check(&*panic::catch_unwind(|| core::panic!("core")).unwrap_err());
check(&*panic::catch_unwind(|| std::panic!("std")).unwrap_err());
}
fn check(msg: &(dyn Any + Send)) {
if let Some(s) = msg.downcast_ref::<String>() {
println!("Got a String: {:?}", s);
} else if let Some(s) = msg.downcast_ref::<&str>() {
println!("Got a &str: {:?}", s);
}
}
```
Before:
```
Got a String: "core"
Got a String: "core"
Got a &str: "std"
Got a &str: "std"
```
After:
```
Got a &str: "core"
Got a &str: "core"
Got a &str: "std"
Got a &str: "std"
```
revise Hermit's mutex interface to support the behaviour of StaticMutex
rust-lang/rust#77147 simplifies things by splitting this Mutex type into two types matching the two use cases: StaticMutex and MovableMutex. To support the new behavior of StaticMutex, we move part of the mutex implementation into libstd.
The interface to the OS changed. Consequently, I removed a few functions, which aren't longer needed.
According to [the bionic status page], `linkat` has only been available
since API level 21. Since Android is based on Linux and Linux's `link`
doesn't follow symlinks, just use `link` on Android.
[the bionic status page]: https://android.googlesource.com/platform/bionic/+/master/docs/status.md
Duration::ZERO composes better with match and various other things,
at the cost of an occasional parens, and results in less work for the
optimizer, so let's use that instead.
This expands time's test suite to use more and in more places the
range of methods and constants added to Duration in recent
proposals for the sake of testing more API surface area and
improving legibility.
const keyword: brief paragraph on 'const fn'
`const fn` were mentioned in the title, but called "deterministic functions" which is not their main property (though at least currently it is a consequence of being const-evaluable). This adds a brief paragraph discussing them, also in the hopes of clarifying that they do *not* have any effect on run-time uses.
If pthread mutex initialization fails, the failure will go unnoticed unless
debug assertions are enabled. Any subsequent use of mutex will also silently
fail, since return values from lock & unlock operations are similarly checked
only through debug assertions.
In some implementations the mutex initialization requires a memory
allocation and so it does fail in practice.
Check that initialization succeeds to ensure that mutex guarantees
mutual exclusion.
Add std:🧵:available_concurrency
This PR adds a counterpart to [C++'s `std:🧵:hardware_concurrency`](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) to Rust, tracking issue https://github.com/rust-lang/rust/issues/74479.
cc/ `@rust-lang/libs`
## Motivation
Being able to know how many hardware threads a platform supports is a core part of building multi-threaded code. In C++ 11 this has become available through the [`std:🧵:hardware_concurrency`](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) API. Currently in Rust most of the ecosystem depends on the [`num_cpus` crate](https://docs.rs/num_cpus/1.13.0/num_cpus/) ([no.35 in top 500 crates](https://docs.google.com/spreadsheets/d/1wwahRMHG3buvnfHjmPQFU4Kyfq15oTwbfsuZpwHUKc4/edit#gid=1253069234)) to provide this functionality. This PR proposes an API to provide access to the number of hardware threads available on a given platform.
__edit (2020-07-24):__ The purpose of this PR is to provide a hint for how many threads to spawn to saturate the processor. There's value in introducing APIs for NUMA and Windows processor groups, but those are intentionally out of scope for this PR. See: https://github.com/rust-lang/rust/pull/74480#issuecomment-662116186.
## Naming
Discussing the naming of the API on Zulip surfaced two options:
- `std:🧵:hardware_concurrency`
- `std:🧵:hardware_threads`
Both options seemed acceptable, but overall people seem to gravitate the most towards `hardware_threads`. Additionally `@jonas-schievink` pointed out that the "hardware threads" terminology is well-established and is used in among other the [RISC-V specification](https://riscv.org/specifications/isa-spec-pdf/) (page 20):
> A component is termed a core if it contains an independent instruction fetch unit. A RISC-V-compatible core might support multiple RISC-V-compatible __hardware threads__, or harts, through multithreading.
It's also worth noting that [the original paper introducing C++'s `std::thread` submodule](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2320.html) unfortunately doesn't feature any discussion on the naming of `hardware_concurrency`, so we can't use that to help inform our decision here.
## Return type
An important consideration `@joshtriplett` brought up is that we don't want to default to `1` for platforms where the number of available threads cannot be retrieved. Instead we want to inform the users of the fact that we don't know and allow them to handle that case. Which is why this PR uses `Option<NonZeroUsize>` as its return type, where `None` is returned on platforms where we don't know the number of hardware threads available.
The reasoning for `NonZeroUsize` vs `usize` is that if the number of threads for a platform are known, they'll always be at least 1. As evidenced by the example the `NonZero*` family of APIs may currently not be the most ergonomic to use, but improving the ergonomics of them is something that I think we can address separately.
## Implementation
`@Mark-Simulacrum` pointed out that most of the code we wanted to expose here was already available under `libtest`. So this PR mostly moves the internal code of libtest into a public API.
Use posix_spawn() on unix if program is a path
Previously `Command::spawn` would fall back to the non-posix_spawn based
implementation if the `PATH` environment variable was possibly changed.
On systems with a modern (g)libc `posix_spawn()` can be significantly
faster. If program is a path itself the `PATH` environment variable is
not used for the lookup and it should be safe to use the
`posix_spawnp()` method. [1]
We found this, because we have a cli application that effectively runs a
lot of subprocesses. It would sometimes noticeably hang while printing
output. Profiling showed that the process was spending the majority of
time in the kernel's `copy_page_range` function while spawning
subprocesses. During this time the process is completely blocked from
running, explaining why users were reporting the cli app hanging.
Through this we discovered that `std::process::Command` has a fast and
slow path for process execution. The fast path is backed by
`posix_spawnp()` and the slow path by fork/exec syscalls being called
explicitly. Using fork for process creation is supposed to be fast, but
it slows down as your process uses more memory. It's not because the
kernel copies the actual memory from the parent, but it does need to
copy the references to it (see `copy_page_range` above!). We ended up
using the slow path, because the command spawn implementation in falls
back to the slow path if it suspects the PATH environment variable was
changed.
Here is a smallish program demonstrating the slowdown before this code
change:
```
use std::process::Command;
use std::time::Instant;
fn main() {
let mut args = std::env::args().skip(1);
if let Some(size) = args.next() {
// Allocate some memory
let _xs: Vec<_> = std::iter::repeat(0)
.take(size.parse().expect("valid number"))
.collect();
let mut command = Command::new("/bin/sh");
command
.arg("-c")
.arg("echo hello");
if args.next().is_some() {
println!("Overriding PATH");
command.env("PATH", std::env::var("PATH").expect("PATH env var"));
}
let now = Instant::now();
let child = command
.spawn()
.expect("failed to execute process");
println!("Spawn took: {:?}", now.elapsed());
let output = child.wait_with_output().expect("failed to wait on process");
println!("Output: {:?}", output);
} else {
eprintln!("Usage: prog [size]");
std::process::exit(1);
}
()
}
```
Running it and passing different amounts of elements to use to allocate
memory shows that the time taken for `spawn()` can differ quite
significantly. In latter case the `posix_spawnp()` implementation is 30x
faster:
```
$ cargo run --release 10000000
...
Spawn took: 324.275µs
hello
$ cargo run --release 10000000 changepath
...
Overriding PATH
Spawn took: 2.346809ms
hello
$ cargo run --release 100000000
...
Spawn took: 387.842µs
hello
$ cargo run --release 100000000 changepath
...
Overriding PATH
Spawn took: 13.434677ms
hello
```
[1]: 5f72f9800b/posix/execvpe.c (L81)
Deny broken intra-doc links in linkchecker
Since rustdoc isn't warning about these links, check for them manually.
This also fixes the broken links that popped up from the lint.
stabilize union with 'ManuallyDrop' fields and 'impl Drop for Union'
As [discussed by @SimonSapin and @withoutboats](https://github.com/rust-lang/rust/issues/55149#issuecomment-634692020), this PR proposes to stabilize parts of the `untagged_union` feature gate:
* It will be possible to have a union with field type `ManuallyDrop<T>` for any `T`.
* While at it I propose we also stabilize `impl Drop for Union`; to my knowledge, there are no open concerns around this feature.
In the RFC discussion, we also talked about allowing `&mut T` as another non-`Copy` non-dropping type, but that felt to me like an overly specific exception so I figured we'd wait if there is actually any use for such a special case.
Some things remain unstable and still require the `untagged_union` feature gate:
* Union with fields that do not drop, are not `Copy`, and are not `ManuallyDrop<_>`. The reason to not stabilize this is to avoid semver concerns around libraries adding `Drop` implementations later. (This is already not fully semver compatible as, to my knowledge, the borrow checker will exploit the non-dropping nature of any type, but it seems prudent to avoid further increasing the amount of trouble adding an `impl Drop` can cause.)
Due to this, quite a few tests still need the `untagged_union` feature, but I think the ones where I could remove the feature flag provide good test coverage for the stable part.
Cc @rust-lang/lang
POSIX leaves it implementation-defined whether `link` follows symlinks.
In practice, for example, on Linux it does not and on FreeBSD it does.
So, switch to `linkat`, so that we can pick a behavior rather than
depending on OS defaults.
Pick the option to not follow symlinks. This is somewhat arbitrary, but
seems the less surprising choice because hard linking is a very
low-level feature which requires the source and destination to be on
the same mounted filesystem, and following a symbolic link could end
up in a different mounted filesystem.
Cleanup cloudabi mutexes and condvars
This gets rid of lots of unnecessary unsafety.
All the AtomicU32s were wrapped in UnsafeCell or UnsafeCell<MaybeUninit>, and raw pointers were used to get to the AtomicU32 inside. This change cleans that up by using AtomicU32 directly.
Also replaces a UnsafeCell<u32> by a safer Cell<u32>.
@rustbot modify labels: +C-cleanup
Static mutex is static
StaticMutex is only ever used with as a static (as the name already suggests). So it doesn't have to be generic over a lifetime, but can simply assume 'static.
This 'static lifetime guarantees the object is never moved, so this is no longer a manually checked requirement for unsafe calls to lock().
@rustbot modify labels: +T-libs +A-concurrency +C-cleanup
For backtrace, use StaticMutex instead of a raw sys Mutex.
The code used the very unsafe `sys::mutex::Mutex` directly, and built its own unlock-on-drop wrapper around it. The StaticMutex wrapper already provides that and is easier to use safely.
@rustbot modify labels: +T-libs +C-cleanup